From patchwork Wed Nov 8 18:02:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 162923 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp810642vqo; Wed, 8 Nov 2023 02:07:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IHfFQcg62ZIE10oslc+jMHLfNY7Z6CLM/yHRl8uWJKvHiJPlGDiPbuynV2YdA8rPGkcpHIz X-Received: by 2002:a05:6a00:244b:b0:6c3:450f:2b64 with SMTP id d11-20020a056a00244b00b006c3450f2b64mr2307150pfj.6.1699438071691; Wed, 08 Nov 2023 02:07:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699438071; cv=none; d=google.com; s=arc-20160816; b=GVtDOWE/Uk9z00jf+H4smld0YTCXcaAJwBS+Rpbn1XA3H7gbino4q2tubRyCkKFKh2 rcEyBhb4zFY1EmnDgCXcP4+t85xEocBFeUZgYJlWupUwl9dPo9gaWq6jnygVjMDeMzU8 Bzs0pParPO6djGf0yjcfGeOeAJRkKgW6nuC994d2WCXqeSVCtRhgUNKpI9qZwGFYwnw1 FL6XL47VeeCBt4mut8SicPbqMnjacibyOYwUyqaUAAJY/DMesAf73ocQC4c/PrTV84PO TJfajQCdr6strKQCQltL2M/ACAANhd3nx6CxU+u+87RjwGU3NvhXrA3UwrYmdO79b3Rq JtbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=78q1Mvt15iz9JP5mY5fbXZYmA4F3UpoSy8MZqtllwGc=; fh=CbuxuJIM4c0z2+t9qjYNC8a5J1er9YP8pXZfhfwQGvg=; b=qvxdWQbQZ5NoqUMNm6F42j8/O/SQ7/hmPmTfpbyl7oB1irb5tOFLh8bMFs6zNq3IeB 5V8+f1aUjdSPGY1omk3TtljRCSX0VpBBJn8pMGpHldmh4BkKD1JfUR+gfQQ69kLQCriI lbwMf4nt3YhOfr5pTm0MX7744UlHLw2K4ENiVhDLnyMZgLVI4S2eLhPVZ2rnKNCkkNI5 39QpgigK0guhDBZhumomtXxs3Ztqee53J5BMSBc84v9JPDsvJdxUPd33SGl8OCn98Zqj pW+tUW4fGpF9nuIfqvIepwMf/yWKQlt0i5N12RvxYL9uQ6qSbLaQazvxeLIk97uphAXf G08Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id q4-20020a056a00150400b0069026fd5a48si12342846pfu.34.2023.11.08.02.07.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Nov 2023 02:07:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 6012F815CD9C; Wed, 8 Nov 2023 02:07:48 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235508AbjKHKHg (ORCPT + 32 others); Wed, 8 Nov 2023 05:07:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235502AbjKHKHf (ORCPT ); Wed, 8 Nov 2023 05:07:35 -0500 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBA691728; Wed, 8 Nov 2023 02:07:31 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4SQLNv0lyNz4f3l7g; Wed, 8 Nov 2023 18:07:27 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 2F9731A0199; Wed, 8 Nov 2023 18:07:28 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgDX2xHeXUtlKTpyAQ--.15863S4; Wed, 08 Nov 2023 18:07:27 +0800 (CST) From: Yu Kuai To: xni@redhat.com, song@kernel.org, maan@systemlinux.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next] md: synchronize flush io with array reconfiguration Date: Thu, 9 Nov 2023 02:02:10 +0800 Message-Id: <20231108180210.3657203-1-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-CM-TRANSID: cCh0CgDX2xHeXUtlKTpyAQ--.15863S4 X-Coremail-Antispam: 1UD129KBjvJXoWxWry8uFW5Kw1DZw4xZr1UZFb_yoW5XFy7p3 yFq3Zxtr4UJFW3KwsxJaykGr1rWw1jvay0yFW3u347uw13Xrn8G3yftF95Xry5AFyfC3y3 ur1qgw4Dua4jqFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9S14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2ocxC64kIII 0Yj41l84x0c7CEw4AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xv wVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4 x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG 64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r 1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAq YI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka0xkIwI1l42xK82IYc2Ij64vIr41l4I8I3I0E4I kC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWU WwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr 0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWr Zr1j6s0DMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr 1UYxBIdaVFxhVjvjDU0xZFpf9x0pRQo7tUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 08 Nov 2023 02:07:48 -0800 (PST) X-Spam-Status: No, score=-0.8 required=5.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781989975511647292 X-GMAIL-MSGID: 1781989975511647292 From: Yu Kuai Currently rcu is used to protect iterating rdev from submit_flushes(): submit_flushes remove_and_add_spares synchronize_rcu pers->hot_remove_disk() rcu_read_lock() rdev_for_each_rcu if (rdev->raid_disk >= 0) rdev->radi_disk = -1; atomic_inc(&rdev->nr_pending) rcu_read_unlock() bi = bio_alloc_bioset() bi->bi_end_io = md_end_flush bi->private = rdev submit_bio // issue io for removed rdev Fix this problem by grabbing 'acive_io' before iterating rdev, make sure that remove_and_add_spares() won't concurrent with submit_flushes(). Fixes: a2826aa92e2e ("md: support barrier requests on all personalities.") Signed-off-by: Yu Kuai --- drivers/md/md.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 4ee4593c874a..eb3e455bcbae 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -529,6 +529,9 @@ static void md_end_flush(struct bio *bio) rdev_dec_pending(rdev, mddev); if (atomic_dec_and_test(&mddev->flush_pending)) { + /* The pair is percpu_ref_tryget() from md_flush_request() */ + percpu_ref_put(&mddev->active_io); + /* The pre-request flush has finished */ queue_work(md_wq, &mddev->flush_work); } @@ -548,12 +551,8 @@ static void submit_flushes(struct work_struct *ws) rdev_for_each_rcu(rdev, mddev) if (rdev->raid_disk >= 0 && !test_bit(Faulty, &rdev->flags)) { - /* Take two references, one is dropped - * when request finishes, one after - * we reclaim rcu_read_lock - */ struct bio *bi; - atomic_inc(&rdev->nr_pending); + atomic_inc(&rdev->nr_pending); rcu_read_unlock(); bi = bio_alloc_bioset(rdev->bdev, 0, @@ -564,7 +563,6 @@ static void submit_flushes(struct work_struct *ws) atomic_inc(&mddev->flush_pending); submit_bio(bi); rcu_read_lock(); - rdev_dec_pending(rdev, mddev); } rcu_read_unlock(); if (atomic_dec_and_test(&mddev->flush_pending)) @@ -617,6 +615,17 @@ bool md_flush_request(struct mddev *mddev, struct bio *bio) /* new request after previous flush is completed */ if (ktime_after(req_start, mddev->prev_flush_start)) { WARN_ON(mddev->flush_bio); + /* + * Grab a reference to make sure mddev_suspend() will wait for + * this flush to be done. + * + * md_flush_reqeust() is called under md_handle_request() and + * 'active_io' is already grabbed, hence percpu_ref_tryget() + * won't fail, percpu_ref_tryget_live() can't be used because + * percpu_ref_kill() can be called by mddev_suspend() + * concurrently. + */ + percpu_ref_tryget(&mddev->active_io); mddev->flush_bio = bio; bio = NULL; }