Message ID | 20230304070133.1134975-1-yukuai1@huaweicloud.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp812094wrd; Fri, 3 Mar 2023 23:20:07 -0800 (PST) X-Google-Smtp-Source: AK7set9gIZnY7eE4aKNf2BjGo0ZzNjrcrmy9fhEgUS/OppOC4IUZUVYOP6JlaNHKXT54ObSw2Z6W X-Received: by 2002:a17:90b:33cc:b0:233:e0d5:a2b8 with SMTP id lk12-20020a17090b33cc00b00233e0d5a2b8mr4565297pjb.29.1677914407579; Fri, 03 Mar 2023 23:20:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677914407; cv=none; d=google.com; s=arc-20160816; b=vOZVWSrAmlarqLnDO52G0OP+ZHFKEGeoVCrMFiIYE8yD/PgCjIAIlpK/5t3YU1ZRyo m9DL+Yy06bH8ZuoMXd6hkO5fo5Lzd0VJfcaHjdDtSKsO0IX0Fr+KDTJdfxFZnddEzkDk o5wdGFqC9+/pvVq5nisLLoabepdf6+3TIrna0Z/xyIn+wJSRNOvC6lf+Z/Z78fRcPunB MmvsTgGHlSkILdK+EQXzpDM1jfEVTMsyQcMqnRPBCaj8tMohPrUydyDEV57FGwL6oxuk EDjLq/Eh2weBg+WP6kBdUWaR/EmYkIEL/Mi4adWfy2MWpWehEqb5NkpqqA8pLKfb6OWS fkJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=qD6UXdHEKPxi3mhcQV7w6kRpTI+iORIz4PeajrxJNVs=; b=W8s+e6gEVZRBpc6KhEGYYMGPiI+nHpxbEzcoUAsmKPdzTMPn8b6E2MSIojMVg7HamL u17CqJaVMghenRaAJkWPOXdSLV87rge5xb4gzqw5V1YyD/Z+qX8EI8V+Zj2l3jhJJTV5 /JlqfQC7b6UexJl8yWGQOkkFSgX+G4RcAFFkwLmkjVSB/o5BRCPo/Y4Eg4lH+WnIq3tw yOhGQYUzLOHMM+4VpNxG/SfjXSqHITTj3uUvdMs8FsTo5aUpSvGDBqNp6CrBW5nSfjuf YaL6yB3HYbj0MUlnc+hvYQO5LSN66/t4yrZ71Vcl8fmkfWoN1TopCk6ZYNbcfuJyTmgO qvxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pi17-20020a17090b1e5100b00230c06399e8si1975514pjb.89.2023.03.03.23.19.54; Fri, 03 Mar 2023 23:20:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229636AbjCDGi1 (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Sat, 4 Mar 2023 01:38:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229476AbjCDGi0 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 4 Mar 2023 01:38:26 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59CEE11650; Fri, 3 Mar 2023 22:38:23 -0800 (PST) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4PTFXS3YB6z4f3jM6; Sat, 4 Mar 2023 14:38:16 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.127.227]) by APP3 (Coremail) with SMTP id _Ch0CgCnUyFY5wJkPRg_EQ--.5366S4; Sat, 04 Mar 2023 14:38:18 +0800 (CST) From: Yu Kuai <yukuai1@huaweicloud.com> To: song@kernel.org, jgq516@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next] raid10: fix leak of io accounting Date: Sat, 4 Mar 2023 15:01:33 +0800 Message-Id: <20230304070133.1134975-1-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: _Ch0CgCnUyFY5wJkPRg_EQ--.5366S4 X-Coremail-Antispam: 1UD129KBjvJXoW7uF1kCFW8ZFWxKr1UXF1kZrb_yoW8urWDp3 yDKas0vrW5J3y5uw4DJFWDC3Zay39rtay2yrWxAw13Jwn7Xr95GF18XF4agrn8ZFZ5urnx Z3Z0vrsrXF47tFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUym14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26ryj6F1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4j 6F4UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s 0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xII jxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr 1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4 v20xvaj40_WFyUJVCq3wCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E 14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x0JUdHUDUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759420777907233046?= X-GMAIL-MSGID: =?utf-8?q?1759420777907233046?= |
Series |
[-next] raid10: fix leak of io accounting
|
|
Commit Message
Yu Kuai
March 4, 2023, 7:01 a.m. UTC
From: Yu Kuai <yukuai3@huawei.com> handle_read_error() will resumit r10_bio by raid10_read_request(), which will call bio_start_io_acct() again, while bio_end_io_acct() will only be called once. Fix the problem by don't account io again from handle_read_error(). Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") Signed-off-by: Yu Kuai <yukuai3@huawei.com> --- drivers/md/raid10.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Comments
Hi, What do you mean 'leak' here? On 3/4/23 15:01, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > handle_read_error() will resumit r10_bio by raid10_read_request(), which > will call bio_start_io_acct() again, while bio_end_io_acct() will only > be called once. > > Fix the problem by don't account io again from handle_read_error(). My understanding is it caused inaccurate io stats for bio which had a read error. > Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > --- > drivers/md/raid10.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 6c66357f92f5..4f8edb6ea3e2 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev *mddev, struct r10conf *conf, > } > > static void raid10_read_request(struct mddev *mddev, struct bio *bio, > - struct r10bio *r10_bio) > + struct r10bio *r10_bio, bool handle_error) > { > struct r10conf *conf = mddev->private; > struct bio *read_bio; > @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio, > } > slot = r10_bio->read_slot; > > - if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) > + if (!handle_error && blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) > r10_bio->start_time = bio_start_io_acct(bio); I think a simpler way is just check R10BIO_ReadError here. Thanks, Guoqing
Hi, 在 2023/03/09 14:36, Guoqing Jiang 写道: > Hi, > > What do you mean 'leak' here? I try to mean that inflight counting is leaked, because it's increased twice for one io. > > On 3/4/23 15:01, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> handle_read_error() will resumit r10_bio by raid10_read_request(), which >> will call bio_start_io_acct() again, while bio_end_io_acct() will only >> be called once. >> >> Fix the problem by don't account io again from handle_read_error(). > > My understanding is it caused inaccurate io stats for bio which had a read > error. > >> Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") >> Signed-off-by: Yu Kuai <yukuai3@huawei.com> >> --- >> drivers/md/raid10.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c >> index 6c66357f92f5..4f8edb6ea3e2 100644 >> --- a/drivers/md/raid10.c >> +++ b/drivers/md/raid10.c >> @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev >> *mddev, struct r10conf *conf, >> } >> static void raid10_read_request(struct mddev *mddev, struct bio *bio, >> - struct r10bio *r10_bio) >> + struct r10bio *r10_bio, bool handle_error) >> { >> struct r10conf *conf = mddev->private; >> struct bio *read_bio; >> @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev >> *mddev, struct bio *bio, >> } >> slot = r10_bio->read_slot; >> - if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >> + if (!handle_error && >> blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >> r10_bio->start_time = bio_start_io_acct(bio); > > I think a simpler way is just check R10BIO_ReadError here. No, I'm afraid this is incorrect because handle_read_error clears the state before resubmiting the r10bio. Thanks, Kuai > > Thanks, > Guoqing > . >
On 3/9/23 14:56, Yu Kuai wrote: > Hi, > > 在 2023/03/09 14:36, Guoqing Jiang 写道: >> Hi, >> >> What do you mean 'leak' here? > > I try to mean that inflight counting is leaked, because it's increased > twice for one io. How about change the subject to something like? 'md/raid10: Don't call bio_start_io_acct twice for bio which experienced read error' > >> >> On 3/4/23 15:01, Yu Kuai wrote: >>> From: Yu Kuai <yukuai3@huawei.com> >>> >>> handle_read_error() will resumit r10_bio by raid10_read_request(), >>> which >>> will call bio_start_io_acct() again, while bio_end_io_acct() will only >>> be called once. >>> >>> Fix the problem by don't account io again from handle_read_error(). >> >> My understanding is it caused inaccurate io stats for bio which had a >> read >> error. >> >>> Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") >>> Signed-off-by: Yu Kuai <yukuai3@huawei.com> >>> --- >>> drivers/md/raid10.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c >>> index 6c66357f92f5..4f8edb6ea3e2 100644 >>> --- a/drivers/md/raid10.c >>> +++ b/drivers/md/raid10.c >>> @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev >>> *mddev, struct r10conf *conf, >>> } >>> static void raid10_read_request(struct mddev *mddev, struct bio *bio, >>> - struct r10bio *r10_bio) >>> + struct r10bio *r10_bio, bool handle_error) >>> { >>> struct r10conf *conf = mddev->private; >>> struct bio *read_bio; >>> @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev >>> *mddev, struct bio *bio, >>> } >>> slot = r10_bio->read_slot; >>> - if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >>> + if (!handle_error && >>> blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >>> r10_bio->start_time = bio_start_io_acct(bio); >> >> I think a simpler way is just check R10BIO_ReadError here. > > No, I'm afraid this is incorrect because handle_read_error clears the > state before resubmiting the r10bio. Right, Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Thanks, Guoqing
Hi, 在 2023/03/09 15:27, Guoqing Jiang 写道: > > > On 3/9/23 14:56, Yu Kuai wrote: >> Hi, >> >> 在 2023/03/09 14:36, Guoqing Jiang 写道: >>> Hi, >>> >>> What do you mean 'leak' here? >> >> I try to mean that inflight counting is leaked, because it's increased >> twice for one io. > > How about change the subject to something like? > > 'md/raid10: Don't call bio_start_io_acct twice for bio which experienced > read error' > Of course, I'll change that in v2. >> >>> >>> On 3/4/23 15:01, Yu Kuai wrote: >>>> From: Yu Kuai <yukuai3@huawei.com> >>>> >>>> handle_read_error() will resumit r10_bio by raid10_read_request(), >>>> which >>>> will call bio_start_io_acct() again, while bio_end_io_acct() will only >>>> be called once. >>>> >>>> Fix the problem by don't account io again from handle_read_error(). >>> >>> My understanding is it caused inaccurate io stats for bio which had a >>> read >>> error. >>> >>>> Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") >>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com> >>>> --- >>>> drivers/md/raid10.c | 8 ++++---- >>>> 1 file changed, 4 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c >>>> index 6c66357f92f5..4f8edb6ea3e2 100644 >>>> --- a/drivers/md/raid10.c >>>> +++ b/drivers/md/raid10.c >>>> @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev >>>> *mddev, struct r10conf *conf, >>>> } >>>> static void raid10_read_request(struct mddev *mddev, struct bio *bio, >>>> - struct r10bio *r10_bio) >>>> + struct r10bio *r10_bio, bool handle_error) >>>> { >>>> struct r10conf *conf = mddev->private; >>>> struct bio *read_bio; >>>> @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev >>>> *mddev, struct bio *bio, >>>> } >>>> slot = r10_bio->read_slot; >>>> - if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >>>> + if (!handle_error && >>>> blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) >>>> r10_bio->start_time = bio_start_io_acct(bio); >>> >>> I think a simpler way is just check R10BIO_ReadError here. >> >> No, I'm afraid this is incorrect because handle_read_error clears the >> state before resubmiting the r10bio. > > Right, > > Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Thanks for the review. Kuai > > Thanks, > Guoqing > . >
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6c66357f92f5..4f8edb6ea3e2 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev *mddev, struct r10conf *conf, } static void raid10_read_request(struct mddev *mddev, struct bio *bio, - struct r10bio *r10_bio) + struct r10bio *r10_bio, bool handle_error) { struct r10conf *conf = mddev->private; struct bio *read_bio; @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio, } slot = r10_bio->read_slot; - if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) + if (!handle_error && blk_queue_io_stat(bio->bi_bdev->bd_disk->queue)) r10_bio->start_time = bio_start_io_acct(bio); read_bio = bio_alloc_clone(rdev->bdev, bio, gfp, &mddev->bio_set); @@ -1578,7 +1578,7 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors) conf->geo.raid_disks); if (bio_data_dir(bio) == READ) - raid10_read_request(mddev, bio, r10_bio); + raid10_read_request(mddev, bio, r10_bio, false); else raid10_write_request(mddev, bio, r10_bio); } @@ -2980,7 +2980,7 @@ static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio) rdev_dec_pending(rdev, mddev); allow_barrier(conf); r10_bio->state = 0; - raid10_read_request(mddev, r10_bio->master_bio, r10_bio); + raid10_read_request(mddev, r10_bio->master_bio, r10_bio, true); } static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)