From patchwork Wed Feb 28 11:43:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207804 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3293790dyb; Wed, 28 Feb 2024 03:57:07 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU3CDPt4fJ2rQSHmS9ZiHzCO70AgPUe/b6BDmz048qh+FJO1oJEmKd7cgWvWtP2dG24lpCI135JGMBAsmcDP2xgLRBx5g== X-Google-Smtp-Source: AGHT+IEWwIJ9v6a24Fw8wwa48ezj8OBrRgxdDVlNK4ull7jYWrDYkhIcUmCHDZrlhFxaoCMjZUPY X-Received: by 2002:a1f:ed41:0:b0:4d3:39c3:717c with SMTP id l62-20020a1fed41000000b004d339c3717cmr1968424vkh.1.1709121427397; Wed, 28 Feb 2024 03:57:07 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121427; cv=pass; d=google.com; s=arc-20160816; b=jyYWHBVOgfufPj5jyljc898esL1yrrZ5hB0m4F1Y2rzEXzqzjbX359msw+AN/omOVk +GtLNs54f6HAmCfQStztjCHOhu4kusQbBrHuG7UGnYc0Vl4aTe/mctfmi7HBzqRWnER4 tdZQjdxsH/rOkvuFmIlr6tm2uDhbW8aKuyhnJxK+jqInH6Rw0UEnXVQY9SGXGbyhjltX 8s6dMKhb5897FVTEbKLzxp0h5GeD7Pb4eKYkdCm23m08rbNeiMbrbGQnhMFeI7nBg0xl HApctJ0zQLFsM07fxsJxqNfwjwKdZqtK2JqMpLPk3igzzM5OBLyKcd0u8fwf3cStmvjT u9PQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=Sb/kNr7PfRENUWs9EE5/E6hxgqnxYAMcVzqdwYCWjYQ=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=SJpIFGdXXv67gDjKziSUz2lOB3clLfrGGu8RQIU4d27/QzLFtBeKkwlmi3UfS+XjuX DrVN5poOsRb3+U9aCFZXDXD4MvM5ZQynZ74n9Sqnzvg+PNxP/hu622sl4zZ22xIeZVqy 4Eew4AwM5wDRd8Mdfcsk76O9HilgBDLdb/PjMsZpFrdCYguZzOZpS/oxUb+hJzaBJJ+v 7aCvp1d4WIYjrfHqDHB820Zq9LsNhYzJhE+WfJ37U9eZdyMk8QEYFP7iOuTMAVeQpfv2 zrMLxHTFSN86u/Ss9fgjY+glOZmqBp8aApv+mSAKRzfJIO4nx90S7dy9unQiWYCLheuS Gcog==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84968-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84968-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id t3-20020ac87603000000b0042e6233137asi9180356qtq.662.2024.02.28.03.57.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 03:57:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84968-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84968-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84968-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 257CB1C21A67 for ; Wed, 28 Feb 2024 11:57:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F20D715AAA2; Wed, 28 Feb 2024 11:49:46 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F19A76CDC8; Wed, 28 Feb 2024 11:49:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; cv=none; b=Xk7eM4kV+kzuh1+EWVvdfMjIqm1wV6twwRpYUxbjzoYQe2o64oGw+179QliOMMJox29+4Oz+nSUbz1GyaEW4ptvv9tGJuAD6AdaJpWrLWlvumjbhz2daZ6ARADfrkv+VxQqdbLxlTxxePel0hThXvsYWnzKcU4kRCOTSVF4/2/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; c=relaxed/simple; bh=VHPw/XIujHrrnoKg8+hTdwOfW2a8iMlT6BhNgl66NrI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=EOW98taNatHV/X/JZaet46VmXnupJV/Zeim7MYzMd/UNtjv9sw3NcTdJ8hw6S83XHuJ3UtLZv08gDU6DD71NiusmlwvxwUpUNnsQRPmzl8c7x784FB9unZrXWh6O8LnTAaLWpKmoXrRMxyJ15ZIquCQS/LrN0EuksZ5DHo1D1vs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLr6Nj5z4f3jQP; Wed, 28 Feb 2024 19:49:24 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 393C31A0232; Wed, 28 Feb 2024 19:49:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S5; Wed, 28 Feb 2024 19:49:31 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 01/11] md: add a new helper rdev_has_badblock() Date: Wed, 28 Feb 2024 19:43:23 +0800 Message-Id: <20240228114333.527222-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S5 X-Coremail-Antispam: 1UD129KBjvJXoW3uFWUuFWxKFW3Jw47Cw47CFg_yoWkZw13p3 9rJa4SyFWUJFyfWw4DJayUurnYy34fJrW7JFWxX34Iga4jkr9xKFykXryYgF98uFy3ur12 qwnrZ3y7u397KFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUjeWlDUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792143709498904327 X-GMAIL-MSGID: 1792143709498904327 From: Yu Kuai The current api is_badblock() must pass in 'first_bad' and 'bad_sectors', however, many caller just want to know if there are badblocks or not, and these caller must define two local variable that will never be used. Add a new helper rdev_has_badblock() that will only return if there are badblocks or not, remove unnecessary local variables and replace is_badblock() with the new helper in many places. There are no functional changes, and the new helper will also be used later to refactor read_balance(). Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/md.h | 10 ++++++++++ drivers/md/raid1.c | 26 +++++++------------------- drivers/md/raid10.c | 45 ++++++++++++++------------------------------- drivers/md/raid5.c | 35 +++++++++++++---------------------- 4 files changed, 44 insertions(+), 72 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index 8d881cc59799..a49ab04ab707 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -222,6 +222,16 @@ static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, } return 0; } + +static inline int rdev_has_badblock(struct md_rdev *rdev, sector_t s, + int sectors) +{ + sector_t first_bad; + int bad_sectors; + + return is_badblock(rdev, s, sectors, &first_bad, &bad_sectors); +} + extern int rdev_set_badblocks(struct md_rdev *rdev, sector_t s, int sectors, int is_new); extern int rdev_clear_badblocks(struct md_rdev *rdev, sector_t s, int sectors, diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 286f8b16c7bd..a145fe48b9ce 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -498,9 +498,6 @@ static void raid1_end_write_request(struct bio *bio) * to user-side. So if something waits for IO, then it * will wait for the 'master' bio. */ - sector_t first_bad; - int bad_sectors; - r1_bio->bios[mirror] = NULL; to_put = bio; /* @@ -516,8 +513,8 @@ static void raid1_end_write_request(struct bio *bio) set_bit(R1BIO_Uptodate, &r1_bio->state); /* Maybe we can clear some bad blocks. */ - if (is_badblock(rdev, r1_bio->sector, r1_bio->sectors, - &first_bad, &bad_sectors) && !discard_error) { + if (rdev_has_badblock(rdev, r1_bio->sector, r1_bio->sectors) && + !discard_error) { r1_bio->bios[mirror] = IO_MADE_GOOD; set_bit(R1BIO_MadeGood, &r1_bio->state); } @@ -1944,8 +1941,6 @@ static void end_sync_write(struct bio *bio) struct r1bio *r1_bio = get_resync_r1bio(bio); struct mddev *mddev = r1_bio->mddev; struct r1conf *conf = mddev->private; - sector_t first_bad; - int bad_sectors; struct md_rdev *rdev = conf->mirrors[find_bio_disk(r1_bio, bio)].rdev; if (!uptodate) { @@ -1955,14 +1950,11 @@ static void end_sync_write(struct bio *bio) set_bit(MD_RECOVERY_NEEDED, & mddev->recovery); set_bit(R1BIO_WriteError, &r1_bio->state); - } else if (is_badblock(rdev, r1_bio->sector, r1_bio->sectors, - &first_bad, &bad_sectors) && - !is_badblock(conf->mirrors[r1_bio->read_disk].rdev, - r1_bio->sector, - r1_bio->sectors, - &first_bad, &bad_sectors) - ) + } else if (rdev_has_badblock(rdev, r1_bio->sector, r1_bio->sectors) && + !rdev_has_badblock(conf->mirrors[r1_bio->read_disk].rdev, + r1_bio->sector, r1_bio->sectors)) { set_bit(R1BIO_MadeGood, &r1_bio->state); + } put_sync_write_buf(r1_bio, uptodate); } @@ -2279,16 +2271,12 @@ static void fix_read_error(struct r1conf *conf, struct r1bio *r1_bio) s = PAGE_SIZE >> 9; do { - sector_t first_bad; - int bad_sectors; - rdev = conf->mirrors[d].rdev; if (rdev && (test_bit(In_sync, &rdev->flags) || (!test_bit(Faulty, &rdev->flags) && rdev->recovery_offset >= sect + s)) && - is_badblock(rdev, sect, s, - &first_bad, &bad_sectors) == 0) { + rdev_has_badblock(rdev, sect, s) == 0) { atomic_inc(&rdev->nr_pending); if (sync_page_io(rdev, sect, s<<9, conf->tmppage, REQ_OP_READ, false)) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 7412066ea22c..d5a7a621f0f0 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -518,11 +518,7 @@ static void raid10_end_write_request(struct bio *bio) * The 'master' represents the composite IO operation to * user-side. So if something waits for IO, then it will * wait for the 'master' bio. - */ - sector_t first_bad; - int bad_sectors; - - /* + * * Do not set R10BIO_Uptodate if the current device is * rebuilding or Faulty. This is because we cannot use * such device for properly reading the data back (we could @@ -535,10 +531,9 @@ static void raid10_end_write_request(struct bio *bio) set_bit(R10BIO_Uptodate, &r10_bio->state); /* Maybe we can clear some bad blocks. */ - if (is_badblock(rdev, - r10_bio->devs[slot].addr, - r10_bio->sectors, - &first_bad, &bad_sectors) && !discard_error) { + if (rdev_has_badblock(rdev, r10_bio->devs[slot].addr, + r10_bio->sectors) && + !discard_error) { bio_put(bio); if (repl) r10_bio->devs[slot].repl_bio = IO_MADE_GOOD; @@ -1330,10 +1325,7 @@ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio) } if (rdev && test_bit(WriteErrorSeen, &rdev->flags)) { - sector_t first_bad; sector_t dev_sector = r10_bio->devs[i].addr; - int bad_sectors; - int is_bad; /* * Discard request doesn't care the write result @@ -1342,9 +1334,8 @@ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio) if (!r10_bio->sectors) continue; - is_bad = is_badblock(rdev, dev_sector, r10_bio->sectors, - &first_bad, &bad_sectors); - if (is_bad < 0) { + if (rdev_has_badblock(rdev, dev_sector, + r10_bio->sectors) < 0) { /* * Mustn't write here until the bad block * is acknowledged @@ -2290,8 +2281,6 @@ static void end_sync_write(struct bio *bio) struct mddev *mddev = r10_bio->mddev; struct r10conf *conf = mddev->private; int d; - sector_t first_bad; - int bad_sectors; int slot; int repl; struct md_rdev *rdev = NULL; @@ -2312,11 +2301,10 @@ static void end_sync_write(struct bio *bio) &rdev->mddev->recovery); set_bit(R10BIO_WriteError, &r10_bio->state); } - } else if (is_badblock(rdev, - r10_bio->devs[slot].addr, - r10_bio->sectors, - &first_bad, &bad_sectors)) + } else if (rdev_has_badblock(rdev, r10_bio->devs[slot].addr, + r10_bio->sectors)) { set_bit(R10BIO_MadeGood, &r10_bio->state); + } rdev_dec_pending(rdev, mddev); @@ -2597,11 +2585,8 @@ static void recovery_request_write(struct mddev *mddev, struct r10bio *r10_bio) static int r10_sync_page_io(struct md_rdev *rdev, sector_t sector, int sectors, struct page *page, enum req_op op) { - sector_t first_bad; - int bad_sectors; - - if (is_badblock(rdev, sector, sectors, &first_bad, &bad_sectors) - && (op == REQ_OP_READ || test_bit(WriteErrorSeen, &rdev->flags))) + if (rdev_has_badblock(rdev, sector, sectors) && + (op == REQ_OP_READ || test_bit(WriteErrorSeen, &rdev->flags))) return -1; if (sync_page_io(rdev, sector, sectors << 9, page, op, false)) /* success */ @@ -2658,16 +2643,14 @@ static void fix_read_error(struct r10conf *conf, struct mddev *mddev, struct r10 s = PAGE_SIZE >> 9; do { - sector_t first_bad; - int bad_sectors; - d = r10_bio->devs[sl].devnum; rdev = conf->mirrors[d].rdev; if (rdev && test_bit(In_sync, &rdev->flags) && !test_bit(Faulty, &rdev->flags) && - is_badblock(rdev, r10_bio->devs[sl].addr + sect, s, - &first_bad, &bad_sectors) == 0) { + rdev_has_badblock(rdev, + r10_bio->devs[sl].addr + sect, + s) == 0) { atomic_inc(&rdev->nr_pending); success = sync_page_io(rdev, r10_bio->devs[sl].addr + diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 14f2cf75abbd..9241e95ef55c 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1210,10 +1210,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) */ while (op_is_write(op) && rdev && test_bit(WriteErrorSeen, &rdev->flags)) { - sector_t first_bad; - int bad_sectors; - int bad = is_badblock(rdev, sh->sector, RAID5_STRIPE_SECTORS(conf), - &first_bad, &bad_sectors); + int bad = rdev_has_badblock(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf)); if (!bad) break; @@ -2855,8 +2853,6 @@ static void raid5_end_write_request(struct bio *bi) struct r5conf *conf = sh->raid_conf; int disks = sh->disks, i; struct md_rdev *rdev; - sector_t first_bad; - int bad_sectors; int replacement = 0; for (i = 0 ; i < disks; i++) { @@ -2888,9 +2884,8 @@ static void raid5_end_write_request(struct bio *bi) if (replacement) { if (bi->bi_status) md_error(conf->mddev, rdev); - else if (is_badblock(rdev, sh->sector, - RAID5_STRIPE_SECTORS(conf), - &first_bad, &bad_sectors)) + else if (rdev_has_badblock(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf))) set_bit(R5_MadeGoodRepl, &sh->dev[i].flags); } else { if (bi->bi_status) { @@ -2900,9 +2895,8 @@ static void raid5_end_write_request(struct bio *bi) if (!test_and_set_bit(WantReplacement, &rdev->flags)) set_bit(MD_RECOVERY_NEEDED, &rdev->mddev->recovery); - } else if (is_badblock(rdev, sh->sector, - RAID5_STRIPE_SECTORS(conf), - &first_bad, &bad_sectors)) { + } else if (rdev_has_badblock(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf))) { set_bit(R5_MadeGood, &sh->dev[i].flags); if (test_bit(R5_ReadError, &sh->dev[i].flags)) /* That was a successful write so make @@ -4674,8 +4668,6 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) /* Now to look around and see what can be done */ for (i=disks; i--; ) { struct md_rdev *rdev; - sector_t first_bad; - int bad_sectors; int is_bad = 0; dev = &sh->dev[i]; @@ -4719,8 +4711,8 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) rdev = conf->disks[i].replacement; if (rdev && !test_bit(Faulty, &rdev->flags) && rdev->recovery_offset >= sh->sector + RAID5_STRIPE_SECTORS(conf) && - !is_badblock(rdev, sh->sector, RAID5_STRIPE_SECTORS(conf), - &first_bad, &bad_sectors)) + !rdev_has_badblock(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf))) set_bit(R5_ReadRepl, &dev->flags); else { if (rdev && !test_bit(Faulty, &rdev->flags)) @@ -4733,8 +4725,8 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) if (rdev && test_bit(Faulty, &rdev->flags)) rdev = NULL; if (rdev) { - is_bad = is_badblock(rdev, sh->sector, RAID5_STRIPE_SECTORS(conf), - &first_bad, &bad_sectors); + is_bad = rdev_has_badblock(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf)); if (s->blocked_rdev == NULL && (test_bit(Blocked, &rdev->flags) || is_bad < 0)) { @@ -5463,8 +5455,8 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) struct r5conf *conf = mddev->private; struct bio *align_bio; struct md_rdev *rdev; - sector_t sector, end_sector, first_bad; - int bad_sectors, dd_idx; + sector_t sector, end_sector; + int dd_idx; bool did_inc; if (!in_chunk_boundary(mddev, raid_bio)) { @@ -5493,8 +5485,7 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) atomic_inc(&rdev->nr_pending); - if (is_badblock(rdev, sector, bio_sectors(raid_bio), &first_bad, - &bad_sectors)) { + if (rdev_has_badblock(rdev, sector, bio_sectors(raid_bio))) { rdev_dec_pending(rdev, mddev); return 0; } From patchwork Wed Feb 28 11:43:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207823 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3303010dyb; Wed, 28 Feb 2024 04:12:52 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWZDeXgWotiAtCsbYjE+9Lflu5pZ2OcqJiUoY4k+vg9YIBe1PB1aCqOVI4S3PKt9TpXr4oujzLCgSElqLpzRJu++R+HEA== X-Google-Smtp-Source: AGHT+IEfbLzos/G/YfijFMoWgZCqOTH4hB8k3t9Dw/RuzdimbZekSaDNH8eHlF7GciHaacfZkB9z X-Received: by 2002:a05:6808:140c:b0:3bf:db6b:9a7d with SMTP id w12-20020a056808140c00b003bfdb6b9a7dmr5952117oiv.8.1709122372035; Wed, 28 Feb 2024 04:12:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709122372; cv=pass; d=google.com; s=arc-20160816; b=J/Xs0rujVDJL7YKXsSRmLjK91dN0Y/U2wdu9iKgk6TVu7iTtHjt6tkho1j2myhyE6z AtPUPSed7z/NVt14BhKvF8PROWOkCM8zfKQ0d6kyUyRLlIPQA07EK/FdW+4lCtX2FtF9 wInMDLGlLcblH/6PvJoH1G6XA1Ukw4OBKdul2aeOzbBKYcv4daZj7iVH1ohbQfrBSERv uoNKipv/Hxa/qq5fUgRIpiwWU40EBVm95V5l3pza7pEgjsm6Sroi41Cu7zGTItuFygqA 61iYvp58Bwin3gAHAVPECaBTwu1OGRrYTV5h3yZYQgQadpEXqhZ4CyugaoYHEJT+e/7i Xd2g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=ERpZw7OLCqI+U3cam3yTi7+fUdF7P/0+ePqWvK3Fm1o=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=DjLWjlL5vrspyjWiYUyLwwa2i713oOBpcsQeXKDs6qBaNh3WwO7WMs9TkhROC0UVT3 AJPpeSrMPiXV3iD6ZKXTLVcwMIlixzMBjT/ipmiRaztqi9XJZPp84MELFwfMJjcD1L6r K4E+jPBLSf1dyCLNeybEig48ruBfcG/yIqOiLpQ4VA4mkv5UpyDJWXKg9J8cogrO+xqT 0GkTCM5ott5fOUM/Rd5hvjhrti6mNF0/kWNkam25JFW6xyN7zhjjSLNthWFcoYJWnNgd 2alH5o8rAek/+XBKgQjKA4sb4QFb9B1rFJz1cIQzLBjTKONIGnc8aJjXI5moxMW3dZlX Tt8g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84971-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84971-ouuuleilei=gmail.com@vger.kernel.org" Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id d24-20020a63d658000000b005db38f35248si7207210pgj.395.2024.02.28.04.12.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:12:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84971-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84971-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84971-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 720AAB26FD2 for ; Wed, 28 Feb 2024 11:57:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A237415B114; Wed, 28 Feb 2024 11:49:46 +0000 (UTC) Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E1B6155A5C; Wed, 28 Feb 2024 11:49:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120984; cv=none; b=crM/MHN14pX4/SmMTUZ62TssTspGXDvfpcnxa8/0cFFzihEQ+95wHSSlZ8zRdWAHIE6LxSSQXa34UJsvij2x0N8QF2UUAAXfdtyMU8a0+UDeZGSGKm+d5YIrESNa13wvO/xsEBh+lVyRPz+ESatHZXlN1Ft9zoab85sfxcPLiKs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120984; c=relaxed/simple; bh=jldN1ct6r5ylRv2hbOuCl1QF6H4dCcHj/0HGIhtPvgc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=X9f457AqO/cftfunRBvIj9lKNzYgDBEfCn3hN1LTAiMkymxAX6o8Ye/bZ6nGr4daVH1UpeaAb2k2jJhW1ZzudmDZQkjj4oMOqeDE6a10L9wzPfg+QP/+fGv6uYJuwhAp3M06u/GhR2u3prMgASakjgjLR3bvoO0/JSZHR0zgZVk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TlCLv3CZvz4f3js6; Wed, 28 Feb 2024 19:49:27 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id AA6431A0D99; Wed, 28 Feb 2024 19:49:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S6; Wed, 28 Feb 2024 19:49:32 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 02/11] md/raid1: factor out helpers to add rdev to conf Date: Wed, 28 Feb 2024 19:43:24 +0800 Message-Id: <20240228114333.527222-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S6 X-Coremail-Antispam: 1UD129KBjvJXoWxur1UJF4UAw4kGry3ZFWkJFb_yoWrAw4Upa 13XasxGr47ZrsIgr1DJrWUCFyFqw4kCa97JryfW3yIvanxKrZ5X3y8JFy5XFyUZFZ8Zw45 Xa4rJrWDGF1xuFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbdOz7UUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144700368370127 X-GMAIL-MSGID: 1792144700368370127 From: Yu Kuai There are no functional changes, just make code cleaner and prepare to record disk non-rotational information while adding and removing rdev to conf Signed-off-by: Yu Kuai --- drivers/md/raid1.c | 74 ++++++++++++++++++++++++++++------------------ 1 file changed, 46 insertions(+), 28 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index a145fe48b9ce..1940ff398c23 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1757,6 +1757,40 @@ static int raid1_spare_active(struct mddev *mddev) return count; } +static bool raid1_add_conf(struct r1conf *conf, struct md_rdev *rdev, int disk) +{ + struct raid1_info *info = conf->mirrors + disk; + + if (info->rdev) + return false; + + rdev->raid_disk = disk; + info->head_position = 0; + info->seq_start = MaxSector; + WRITE_ONCE(info->rdev, rdev); + + return true; +} + +static bool raid1_remove_conf(struct r1conf *conf, int disk) +{ + struct raid1_info *info = conf->mirrors + disk; + struct md_rdev *rdev = info->rdev; + + if (!rdev || test_bit(In_sync, &rdev->flags) || + atomic_read(&rdev->nr_pending)) + return false; + + /* Only remove non-faulty devices if recovery is not possible. */ + if (!test_bit(Faulty, &rdev->flags) && + rdev->mddev->recovery_disabled != conf->recovery_disabled && + rdev->mddev->degraded < conf->raid_disks) + return false; + + WRITE_ONCE(info->rdev, NULL); + return true; +} + static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev) { struct r1conf *conf = mddev->private; @@ -1792,15 +1826,13 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev) disk_stack_limits(mddev->gendisk, rdev->bdev, rdev->data_offset << 9); - p->head_position = 0; - rdev->raid_disk = mirror; + raid1_add_conf(conf, rdev, mirror); err = 0; /* As all devices are equivalent, we don't need a full recovery * if this was recently any drive of the array */ if (rdev->saved_raid_disk < 0) conf->fullsync = 1; - WRITE_ONCE(p->rdev, rdev); break; } if (test_bit(WantReplacement, &p->rdev->flags) && @@ -1810,13 +1842,11 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev) if (err && repl_slot >= 0) { /* Add this device as a replacement */ - p = conf->mirrors + repl_slot; clear_bit(In_sync, &rdev->flags); set_bit(Replacement, &rdev->flags); - rdev->raid_disk = repl_slot; + raid1_add_conf(conf, rdev, repl_slot); err = 0; conf->fullsync = 1; - WRITE_ONCE(p[conf->raid_disks].rdev, rdev); } print_conf(conf); @@ -1833,27 +1863,20 @@ static int raid1_remove_disk(struct mddev *mddev, struct md_rdev *rdev) if (unlikely(number >= conf->raid_disks)) goto abort; - if (rdev != p->rdev) - p = conf->mirrors + conf->raid_disks + number; + if (rdev != p->rdev) { + number += conf->raid_disks; + p = conf->mirrors + number; + } print_conf(conf); if (rdev == p->rdev) { - if (test_bit(In_sync, &rdev->flags) || - atomic_read(&rdev->nr_pending)) { - err = -EBUSY; - goto abort; - } - /* Only remove non-faulty devices if recovery - * is not possible. - */ - if (!test_bit(Faulty, &rdev->flags) && - mddev->recovery_disabled != conf->recovery_disabled && - mddev->degraded < conf->raid_disks) { + if (!raid1_remove_conf(conf, number)) { err = -EBUSY; goto abort; } - WRITE_ONCE(p->rdev, NULL); - if (conf->mirrors[conf->raid_disks + number].rdev) { + + if (number < conf->raid_disks && + conf->mirrors[conf->raid_disks + number].rdev) { /* We just removed a device that is being replaced. * Move down the replacement. We drain all IO before * doing this to avoid confusion. @@ -3000,15 +3023,10 @@ static struct r1conf *setup_conf(struct mddev *mddev) || disk_idx < 0) continue; if (test_bit(Replacement, &rdev->flags)) - disk = conf->mirrors + mddev->raid_disks + disk_idx; - else - disk = conf->mirrors + disk_idx; + disk_idx += mddev->raid_disks; - if (disk->rdev) + if (!raid1_add_conf(conf, rdev, disk_idx)) goto abort; - disk->rdev = rdev; - disk->head_position = 0; - disk->seq_start = MaxSector; } conf->raid_disks = mddev->raid_disks; conf->mddev = mddev; From patchwork Wed Feb 28 11:43:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207812 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3299320dyb; Wed, 28 Feb 2024 04:06:22 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUP49UcSRWxu9cVH6zMDMfoLa6ZkO97mWKdkajlNta8/r5r55EJ5Ov0YzYRveV3pen/U8dzy/+PRu6OzFvKBhe6GFY3dA== X-Google-Smtp-Source: AGHT+IFAGTTGERf07h+dp+FVZBdpTE4bmXCFYLnenK0qDok9/n5dMM0peyJ7aP6VI3VEHDuprsOb X-Received: by 2002:a17:90a:3fc7:b0:299:8e70:e9a5 with SMTP id u7-20020a17090a3fc700b002998e70e9a5mr11359975pjm.20.1709121981884; Wed, 28 Feb 2024 04:06:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121981; cv=pass; d=google.com; s=arc-20160816; b=vB7NCg5/VwuBnRtL+Rago4vAdZ1cavbwKVF4sIj06c4pjTPQwR18nas9UCFE1CNklf IxiuGSLEAL9rhVoK9Wv3to5vTsAlwOyooAa0r0tEvHKDUtDE6WA2kQM14AnQVl4+jN4j 3I0Wfdyr+I6dcGlEkCfYQWrmN/R5xxtPJBYp674iYnnn+J0Myq0gWYcOkSM+ARXVPd6y izXuF+m4RKujhy5CrS8c6WBL/0ccvbFHMZH/cuj+ft94ijtGKnTCMn8ozG5kMWApO4x1 Ls3toRKb/NJLYXYzwK/ZaW1RsnFmiphcLvvZ0EVGV/+36jLwznspQkiNR8Dk5m/DKaRh Nakw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=xKDhUg+46ew7EkjkCVoIUnXpNh65FgqFj4xIdbrtqrw=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=t8cRFGMyF3LTBDcLNY4VCtdmfhHkkjmaxl5XDNREeFepoPJI86hrMKa5DBhmMsj7YK zHsMzq1Ivp6WernVj44r9cuy3vVrUNIW7IC9kjr0wy1P59lICx36/3MIGwuP6Ca+9tjp uDJ1sn4SaPKsQIjGD3Mfy7wxaMpfZ5LKP67UhPo2KYIxK0Qvhe7qgygvyDcuAy5SH9/y ikd/W9AbvjIQNd4spMtNswvwdAI3KzgFZlLHX1OOUWUnCyYlGpzVmFnUtqrAmLOCLsQT PVMVGGlgpFg4wkD0sqvurb4IDLk2fETffG80w65Ui6vGZqQNEYKBsufUvg1dpQJCTLme 15gw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84967-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84967-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id i20-20020a17090acf9400b0029af7be61a2si746430pju.99.2024.02.28.04.06.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:06:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84967-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84967-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84967-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B00D528D99A for ; Wed, 28 Feb 2024 11:56:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8DFA115AADA; Wed, 28 Feb 2024 11:49:45 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 242FC6CDDB; Wed, 28 Feb 2024 11:49:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; cv=none; b=YpzKZEWfRM0f6BSe35E3rASEQ0dOnNrGH9qPFGx6HpTC9T0tgTQ/MCDREQZMWafSbrqZzVT9UEotAChwruReR4Up8zB9rZm6WQU3xWZJqrXJj0vT1fEGjtMS1coF/al8hHWBM12alxa1VzCVUwEonAJcnSbSePqRlmN+3grDLJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; c=relaxed/simple; bh=AZ9SJ8Ws9UIzKwZBqDyxIqj2dRCU8An1gMQqmKewSbs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s3qhLcJha3SqaZEO4hSY4cmdDdg2lHMXEY1pSHz00mvRqDezWzMGauFQI1YZ6jMk1FijM8LcqWUR3a7JYanI/bVLmmlzDqTrc/ixmT0F5+GD/r3zVGlDVcnOGHelEsPvQgbVbO6b3BBG0IW27foSy9r0+PPXcOxlMVfQg3zu8o0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLs5ngpz4f3lfq; Wed, 28 Feb 2024 19:49:25 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 25A811A016E; Wed, 28 Feb 2024 19:49:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S7; Wed, 28 Feb 2024 19:49:32 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 03/11] md/raid1: record nonrot rdevs while adding/removing rdevs to conf Date: Wed, 28 Feb 2024 19:43:25 +0800 Message-Id: <20240228114333.527222-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S7 X-Coremail-Antispam: 1UD129KBjvJXoWxZF47AF1xGFWxJryxCryUZFb_yoWrZF1Dpa y5ta9av3yUJa98Cw4ktw48Cr1Sv34UKay8GFZ7C3yF9asIqFWqqFW8G342qr1kGrs8Aw42 vr1UGws8C3WxKFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUd8n5UUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144290942962754 X-GMAIL-MSGID: 1792144290942962754 From: Yu Kuai For raid1, each read will iterate all the rdevs from conf and check if any rdev is non-rotational, then choose rdev with minimal IO inflight if so, or rdev with closest distance otherwise. Disk nonrot info can be changed through sysfs entry: /sys/block/[disk_name]/queue/rotational However, consider that this should only be used for testing, and user really shouldn't do this in real life. Record the number of non-rotational disks in conf, to avoid checking each rdev in IO fast path and simplify read_balance() a little bit. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai --- drivers/md/md.h | 1 + drivers/md/raid1.c | 17 ++++++++++------- drivers/md/raid1.h | 1 + 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index a49ab04ab707..b2076a165c10 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -207,6 +207,7 @@ enum flag_bits { * check if there is collision between raid1 * serial bios. */ + Nonrot, /* non-rotational device (SSD) */ }; static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 1940ff398c23..032a6a6c3730 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -599,7 +599,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect int sectors; int best_good_sectors; int best_disk, best_dist_disk, best_pending_disk; - int has_nonrot_disk; int disk; sector_t best_dist; unsigned int min_pending; @@ -620,7 +619,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect best_pending_disk = -1; min_pending = UINT_MAX; best_good_sectors = 0; - has_nonrot_disk = 0; choose_next_idle = 0; clear_bit(R1BIO_FailFast, &r1_bio->state); @@ -637,7 +635,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect sector_t first_bad; int bad_sectors; unsigned int pending; - bool nonrot; rdev = conf->mirrors[disk].rdev; if (r1_bio->bios[disk] == IO_BLOCKED @@ -703,8 +700,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect /* At least two disks to choose from so failfast is OK */ set_bit(R1BIO_FailFast, &r1_bio->state); - nonrot = bdev_nonrot(rdev->bdev); - has_nonrot_disk |= nonrot; pending = atomic_read(&rdev->nr_pending); dist = abs(this_sector - conf->mirrors[disk].head_position); if (choose_first) { @@ -731,7 +726,7 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * small, but not a big deal since when the second disk * starts IO, the first disk is likely still busy. */ - if (nonrot && opt_iosize > 0 && + if (test_bit(Nonrot, &rdev->flags) && opt_iosize > 0 && mirror->seq_start != MaxSector && mirror->next_seq_sect > opt_iosize && mirror->next_seq_sect - opt_iosize >= @@ -763,7 +758,7 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * mixed ratation/non-rotational disks depending on workload. */ if (best_disk == -1) { - if (has_nonrot_disk || min_pending == 0) + if (READ_ONCE(conf->nonrot_disks) || min_pending == 0) best_disk = best_pending_disk; else best_disk = best_dist_disk; @@ -1764,6 +1759,11 @@ static bool raid1_add_conf(struct r1conf *conf, struct md_rdev *rdev, int disk) if (info->rdev) return false; + if (bdev_nonrot(rdev->bdev)) { + set_bit(Nonrot, &rdev->flags); + WRITE_ONCE(conf->nonrot_disks, conf->nonrot_disks + 1); + } + rdev->raid_disk = disk; info->head_position = 0; info->seq_start = MaxSector; @@ -1787,6 +1787,9 @@ static bool raid1_remove_conf(struct r1conf *conf, int disk) rdev->mddev->degraded < conf->raid_disks) return false; + if (test_and_clear_bit(Nonrot, &rdev->flags)) + WRITE_ONCE(conf->nonrot_disks, conf->nonrot_disks - 1); + WRITE_ONCE(info->rdev, NULL); return true; } diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index 14d4211a123a..5300cbaa58a4 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -71,6 +71,7 @@ struct r1conf { * allow for replacements. */ int raid_disks; + int nonrot_disks; spinlock_t device_lock; From patchwork Wed Feb 28 11:43:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207806 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3293852dyb; Wed, 28 Feb 2024 03:57:17 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWWUVWA8lSBWVbvgf2SjeV4sB6qG1QaskEDnvBdujC4HJlqpbbdCHDKWM4td+BeKpMcwzLXSENGfnlcKKcPQhfP/rs1sw== X-Google-Smtp-Source: AGHT+IHKcRlIkWEx4WgWrrU6NR0lHgm9Ifjxba5CCsj3yk2p/zRkCHC9g0Mx5MZ2PMqzwmDWODWo X-Received: by 2002:a05:6402:4022:b0:566:72b0:286a with SMTP id d34-20020a056402402200b0056672b0286amr1147524eda.2.1709121436984; Wed, 28 Feb 2024 03:57:16 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121436; cv=pass; d=google.com; s=arc-20160816; b=MZ7DrlAgPQdE5SnSDdoiAnpPGMFha4UJ93a19cuVPZxSve6PJU4uUHv6o6A8c7NIef 9l73QtfW86zMPImu6vWw0vYOr7RlHy15ognjdjBbL13A07g3ne1QwJQcprc86qKNRqgE QJtcdpMGF4i8/hxDtEOwrKADrc+a7Cq7RJ2Y2OOTxuxeSdrW+MQBsN5d34tRaIUTstjC KJXl0lpZqn6bN+R5wer2wfWqHBRDjwIMVAfrCIBjjm2qVJgF5Si2KuIBBC3kXjUbuesn JQY4utg2uDfA63wantDUyrTQfcrapk6J2aC/NmWPBIKJA0Vwi05Ytg1r3X1R141aYvVK Bxlw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=js4aDrI6E++tKUOC202WQG02NEC/z7/t/PAjps79MOQ=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=e3okEzNbTjxHP0DL/BzSUAl9KetGqP+9IMghfuiw54XIn8ul8vUuNEk9SRfWrurodZ 6shiehLAoRsdBlUQXC8GeFbjcSo09PDQ7D5EIbMB4o3HMFjjAotNRx1MIQQSdNP8HEDu 7q9JV/aov6AQcUUyvGUislvUK7AKvmRsHqNLWSOC0dNoD4GRiAamZTVWZ+hCMmg/YBOm ojWjNrG9xsK/vuw3nkR40vTF9AiK8fJU6miTvBw3NiSKOix879Bv+DgG2+WtZcpKqUPQ l+qANs8B3MPzDIg3FyKKQ75cRNpNJeZk1YFqwpCd63+3lAdKiQfjM5DDQ/FQ4J+Z7cMD f/Ng==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84972-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84972-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id t7-20020a50ab47000000b0056649810923si1145930edc.242.2024.02.28.03.57.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 03:57:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84972-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84972-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84972-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id A1E291F23163 for ; Wed, 28 Feb 2024 11:57:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8932915B96B; Wed, 28 Feb 2024 11:49:47 +0000 (UTC) Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB46E12F368; Wed, 28 Feb 2024 11:49:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120985; cv=none; b=qBLW94Wv7leavLI/1s8LzPGBYuZ3f7MiFt6aZCm1u6l9NZ8zgR607ACZLQ92ll8DB9i92OxKp9q5jN5fa/plLYcLtDRQi2d9qnnosJ5PT4TcCKfuPKJXNEVGgduv2iST5+lo6y+/vDIqzB/ArgUff8RIhIYoO7dFNJLeVU9Al9k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120985; c=relaxed/simple; bh=0oZhOJul7RnQBEM4ip4PYFtTex6jjc1LvQzcBj5x1Fo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ht3YXhDkaO8zFUMEmPecSmfxhO3H0675isCnJXADE3WXTcBG8CgrbkcniSD9E0wqMbi+35DoPsp3QlKJeZBXMRqiR7Vq1pjHN2yzyGftR8iVvxDZYl6sXpItyw1+De69bQ5xleOBMDoi+fqyq7BqP5lRYsR1QMPfuj2lOErlUds= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TlCLw2gB9z4f3jXf; Wed, 28 Feb 2024 19:49:28 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 96CB71A0232; Wed, 28 Feb 2024 19:49:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S8; Wed, 28 Feb 2024 19:49:33 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 04/11] md/raid1: fix choose next idle in read_balance() Date: Wed, 28 Feb 2024 19:43:26 +0800 Message-Id: <20240228114333.527222-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S8 X-Coremail-Antispam: 1UD129KBjvJXoWxXryrKr1kKr4fAFyUJry3twb_yoWrXw1xpw 4jvwsaqrWUXF43u3sxJw4UurySg345JayrGrZ7C34Fgry3XrWqqa47K342vry8CFs3J342 qw18GrZru3WkKa7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr 1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AK xVWxJVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792143719651323375 X-GMAIL-MSGID: 1792143719651323375 From: Yu Kuai Commit 12cee5a8a29e ("md/raid1: prevent merging too large request") add the case choose next idle in read_balance(): read_balance: for_each_rdev if(next_seq_sect == this_sector || dist == 0) -> sequential reads best_disk = disk; if (...) choose_next_idle = 1 continue; for_each_rdev -> iterate next rdev if (pending == 0) best_disk = disk; -> choose the next idle disk break; if (choose_next_idle) -> keep using this rdev if there are no other idle disk contine However, commit 2e52d449bcec ("md/raid1: add failfast handling for reads.") remove the code: - /* If device is idle, use it */ - if (pending == 0) { - best_disk = disk; - break; - } Hence choose next idle will never work now, fix this problem by following: 1) don't set best_disk in this case, read_balance() will choose the best disk after iterating all the disks; 2) add 'pending' so that other idle disk will be chosen; 3) add a new local variable 'sequential_disk' to record the disk, and if there is no other idle disk, 'sequential_disk' will be chosen; Fixes: 2e52d449bcec ("md/raid1: add failfast handling for reads.") Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 032a6a6c3730..97db9add27df 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -598,13 +598,12 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect const sector_t this_sector = r1_bio->sector; int sectors; int best_good_sectors; - int best_disk, best_dist_disk, best_pending_disk; + int best_disk, best_dist_disk, best_pending_disk, sequential_disk; int disk; sector_t best_dist; unsigned int min_pending; struct md_rdev *rdev; int choose_first; - int choose_next_idle; /* * Check if we can balance. We can balance on the whole @@ -615,11 +614,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect sectors = r1_bio->sectors; best_disk = -1; best_dist_disk = -1; + sequential_disk = -1; best_dist = MaxSector; best_pending_disk = -1; min_pending = UINT_MAX; best_good_sectors = 0; - choose_next_idle = 0; clear_bit(R1BIO_FailFast, &r1_bio->state); if ((conf->mddev->recovery_cp < this_sector + sectors) || @@ -712,7 +711,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect int opt_iosize = bdev_io_opt(rdev->bdev) >> 9; struct raid1_info *mirror = &conf->mirrors[disk]; - best_disk = disk; /* * If buffered sequential IO size exceeds optimal * iosize, check if there is idle disk. If yes, choose @@ -731,15 +729,22 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect mirror->next_seq_sect > opt_iosize && mirror->next_seq_sect - opt_iosize >= mirror->seq_start) { - choose_next_idle = 1; - continue; + /* + * Add 'pending' to avoid choosing this disk if + * there is other idle disk. + */ + pending++; + /* + * If there is no other idle disk, this disk + * will be chosen. + */ + sequential_disk = disk; + } else { + best_disk = disk; + break; } - break; } - if (choose_next_idle) - continue; - if (min_pending > pending) { min_pending = pending; best_pending_disk = disk; @@ -751,6 +756,13 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect } } + /* + * sequential IO size exceeds optimal iosize, however, there is no other + * idle disk, so choose the sequential disk. + */ + if (best_disk == -1 && min_pending != 0) + best_disk = sequential_disk; + /* * If all disks are rotational, choose the closest disk. If any disk is * non-rotational, choose the disk with less pending request even the From patchwork Wed Feb 28 11:43:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207811 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3298742dyb; Wed, 28 Feb 2024 04:05:27 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU8LEzROPzWVB1SZbv4Tg2MZU5GITlXAm1POXlhCgzZ55qBNUWa3HCUD44vP9lTFwP6SEVI3s+TVXdBFdAFt6jtO7kNew== X-Google-Smtp-Source: AGHT+IHcH980vQEgitJoN6B7zxQnt4EISlN5wM2MwdErLlWTk0O59ejty5/5qaCWr1D9lofKvAgD X-Received: by 2002:a17:902:db01:b0:1dc:adad:f54b with SMTP id m1-20020a170902db0100b001dcadadf54bmr7435074plx.44.1709121927577; Wed, 28 Feb 2024 04:05:27 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121927; cv=pass; d=google.com; s=arc-20160816; b=DRbkZTKX3F7vbUlmNBOdXXYCjSUnjQgEo1uDyC5EFlZ4m0MJPp9yKTcr9T/5nWyglk rj4n0SCjv6EWlV3Pqq44BR7gCMw3SYgNUFC76Dk3H1EtEjn+ijOy8IGIvjL+y9vzUm9S OOfhWcGK0fuFVaZpR9VcnqUHuCwrk1CiKtCdOJIqZwUMTpPZCIIGfSvCpNjwWHZjLBBX aQTITKVPhv4y64xpYbPXjZXazjXKWfRNfFuAsJlJxbRd2vqivYv0sIQtU1P5P+sNWlhn btkGrhN2i1mjn75wGL+CcoQy5U+jz7apR9Bwin6k10eL6X9CWFdM6Kf9TG7xAttn5tOd xpuQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=AYI14mbTiQ5uTwOzEYZMcKeL8x961l4a6aNfhBWGGBQ=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=UY9ETLm3UQwzR0ncmFFDCsVaQpN3DeFHG5oEOzRPz2eG78wG0yvT9smSeU8opbZsWL OHRG/cBJxbU1GrFIRmrbCEtAF7cEr75RaXS/C9/Usd4Ks0Z1kxHPUzDhvpPxhjweQXBJ a/kuPlq/5QjgreLCATGPKrplYSCEh/KuYNfXRvEEPZX/U3Sw+kdV1fdfXnYgsWkCVGLx 9mvCRxCxr8sR6tU0X+EtI1R65gLFS0xPByfeeQhgeOc/9ApPQ7Agl2ZX/o1g4LgLg85z SjBSu8XP/KjjkClexXckPCw7VmBEgjHQRmG1jNJWntHsWHeDlLTl7emvMqtTr8WrGD6A zTWQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84961-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84961-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id y16-20020a17090322d000b001db29e3ba29si3422800plg.77.2024.02.28.04.05.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:05:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84961-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84961-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84961-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 350B8287CE7 for ; Wed, 28 Feb 2024 11:55:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DD051158D98; Wed, 28 Feb 2024 11:49:42 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F19F06CDCD; Wed, 28 Feb 2024 11:49:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; cv=none; b=fwmmIFrJYOomEx80d40T7nF8L7vySuYdYzWRWCZdfHyZZKghFC8Ggp4re2UNLQwbEO0rz8aGBwM00/KJckmj4+AB6CObmwh/cKpIDsk7AcUC2qptlPXCiQT409FM+Do2Ry157WKY3LoYhm0soZBCoOTGETVRzccLQMEwqMydgH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; c=relaxed/simple; bh=QYOlqih7BF6kX5BYdwF9n3/QSS94mDOA3t9MJUII4Qw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QCL988FAR4eKU09DpTpg2Cd7wWUaVVRO45y0W2z5pm3OjC04CumJp9ofbelDCzHh6OGXZtlh2H4uTiKLnensIt4VMorroUF8lXqozT06/ShYceaukaGp0X/cbngWdd3oeJfQV9f8nmMalmhJIsKz8LnRyUiMJB+dcQYgriC8r/4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLt59BMz4f3lgF; Wed, 28 Feb 2024 19:49:26 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 10EB11A0232; Wed, 28 Feb 2024 19:49:34 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S9; Wed, 28 Feb 2024 19:49:33 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 05/11] md/raid1-10: add a helper raid1_check_read_range() Date: Wed, 28 Feb 2024 19:43:27 +0800 Message-Id: <20240228114333.527222-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S9 X-Coremail-Antispam: 1UD129KBjvJXoW7AFy8Zw15tF1rurW5CF4kWFg_yoW8KFy5pr 4Yya43tr1UK3y3W3W3uF1xC34FyayfWFW8GrWfX3WDWry5Ga9akF97JryjgFyDWry3Xw12 qa1j9rWxua47CaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZ X7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144234349032348 X-GMAIL-MSGID: 1792144234349032348 From: Yu Kuai The checking and handler of bad blocks appear many timers during read_balance() in raid1 and raid10. This helper will be used in later patches to simplify read_balance() a lot. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1-10.c | 49 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 512746551f36..9bc0f0022a6c 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -227,3 +227,52 @@ static inline bool exceed_read_errors(struct mddev *mddev, struct md_rdev *rdev) return false; } + +/** + * raid1_check_read_range() - check a given read range for bad blocks, + * available read length is returned; + * @rdev: the rdev to read; + * @this_sector: read position; + * @len: read length; + * + * helper function for read_balance() + * + * 1) If there are no bad blocks in the range, @len is returned; + * 2) If the range are all bad blocks, 0 is returned; + * 3) If there are partial bad blocks: + * - If the bad block range starts after @this_sector, the length of first + * good region is returned; + * - If the bad block range starts before @this_sector, 0 is returned and + * the @len is updated to the offset into the region before we get to the + * good blocks; + */ +static inline int raid1_check_read_range(struct md_rdev *rdev, + sector_t this_sector, int *len) +{ + sector_t first_bad; + int bad_sectors; + + /* no bad block overlap */ + if (!is_badblock(rdev, this_sector, *len, &first_bad, &bad_sectors)) + return *len; + + /* + * bad block range starts offset into our range so we can return the + * number of sectors before the bad blocks start. + */ + if (first_bad > this_sector) + return first_bad - this_sector; + + /* read range is fully consumed by bad blocks. */ + if (this_sector + *len <= first_bad + bad_sectors) + return 0; + + /* + * final case, bad block range starts before or at the start of our + * range but does not cover our entire range so we still return 0 but + * update the length with the number of sectors before we get to the + * good ones. + */ + *len = first_bad + bad_sectors - this_sector; + return 0; +} From patchwork Wed Feb 28 11:43:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207799 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3293376dyb; Wed, 28 Feb 2024 03:55:59 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXcQPPtj8rxvtTFZSCxUAJGl9E5In3ho41r8ET6z3TloW0KYBDQEBwHoacH/EYwvync9ByipnqThjQxyC2i46/gJqcxww== X-Google-Smtp-Source: AGHT+IEeCQDn0jl1IBUdK40UO31xHNwlENIPhOqQlsh6QVsHc9QL84VfMbQjrffSb/0bJVZrY1Bq X-Received: by 2002:a17:906:1183:b0:a3e:e869:a151 with SMTP id n3-20020a170906118300b00a3ee869a151mr8375552eja.45.1709121359655; Wed, 28 Feb 2024 03:55:59 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121359; cv=pass; d=google.com; s=arc-20160816; b=YGLwv4kqQIdnhQk7LZK6/a6UthDjd1YI2VjqICz/CnQnbiAmvHmh3NbGe2QX1R5hnE nPO8axferH358wOWrbKpko+8fmOBA6pGgmzOc4wBdtkQeSPJ7zofrRMnE912Ee8E7n36 foYhAvrRkV45ZiynCbg03CTOhk781LaRBFFdeJdbodv/tDiTbmPa50DLtVN2q8THKHGY Ub1JK6MZtMOPUnTQoxis91H7A59B3KVTnr/AmpfZDF0sRK/awFtOKI7MIG+cs2kkkY9T 22cNwpYqo28y4ZXw68WSfEWmOchOG7/4NERaQ0m10ZGMBVnZ1ZgJJUX2BNaL0PsqsBEY 0Tuw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=S3MtL+BwuZllyiGUvJHWRN9J9vlTkbXy+HrWYVpGX1o=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=V5n7OC1nffFpIDKvCJ+VEdSZOFXZR76DThy2nHYOl+9UqQB1gttNod2QQ9HPtzAg4H 7Fm091WkoHwWMk13tNmScHr0rEDcvvh6e+QxDRSd8VjM0el5bKJz+CZWS8qcXAU71RRC dZVUy7K3acPaU/vvWTIzmS7RTxEpTPuHGOrM9z2G9G2nrZl05yA3qKeu0MSeA1FB0yZw dHPugl3U2/XlVUwlhLk/EuWaDB2eGiTwxCKM21h/njMXx5bMWxKPBdy7ZgI+yDPcyhWa 1Zk4SQgrd8PMRoJQpqmuccrcVTXAGYE1r45jHy8HGh4eiLAfb1xLBf46E+2CIdnUSvZo O+CQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84965-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84965-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id z21-20020a1709060ad500b00a440f1c762dsi295093ejf.50.2024.02.28.03.55.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 03:55:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84965-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84965-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84965-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 1FE381F22DF0 for ; Wed, 28 Feb 2024 11:55:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4F4FF15A4A7; Wed, 28 Feb 2024 11:49:44 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1A216CDD0; Wed, 28 Feb 2024 11:49:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; cv=none; b=aR9OEjx3lhiaRxOOrNnH906C5Z+g53LRts3dWJxy5JR9fMEAHb6K2ltCM1IUD12z+MqPadbp/cYd++Vcdf81urgpaO4FP4page/iXJEt9MoLk1NXHCfqUJOl7RFh1Onua1E6Ij1wcedylkeSVR71NHJtLMKPgbiHYIbjBszKnnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; c=relaxed/simple; bh=kfl4DSCdwzHvDQU2lLSIAUyUCCZRL9xKBKE9s71DjHo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AIfr1CxGZmMqAzCGFAsRuLPThRetfEn7/KUgKvmvKOEmBDwMJE4xzi0hTO3zC9wg9oJUxXcTrDh4wYMN2+y5690pZN/SxoUtXE8YeGfHWNhO/K5Nm612dltnaI5O0wr15UcZpN61vbSgumV4J0eDXN4yzGLnGL4/ZzVR3FMd8Cg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLz16bMz4f3kKM; Wed, 28 Feb 2024 19:49:31 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 80AD41A0175; Wed, 28 Feb 2024 19:49:34 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S10; Wed, 28 Feb 2024 19:49:34 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 06/11] md/raid1-10: factor out a new helper raid1_should_read_first() Date: Wed, 28 Feb 2024 19:43:28 +0800 Message-Id: <20240228114333.527222-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S10 X-Coremail-Antispam: 1UD129KBjvJXoWxXr17Jw4DJrWruF4DWr4xCrg_yoWrWF43pw 4avF93AryUKay3Aws8A3yDua4Sy34rWFWUKFWxWw4kuFySqFW5Way5GryY9r1DuF95Jw17 Xa45GrW5C3ZrJFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JV WxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZ X7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792143638649484355 X-GMAIL-MSGID: 1792143638649484355 From: Yu Kuai If resync is in progress, read_balance() should find the first usable disk, otherwise, data could be inconsistent after resync is done. raid1 and raid10 implement the same checking, hence factor out the checking to make code cleaner. Noted that raid1 is using 'mddev->recovery_cp', which is updated after all resync IO is done, while raid10 is using 'conf->next_resync', which is inaccurate because raid10 update it before submitting resync IO. Fortunately, raid10 read IO can't concurrent with resync IO, hence there is no problem. And this patch also switch raid10 to use 'mddev->recovery_cp'. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1-10.c | 20 ++++++++++++++++++++ drivers/md/raid1.c | 15 ++------------- drivers/md/raid10.c | 13 ++----------- 3 files changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 9bc0f0022a6c..2ea1710a3b70 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -276,3 +276,23 @@ static inline int raid1_check_read_range(struct md_rdev *rdev, *len = first_bad + bad_sectors - this_sector; return 0; } + +/* + * Check if read should choose the first rdev. + * + * Balance on the whole device if no resync is going on (recovery is ok) or + * below the resync window. Otherwise, take the first readable disk. + */ +static inline bool raid1_should_read_first(struct mddev *mddev, + sector_t this_sector, int len) +{ + if ((mddev->recovery_cp < this_sector + len)) + return true; + + if (mddev_is_clustered(mddev) && + md_cluster_ops->area_resyncing(mddev, READ, this_sector, + this_sector + len)) + return true; + + return false; +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 97db9add27df..6e3c0d3e0b75 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -605,11 +605,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect struct md_rdev *rdev; int choose_first; - /* - * Check if we can balance. We can balance on the whole - * device if no resync is going on, or below the resync window. - * We take the first readable disk when above the resync window. - */ retry: sectors = r1_bio->sectors; best_disk = -1; @@ -619,16 +614,10 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect best_pending_disk = -1; min_pending = UINT_MAX; best_good_sectors = 0; + choose_first = raid1_should_read_first(conf->mddev, this_sector, + sectors); clear_bit(R1BIO_FailFast, &r1_bio->state); - if ((conf->mddev->recovery_cp < this_sector + sectors) || - (mddev_is_clustered(conf->mddev) && - md_cluster_ops->area_resyncing(conf->mddev, READ, this_sector, - this_sector + sectors))) - choose_first = 1; - else - choose_first = 0; - for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { sector_t dist; sector_t first_bad; diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index d5a7a621f0f0..8aecdb1ccc16 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -748,17 +748,8 @@ static struct md_rdev *read_balance(struct r10conf *conf, best_good_sectors = 0; do_balance = 1; clear_bit(R10BIO_FailFast, &r10_bio->state); - /* - * Check if we can balance. We can balance on the whole - * device if no resync is going on (recovery is ok), or below - * the resync window. We take the first readable disk when - * above the resync window. - */ - if ((conf->mddev->recovery_cp < MaxSector - && (this_sector + sectors >= conf->next_resync)) || - (mddev_is_clustered(conf->mddev) && - md_cluster_ops->area_resyncing(conf->mddev, READ, this_sector, - this_sector + sectors))) + + if (raid1_should_read_first(conf->mddev, this_sector, sectors)) do_balance = 0; for (slot = 0; slot < conf->copies ; slot++) { From patchwork Wed Feb 28 11:43:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207798 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3293266dyb; Wed, 28 Feb 2024 03:55:40 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWP+edlcI2+OMix1MDazU+16J6C5dgqdiT1PfLXDN+5fpd9qCTQglhxkdvuRiYMdpU8CbzRlnDjpBvQsTW+4BRxKX3AvA== X-Google-Smtp-Source: AGHT+IEeqVrpJMfoaUEV/CtUaL81lCPLmbMygI2+WNnM9Ec8MPkvjwA9B7wCBPtvPzE/CoKlTsWs X-Received: by 2002:a1f:c7c3:0:b0:4c0:24e6:f49d with SMTP id x186-20020a1fc7c3000000b004c024e6f49dmr8662851vkf.1.1709121340306; Wed, 28 Feb 2024 03:55:40 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121340; cv=pass; d=google.com; s=arc-20160816; b=sKB+ef0EVP6BTxQFqCQxKPE5ILaoZqE6C6/uTF9qLUD0QaS+oe7mPEsMTf3QOdsGoY FeNp+sbq6ESvQj9/mXR9BNK9rCx0mkeeIqJZrG8fAtJAvOeg17+0SIupac19va81LjPw eEoPSV//rZ8UxAy5z3Roxq06xtidgEKv6Zcg7fn5p7Dwf+2oCCq7ROOMMhOXVkOoqdhO XwMcp8zqc2Xocc1C31TMnz2hazn+YjbY3IxL4plOOuoT8+5o+yjYyg8At3AsQSQ4UVHK JUWkOvILcaAsO6n9iy/IAgikqo6v2jQMA8VbpgL9Yj707Kh2ZuEc4amXCvAXofGE4n2i 5FTA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=mKKOFsCGz6ewfK5+OdsD1wu0Gz2FdGP8+n0Frovu/is=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=cbJRtNznTZiHD76h1S9h6S6vaHGYsDxT1eL+O4CgL/w3Nb1MjiqZeC+4cJpBeYZR3L mgZsSfBHzQjagMB1yUuKmD5n+jVIqdECclI0QWgwzluHEkg3tibrGOOzlK7Xv7Q/D5KW kAnk6o5731cppGeQwsXyQe/Ut5rJjK4k3o3qGetp4NjwYnuFH+E0qVq6EZw8A5ePgCjB QC97kh+sluctY+KVWziJZA+8lhTS47rVpilT6refWR/Akj3y84U6uNsryD8ij5d2JTym LkSiaLH9wTjmkALHH+NEi5krM954gfGShTxUU185KFakGDuHxbEDdp0ZMiF73JCVXDnq TjWw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84962-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84962-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id y18-20020ac87092000000b0042e7b7c14a8si7565437qto.159.2024.02.28.03.55.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 03:55:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84962-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84962-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84962-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id C7B001C23801 for ; Wed, 28 Feb 2024 11:55:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 60F39159584; Wed, 28 Feb 2024 11:49:43 +0000 (UTC) Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 242BC6CDD8; Wed, 28 Feb 2024 11:49:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; cv=none; b=WEttgoy4OBuBbdSECC9gdr0hzr32LhpAlpWG7oUNgW3hSqEl5ImDz7yXpnwzGnsKqNmPq14iAuXseduwWDasbfke5ZcKGMGX/PPsBfRPG+WH4Uzz/vk5FEuTFXdahhB228iYvpNPfmnMeTcD/NFA2IWXQDxeUPcbl7hKMDP8wdM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; c=relaxed/simple; bh=fAkNIprYsZH5PQBVb31gePLzngvp2TZnuFAypduC6Lc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XhIP5p4vIgDZGbrFnd25sody5QURRQm7+t+9P1mjKcM7hsvKEtpgjLS3msXoAvmDfGI1bCQP46k1bgzyjC8y0CxQdjRTgCx6KKK/clOv966f4g7bulUY8uqExajtnX8Ukg8UY10LM3Ryk0e/5ESGQzdD7899M9qhBFcPqZmPUzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLz4MtJz4f3kKl; Wed, 28 Feb 2024 19:49:31 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id F00B91A0283; Wed, 28 Feb 2024 19:49:34 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S11; Wed, 28 Feb 2024 19:49:34 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 07/11] md/raid1: factor out read_first_rdev() from read_balance() Date: Wed, 28 Feb 2024 19:43:29 +0800 Message-Id: <20240228114333.527222-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S11 X-Coremail-Antispam: 1UD129KBjvJXoWxZFW8CrW5GFW3Jr1UAw13Arb_yoWrCw47pw 45AFZ3tryUXryrZws8J3yDWr93t34fJF48GrZ7Xwnagwn3KrWqgFyUGrya9Fy5Crs8Jw1U Zw15Ar4ak3Z7KFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JV WxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZ X7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792143618295275292 X-GMAIL-MSGID: 1792143618295275292 From: Yu Kuai read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the first rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 63 +++++++++++++++++++++++++++++++++------------- 1 file changed, 46 insertions(+), 17 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 6e3c0d3e0b75..b42b947bbd34 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -579,6 +579,47 @@ static sector_t align_to_barrier_unit_end(sector_t start_sector, return len; } +static void update_read_sectors(struct r1conf *conf, int disk, + sector_t this_sector, int len) +{ + struct raid1_info *info = &conf->mirrors[disk]; + + atomic_inc(&info->rdev->nr_pending); + if (info->next_seq_sect != this_sector) + info->seq_start = this_sector; + info->next_seq_sect = this_sector + len; +} + +static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio, + int *max_sectors) +{ + sector_t this_sector = r1_bio->sector; + int len = r1_bio->sectors; + int disk; + + for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { + struct md_rdev *rdev; + int read_len; + + if (r1_bio->bios[disk] == IO_BLOCKED) + continue; + + rdev = conf->mirrors[disk].rdev; + if (!rdev || test_bit(Faulty, &rdev->flags)) + continue; + + /* choose the first disk even if it has some bad blocks. */ + read_len = raid1_check_read_range(rdev, this_sector, &len); + if (read_len > 0) { + update_read_sectors(conf, disk, this_sector, read_len); + *max_sectors = read_len; + return disk; + } + } + + return -1; +} + /* * This routine returns the disk from which the requested read should * be done. There is a per-array 'next expected sequential IO' sector @@ -603,7 +644,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect sector_t best_dist; unsigned int min_pending; struct md_rdev *rdev; - int choose_first; retry: sectors = r1_bio->sectors; @@ -614,10 +654,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect best_pending_disk = -1; min_pending = UINT_MAX; best_good_sectors = 0; - choose_first = raid1_should_read_first(conf->mddev, this_sector, - sectors); clear_bit(R1BIO_FailFast, &r1_bio->state); + if (raid1_should_read_first(conf->mddev, this_sector, sectors)) + return choose_first_rdev(conf, r1_bio, max_sectors); + for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { sector_t dist; sector_t first_bad; @@ -663,8 +704,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * bad_sectors from another device.. */ bad_sectors -= (this_sector - first_bad); - if (choose_first && sectors > bad_sectors) - sectors = bad_sectors; if (best_good_sectors > sectors) best_good_sectors = sectors; @@ -674,8 +713,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect best_good_sectors = good_sectors; best_disk = disk; } - if (choose_first) - break; } continue; } else { @@ -690,10 +727,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect pending = atomic_read(&rdev->nr_pending); dist = abs(this_sector - conf->mirrors[disk].head_position); - if (choose_first) { - best_disk = disk; - break; - } /* Don't change to another disk for sequential reads */ if (conf->mirrors[disk].next_seq_sect == this_sector || dist == 0) { @@ -769,13 +802,9 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect rdev = conf->mirrors[best_disk].rdev; if (!rdev) goto retry; - atomic_inc(&rdev->nr_pending); - sectors = best_good_sectors; - - if (conf->mirrors[best_disk].next_seq_sect != this_sector) - conf->mirrors[best_disk].seq_start = this_sector; - conf->mirrors[best_disk].next_seq_sect = this_sector + sectors; + sectors = best_good_sectors; + update_read_sectors(conf, disk, this_sector, sectors); } *max_sectors = sectors; From patchwork Wed Feb 28 11:43:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207814 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3300173dyb; Wed, 28 Feb 2024 04:07:43 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWTZGqQcjnS8Ic8xWQm2g7KTL7iS7cqqBNyOf/Q3/W4ojmf3aNHYDs3qq/AUfa/WkPc49o6MqnrvO0gpZcXEViP1EA/oA== X-Google-Smtp-Source: AGHT+IE14cHg5UMj7FYVFYdflg2tWMtYYKO64oc38AAkaq0XF3j1vd4GWG2VNWcic0Gny2trHCP7 X-Received: by 2002:a92:c08b:0:b0:365:4004:83bc with SMTP id h11-20020a92c08b000000b00365400483bcmr14353668ile.14.1709122063606; Wed, 28 Feb 2024 04:07:43 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709122063; cv=pass; d=google.com; s=arc-20160816; b=Hnka6FHYmmdBVXyGr1QbMITcNhp9BP4Xg1Vv7oDMgfRT0d724U+whotu8kYL610Wph SuzhahGzVa3NtlmGZpq0RNlAQTKpxcr5jAUGK1ejgBm44Jui0xhYIGz0JE7TOCARfcrV sSj/917FOy4BTsBWaRU8JR4/P7h4EBJ8xxgzs/kUw5MigFJIG3qpO2wPzOxSWPICfeoC BECDz6UpvimPzfkVwcqkKAPsNbs3J9mgzSJUCwKXockrmUbuFRme/gBHAC+q92a6zRls EjI9FqI95Cb6nvORLKRWUS6b1nru9GPbaZQPtdVmcvqHu5ep/vkMx/VhIsWy9h7ctnWV PZFg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=cKWV2NNsigWUYW96bZovwBuy0KyL7Uspb6cSJ38ynco=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=DmDY7K5m/XA7ugE8z4poKUpscinclKc6BtnW0MFicCg7jXOzw2M1zthkAd6FbmldMr fq26rO5OWW2C5+ztsIMgE5sLh94d1Hc6yarAMuvR1kxGqCUi+vbPu039Qf2WbErvmcbb kBkzic6PvTcsLRmBtjeqRXWObRVxmSSaxWF9jq5paXoPw+rHv3JMi+Is1aV6EZ/FtWC+ VSVsFva5pjGLoIahIHcJWVzb/cC5hkXhWQvywqnkb5cVW3LX6FpvEwAM4xL3CjbY9rsd N3fNBXO6jMWWdbk8Mkh7Eo0/9e298boW0L+jDIJE+wEJXD0CW0UOjFc2ID4PymYyj+Od u5HQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84963-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84963-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id a126-20020a636684000000b005dc8914839bsi7313163pgc.5.2024.02.28.04.07.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:07:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84963-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84963-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84963-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 06DE0289452 for ; Wed, 28 Feb 2024 11:55:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 82893159590; Wed, 28 Feb 2024 11:49:43 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A69CE70CA0; Wed, 28 Feb 2024 11:49:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; cv=none; b=jzDIoqRx8NWwodmODZ4Rrw3bLXkDcHkK53EcMjHt3nZ+3xzzbJ0PwQQm54wvCPSnS3O9rjeJTjTxUU5y79HTZ6NCEfCyZbg1fMbdjlfP7zzC7MYsQcMiPCYX9VbjBPYMh1tKmbmwG+BvxzPowfv3N+HCDHYgnKtjntxkZGo9Rm0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120980; c=relaxed/simple; bh=Cxh4fxQ7w5CVZ5id4tlW0SqHG+Sx6hsAJJcAlSeebGg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=e63JZbZBJRW+/0GXwVof8l5m7LtR5582oMvI+HQK4GC+LLBP9kIL8uEcft2lpcsbuBLTvbZjyEcbS/tBBtkzWGHLnQWlboI/c8fbvcxMxigaR1S+k7YEGuRQXC9X4Gjnm+ygfrAbrQ+qZ0/I74sPlqKzFLRnEc9fvGTiPb+jUzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCM00WZlz4f3kL1; Wed, 28 Feb 2024 19:49:32 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 6C1831A0283; Wed, 28 Feb 2024 19:49:35 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S12; Wed, 28 Feb 2024 19:49:35 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 08/11] md/raid1: factor out choose_slow_rdev() from read_balance() Date: Wed, 28 Feb 2024 19:43:30 +0800 Message-Id: <20240228114333.527222-9-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S12 X-Coremail-Antispam: 1UD129KBjvJXoWxZF4rKw4kZryUuF1DXw1DJrb_yoW5ZF15pa y3CFWSqryUXry7uws8J3yDur9aga4rGFW8GryxJw1S9r9agrZ09FWxGFyagFyUWrWrJFyU Xw15ZrW293WktFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JV WxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZ X7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144377240716232 X-GMAIL-MSGID: 1792144377240716232 From: Yu Kuai read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the slow rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 69 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 52 insertions(+), 17 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index b42b947bbd34..ccf05391d597 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -620,6 +620,53 @@ static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio, return -1; } +static int choose_slow_rdev(struct r1conf *conf, struct r1bio *r1_bio, + int *max_sectors) +{ + sector_t this_sector = r1_bio->sector; + int bb_disk = -1; + int bb_read_len = 0; + int disk; + + for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { + struct md_rdev *rdev; + int len; + int read_len; + + if (r1_bio->bios[disk] == IO_BLOCKED) + continue; + + rdev = conf->mirrors[disk].rdev; + if (!rdev || test_bit(Faulty, &rdev->flags) || + !test_bit(WriteMostly, &rdev->flags)) + continue; + + /* there are no bad blocks, we can use this disk */ + len = r1_bio->sectors; + read_len = raid1_check_read_range(rdev, this_sector, &len); + if (read_len == r1_bio->sectors) { + update_read_sectors(conf, disk, this_sector, read_len); + return disk; + } + + /* + * there are partial bad blocks, choose the rdev with largest + * read length. + */ + if (read_len > bb_read_len) { + bb_disk = disk; + bb_read_len = read_len; + } + } + + if (bb_disk != -1) { + *max_sectors = bb_read_len; + update_read_sectors(conf, bb_disk, this_sector, bb_read_len); + } + + return bb_disk; +} + /* * This routine returns the disk from which the requested read should * be done. There is a per-array 'next expected sequential IO' sector @@ -673,23 +720,8 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect if (!test_bit(In_sync, &rdev->flags) && rdev->recovery_offset < this_sector + sectors) continue; - if (test_bit(WriteMostly, &rdev->flags)) { - /* Don't balance among write-mostly, just - * use the first as a last resort */ - if (best_dist_disk < 0) { - if (is_badblock(rdev, this_sector, sectors, - &first_bad, &bad_sectors)) { - if (first_bad <= this_sector) - /* Cannot use this */ - continue; - best_good_sectors = first_bad - this_sector; - } else - best_good_sectors = sectors; - best_dist_disk = disk; - best_pending_disk = disk; - } + if (test_bit(WriteMostly, &rdev->flags)) continue; - } /* This is a reasonable device to use. It might * even be best. */ @@ -808,7 +840,10 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect } *max_sectors = sectors; - return best_disk; + if (best_disk >= 0) + return best_disk; + + return choose_slow_rdev(conf, r1_bio, max_sectors); } static void wake_up_barrier(struct r1conf *conf) From patchwork Wed Feb 28 11:43:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207813 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3299448dyb; Wed, 28 Feb 2024 04:06:36 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVdRo2uwWA0Y0cRa8jALDkkzvDxwrV0rvK9dA2qNrBiAVhLQJ8jSNIj8WJ1st7cFg9QRaBL825nytBTplg4gID7UExKng== X-Google-Smtp-Source: AGHT+IH7UBH4b0y0ONQEOg+pYFz07BBxC8PziGbLMrTNNHDTZtCj+rsaVmVjwy7ka62qBiBOg0LY X-Received: by 2002:a17:902:ea09:b0:1db:cb13:6792 with SMTP id s9-20020a170902ea0900b001dbcb136792mr14839899plg.5.1709121996362; Wed, 28 Feb 2024 04:06:36 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121996; cv=pass; d=google.com; s=arc-20160816; b=pqLA6GbyPQvKwSBwJycwr+XTT3Aa9CCdzCV9lVoMzbmvhIQnnuVM1z4DGJmcmh52jp WI2dyGy8+xVeQQVdkm1IH2jceQ/dPtlOJQYoZqdAzOKFQHM1j1hXDbTIs3VhYnBEXsBv mCPCHuVU5/oV2HEs3RAUiIN6Ip847r/6G4zgWtvGhPvl0n382L05yhA/Ww1A8rIk6Dvd fMzA4cEm31fRCOcq7ev8qs9ODzsuksF7uoJ2wNgAky3BuEN37QNeBIsxReaRoJDZ63iL jp0G7ChHuEDM6Q9VMn2bezS/LJ5fSC+eLK1N0L453MY3caJ+mAhLxovLpHNHi1hIgGbQ yggQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=Znl21bCxpAspfCoqO+4d1cYJincKOlhgDQv0fqqhpqk=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=C/dXcuiJwZRYGKmjLRe88lmU+F8zUV7Ght7b2WlObLr3L12umISDVHh/w45an+krZA ZEYFFCX9Y+xnKVZ0l1+1/n01KfEwajObIZTIwfbgyYmW51qx8FPtg7q05EKvmSy6RRVx kzk0WSoK32cD9Oy+TieFKTxW0GAnfNAfmLMbwQ24aH5UlRi9i8BjGqMUg9pEzd3FUEwq bjxQzQJ6exSSBNvW2+YFQeyXmB8F4GQLyWDwA6K5nsm1AbxnaMLxeiJJzQWkULHYFf5Y U7LUh2G2vEqlDnJ3iosRt4oPedELtmDKU7n7SQ5tLwB6IFzGvgUpZz8jb3CSqa0WgyNO 4Oog==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84964-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84964-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id p3-20020a170902e74300b001dcb16f6c19si3285902plf.363.2024.02.28.04.06.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:06:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84964-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84964-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84964-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B490628D8D2 for ; Wed, 28 Feb 2024 11:55:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 17BC915A49C; Wed, 28 Feb 2024 11:49:44 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F41E66CDB4; Wed, 28 Feb 2024 11:49:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; cv=none; b=slyNkaYGqIAiIoqvCWdwSAnCJJzhKdov8TH4pEuO4dv4Y9jDp22K6GPyeEpCt6gIp8b5ybtH+1zQ4FGkoWsjgYfKeUIRD6S/nNSej+2KjN1JSG3Xaidb36BfwdqJAhV2ZT6n+t6ysuMbMIxLdWq6YXI2rImDsMW0741XQuBZ4zU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; c=relaxed/simple; bh=pzr8Ku2BRQFv4XCqmFV3WoGNGchB9CY78i4NXD2/Or0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nLAkcFjZKujbXb4TzrwNf5ynFkH0GsqdZpzbRKjrvgxbmrgAakS/vqRI/NzWgxcUcryQWw9wJAQ2Sh6zqq213xeo+wZquIrDXKOQ/gB3Rvase+jAk2auA8CVTbl0A9AlNLpBOESXtL2LfuDm1FE64ZMb1qsk1QBkIhOXJ2EcNdU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCM03q33z4f3kjx; Wed, 28 Feb 2024 19:49:32 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id DC0371A0283; Wed, 28 Feb 2024 19:49:35 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S13; Wed, 28 Feb 2024 19:49:35 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 09/11] md/raid1: factor out choose_bb_rdev() from read_balance() Date: Wed, 28 Feb 2024 19:43:31 +0800 Message-Id: <20240228114333.527222-10-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S13 X-Coremail-Antispam: 1UD129KBjvJXoWxZF47Ww4ftFyxtF4fWryUKFg_yoWrJF17pw 43KFWftryUX34fWws8J3yUuryft345Ka18JryxJ3WS9r93Cr90gFW8GryYgFyUCrWrA3W7 Zw15Zr4293WkKFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr 0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144306530639342 X-GMAIL-MSGID: 1792144306530639342 From: Yu Kuai read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the rdev with bad blocks from read_balance(), there are no functional changes. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 79 ++++++++++++++++++++++++++++------------------ 1 file changed, 48 insertions(+), 31 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index ccf05391d597..13cc978dc7c0 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -620,6 +620,44 @@ static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio, return -1; } +static int choose_bb_rdev(struct r1conf *conf, struct r1bio *r1_bio, + int *max_sectors) +{ + sector_t this_sector = r1_bio->sector; + int best_disk = -1; + int best_len = 0; + int disk; + + for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { + struct md_rdev *rdev; + int len; + int read_len; + + if (r1_bio->bios[disk] == IO_BLOCKED) + continue; + + rdev = conf->mirrors[disk].rdev; + if (!rdev || test_bit(Faulty, &rdev->flags) || + test_bit(WriteMostly, &rdev->flags)) + continue; + + /* keep track of the disk with the most readable sectors. */ + len = r1_bio->sectors; + read_len = raid1_check_read_range(rdev, this_sector, &len); + if (read_len > best_len) { + best_disk = disk; + best_len = read_len; + } + } + + if (best_disk != -1) { + *max_sectors = best_len; + update_read_sectors(conf, best_disk, this_sector, best_len); + } + + return best_disk; +} + static int choose_slow_rdev(struct r1conf *conf, struct r1bio *r1_bio, int *max_sectors) { @@ -708,8 +746,6 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { sector_t dist; - sector_t first_bad; - int bad_sectors; unsigned int pending; rdev = conf->mirrors[disk].rdev; @@ -722,36 +758,8 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect continue; if (test_bit(WriteMostly, &rdev->flags)) continue; - /* This is a reasonable device to use. It might - * even be best. - */ - if (is_badblock(rdev, this_sector, sectors, - &first_bad, &bad_sectors)) { - if (best_dist < MaxSector) - /* already have a better device */ - continue; - if (first_bad <= this_sector) { - /* cannot read here. If this is the 'primary' - * device, then we must not read beyond - * bad_sectors from another device.. - */ - bad_sectors -= (this_sector - first_bad); - if (best_good_sectors > sectors) - best_good_sectors = sectors; - - } else { - sector_t good_sectors = first_bad - this_sector; - if (good_sectors > best_good_sectors) { - best_good_sectors = good_sectors; - best_disk = disk; - } - } + if (rdev_has_badblock(rdev, this_sector, sectors)) continue; - } else { - if ((sectors > best_good_sectors) && (best_disk >= 0)) - best_disk = -1; - best_good_sectors = sectors; - } if (best_disk >= 0) /* At least two disks to choose from so failfast is OK */ @@ -843,6 +851,15 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect if (best_disk >= 0) return best_disk; + /* + * If we are here it means we didn't find a perfectly good disk so + * now spend a bit more time trying to find one with the most good + * sectors. + */ + disk = choose_bb_rdev(conf, r1_bio, max_sectors); + if (disk >= 0) + return disk; + return choose_slow_rdev(conf, r1_bio, max_sectors); } From patchwork Wed Feb 28 11:43:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207821 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3302662dyb; Wed, 28 Feb 2024 04:12:08 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVFRX3vfjn5XK4BV3pnccyq2AG+QnROfhcs5PgPQfiL/Gx5K/ZYGb/MbEp3XsYUG/9M5Y3EuPAO0tAa330HgbBTYtKWUA== X-Google-Smtp-Source: AGHT+IEMQtFqQ3Z2+Sz0nDksgOTACfc+EfEUmrjUANvwDYJKxB1JS3hsx6Ju4hM3dzZUlnQfQ96L X-Received: by 2002:a17:90a:4491:b0:29a:ef31:7982 with SMTP id t17-20020a17090a449100b0029aef317982mr3030762pjg.16.1709122328019; Wed, 28 Feb 2024 04:12:08 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709122328; cv=pass; d=google.com; s=arc-20160816; b=JgUq2THtEW4N5tJZ3p6amk6uXKpV3/G8dJPEi4rRTlsxlfDrk6aGRRilopzOB8zoNI UOBTBKfbgaIyn0TcB7Kipcuxg8MigeXZ27EVgywUDK815QTER3CCUNka1ulh+hjkkLKk B6wV2Y1itKGUBYETm23W4fFs/mjWbXBGGPI3v1phn+MTl8Ok9bRmb6MasFFvkgV4P9AN aOEX7oK8Zmn1jqCrj5mbRlb0yQgXZl3My2bjW6vUgKLvT2t99kxmRAUuhdCvaL91aaRa SyEh0H/drfakK6NJnt23ZzU99sWbjywgqPCenyDEO7woKLU+MIyH9toHdowSLYl2sJEr 7wBQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=oNDIQvnbeJzHM7j6OqzduX/u6j5l9FngvdY55Z5yKg0=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=fLfqsyH7M/92HHok2kEQEyoxMSYC3aHCdYlwITFyCyuPRd5Gw0EQxoW0TmhcznApJD gmouv4Rgg3O9OCqWoTa9PbE+l7y4/mYWwfZEIdMFcJF/9nD6R36/NlZly+qrA4n/N9Fl dooiimJ7g96DlSTpEyRkZbqlA4jk8cAVAg2hyrLNuSuHttRFlh+13jnjmHFkanVRoJHG IlzDRUJlh+uE9Ht+99FuQhqr/dJexhqiO3ZqkwdTLXXGNp/gQUEKXcpcbqqmCMCCEdxX 0ABe36paWFYaspzT1qp67GG9Lb+OF8NBQBLAM3dvjA2sC45LOUfuK0KGL0SlVwRbD9Yv 4m1A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84966-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84966-ouuuleilei=gmail.com@vger.kernel.org" Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id z19-20020a17090ad79300b0029aa277aadcsi1242772pju.115.2024.02.28.04.12.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 04:12:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84966-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84966-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84966-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id E4756B2629E for ; Wed, 28 Feb 2024 11:56:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 92FB915AAA4; Wed, 28 Feb 2024 11:49:44 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78C6570CAC; Wed, 28 Feb 2024 11:49:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; cv=none; b=o5E1KgcWn9IjrLgYcDVgYfk/rDComYKNzKZVYLdWk0tNZJU6UlodelE4a3rBG4uzeflfwjbVYb9f9jNq9jSJF/maBjI8aWXhF5s+0ComIfAURXZU1TZEAc0vf4dcDLDSFkjOPA8MWvKUzllM/2Z8NUywr8BUQ43dKUIOK/a5j8w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120981; c=relaxed/simple; bh=SFtmF0ATdI2sw3sJnBeHXWhmrfqMn824br2CPkYvaxo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Nen0Yp5pcfg4BYwhu6RVVxjGBepG5D5x72yjKCtrAY5t5ioxTUjLcs7Gbh/b00exaO08yPWq6y9koDHPcM+g11j9/ODgOZglc9Rz2EAgKZZkNAtXR/01ucyG8+ytmfUq1nA989Qu16hZIF5s1VnH11IcsQ+pOB1EFcy0hb2J90A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLx069zz4f3lgQ; Wed, 28 Feb 2024 19:49:29 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 57B331A0232; Wed, 28 Feb 2024 19:49:36 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S14; Wed, 28 Feb 2024 19:49:36 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 10/11] md/raid1: factor out the code to manage sequential IO Date: Wed, 28 Feb 2024 19:43:32 +0800 Message-Id: <20240228114333.527222-11-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S14 X-Coremail-Antispam: 1UD129KBjvJXoWxZF4rAF1ftw4UKrWUGw18Grg_yoW5KFyDpa 1avwn3ZrWkXr9xu3y3Jr4UCryF9w1fGF48GFZ7A34FgrySqrWUta18K3y3Zr97J393J34U X3Z3GrW7C3WkC3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr 0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792144653896384475 X-GMAIL-MSGID: 1792144653896384475 From: Yu Kuai There is no functional change for now, make read_balance() cleaner and prepare to fix problems and refactor the handler of sequential IO. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 71 ++++++++++++++++++++++++---------------------- 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 13cc978dc7c0..526f0d977040 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -705,6 +705,31 @@ static int choose_slow_rdev(struct r1conf *conf, struct r1bio *r1_bio, return bb_disk; } +static bool is_sequential(struct r1conf *conf, int disk, struct r1bio *r1_bio) +{ + /* TODO: address issues with this check and concurrency. */ + return conf->mirrors[disk].next_seq_sect == r1_bio->sector || + conf->mirrors[disk].head_position == r1_bio->sector; +} + +/* + * If buffered sequential IO size exceeds optimal iosize, check if there is idle + * disk. If yes, choose the idle disk. + */ +static bool should_choose_next(struct r1conf *conf, int disk) +{ + struct raid1_info *mirror = &conf->mirrors[disk]; + int opt_iosize; + + if (!test_bit(Nonrot, &mirror->rdev->flags)) + return false; + + opt_iosize = bdev_io_opt(mirror->rdev->bdev) >> 9; + return opt_iosize > 0 && mirror->seq_start != MaxSector && + mirror->next_seq_sect > opt_iosize && + mirror->next_seq_sect - opt_iosize >= mirror->seq_start; +} + /* * This routine returns the disk from which the requested read should * be done. There is a per-array 'next expected sequential IO' sector @@ -768,43 +793,21 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect pending = atomic_read(&rdev->nr_pending); dist = abs(this_sector - conf->mirrors[disk].head_position); /* Don't change to another disk for sequential reads */ - if (conf->mirrors[disk].next_seq_sect == this_sector - || dist == 0) { - int opt_iosize = bdev_io_opt(rdev->bdev) >> 9; - struct raid1_info *mirror = &conf->mirrors[disk]; - - /* - * If buffered sequential IO size exceeds optimal - * iosize, check if there is idle disk. If yes, choose - * the idle disk. read_balance could already choose an - * idle disk before noticing it's a sequential IO in - * this disk. This doesn't matter because this disk - * will idle, next time it will be utilized after the - * first disk has IO size exceeds optimal iosize. In - * this way, iosize of the first disk will be optimal - * iosize at least. iosize of the second disk might be - * small, but not a big deal since when the second disk - * starts IO, the first disk is likely still busy. - */ - if (test_bit(Nonrot, &rdev->flags) && opt_iosize > 0 && - mirror->seq_start != MaxSector && - mirror->next_seq_sect > opt_iosize && - mirror->next_seq_sect - opt_iosize >= - mirror->seq_start) { - /* - * Add 'pending' to avoid choosing this disk if - * there is other idle disk. - */ - pending++; - /* - * If there is no other idle disk, this disk - * will be chosen. - */ - sequential_disk = disk; - } else { + if (is_sequential(conf, disk, r1_bio)) { + if (!should_choose_next(conf, disk)) { best_disk = disk; break; } + /* + * Add 'pending' to avoid choosing this disk if + * there is other idle disk. + */ + pending++; + /* + * If there is no other idle disk, this disk + * will be chosen. + */ + sequential_disk = disk; } if (min_pending > pending) { From patchwork Wed Feb 28 11:43:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 207805 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3293829dyb; Wed, 28 Feb 2024 03:57:14 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXejQsUmMF9FWaPdB8bhOjySvhLW9/mdeklWab1y5xNud7ETsVYjbC3IsRsHoHcpagVCQtXaNSjmcGdFPtXjaq1ejZUYw== X-Google-Smtp-Source: AGHT+IFnyomzXc9klquZw3pPEG+dIri6XB2IDGf7ZK/xwHdBcz9uC+MacbZzM5nKu4uXQL1Kgh0u X-Received: by 2002:a17:906:1988:b0:a43:a91b:829f with SMTP id g8-20020a170906198800b00a43a91b829fmr3368283ejd.64.1709121433992; Wed, 28 Feb 2024 03:57:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709121433; cv=pass; d=google.com; s=arc-20160816; b=MmpfxPN0KxODcuyGEpxY1PCrxsuiHS3oX+qBwGFrA7o5s1liVOzoz2nwd2qhupuibp zbNw5lITI5HJo+UoDZPYCdbV7EiwTV2WBn9Q3lNqeB8XRpABbM1ZI/cnBVRYhGg8l1W4 rlqicBzRgqhHLtXQTRcVNh7oDccARXVkrE3gCJEudVVTnag21gJpb+4R8B9a6igkG0qq aXk8klyCfe3daDSwIfk8BXH2rgtMWL2zZoT8I44hbDh+I95jjERZqUVKFDB5R3093Tw1 3yF0ekCH+Azz8qDohympobv+2+Oy7wymqbLQuE6NflQseH9jfaPCe+aQET1vOogu68VZ 3RKQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=sfQM65/T9KcPYm8QSJStiQ9SjYj3GUTWuAsf9OlG1Tc=; fh=Ytj0OQL2yyVgRSgKlJizbwNzVtV64xy+1TGtb7SiHpY=; b=Kgt6QUNhhGeL4ml0xmyvib5CoiLV1Treu1oR8g4g/FQnop/yTcQCeQa4lbL6HrSF/y qWFzXZ+fk7HxSKkdBJJs3v51/opDOWuA7Yt7sSbYkqs46883+uqDCh5QnAWoDL6Gv2qB 8nJ4ponkL0wc9aDoftMRhaOvltdYHW5tHm7vkUnr61GmEeP2Jj10w5VtME7O5FipcOEG c6JDq5HCnkCtQb+z9a7FbEkzLjvOnnb0B9X1bFYI0AueGmMyRhO4Yc+vooecGZt89Oqv sVT2j7zlR4JRCEmHwOA4Ptx4UEnvbi+jZtRl9m4ngKc00oOv6EKI7xS3HNx2W4/Nekdx NxeA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84969-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84969-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id dt19-20020a170906b79300b00a44039da735si455638ejb.201.2024.02.28.03.57.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 03:57:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84969-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-84969-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84969-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 1033E1F22AB4 for ; Wed, 28 Feb 2024 11:57:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 116A6158D6C; Wed, 28 Feb 2024 11:49:47 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A1EC73530; Wed, 28 Feb 2024 11:49:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120982; cv=none; b=FmaSe6s/bO+2SSGc1lXOBxlz14VDkHukBY2W3OLzMN+BEgC+aTzKWGmDM7ohazywZBk/5N7+cI8F/f8YwQ5kND62O60js+nLDHQrGeGoJJSaSwHef7v00wD20Vr/gre+02wqMnANhZdqtOa/PqfsyTAc+rlZfzhMGZFxDtrurrs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709120982; c=relaxed/simple; bh=WcnAPi+LGdvS0TdAGH0xsFUzCm5KImEKZ7D6nZzkh0g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IyrWcLgCABi/0p7jh8/J07xEMYNcOe1FOsG+4n5uSINXj1Md9g4VM/Ykkm2Cw6fbufWjp0ev3V79dxoAobY7n1GKpkBlF9NF1YrYWZwXcs55lDRhn754aTCkAWFYpQvq/BcFGDvb+kY3rv0cK//M7MrzmrVYiL4KnO/YOU9OzeQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TlCLx3JtRz4f3lgR; Wed, 28 Feb 2024 19:49:29 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id C59021A0232; Wed, 28 Feb 2024 19:49:36 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAn9g7IHd9l+eamFQ--.6969S15; Wed, 28 Feb 2024 19:49:36 +0800 (CST) From: Yu Kuai To: xni@redhat.com, paul.e.luse@linux.intel.com, song@kernel.org, shli@fb.com, neilb@suse.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.9 v3 11/11] md/raid1: factor out helpers to choose the best rdev from read_balance() Date: Wed, 28 Feb 2024 19:43:33 +0800 Message-Id: <20240228114333.527222-12-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240228114333.527222-1-yukuai1@huaweicloud.com> References: <20240228114333.527222-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAn9g7IHd9l+eamFQ--.6969S15 X-Coremail-Antispam: 1UD129KBjvJXoW3JrWrWw15Wr1UAFyfAryxXwb_yoW3Zw1Upw 45GFn2yrWUZryruwn5tr4UWrWS934fJa18GrWkG34S93sagrZ0qFnrKryY9FyDGFs3Cw12 qw15Gr47C3Z7GFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr 0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792143716603853628 X-GMAIL-MSGID: 1792143716603853628 From: Yu Kuai The way that best rdev is chosen: 1) If the read is sequential from one rdev: - if rdev is rotational, use this rdev; - if rdev is non-rotational, use this rdev until total read length exceed disk opt io size; 2) If the read is not sequential: - if there is idle disk, use it, otherwise: - if the array has non-rotational disk, choose the rdev with minimal inflight IO; - if all the underlaying disks are rotational disk, choose the rdev with closest IO; There are no functional changes, just to make code cleaner and prepare for following refactor. Co-developed-by: Paul Luse Signed-off-by: Paul Luse Signed-off-by: Yu Kuai Reviewed-by: Xiao Ni --- drivers/md/raid1.c | 175 +++++++++++++++++++++++++-------------------- 1 file changed, 98 insertions(+), 77 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 526f0d977040..c00f2aefbc56 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -730,74 +730,71 @@ static bool should_choose_next(struct r1conf *conf, int disk) mirror->next_seq_sect - opt_iosize >= mirror->seq_start; } -/* - * This routine returns the disk from which the requested read should - * be done. There is a per-array 'next expected sequential IO' sector - * number - if this matches on the next IO then we use the last disk. - * There is also a per-disk 'last know head position' sector that is - * maintained from IRQ contexts, both the normal and the resync IO - * completion handlers update this position correctly. If there is no - * perfect sequential match then we pick the disk whose head is closest. - * - * If there are 2 mirrors in the same 2 devices, performance degrades - * because position is mirror, not device based. - * - * The rdev for the device selected will have nr_pending incremented. - */ -static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sectors) +static bool rdev_readable(struct md_rdev *rdev, struct r1bio *r1_bio) { - const sector_t this_sector = r1_bio->sector; - int sectors; - int best_good_sectors; - int best_disk, best_dist_disk, best_pending_disk, sequential_disk; - int disk; - sector_t best_dist; - unsigned int min_pending; - struct md_rdev *rdev; + if (!rdev || test_bit(Faulty, &rdev->flags)) + return false; - retry: - sectors = r1_bio->sectors; - best_disk = -1; - best_dist_disk = -1; - sequential_disk = -1; - best_dist = MaxSector; - best_pending_disk = -1; - min_pending = UINT_MAX; - best_good_sectors = 0; - clear_bit(R1BIO_FailFast, &r1_bio->state); + /* still in recovery */ + if (!test_bit(In_sync, &rdev->flags) && + rdev->recovery_offset < r1_bio->sector + r1_bio->sectors) + return false; - if (raid1_should_read_first(conf->mddev, this_sector, sectors)) - return choose_first_rdev(conf, r1_bio, max_sectors); + /* don't read from slow disk unless have to */ + if (test_bit(WriteMostly, &rdev->flags)) + return false; + + /* don't split IO for bad blocks unless have to */ + if (rdev_has_badblock(rdev, r1_bio->sector, r1_bio->sectors)) + return false; + + return true; +} + +struct read_balance_ctl { + sector_t closest_dist; + int closest_dist_disk; + int min_pending; + int min_pending_disk; + int sequential_disk; + int readable_disks; +}; + +static int choose_best_rdev(struct r1conf *conf, struct r1bio *r1_bio) +{ + int disk; + struct read_balance_ctl ctl = { + .closest_dist_disk = -1, + .closest_dist = MaxSector, + .min_pending_disk = -1, + .min_pending = UINT_MAX, + .sequential_disk = -1, + }; for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) { + struct md_rdev *rdev; sector_t dist; unsigned int pending; - rdev = conf->mirrors[disk].rdev; - if (r1_bio->bios[disk] == IO_BLOCKED - || rdev == NULL - || test_bit(Faulty, &rdev->flags)) - continue; - if (!test_bit(In_sync, &rdev->flags) && - rdev->recovery_offset < this_sector + sectors) - continue; - if (test_bit(WriteMostly, &rdev->flags)) + if (r1_bio->bios[disk] == IO_BLOCKED) continue; - if (rdev_has_badblock(rdev, this_sector, sectors)) + + rdev = conf->mirrors[disk].rdev; + if (!rdev_readable(rdev, r1_bio)) continue; - if (best_disk >= 0) - /* At least two disks to choose from so failfast is OK */ + /* At least two disks to choose from so failfast is OK */ + if (ctl.readable_disks++ == 1) set_bit(R1BIO_FailFast, &r1_bio->state); pending = atomic_read(&rdev->nr_pending); - dist = abs(this_sector - conf->mirrors[disk].head_position); + dist = abs(r1_bio->sector - conf->mirrors[disk].head_position); + /* Don't change to another disk for sequential reads */ if (is_sequential(conf, disk, r1_bio)) { - if (!should_choose_next(conf, disk)) { - best_disk = disk; - break; - } + if (!should_choose_next(conf, disk)) + return disk; + /* * Add 'pending' to avoid choosing this disk if * there is other idle disk. @@ -807,17 +804,17 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * If there is no other idle disk, this disk * will be chosen. */ - sequential_disk = disk; + ctl.sequential_disk = disk; } - if (min_pending > pending) { - min_pending = pending; - best_pending_disk = disk; + if (ctl.min_pending > pending) { + ctl.min_pending = pending; + ctl.min_pending_disk = disk; } - if (dist < best_dist) { - best_dist = dist; - best_dist_disk = disk; + if (ctl.closest_dist > dist) { + ctl.closest_dist = dist; + ctl.closest_dist_disk = disk; } } @@ -825,8 +822,8 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * sequential IO size exceeds optimal iosize, however, there is no other * idle disk, so choose the sequential disk. */ - if (best_disk == -1 && min_pending != 0) - best_disk = sequential_disk; + if (ctl.sequential_disk != -1 && ctl.min_pending != 0) + return ctl.sequential_disk; /* * If all disks are rotational, choose the closest disk. If any disk is @@ -834,25 +831,49 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect * disk is rotational, which might/might not be optimal for raids with * mixed ratation/non-rotational disks depending on workload. */ - if (best_disk == -1) { - if (READ_ONCE(conf->nonrot_disks) || min_pending == 0) - best_disk = best_pending_disk; - else - best_disk = best_dist_disk; - } + if (ctl.min_pending_disk != -1 && + (READ_ONCE(conf->nonrot_disks) || ctl.min_pending == 0)) + return ctl.min_pending_disk; + else + return ctl.closest_dist_disk; +} - if (best_disk >= 0) { - rdev = conf->mirrors[best_disk].rdev; - if (!rdev) - goto retry; +/* + * This routine returns the disk from which the requested read should be done. + * + * 1) If resync is in progress, find the first usable disk and use it even if it + * has some bad blocks. + * + * 2) Now that there is no resync, loop through all disks and skipping slow + * disks and disks with bad blocks for now. Only pay attention to key disk + * choice. + * + * 3) If we've made it this far, now look for disks with bad blocks and choose + * the one with most number of sectors. + * + * 4) If we are all the way at the end, we have no choice but to use a disk even + * if it is write mostly. + * + * The rdev for the device selected will have nr_pending incremented. + */ +static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, + int *max_sectors) +{ + int disk; - sectors = best_good_sectors; - update_read_sectors(conf, disk, this_sector, sectors); - } - *max_sectors = sectors; + clear_bit(R1BIO_FailFast, &r1_bio->state); + + if (raid1_should_read_first(conf->mddev, r1_bio->sector, + r1_bio->sectors)) + return choose_first_rdev(conf, r1_bio, max_sectors); - if (best_disk >= 0) - return best_disk; + disk = choose_best_rdev(conf, r1_bio); + if (disk >= 0) { + *max_sectors = r1_bio->sectors; + update_read_sectors(conf, disk, r1_bio->sector, + r1_bio->sectors); + return disk; + } /* * If we are here it means we didn't find a perfectly good disk so