Message ID | 20240122-reclaim-fix-v1-1-761234a6d005@wdc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2506023dyb; Mon, 22 Jan 2024 03:28:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IEvJYeKuqNgwekPets/w8gRikZ7P8hHBpqXW1OlPb2kL2OVXc/zV6Ar2ECBfee9QUdpOzPC X-Received: by 2002:a17:902:dac8:b0:1d7:2328:6aad with SMTP id q8-20020a170902dac800b001d723286aadmr2113832plx.12.1705922931712; Mon, 22 Jan 2024 03:28:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705922931; cv=pass; d=google.com; s=arc-20160816; b=KLrDVZPdW/mTQopOTJ8hnrY+hM22O7LrdFThzmNFnGb/lhRqd2f1tI9abu1gC5KxdQ ZhOcUfJ76oZLIP2jXfkxe9WYA3efOuTOiyO194MObvc8Ipd8zuUp8StnwpdPgw8s22Jf olUKwbhZpqeaU2XjPXvV0V+uKvsMWdgib4BZYeDjJ6Rdx77FdNjo355yafUxjzwXS20e 1C/K9bs3YAUmigmNyKKbxeLF6SeagXXist+BT7QLV+VA5UFz+y6wrmi61abvo1Lolhfy mplVi7Z7Gwxl/mjlQt3iQh3++I9KoTHLUMleuPy3DSILDhE2seuAPi+IAfBGIL9WClAU KCgA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:wdcironportexception:ironport-sdr:ironport-sdr :dkim-signature; bh=zO94UXVYmQUB7T3nL8p57Kb6Bz4FGJNd9BxUpStpCm4=; fh=owwVPd3KXP7+r4IPWuBOqKSvRpyfd4ISITVwdR5C8Qk=; b=AUuAPoBerIwRHjnWlntMLqD0Tdakn8/uYAZxAkQgsIj2xAp+bu/pMjIyPJN4NY7fsm e0ci3DmCeeWGBUAsn84s18+c958x3d/MT1f3jVqluIO+i2JckQHjVNFOeh9p9hPj6z9A 2D6aowIHXsqA4lWb3mYZr3gJbwrryUe+X3lShi3vaWjynQFvZqR4fCssJVs/v6sCHe2F BWtKkDzQ1Hgn1qI9CRA2MJ81GLDvZS1wDETac1c4JSU4WPO/qKnehWEB9A1IgbFO3X9G BMe/XkOZdUkzw+b/T99v4UPbe264YqDGme0g+jKDCTuz6h6RXTh7uKqYwsU9xC8BL38D X/Dg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@wdc.com header.s=dkim.wdc.com header.b=CyK9yyEa; arc=pass (i=1 spf=pass spfdomain=wdc.com dkim=pass dkdomain=wdc.com dmarc=pass fromdomain=wdc.com); spf=pass (google.com: domain of linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=wdc.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id i2-20020a170902c94200b001d6fa536a8csi7969756pla.309.2024.01.22.03.28.51 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 03:28:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@wdc.com header.s=dkim.wdc.com header.b=CyK9yyEa; arc=pass (i=1 spf=pass spfdomain=wdc.com dkim=pass dkdomain=wdc.com dmarc=pass fromdomain=wdc.com); spf=pass (google.com: domain of linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32790-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=wdc.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id EBA27B23765 for <ouuuleilei@gmail.com>; Mon, 22 Jan 2024 10:52:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C28983B785; Mon, 22 Jan 2024 10:51:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="CyK9yyEa" Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51F553A8CC; Mon, 22 Jan 2024 10:51:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=68.232.143.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705920676; cv=none; b=EoKy70obk1nr+4WaOmOuxhicTmHENfJYzyLGZbzWn4SzeZt/E/0XcEcCk3YR4I+jTFV2c1ZaBxDzvWf61DF6V3xheKPJj0sAqzMMNJ0UgdnB2T0lYxcYK15HfJoWs+RFN0R6Hi1sJ/QbD8dBH0KZ4ulQBkbSCRHCYFtg/aMtW18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705920676; c=relaxed/simple; bh=wwPu7DkLQsyhuvNCEpSxZndOrmqP3sMH0uvfkUU5WjI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=nnz1YhM9VBOmbhrVFD6E9nxLWTChC31yNr8O3xtnwPDrz3jqxmUtoUloa5JK4avs30RLm23WhBz+cV6Pd4hlth2SUTFqrwounJIJ4i02Mo5711Bipyk8nAhVzpXfl1SfaHy+LzDUWcurRC5DIqEOhiFpz8blCTKXeOmjwIb6bk4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com; spf=pass smtp.mailfrom=wdc.com; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b=CyK9yyEa; arc=none smtp.client-ip=68.232.143.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=wdc.com DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1705920674; x=1737456674; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=wwPu7DkLQsyhuvNCEpSxZndOrmqP3sMH0uvfkUU5WjI=; b=CyK9yyEaEfY5v61PVmUgs1XL0ZyVIRFIiKqhKn7xvk80PI88KmJH9vWo lKq7SO9expegn46DXTl94Ew6m50byh7NYtzvP1p0AwhK7JSuIPerV1Mmq dRudOfE+F4hvxJNv0ovJmnhqHWmcyLfjWuKFqdFUgH2m5SJAPkLdr73N6 uyyL+UE/RNR5Qr3PQvZAyk3GxjGFquq76y0EmGTSLcfivekJEbupOiWc8 lsezGsiC3mYceHBl7OxOaim9KFRy893H/UW6U+0ParEgfvfXBn6Apgh8S iNq2PjudNQFwDXAFuj56QaawYkiAejr9TKTxMLKqgcanhr69JVBzZYFMj g==; X-CSE-ConnectionGUID: rtpBGrOAQ4KF5TsO5NhvAw== X-CSE-MsgGUID: 1Yn1tz85SFKXYv0O2cCmTA== X-IronPort-AV: E=Sophos;i="6.05,211,1701100800"; d="scan'208";a="7427194" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 22 Jan 2024 18:51:13 +0800 IronPort-SDR: 32Nba8c8wwwixFwZCqcTkJj+GbHlGSTBGO/sqatOYqqTHtU3//vzSD9K0IkxvQc5sXCl+udusY L38F8EpmmucQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 22 Jan 2024 02:01:16 -0800 IronPort-SDR: F3NWE8XuMcyk04xRq9g+JwpM08b7KqYnlodzG8W5bj/2xaqn4Ev5HTjVyo/iC0yRCtAVxQNGzZ qQYASCBRvJEQ== WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 22 Jan 2024 02:51:12 -0800 From: Johannes Thumshirn <johannes.thumshirn@wdc.com> Date: Mon, 22 Jan 2024 02:51:03 -0800 Subject: [PATCH 1/2] btrfs: zoned: use rcu list for iterating devices to collect stats Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20240122-reclaim-fix-v1-1-761234a6d005@wdc.com> References: <20240122-reclaim-fix-v1-0-761234a6d005@wdc.com> In-Reply-To: <20240122-reclaim-fix-v1-0-761234a6d005@wdc.com> To: Josef Bacik <josef@toxicpanda.com>, David Sterba <dsterba@suse.com> Cc: Naohiro Aota <naohiro.aota@wdc.com>, linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, Johannes Thumshirn <johannes.thumshirn@wdc.com>, Damien Le Moal <dlemoal@kernel.org> X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1705920670; l=1177; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=wwPu7DkLQsyhuvNCEpSxZndOrmqP3sMH0uvfkUU5WjI=; b=guDBNuj3D6sX/OjE2aHJDCGPc1NYt+gvnf6rLifQxGQoHe3WfgKIjsSNl2U3iVo7mZMRAGtW7 z2Xw8qZSCblCVqenWIBIB5/1xuqh2JbO+d9HrCZpCxRaye3xHti2x5i X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788789843915101758 X-GMAIL-MSGID: 1788789843915101758 |
Series |
btrfs: zoned: kick reclaim earlier on fast zoned devices
|
|
Commit Message
Johannes Thumshirn
Jan. 22, 2024, 10:51 a.m. UTC
As btrfs_zoned_should_reclaim only has to iterate the device list in order
to collect stats on the device's total and used bytes, we don't need to
take the full blown mutex, but can iterate the device list in a rcu_read
context.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/zoned.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Comments
On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > As btrfs_zoned_should_reclaim only has to iterate the device list in order > to collect stats on the device's total and used bytes, we don't need to > take the full blown mutex, but can iterate the device list in a rcu_read > context. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Looks good. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > As btrfs_zoned_should_reclaim only has to iterate the device list in order > to collect stats on the device's total and used bytes, we don't need to > take the full blown mutex, but can iterate the device list in a rcu_read > context. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/zoned.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 168af9d000d1..b7e7b5a5a6fa 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) > if (fs_info->bg_reclaim_threshold == 0) > return false; > > - mutex_lock(&fs_devices->device_list_mutex); > - list_for_each_entry(device, &fs_devices->devices, dev_list) { > + rcu_read_lock(); > + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { > if (!device->bdev) > continue; > > total += device->disk_total_bytes; > used += device->bytes_used; > } > - mutex_unlock(&fs_devices->device_list_mutex); > + rcu_read_unlock(); This is basically only a hint and inaccuracies in the total or used values would be transient, right? The sum is calculated each time the funciton is called, not stored anywhere so in the unlikely case of device removal it may skip reclaim once, but then pick it up later. Any actual removal of the block groups in verified again and properly locked in btrfs_reclaim_bgs_work().
On 22.01.24 22:35, David Sterba wrote: > On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: >> As btrfs_zoned_should_reclaim only has to iterate the device list in order >> to collect stats on the device's total and used bytes, we don't need to >> take the full blown mutex, but can iterate the device list in a rcu_read >> context. >> >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> --- >> fs/btrfs/zoned.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c >> index 168af9d000d1..b7e7b5a5a6fa 100644 >> --- a/fs/btrfs/zoned.c >> +++ b/fs/btrfs/zoned.c >> @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) >> if (fs_info->bg_reclaim_threshold == 0) >> return false; >> >> - mutex_lock(&fs_devices->device_list_mutex); >> - list_for_each_entry(device, &fs_devices->devices, dev_list) { >> + rcu_read_lock(); >> + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { >> if (!device->bdev) >> continue; >> >> total += device->disk_total_bytes; >> used += device->bytes_used; >> } >> - mutex_unlock(&fs_devices->device_list_mutex); >> + rcu_read_unlock(); > > This is basically only a hint and inaccuracies in the total or used > values would be transient, right? The sum is calculated each time the > funciton is called, not stored anywhere so in the unlikely case of > device removal it may skip reclaim once, but then pick it up later. > Any actual removal of the block groups in verified again and properly > locked in btrfs_reclaim_bgs_work(). > Yes.
On Tue, Jan 23, 2024 at 07:49:22AM +0000, Johannes Thumshirn wrote: > On 22.01.24 22:35, David Sterba wrote: > > On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > >> As btrfs_zoned_should_reclaim only has to iterate the device list in order > >> to collect stats on the device's total and used bytes, we don't need to > >> take the full blown mutex, but can iterate the device list in a rcu_read > >> context. > >> > >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > >> --- > >> fs/btrfs/zoned.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > >> index 168af9d000d1..b7e7b5a5a6fa 100644 > >> --- a/fs/btrfs/zoned.c > >> +++ b/fs/btrfs/zoned.c > >> @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) > >> if (fs_info->bg_reclaim_threshold == 0) > >> return false; > >> > >> - mutex_lock(&fs_devices->device_list_mutex); > >> - list_for_each_entry(device, &fs_devices->devices, dev_list) { > >> + rcu_read_lock(); > >> + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { > >> if (!device->bdev) > >> continue; > >> > >> total += device->disk_total_bytes; > >> used += device->bytes_used; > >> } > >> - mutex_unlock(&fs_devices->device_list_mutex); > >> + rcu_read_unlock(); > > > > This is basically only a hint and inaccuracies in the total or used > > values would be transient, right? The sum is calculated each time the > > funciton is called, not stored anywhere so in the unlikely case of > > device removal it may skip reclaim once, but then pick it up later. > > Any actual removal of the block groups in verified again and properly > > locked in btrfs_reclaim_bgs_work(). > > > > Yes. So please add it to the changelog as an explanation why the mutex -> rcu switch is safe, thanks.
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 168af9d000d1..b7e7b5a5a6fa 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) if (fs_info->bg_reclaim_threshold == 0) return false; - mutex_lock(&fs_devices->device_list_mutex); - list_for_each_entry(device, &fs_devices->devices, dev_list) { + rcu_read_lock(); + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { if (!device->bdev) continue; total += device->disk_total_bytes; used += device->bytes_used; } - mutex_unlock(&fs_devices->device_list_mutex); + rcu_read_unlock(); factor = div64_u64(used * 100, total); return factor >= fs_info->bg_reclaim_threshold;