Message ID | 20230408142530.800612-1-qiang1.zhang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp908864vqo; Sat, 8 Apr 2023 07:36:15 -0700 (PDT) X-Google-Smtp-Source: AKy350ZUcbxJuErQWS3eBwKQ8X0rvtVnJ+ziRfjPtqnWZ6m3SWqIeNr1MiGjnMeMLbrOevOVRujD X-Received: by 2002:a17:903:32cb:b0:1a2:98b1:1ee2 with SMTP id i11-20020a17090332cb00b001a298b11ee2mr8458012plr.15.1680964574864; Sat, 08 Apr 2023 07:36:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680964574; cv=none; d=google.com; s=arc-20160816; b=aX4nlQL3CrqE0ffcRmexacSDnI6mT2QGCXjPLrrWAOYoXVNIOmKdcvKtaLLpPaHk51 WgurLYkU0DgzXAq66gPwm9ssv9o8xphJPuXyzmmUZt+Ym2A4AakceE9ADJfkxFu/cum9 3foqhhc5kO/dLDoVLE1+4XyBfIPf6pyCB4Bgv/zebk3tynmhqssFvrfjHwcaScZkJIoj BOFCiWk+mYHHDB4p4KphJw27l3UhA/Cy+UiDBZHj2MvBAOir8+MlsHeQqhN5RDZ28cbV 2lIGHx/krov51phrCRhY8pPkh0QJ0lpd1NXLr0HQc4gU2ZEG6p5RKkBPZaZFDD6h6SQx /hzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=LkT8RKiCSEPjKstkVq433AGNUL/BdARLZrc/3DFf8X8=; b=zHC3AgSRlKZ7pXJbwBiJOfeTuY6PFMpA2nKP3vrzKsqXvKlvTsmPBQEBHmzQ4eDdpZ OtoWmZlKnnEIZzjsop8JBVbPo1F/IbE63aRS251ljIyVt9Su9eXYmUPPbUgmnxKE/Vew ObcIduW6eJpwoAhT+t9wRQT4ME9hdO5g9Gl7ivAbKVkvQu/H9P3QlLkExuMd+JQwCjZR x6NQvLwQWIZudm1jVWNiwksHw3QKPBE1RJMtbZmGSMzw2+HGqhIvxDsIRYMTrNcJidpY P7gfieXGXMUCEbK/JeJZ+ngql6J242hAx8KJGT7GJ/n/YGrodBldPm7pgzEOJ+lA1VaH Zrcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="a/rpi5K+"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q5-20020a17090311c500b001a217a73d56si6946760plh.161.2023.04.08.07.36.02; Sat, 08 Apr 2023 07:36:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="a/rpi5K+"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230079AbjDHOY4 (ORCPT <rfc822;a1648639935@gmail.com> + 99 others); Sat, 8 Apr 2023 10:24:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229756AbjDHOYx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 8 Apr 2023 10:24:53 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2A4BC640; Sat, 8 Apr 2023 07:24:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680963891; x=1712499891; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=ReZriI51GMtfmI0Pc8bW8jknlkpzYTRVOz7nIwwjg4M=; b=a/rpi5K+JiWEX7VSZd8sleEXNVdgd2Kp9OuXmc0yzi/DwiCYviYA5MtF P90BYhQWX1KdAPojBpaOY8bsEt7/LQIieFEZKof5c80dQk4B6H70wNBYO kz8hbr5sZciSf5imkatH5sHzTMOVNwNpsKH7nbAZdlzb8D2Q+ORKrYZrr xhLpod+6bMftfzNOJMu/WxWfiyIiQfPmq/PCxKJBs255qKWnEqg/NDuv4 Z+cRTCJqhDCWmho6yK5IhPS66g+yimYPlEwvr9dtqSTQg1L18gIyBxaj+ RlIcbS1PvRIEIR6z3HEAEPtPBd2e7HlS30tb3tNaTlvABXMHDVln1Pa+Q A==; X-IronPort-AV: E=McAfee;i="6600,9927,10674"; a="343147993" X-IronPort-AV: E=Sophos;i="5.98,329,1673942400"; d="scan'208";a="343147993" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2023 07:24:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10674"; a="831461615" X-IronPort-AV: E=Sophos;i="5.98,329,1673942400"; d="scan'208";a="831461615" Received: from ubuntu.bj.intel.com ([10.238.155.108]) by fmsmga001.fm.intel.com with ESMTP; 08 Apr 2023 07:24:48 -0700 From: Zqiang <qiang1.zhang@intel.com> To: urezki@gmail.com, paulmck@kernel.org, frederic@kernel.org, joel@joelfernandes.org, qiang1.zhang@intel.com Cc: qiang.zhang1211@gmail.com, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] rcu/kvfree: Make page cache growing happen on the correct krcp Date: Sat, 8 Apr 2023 22:25:30 +0800 Message-Id: <20230408142530.800612-1-qiang1.zhang@intel.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762619110019720202?= X-GMAIL-MSGID: =?utf-8?q?1762619110019720202?= |
Series |
rcu/kvfree: Make page cache growing happen on the correct krcp
|
|
Commit Message
Zqiang
April 8, 2023, 2:25 p.m. UTC
When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke
krc_this_cpu_lock() return current CPU's krcp structure and get a
bnode object from the krcp structure's ->bulk_head, if return is
empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR,
when the can_alloc is set, will unlock current CPU's krcp->lock and
allocate bnode, after that, will invoke krc_this_cpu_lock() again to
return current CPU's krcp structure, if the CPU migration occurs,
the krcp obtained at this time will not be consistent with the previous
one, this causes the bnode will be added to the wrong krcp structure's
->bulk_head or trigger fill page work on wrong krcp.
This commit therefore re-hold krcp->lock after allocated page instead
of re-call krc_this_cpu_lock() to ensure the consistency of krcp.
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
kernel/rcu/tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Sat, Apr 08, 2023 at 10:25:30PM +0800, Zqiang wrote: > When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke > krc_this_cpu_lock() return current CPU's krcp structure and get a > bnode object from the krcp structure's ->bulk_head, if return is > empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR, > when the can_alloc is set, will unlock current CPU's krcp->lock and > allocate bnode, after that, will invoke krc_this_cpu_lock() again to > return current CPU's krcp structure, if the CPU migration occurs, > the krcp obtained at this time will not be consistent with the previous > one, this causes the bnode will be added to the wrong krcp structure's > ->bulk_head or trigger fill page work on wrong krcp. > > This commit therefore re-hold krcp->lock after allocated page instead > of re-call krc_this_cpu_lock() to ensure the consistency of krcp. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> Very good, thank you! Queued for testing and further review, but please check my wordsmithing. Thanx, Paul ------------------------------------------------------------------------ commit a0bbb5785539ed846f4769368f24a296d54bc801 Author: Zqiang <qiang1.zhang@intel.com> Date: Sat Apr 8 22:25:30 2023 +0800 rcu/kvfree: Use consistent krcp when growing kfree_rcu() page cache The add_ptr_to_bulk_krc_lock() function is invoked to allocate a new kfree_rcu() page, also known as a kvfree_rcu_bulk_data structure. The kfree_rcu_cpu structure's lock is used to protect this operation, except that this lock must be momentarily dropped when allocating memory. It is clearly important that the lock that is reacquired be the same lock that was acquired initially via krc_this_cpu_lock(). Unfortunately, this same krc_this_cpu_lock() function is used to re-acquire this lock, and if the task migrated to some other CPU during the memory allocation, this will result in the kvfree_rcu_bulk_data structure being added to the wrong CPU's kfree_rcu_cpu structure. This commit therefore replaces that second call to krc_this_cpu_lock() with raw_spin_lock_irqsave() in order to explicitly acquire the lock on the correct kfree_rcu_cpu structure, thus keeping things straight even when the task migrates. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2699b7acf0e3..41daae3239b5 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3301,7 +3301,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, // scenarios. bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN); - *krcp = krc_this_cpu_lock(flags); + raw_spin_lock_irqsave(&(*krcp)->lock, *flags); } if (!bnode)
> When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke > krc_this_cpu_lock() return current CPU's krcp structure and get a > bnode object from the krcp structure's ->bulk_head, if return is > empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR, > when the can_alloc is set, will unlock current CPU's krcp->lock and > allocate bnode, after that, will invoke krc_this_cpu_lock() again to > return current CPU's krcp structure, if the CPU migration occurs, > the krcp obtained at this time will not be consistent with the previous > one, this causes the bnode will be added to the wrong krcp structure's > ->bulk_head or trigger fill page work on wrong krcp. > > This commit therefore re-hold krcp->lock after allocated page instead > of re-call krc_this_cpu_lock() to ensure the consistency of krcp. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > >Very good, thank you! Queued for testing and further review, but >please check my wordsmithing. More clear and detailed description, Thanks Paul 😊. > > Thanx, Paul > >------------------------------------------------------------------------ > >commit a0bbb5785539ed846f4769368f24a296d54bc801 >Author: Zqiang <qiang1.zhang@intel.com> >Date: Sat Apr 8 22:25:30 2023 +0800 > > rcu/kvfree: Use consistent krcp when growing kfree_rcu() page cache > > The add_ptr_to_bulk_krc_lock() function is invoked to allocate a new > kfree_rcu() page, also known as a kvfree_rcu_bulk_data structure. > The kfree_rcu_cpu structure's lock is used to protect this operation, > except that this lock must be momentarily dropped when allocating memory. > It is clearly important that the lock that is reacquired be the same > lock that was acquired initially via krc_this_cpu_lock(). > > Unfortunately, this same krc_this_cpu_lock() function is used to > re-acquire this lock, and if the task migrated to some other CPU during > the memory allocation, this will result in the kvfree_rcu_bulk_data > structure being added to the wrong CPU's kfree_rcu_cpu structure. > > This commit therefore replaces that second call to krc_this_cpu_lock() > with raw_spin_lock_irqsave() in order to explicitly acquire the lock on > the correct kfree_rcu_cpu structure, thus keeping things straight even > when the task migrates. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > >diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >index 2699b7acf0e3..41daae3239b5 100644 >--- a/kernel/rcu/tree.c >+++ b/kernel/rcu/tree.c >@@ -3301,7 +3301,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, > // scenarios. > bnode = (struct kvfree_rcu_bulk_data *) > __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN); >- *krcp = krc_this_cpu_lock(flags); >+ raw_spin_lock_irqsave(&(*krcp)->lock, *flags); > } > > if (!bnode)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9d9d3772cc45..c9076fa0a954 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3303,7 +3303,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, // scenarios. bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN); - *krcp = krc_this_cpu_lock(flags); + raw_spin_lock_irqsave(&(*krcp)->lock, *flags); } if (!bnode)