From patchwork Thu Jan 11 06:07:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187177 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1260785dyi; Wed, 10 Jan 2024 22:24:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IHqOquqZCvcghSCngcNC3KanwdCpqLczAJMfUhG/PXj8oxzipqfdtxKVMpOVNcX0jNfBjbJ X-Received: by 2002:a17:902:f542:b0:1d4:cd56:a5 with SMTP id h2-20020a170902f54200b001d4cd5600a5mr740783plf.53.1704954249010; Wed, 10 Jan 2024 22:24:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954248; cv=none; d=google.com; s=arc-20160816; b=FxSH0OahzbgU/dJxmHYsaCGqiP4vl+HbFyA64Hb114i70ycQlOSacrdnU7YKmF5GgB nPttjXz+/dWW+d3wSr615hxSpjZDuIvZd8oq9WjA7KRTEEvB7cB0xdBggztqsV+muyEc vbYp3tPhBBz8LMvwmXkJFjxYjG9FZdcGli9u+9jR9K6HaWqG0YQhYBu2bst2DWsX0POn DGyHp1vonQR1Lx4Lt4mqc/eKpzT+LvjKX1bQaweahpmGHPHbtxG9BtsUFPT11LBI6NQG /nyUWG5+DLDRcCqe234GGqadloQqPwUaoc/IgIEvuuohAmr5cUnx/EENU4DG3ui63eEa TFLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=eRD1/8o+qtEOTAKMY7ucioDJJtMI3+WpTL+ASvHX4To=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=CFQczhV+adGn5H3hiLHb9+NSG1J73eoNQGXXe92lJ0rHyWd1DO2wtnRmMHIWNlTBKF Tft6R+EO3mE5jgKLecttxF/H2OTdnkduScYgwuKacmHMg4gurFRXvvlp0C4fPe6/704V kgbDADbGlPY6jJDGB6zpSU3lALRd4+6aJdov3HKSM8QdgJ2gDYAd8eKAH21SRQHig7zi IreVUTC2SFldsseiVKkIwEilp3XUSGqA1l9Ksa903IH8wLrDF21ASTfVG6QoUhbmQhsb lY2FgBmmRRoty+II2lasRQIhjCFAa6ntrP3RPXj0dQOMyEtkU2P8CepQ4rbZYRZpw2Y3 /Ecw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23087-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23087-ouuuleilei=gmail.com@vger.kernel.org" Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id b2-20020a170902d30200b001d4baa345e7si420841plc.493.2024.01.10.22.24.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23087-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23087-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23087-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id EF262B24430 for ; Thu, 11 Jan 2024 06:24:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 41A04D2E4; Thu, 11 Jan 2024 06:23:34 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B68CC6AA6 for ; Thu, 11 Jan 2024 06:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-de-659f85c70830 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 1/7] x86/tlb: Add APIs manipulating tlb batch's arch data Date: Thu, 11 Jan 2024 15:07:51 +0900 Message-Id: <20240111060757.13563-2-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXC9ZZnke6J1vmpBg+usFrMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXxu3b L5kKVvFVLPnRztbAOI2ni5GTQ0LARGLby5dMXYwcYPa8TZkgYTYBdYkbN34yg9giAmYSB1v/ sHcxcnEwCzxgkpj7dgUjSEJYwE2i/8M3JhCbRUBVYtHNTlYQm1fAVOLdzR5WiPnyEqs3HAAb xAk06MLhWYwgu4SAah611UGUfGeTeLK0FMKWlDi44gbLBEbeBYwMqxiFMvPKchMzc0z0Mirz Miv0kvNzNzECQ35Z7Z/oHYyfLgQfYhTgYFTi4X2waF6qEGtiWXFl7iFGCQ5mJRFehc9zUoV4 UxIrq1KL8uOLSnNSiw8xSnOwKInzGn0rTxESSE8sSc1OTS1ILYLJMnFwSjUwTrb4OvVb5d9S D6ZpXYaRl+LjdnRMndy4f3VFTuefjZqHeTySHUVfrylIWJGwti3NqKsmLMJ6y7xNaUenur6z 3dhvm7B3TfjV5dv9v/YY9K3ZNLcv+leuuqj5dr6EY5eTGr4GzfNI59/59dLaxd9PHKzv+nT9 4zK+ednHN/h+M7uxNdTxtvM/fyWW4oxEQy3mouJEAAWL0dt1AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrFLMWRmVeSWpSXmKPExsXC5WfdrHu8dX6qwdM+C4s569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfVdDxkt jvceYLLYvGkqs8XvH0B1c6ZYWZycNZnFQdDje2sfi8eCTaUem1doeSze85LJY9OqTjaPTZ8m sXu8O3eO3ePEjN8sHjsfWnrMOxno8X7fVTaPxS8+MHls/WXn8XmTnMe7+W/ZAvijuGxSUnMy y1KL9O0SuDJu337JVLCKr2LJj3a2BsZpPF2MHBwSAiYS8zZldjFycrAJqEvcuPGTGcQWETCT ONj6h72LkYuDWeABk8TctysYQRLCAm4S/R++MYHYLAKqEotudrKC2LwCphLvbvaA2RIC8hKr NxwAG8QJNOjC4VmMILuEgGoetdVNYORawMiwilEkM68sNzEzx1SvODujMi+zQi85P3cTIzCE l9X+mbiD8ctl90OMAhyMSjy8Bi/npQqxJpYVV+YeYpTgYFYS4VX4PCdViDclsbIqtSg/vqg0 J7X4EKM0B4uSOK9XeGqCkEB6YklqdmpqQWoRTJaJg1OqgdHRV6F6dvfDWYfPVhxTevCV/ZZT UfEO4XX+Pckit20DZh7cy3HyovC9rDdHZBw1/untvNm9f4rPFY2DQdrTbklzq4c6btu0Ikda IKPsssTEG8edT6wzaCjM3xqdXHTvyaFCVgvzXqeWi9/CRPIrmf85ey7jNVWdkrNucvb8G2Gl D668dZ3esEOJpTgj0VCLuag4EQDnu2KHXQIAAA== X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774106453882092 X-GMAIL-MSGID: 1787774106453882092 This is a preparation for migrc mechanism that needs to recognize read-only TLB entries during batched TLB flush by separating tlb batch's arch data into two, one is for read-only entries and the other is for writable ones, and merging those two when needed. Migrc also needs to optimize CPUs to flush by clearing ones that have already performed TLB flush needed. To support it, added APIs manipulating arch data for x86. Signed-off-by: Byungchul Park --- arch/x86/include/asm/tlbflush.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 25726893c6f4..fa7e16dbeb44 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -293,6 +294,23 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) +{ + cpumask_clear(&batch->cpumask); +} + +static inline void arch_tlbbatch_fold(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + cpumask_or(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); +} + +static inline bool arch_tlbbatch_done(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + return cpumask_andnot(&bdst->cpumask, &bdst->cpumask, &bsrc->cpumask); +} + static inline bool pte_flags_need_flush(unsigned long oldflags, unsigned long newflags, bool ignore_access) From patchwork Thu Jan 11 06:07:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187176 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1260783dyi; Wed, 10 Jan 2024 22:24:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IH8mYS3KgDDQhg7ASirOnih+XgAz/4SMMO9M8Gb4UwiA1MlySVbzzhpn5IBnO5oD0VSpmfD X-Received: by 2002:a17:90a:69c4:b0:28c:1398:e360 with SMTP id s62-20020a17090a69c400b0028c1398e360mr617951pjj.47.1704954248539; Wed, 10 Jan 2024 22:24:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954248; cv=none; d=google.com; s=arc-20160816; b=MYV7tT33yyYeXusQZfLOUDUyLq8WkJKy5aIhRAvM7vIEl49Dcdjl3WtknRRqqom+Ll Rra7zG37UPhDA9zbDgX9Xv1sH+fycqfvnqI3mWGlEISTud7wiGynzZ8+yGEIIKYABVcD qZqHfJHz7EKUZYM8dVqXZjmE6Lo3f6cVn7Y8lLEH/gJAA69c4UwEvVyTRF0YNgBc2JDZ HMDDMm/soFo9+k7ePjrcoY/Hf3dVfFchkKPNm0+uiaSk/GAQlQfGob0p0MdrB3tw+umn AdOi+DtkISAcb5gqkVd+88lawH/ZeG/xTFRCKtxXSs7mzWJyrnYJuLNHSbhVbcChpNRa Sp0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=23+JjFD9uRPbyMWGqJTLiQYjg6Gkh8EQ/TB1jpahIpQ=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=zsZQW27gTMenVW5ZKSTz3CfYNaX9gTNlcxl0RwJRbHFtjnrVCsXP/amHvllPg8avRw gEMfqnLlwveKwnHk2iX8eRZxZKKpCHAzc3JTOF7J6o7AYP8rF6JOgM+GovTW6olGGhHr WpIShxMgupVWU/u6cJSOyiPzhjwkfEn9iXN7oLpW7x51ydeQDVCteN8So5eZfsKt1et4 7ACBd2a47a5ax7fG70hMfp6ntpMFERzkTDd5XhifIQmESKtKhTGk8dY2l+9lJ+zt3aoi AySO/dB900vK8NbYzLLTbjMhRRBI8QsrdJMvkv1AoQN6YhMV1BiN6Zkg9Aqdw4oHipFC 2uWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23086-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23086-ouuuleilei=gmail.com@vger.kernel.org" Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d12-20020a17090a628c00b0028d2806cebbsi2879578pjj.189.2024.01.10.22.24.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23086-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23086-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23086-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 95F24B242D1 for ; Thu, 11 Jan 2024 06:24:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3FE49D2E2; Thu, 11 Jan 2024 06:23:34 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B68A26AA1 for ; Thu, 11 Jan 2024 06:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-e3-659f85c87bb7 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 2/7] arm64: tlbflush: Add APIs manipulating tlb batch's arch data Date: Thu, 11 Jan 2024 15:07:52 +0900 Message-Id: <20240111060757.13563-3-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXC9ZZnke6J1vmpBo0XuCzmrF/DZvF5wz82 ixcb2hktvq7/xWzx9FMfi8XlXXPYLO6t+c9qcX7XWlaLHUv3MVlcOrCAyeL6roeMFsd7DzBZ bN40ldni9w+gujlTrCxOzprM4iDg8b21j8VjwaZSj80rtDwW73nJ5LFpVSebx6ZPk9g93p07 x+5xYsZvFo+dDy095p0M9Hi/7yqbx9Zfdh6fN8l5vJv/li2AL4rLJiU1J7MstUjfLoErY0/f Q/aC2VwVm963szUw7uXoYuTkkBAwkVi95T4jjP3z/Hk2EJtNQF3ixo2fzCC2iICZxMHWP+xd jFwczAIPmCTmvl0B1iAs4CdxovMeWAOLgKrE5s5/QHEODl4BU4l9d7khZspLrN5wAGwOJ9Cc C4dngZUIAZU8aquDKHnPJjHjdxaELSlxcMUNlgmMvAsYGVYxCmXmleUmZuaY6GVU5mVW6CXn 525iBIb8sto/0TsYP10IPsQowMGoxMP7YNG8VCHWxLLiytxDjBIczEoivAqf56QK8aYkVlal FuXHF5XmpBYfYpTmYFES5zX6Vp4iJJCeWJKanZpakFoEk2Xi4JRqYOy/Md98R+93NWHb1nTB 5I07nv+bIDa/+7VCMI/Lg1N9SU/YrWXCmrNO7CpfsTkwVCyyhyFgWlLI9d+vFz4SdjZRlrsV //N5lm6a6l+DTof1xvuYlk5x/iX1/P2Z824e7du+9ZeUKbydJHYp4H9968KHzmJal6xnzY9Q EXv8ezrX7Igwy0l1lkosxRmJhlrMRcWJABhXxAt1AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrJLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qQaeyxZz1a9gsPm/4x2bx YkM7o8XX9b+YLZ5+6mOxODz3JKvF5V1z2CzurfnPanF+11pWix1L9zFZXDqwgMni+q6HjBbH ew8wWWzeNJXZ4vcPoLo5U6wsTs6azOIg6PG9tY/FY8GmUo/NK7Q8Fu95yeSxaVUnm8emT5PY Pd6dO8fucWLGbxaPnQ8tPeadDPR4v+8qm8fiFx+YPLb+svP4vEnO4938t2wB/FFcNimpOZll qUX6dglcGXv6HrIXzOaq2PS+na2BcS9HFyMnh4SAicTP8+fZQGw2AXWJGzd+MoPYIgJmEgdb /7B3MXJxMAs8YJKY+3YFI0hCWMBP4kTnPbAGFgFVic2d/4DiHBy8AqYS++5yQ8yUl1i94QDY HE6gORcOzwIrEQIqedRWN4GRawEjwypGkcy8stzEzBxTveLsjMq8zAq95PzcTYzAAF5W+2fi DsYvl90PMQpwMCrx8Bq8nJcqxJpYVlyZe4hRgoNZSYRX4fOcVCHelMTKqtSi/Pii0pzU4kOM 0hwsSuK8XuGpCUIC6YklqdmpqQWpRTBZJg5OqQbG1p2sx89pmE9wTEpPNe97IPWpeA07l8Pv 0PNuPVd6hcpuHZgy0bn+/5QNm3vWRJXKLVDb+E4i2EFNxN6OSVhhUTLH50UlpYJt676FXTw1 h8f16YFrYftv5wipMm8zWXg3I5Z72gnxuU92sc2o+RTO36m6v9dt5gGz9Icfju++af+VL8Ta ea6VEktxRqKhFnNRcSIAq273dFwCAAA= X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774105898611790 X-GMAIL-MSGID: 1787774105898611790 This is a preparation for migrc mechanism that requires to manipulate tlb batch's arch data. Even though arm64 does nothing with it, arch with CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH should provide the APIs. Signed-off-by: Byungchul Park --- arch/arm64/include/asm/tlbflush.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bb2c2833a987..4f2094843e7a 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -328,6 +328,25 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) dsb(ish); } +static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) +{ + /* nothing to do */ +} + +static inline void arch_tlbbatch_fold(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + /* nothing to do */ +} + +static inline bool arch_tlbbatch_done(struct arch_tlbflush_unmap_batch *bdst, + struct arch_tlbflush_unmap_batch *bsrc) +{ + /* nothing to do */ + + return false; +} + /* * This is meant to avoid soft lock-ups on large TLB flushing ranges and not * necessarily a performance improvement. From patchwork Thu Jan 11 06:07:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187178 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1260816dyi; Wed, 10 Jan 2024 22:24:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IFosEgsA6IdhRMCtvM1MQR5UKkDnP2aJCfueAI87eUZoi+iYPHHtMsKyFUJ8/VQJJgQFsgI X-Received: by 2002:a17:90a:7449:b0:28d:b6c6:646c with SMTP id o9-20020a17090a744900b0028db6c6646cmr404725pjk.46.1704954253778; Wed, 10 Jan 2024 22:24:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954253; cv=none; d=google.com; s=arc-20160816; b=fLYj3oy/CtVsyvTtQ7FI/6O+DUAY+rD7Ll6d/8MN1G6Ya0uE1eJaQVTGpxOOAtVqsm FVVeco0AeOSxqdtycWF3oZKujnPmLswtrctSpN1RFLDsFR4QXPFx164Rvwz3xTnp3BB0 PVUZCFrL/fdixv39B3+030AkXhIRfqY9PPx9G+7fBUDO3HJ3HWm4ZDmYd7IKJUaGvoTp 3Lubt5IgCOVVwXWRrz8IkPm6DsRpmL8Yyi5Q1lHu8XVnFAafp9XzbEZSct4y9RyOgrlS Ia8xhOsAr6yvDRLau1gXy1bzrQlfLBxkeCWISeUFrqZ/8n3vGHyq8KJgyV0Mughqz7Hm kYJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=deHIYqkuaMMowWAwSsScVOmIndLCnwrqbZwxQiYz+QA=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=K918VzmGZSuggL+JeJpMNYfJqSkY0SNpjDlFaDMNKtuP+m8yF6SEVQEVXDuwebr+Nf 2Pv+SPLLuTDCciGdnomece1SkN9CjcYTMLLFfBeE8KV+y0CRFACu+0DT2wq89LzuUNc8 DMOxE48K+h1n4//P/Ke87EKKjZstIFH+gcxdUHtHjhU2m+eM7loCMTCrZf5WhxC8lJe8 K7IKOpcDxzamrd87R2TnnB+yVjmH3foltsq4IXvuXe7nct4tg+MI6w4xo4qLy/fgtgTI egQwvwGMg61LQ62kgyFFkxpwk9y5PKBbN0J2NVsP+IEm6J0VVNjiHtz2O0aNQxpaP3tZ K8Nw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23088-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23088-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id lx16-20020a17090b4b1000b0028bc9c1cbccsi3163165pjb.58.2024.01.10.22.24.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23088-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23088-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23088-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8F1322862AA for ; Thu, 11 Jan 2024 06:24:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9C601D516; Thu, 11 Jan 2024 06:23:34 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B687D63D8 for ; Thu, 11 Jan 2024 06:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-e8-659f85c8eef1 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 3/7] mm/rmap: Recognize read-only TLB entries during batched TLB flush Date: Thu, 11 Jan 2024 15:07:53 +0900 Message-Id: <20240111060757.13563-4-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXC9ZZnoe6J1vmpBi8W8VnMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXxtd5 TgXHpSs+fNrB2MD4QKyLkZNDQsBEYsrSBkYY+27PBGYQm01AXeLGjZ9gtoiAmcTB1j/sXYxc HMwCD5gk5r5dAdYgLBAscej9NlYQm0VAVWLO33dARRwcvAKmEn+mQM2Xl1i94QDYHE6gORcO z2IEKRECKnnUVgcyUkLgPZvEm4sTmSHqJSUOrrjBMoGRdwEjwypGocy8stzEzBwTvYzKvMwK veT83E2MwJBfVvsnegfjpwvBhxgFOBiVeHgfLJqXKsSaWFZcmXuIUYKDWUmEV+HznFQh3pTE yqrUovz4otKc1OJDjNIcLErivEbfylOEBNITS1KzU1MLUotgskwcnFINjPKH1Z8pHNc4dKyz tOrYkmuah7aJnXLVqlb+zbRD79/NCo32CTs5mnNDWFI6WDp4Vh+TZOt3f2/8wJ7v8fo9c96/ /jRrNXNBQZ7NUSEFzvf6xYsYdA6xstr/6oh0qJTs2fKB876p/jGbrFNBNls2Bfu6G15i9NCd I1rMlvt95+RHG2Ru3GsyV2Ipzkg01GIuKk4EALcZgJl1AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrJLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qwedPTBZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLw3JOsFpd3zWGzuLfmP6vF+V1rWS12LN3HZHHpwAImi+u7HjJa HO89wGSxedNUZovfP4Dq5kyxsjg5azKLg6DH99Y+Fo8Fm0o9Nq/Q8li85yWTx6ZVnWwemz5N Yvd4d+4cu8eJGb9ZPHY+tPSYdzLQ4/2+q2wei198YPLY+svO4/MmOY9389+yBfBHcdmkpOZk lqUW6dslcGV8nedUcFy64sOnHYwNjA/Euhg5OSQETCTu9kxgBrHZBNQlbtz4CWaLCJhJHGz9 w97FyMXBLPCASWLu2xWMIAlhgWCJQ++3sYLYLAKqEnP+vgMq4uDgFTCV+DMFaqa8xOoNB8Dm cALNuXB4FiNIiRBQyaO2ugmMXAsYGVYximTmleUmZuaY6hVnZ1TmZVboJefnbmIEBvCy2j8T dzB+uex+iFGAg1GJh9fg5bxUIdbEsuLK3EOMEhzMSiK8Cp/npArxpiRWVqUW5ccXleakFh9i lOZgURLn9QpPTRASSE8sSc1OTS1ILYLJMnFwSjUwVoZaMSzSWsZ4hOEQn9GVW1w7NggdLGs8 eOidjG+ES85q2YbXR716b6R8id/HrFf/Qu+5hVLa0kkWz6Ia97bpTbigEnXP9hRb8J62+3Ok suo5TyhV/Vh36Pn7pJcHZr8XLes59rV16uKvpUHz7tf0335oIN51f4b7CvbzHstOspqfcDwp GNyxUYmlOCPRUIu5qDgRAM7hR7pcAgAA X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774111408207717 X-GMAIL-MSGID: 1787774111408207717 Functionally, no change. This is a preparation for migrc mechanism that requires to recognize read-only TLB entries and makes use of them to batch more aggressively. Plus, the newly introduced API, fold_ubc() will be used by migrc mechanism when manipulating tlb batch data. Signed-off-by: Byungchul Park --- include/linux/sched.h | 1 + mm/internal.h | 4 ++++ mm/rmap.c | 31 ++++++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 292c31697248..0317e7a65151 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1328,6 +1328,7 @@ struct task_struct { #endif struct tlbflush_unmap_batch tlb_ubc; + struct tlbflush_unmap_batch tlb_ubc_ro; /* Cache last used pipe for splice(): */ struct pipe_inode_info *splice_pipe; diff --git a/mm/internal.h b/mm/internal.h index b61034bd50f5..b880f1e78700 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -923,6 +923,7 @@ extern struct workqueue_struct *mm_percpu_wq; void try_to_unmap_flush(void); void try_to_unmap_flush_dirty(void); void flush_tlb_batched_pending(struct mm_struct *mm); +void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src); #else static inline void try_to_unmap_flush(void) { @@ -933,6 +934,9 @@ static inline void try_to_unmap_flush_dirty(void) static inline void flush_tlb_batched_pending(struct mm_struct *mm) { } +static inline void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src) +{ +} #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ extern const struct trace_print_flags pageflag_names[]; diff --git a/mm/rmap.c b/mm/rmap.c index 7a27a2b41802..da36f23ff7b0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -605,6 +605,28 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, } #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH + +void fold_ubc(struct tlbflush_unmap_batch *dst, + struct tlbflush_unmap_batch *src) +{ + if (!src->flush_required) + return; + + /* + * Fold src to dst. + */ + arch_tlbbatch_fold(&dst->arch, &src->arch); + dst->writable = dst->writable || src->writable; + dst->flush_required = true; + + /* + * Reset src. + */ + arch_tlbbatch_clear(&src->arch); + src->flush_required = false; + src->writable = false; +} + /* * Flush TLB entries for recently unmapped pages from remote CPUs. It is * important if a PTE was dirty when it was unmapped that it's flushed @@ -614,7 +636,9 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, void try_to_unmap_flush(void) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + fold_ubc(tlb_ubc, tlb_ubc_ro); if (!tlb_ubc->flush_required) return; @@ -645,13 +669,18 @@ void try_to_unmap_flush_dirty(void) static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, unsigned long uaddr) { - struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc; int batch; bool writable = pte_dirty(pteval); if (!pte_accessible(mm, pteval)) return; + if (pte_write(pteval) || writable) + tlb_ubc = ¤t->tlb_ubc; + else + tlb_ubc = ¤t->tlb_ubc_ro; + arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); tlb_ubc->flush_required = true; From patchwork Thu Jan 11 06:07:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187181 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1260996dyi; Wed, 10 Jan 2024 22:24:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IEpqLdfZgIDIzZPnkPo+y2UnR8cCVkTZDTeSnoQNPTi77FZ0RoImUrEGwuuHCfKRAMP817N X-Received: by 2002:a9d:6f19:0:b0:6dd:dd2f:4265 with SMTP id n25-20020a9d6f19000000b006dddd2f4265mr808831otq.25.1704954281993; Wed, 10 Jan 2024 22:24:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954281; cv=none; d=google.com; s=arc-20160816; b=I6n135MLlFJWZJhPDPA2oLFFHUQxILCj6D9E9E6fw13GFhx2iCZdV2MLM9Weu+TBES cv6DH4PAjntq+6xdbH0e+fTb0qQZrr74m5MD7MVqgIi1FbT4ChAu9jCg2D3xQkkkzZX/ qpVf+MRBcQHanrZAlEwqTvjdYL4MH8/MC8L9QY1eyJeRRY1jKxBmLTNeg2iWUvcAHpq5 eAYukG6tMYD0A8JSo5mfIqpdf1bsi/cpdm2oFlPSCAsodrE0CSfCEKA9lU+w+KZBTiwa mG7qqBiXAMaOdhtgRfKIvlH5MIMGn/UDTPTiQzMJ8v/PEVgSzB8XjITESkaCYJ1492e8 2A+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=hj/OU3wk3OzsqULfJnD2mvtgwTJyJb/ndDOd9gzDe6E=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=aaVndbKv9Z05Zd88kc490YCv0Ga5+oJr9/FUHQsm5xa+RQi2R6TXichcUlUZwXik/l LiaL9/UoVZQxSe0Gwtb8iuBONsiD0P8IykBOZ/UXhE0YO2nsy5IBnvyCIrgx64KmEIkH FfReJbOU+irea0gX132P0UP8apjpBI+w1Ioa3Y5jLTVcOMUh3E/aRb1RJXB2ELQ6e1U6 e/W98tbO9TgUpPp4OTsdWbQ2/uSj+/3aLnxrJ1tafMJAP1mIs9jIM5hdE6EYdVrQwhZu mgkvj8tMq8s5FucI2KEDGf4+ZBW8y/3P1/ixmG7e3dCDLIGf1ndpwPJLJfz3IiVHe5uZ pnHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23090-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23090-ouuuleilei=gmail.com@vger.kernel.org" Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id y63-20020a638a42000000b005cddb743bc7si433788pgd.709.2024.01.10.22.24.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23090-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23090-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23090-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id AB2FB2865F1 for ; Thu, 11 Jan 2024 06:24:41 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EEAFDEAE2; Thu, 11 Jan 2024 06:23:36 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 02C1A6ADE for ; Thu, 11 Jan 2024 06:23:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-ed-659f85c894c4 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 4/7] mm: Separate move/undo doing on folio list from migrate_pages_batch() Date: Thu, 11 Jan 2024 15:07:54 +0900 Message-Id: <20240111060757.13563-5-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMLMWRmVeSWpSXmKPExsXC9ZZnke6J1vmpBou/CFjMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXxo33 d9kL+nUqFs15yN7AuFK5i5GDQ0LAROLwXK8uRk4wc9WkPawgNpuAusSNGz+ZQWwRATOJg61/ 2LsYuTiYBR4wScx9u4IRJCEsEC7ReHEymM0ioCqxbvp8dhCbV8BU4uqCZiaIofISqzccABvE CTTowuFZjCB7hYBqHrXVgcyUEPjOJrFg3Tk2iHpJiYMrbrBMYORdwMiwilEoM68sNzEzx0Qv ozIvs0IvOT93EyMw6JfV/onewfjpQvAhRgEORiUe3geL5qUKsSaWFVfmHmKU4GBWEuFV+Dwn VYg3JbGyKrUoP76oNCe1+BCjNAeLkjiv0bfyFCGB9MSS1OzU1ILUIpgsEwenVANj8Pz/p6Pz Yr4XLF9uytDucVlF2ofLR2DHs827TB2a2y5d+9Uwdf5CNQ6GVRM1NjzR0KvkUb4p+umkynle hyXCr4/yn+M+mvVETet83P6eptNF82MnndijE1cx4an1U4+T8ye0KB4+MrdlMw+7U/cLj/vv hQ99z5V0Xfh2Nc/MOMXLmrFXMr63KrEUZyQaajEXFScCAJniNQJ2AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrNLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qwdqrrBZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLw3JOsFpd3zWGzuLfmP6vF+V1rWS12LN3HZHHpwAImi+u7HjJa HO89wGSxedNUZovfP4Dq5kyxsjg5azKLg6DH99Y+Fo8Fm0o9Nq/Q8li85yWTx6ZVnWwemz5N Yvd4d+4cu8eJGb9ZPHY+tPSYdzLQ4/2+q2wei198YPLY+svO4/MmOY9389+yBfBHcdmkpOZk lqUW6dslcGXceH+XvaBfp2LRnIfsDYwrlbsYOTkkBEwkVk3awwpiswmoS9y48ZMZxBYRMJM4 2PqHvYuRi4NZ4AGTxNy3KxhBEsIC4RKNFyeD2SwCqhLrps9nB7F5BUwlri5oZoIYKi+xesMB sEGcQIMuHJ4FVM/BIQRU86itbgIj1wJGhlWMIpl5ZbmJmTmmesXZGZV5mRV6yfm5mxiBQbys 9s/EHYxfLrsfYhTgYFTi4TV4OS9ViDWxrLgy9xCjBAezkgivwuc5qUK8KYmVValF+fFFpTmp xYcYpTlYlMR5vcJTE4QE0hNLUrNTUwtSi2CyTBycUg2M8zb8P3GI4dD5fyp7tjzven33zOGC MNOZ3Sy+HKaPF+z/tVomrDJccdruk9Jsh040fpl/OUni5UT52En2k76L3FrUK/y79/TbtBWW Kg8sfk+eMFv1f2z0aVcF07A4YbYZtkf0U57mzirk+vXO6erlLS8vbvPpmca+a+NcxZJN2Ykd l317d9lzRSixFGckGmoxFxUnAgA/JYEJXgIAAA== X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774140775975911 X-GMAIL-MSGID: 1787774140775975911 Functionally, no change. This is a preparation for migrc mechanism that requires to use separate folio lists for its own handling at migration. Refactored migrate_pages_batch() and separated move and undo parts operating on folio list, from migrate_pages_batch(). Signed-off-by: Byungchul Park --- mm/migrate.c | 134 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 83 insertions(+), 51 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 397f2a6e34cb..bbe1ecef4956 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1611,6 +1611,81 @@ static int migrate_hugetlbs(struct list_head *from, new_folio_t get_new_folio, return nr_failed; } +static void migrate_folios_move(struct list_head *src_folios, + struct list_head *dst_folios, + free_folio_t put_new_folio, unsigned long private, + enum migrate_mode mode, int reason, + struct list_head *ret_folios, + struct migrate_pages_stats *stats, + int *retry, int *thp_retry, int *nr_failed, + int *nr_retry_pages) +{ + struct folio *folio, *folio2, *dst, *dst2; + bool is_thp; + int nr_pages; + int rc; + + dst = list_first_entry(dst_folios, struct folio, lru); + dst2 = list_next_entry(dst, lru); + list_for_each_entry_safe(folio, folio2, src_folios, lru) { + is_thp = folio_test_large(folio) && folio_test_pmd_mappable(folio); + nr_pages = folio_nr_pages(folio); + + cond_resched(); + + rc = migrate_folio_move(put_new_folio, private, + folio, dst, mode, + reason, ret_folios); + /* + * The rules are: + * Success: folio will be freed + * -EAGAIN: stay on the unmap_folios list + * Other errno: put on ret_folios list + */ + switch(rc) { + case -EAGAIN: + *retry += 1; + *thp_retry += is_thp; + *nr_retry_pages += nr_pages; + break; + case MIGRATEPAGE_SUCCESS: + stats->nr_succeeded += nr_pages; + stats->nr_thp_succeeded += is_thp; + break; + default: + *nr_failed += 1; + stats->nr_thp_failed += is_thp; + stats->nr_failed_pages += nr_pages; + break; + } + dst = dst2; + dst2 = list_next_entry(dst, lru); + } +} + +static void migrate_folios_undo(struct list_head *src_folios, + struct list_head *dst_folios, + free_folio_t put_new_folio, unsigned long private, + struct list_head *ret_folios) +{ + struct folio *folio, *folio2, *dst, *dst2; + + dst = list_first_entry(dst_folios, struct folio, lru); + dst2 = list_next_entry(dst, lru); + list_for_each_entry_safe(folio, folio2, src_folios, lru) { + int old_page_state = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_folio_extract(dst, &old_page_state, &anon_vma); + migrate_folio_undo_src(folio, old_page_state & PAGE_WAS_MAPPED, + anon_vma, true, ret_folios); + list_del(&dst->lru); + migrate_folio_undo_dst(dst, true, put_new_folio, private); + dst = dst2; + dst2 = list_next_entry(dst, lru); + } +} + /* * migrate_pages_batch() first unmaps folios in the from list as many as * possible, then move the unmapped folios. @@ -1633,7 +1708,7 @@ static int migrate_pages_batch(struct list_head *from, int pass = 0; bool is_thp = false; bool is_large = false; - struct folio *folio, *folio2, *dst = NULL, *dst2; + struct folio *folio, *folio2, *dst = NULL; int rc, rc_saved = 0, nr_pages; LIST_HEAD(unmap_folios); LIST_HEAD(dst_folios); @@ -1769,42 +1844,11 @@ static int migrate_pages_batch(struct list_head *from, thp_retry = 0; nr_retry_pages = 0; - dst = list_first_entry(&dst_folios, struct folio, lru); - dst2 = list_next_entry(dst, lru); - list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) { - is_thp = folio_test_large(folio) && folio_test_pmd_mappable(folio); - nr_pages = folio_nr_pages(folio); - - cond_resched(); - - rc = migrate_folio_move(put_new_folio, private, - folio, dst, mode, - reason, ret_folios); - /* - * The rules are: - * Success: folio will be freed - * -EAGAIN: stay on the unmap_folios list - * Other errno: put on ret_folios list - */ - switch(rc) { - case -EAGAIN: - retry++; - thp_retry += is_thp; - nr_retry_pages += nr_pages; - break; - case MIGRATEPAGE_SUCCESS: - stats->nr_succeeded += nr_pages; - stats->nr_thp_succeeded += is_thp; - break; - default: - nr_failed++; - stats->nr_thp_failed += is_thp; - stats->nr_failed_pages += nr_pages; - break; - } - dst = dst2; - dst2 = list_next_entry(dst, lru); - } + /* Move the unmapped folios */ + migrate_folios_move(&unmap_folios, &dst_folios, + put_new_folio, private, mode, reason, + ret_folios, stats, &retry, &thp_retry, + &nr_failed, &nr_retry_pages); } nr_failed += retry; stats->nr_thp_failed += thp_retry; @@ -1813,20 +1857,8 @@ static int migrate_pages_batch(struct list_head *from, rc = rc_saved ? : nr_failed; out: /* Cleanup remaining folios */ - dst = list_first_entry(&dst_folios, struct folio, lru); - dst2 = list_next_entry(dst, lru); - list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) { - int old_page_state = 0; - struct anon_vma *anon_vma = NULL; - - __migrate_folio_extract(dst, &old_page_state, &anon_vma); - migrate_folio_undo_src(folio, old_page_state & PAGE_WAS_MAPPED, - anon_vma, true, ret_folios); - list_del(&dst->lru); - migrate_folio_undo_dst(dst, true, put_new_folio, private); - dst = dst2; - dst2 = list_next_entry(dst, lru); - } + migrate_folios_undo(&unmap_folios, &dst_folios, + put_new_folio, private, ret_folios); return rc; } From patchwork Thu Jan 11 06:07:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187180 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1260964dyi; Wed, 10 Jan 2024 22:24:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IFoyID+oz1r5crQMKVw1/C6AvSFUprHjBh2owRL0pEEBn8F3NYDSSTxPO3fAJffJgGzxrC4 X-Received: by 2002:ad4:5de3:0:b0:67f:ba74:ebca with SMTP id jn3-20020ad45de3000000b0067fba74ebcamr762499qvb.76.1704954275902; Wed, 10 Jan 2024 22:24:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954275; cv=none; d=google.com; s=arc-20160816; b=kXWuza7n1QxigK2h/jEkhZmCPAytKgW/xhDU2mgtc8GayRWN/SBje+M9xO7k5bBKqG xdAvPmFWRLMdBDbcUfWqmV0cEQgaMs/tDwSNBo2lAU3sqRoz58FGZt+gjUU7TdhhC2Gp 91e2pQ56W5pD4rNTponeDFxwvpnYOh2Az8ye6VjJLzmAjKat2kU9H1DXiPlxXeJWBCfI 6S+tdY8QXmT+iYbJSX0u361Cz2dmvQlNvZFcmoziWmpoNoR++xYQh+j5Kkp5tMu77JPV om7JtJBh8BbphPAcpmegz5QFtgeIUNqQ3IfB543SumvCBcMSHydKtZmSsGP+gzoIQ5BB e7ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=xyww7pMUuLyPBfG9ZuK6U6g9ZtqJWjn+tCNK2s25zA8=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=NmMiiXn2G/ULEztfdhE+CrP2AXaBFrSEAh+Lfss/344xM4fZa8gh1RBiP296r8oTX5 C8FcbMIs3wfmdMhLklN0CfPrSI/ESZd0rTmBAPCfqBgwLHr7oeObSObOD/bIvfe26sZH Hlmjc9hpmjdZFFxIRnn1C1VxWITab/6dPn6jIBTEmxMmm0VRaorci07brJtNqPvTm2Ky FXUTxqghlrm/KE76S1s2LUZRrrYUUSmhNwQJfTucjx0Cria1KpHQvFyx2k3R2j0s4I/K fG8UThKulHJk68uR1DyjDugJMxCS77QDbrYRriqqmuSNJX9249udUK38C/IUs9vHPMxj 5aeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23089-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23089-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id e18-20020a0caa52000000b0067fc4a628fbsi215526qvb.217.2024.01.10.22.24.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23089-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23089-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23089-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B026F1C23264 for ; Thu, 11 Jan 2024 06:24:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 74494E554; Thu, 11 Jan 2024 06:23:36 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 02C3D6D39 for ; Thu, 11 Jan 2024 06:23:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-f2-659f85c84066 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 5/7] mm: Add APIs to free a folio directly to the buddy bypassing pcp Date: Thu, 11 Jan 2024 15:07:55 +0900 Message-Id: <20240111060757.13563-6-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXC9ZZnke6J1vmpBqvvClnMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXRsd1 toI/UhWHj+9mbmD8IdrFyMkhIWAiMb3pOSOMvfHTX3YQm01AXeLGjZ/MILaIgJnEwdY/QHEu DmaBB0wSc9+uAGrg4BAWCJLo+VAGUsMioCrxrWkqG4jNK2Aq8WT6eyaImfISqzccAJvDCTTn wuFZYK1CQDWP2upARkoIvGeTmHVyEgtEvaTEwRU3WCYw8i5gZFjFKJSZV5abmJljopdRmZdZ oZecn7uJERjyy2r/RO9g/HQh+BCjAAejEg/vg0XzUoVYE8uKK3MPMUpwMCuJ8Cp8npMqxJuS WFmVWpQfX1Sak1p8iFGag0VJnNfoW3mKkEB6YklqdmpqQWoRTJaJg1OqgbG1MJQnJ209j1uf 34enj2dOZXGdupdBKTwoKPnVp8uHOlU1A7s4baL8HS/vuBu3ef3nw39vBZ3Ri9vvZMbeZPRW dBlj/KTdTsFTuk47TVS5ujvgao9krHCB+MdXLkke7v0Rp/gXVq62bMxwXq6uW1fWp13OPkc4 sqb57t0brzd5tGgaP5nLqMRSnJFoqMVcVJwIAMpo1nB1AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrJLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qQdNxDos569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfVdDxkt jvceYLLYvGkqs8XvH0B1c6ZYWZycNZnFQdDje2sfi8eCTaUem1doeSze85LJY9OqTjaPTZ8m sXu8O3eO3ePEjN8sHjsfWnrMOxno8X7fVTaPxS8+MHls/WXn8XmTnMe7+W/ZAvijuGxSUnMy y1KL9O0SuDI6rrMV/JGqOHx8N3MD4w/RLkZODgkBE4mNn/6yg9hsAuoSN278ZAaxRQTMJA62 /gGKc3EwCzxgkpj7dgVjFyMHh7BAkETPhzKQGhYBVYlvTVPZQGxeAVOJJ9PfM0HMlJdYveEA 2BxOoDkXDs8CaxUCqnnUVjeBkWsBI8MqRpHMvLLcxMwcU73i7IzKvMwKveT83E2MwABeVvtn 4g7GL5fdDzEKcDAq8fAavJyXKsSaWFZcmXuIUYKDWUmEV+HznFQh3pTEyqrUovz4otKc1OJD jNIcLErivF7hqQlCAumJJanZqakFqUUwWSYOTqkGRg4/2Rzhuw90Bfk+T1v7W0f0e6dAdrXB dqGD1UY9+ioSU2PeiTi7rewtr867bn+raY9J2UJt9fa9GtYbU9Zn//B1XNHrHqe50vUVv7Bh 386XlmErftw//++yc2zDdS8emZrLOjfeVl/LVelTebBgclSo0HLuloBNBnXJt5tsVX5J6f6p b/qnxFKckWioxVxUnAgAIn/9/1wCAAA= X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774135021290372 X-GMAIL-MSGID: 1787774135021290372 This is a preparation for migrc mechanism that frees folios at a better time later, rather than the moment migrating folios. The folios freed by migrc are too old to keep in pcp. Signed-off-by: Byungchul Park --- include/linux/mm.h | 23 +++++++++++++++++++++++ mm/internal.h | 1 + mm/page_alloc.c | 10 ++++++++++ mm/swap.c | 7 +++++++ 4 files changed, 41 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index da5219b48d52..fc0581cce3a7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1284,6 +1284,7 @@ static inline struct folio *virt_to_folio(const void *x) } void __folio_put(struct folio *folio); +void __folio_put_small_nopcp(struct folio *folio); void put_pages_list(struct list_head *pages); @@ -1483,6 +1484,28 @@ static inline void folio_put(struct folio *folio) __folio_put(folio); } +/** + * folio_put_small_nopcp - Decrement the reference count on a folio. + * @folio: The folio. + * + * This is only for a single page folio to release directly to the buddy + * allocator bypassing pcp. + * + * If the folio's reference count reaches zero, the memory will be + * released back to the page allocator and may be used by another + * allocation immediately. Do not access the memory or the struct folio + * after calling folio_put_small_nopcp() unless you can be sure that it + * wasn't the last reference. + * + * Context: May be called in process or interrupt context, but not in NMI + * context. May be called while holding a spinlock. + */ +static inline void folio_put_small_nopcp(struct folio *folio) +{ + if (folio_put_testzero(folio)) + __folio_put_small_nopcp(folio); +} + /** * folio_put_refs - Reduce the reference count on a folio. * @folio: The folio. diff --git a/mm/internal.h b/mm/internal.h index b880f1e78700..3be8fd5604e8 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -451,6 +451,7 @@ extern int user_min_free_kbytes; extern void free_unref_page(struct page *page, unsigned int order); extern void free_unref_page_list(struct list_head *list); +extern void free_pages_nopcp(struct page *page, unsigned int order); extern void zone_pcp_reset(struct zone *zone); extern void zone_pcp_disable(struct zone *zone); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 733732e7e0ba..21b8c8cd1673 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -565,6 +565,16 @@ static inline void free_the_page(struct page *page, unsigned int order) __free_pages_ok(page, order, FPI_NONE); } +void free_pages_nopcp(struct page *page, unsigned int order) +{ + /* + * This function will be used in case that the pages are too + * cold to keep in pcp e.g. migrc mechanism. So it'd better + * release the pages to the tail. + */ + __free_pages_ok(page, order, FPI_TO_TAIL); +} + /* * Higher-order pages are called "compound pages". They are structured thusly: * diff --git a/mm/swap.c b/mm/swap.c index cd8f0150ba3a..3f37496a1184 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -106,6 +106,13 @@ static void __folio_put_small(struct folio *folio) free_unref_page(&folio->page, 0); } +void __folio_put_small_nopcp(struct folio *folio) +{ + __page_cache_release(folio); + mem_cgroup_uncharge(folio); + free_pages_nopcp(&folio->page, 0); +} + static void __folio_put_large(struct folio *folio) { /* From patchwork Thu Jan 11 06:07:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187183 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1261216dyi; Wed, 10 Jan 2024 22:25:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IFDb83iw6t8vHenxv4w4IDn9tPwfLxDUT6ULpnX95U2rDdQ7B4iRku/47sHGOcsk6HcDYAE X-Received: by 2002:ac8:7c43:0:b0:429:be32:b091 with SMTP id o3-20020ac87c43000000b00429be32b091mr212150qtv.99.1704954318707; Wed, 10 Jan 2024 22:25:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954318; cv=none; d=google.com; s=arc-20160816; b=wl5UYF6zo0uWblcUOPZn7U/60wghqXJAuumOY7tUJNXXeickSIicQvzr0EKmnabENg h1knaXkXLr9BkCvtMPtV8yMEps4jUrTbunlJ1Z1CmfuAstLkj8zYtlrChRcS1uba1Yhl 3r9eY7oBTdP+v3x1y+UnTQ5MrI0H09CP7/Fcv/+bfbUre9kSJBFb1O7rTYoyRzuaA0Vp KbUgbnbWE2lelw1KAZeh6egUh0ypQnQA9sk05DG80vPqHKQUne/dDSyCF3KuSguXxuKY /UWdM10wi3doSkFAACaUMnUvYAMo5mK2uF/1qGhWux1TklgVi4nmI9Cx66n+j/qGDNRA XBPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=gW2XZ8Dur3j3eRoXXnx0ZH/nmEsewBR49j9mnr2BFuA=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=ohx6NCf7c/GAAIJzwb1JwS4PyDsPAcgUZZfY6YE6Q3VejZHaOGSiW1gdZs4Iv8nvR8 j3lzWuDfRTPc5fwJRFNWZ/9cbB+0P1ioDKxZvNRAJiIm1VOI63KgCAqv98iBZ48W+fXe 22jSqrZWc6qBkwVGT3p//itjiBplYhfl27V9OnR0vNeTCTVAARI34DXM35xSN/UO+c14 B5YyY3ECyRbnnXViYhR6UyqrBMH3zj/GXBysVY5EZSeK+0Hm6JRvbs36He8ppKcTPgXs T8oq6LOMrDPueuLTkNDEqLeJJ3t8us3F7haDaGIv9EKgd+jgn1ikKkyiLOHz2c/0NVKh iVfg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23091-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23091-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id o8-20020a05622a138800b0042976f74ceesi358733qtk.690.2024.01.10.22.25.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:25:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23091-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23091-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23091-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id EE8C51C23534 for ; Thu, 11 Jan 2024 06:25:06 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4CF6DFBEB; Thu, 11 Jan 2024 06:23:41 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5D59D8F70 for ; Thu, 11 Jan 2024 06:23:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-f7-659f85c85252 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 6/7] mm: Defer TLB flush by keeping both src and dst folios at migration Date: Thu, 11 Jan 2024 15:07:56 +0900 Message-Id: <20240111060757.13563-7-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrKLMWRmVeSWpSXmKPExsXC9ZZnoe6J1vmpBk/+SFjMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXxqWn s9kL1q5jrNh5ZCtrA+PXTsYuRk4OCQETiVU7Wlhg7HU/zjOD2GwC6hI3bvwEs0UEzCQOtv5h 72Lk4mAWeMAkMfftCrBmYYFQiSW/foE1swioSpx9ugqoiIODV8BUovm4MMRMeYnVGw6AzeEE mnPh8CxGkBIhoJJHbXUgIyUEvrNJHN+wDuoGSYmDK26wTGDkXcDIsIpRKDOvLDcxM8dEL6My L7NCLzk/dxMjMPCX1f6J3sH46ULwIUYBDkYlHt4Hi+alCrEmlhVX5h5ilOBgVhLhVfg8J1WI NyWxsiq1KD++qDQntfgQozQHi5I4r9G38hQhgfTEktTs1NSC1CKYLBMHp1QDo5zvtYyk48sy rs644PBTdVLwmcoSFdWbtnt09q+rmOgeHuoY92X3B7+DnYKCp2S+lnxXS35W5qA79em3Ncy3 e/7J/frqr7Cn7+zxvhbeHe2/1+bG6W2/Y+0bMbcqbt/Tj6erj24JTdCc3WYe5nJG0/H0saK5 u5hyWWLnbtW4MP02f9IKMa7r95RYijMSDbWYi4oTASHOSCp4AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qwZaf3BZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLw3JOsFpd3zWGzuLfmP6vF+V1rWS12LN3HZHHpwAImi+u7HjJa HO89wGSxedNUZovfP4Dq5kyxsjg5azKLg6DH99Y+Fo8Fm0o9Nq/Q8li85yWTx6ZVnWwemz5N Yvd4d+4cu8eJGb9ZPHY+tPSYdzLQ4/2+q2wei198YPLY+svO4/MmOY9389+yBfBHcdmkpOZk lqUW6dslcGVcejqbvWDtOsaKnUe2sjYwfu1k7GLk5JAQMJFY9+M8M4jNJqAucePGTzBbRMBM 4mDrH/YuRi4OZoEHTBJz364AaxAWCJVY8usXC4jNIqAqcfbpKqAiDg5eAVOJ5uPCEDPlJVZv OAA2hxNozoXDsxhBSoSASh611U1g5FrAyLCKUSQzryw3MTPHVK84O6MyL7NCLzk/dxMjMIyX 1f6ZuIPxy2X3Q4wCHIxKPLwGL+elCrEmlhVX5h5ilOBgVhLhVfg8J1WINyWxsiq1KD++qDQn tfgQozQHi5I4r1d4aoKQQHpiSWp2ampBahFMlomDU6qB8WmBwObU9fWncqpMFO2//TJav+x7 wLqiefcCd/3bu4JN97imwhFHtdWn9JT+vbrnJ9T79mDd7K+v82Ye2v2JX0x3a8GntIypuZaF m+fr3i+q29c3i3nKcsPozj9XnnKo5U59HKq4JWH62kuvH11L6jZ6o7L53e5a+ad5B5/N6z4c NvFO7oTigiVKLMUZiYZazEXFiQBzPOZWXwIAAA== X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774179683095248 X-GMAIL-MSGID: 1787774179683095248 Implementation of MIGRC mechanism that stands for 'Migration Read Copy'. We always face the migration overhead at either promotion or demotion, while working with tiered memory e.g. CXL memory and found out TLB shootdown is a quite big one that is needed to get rid of if possible. Fortunately, TLB flush can be defered if both source and destination of folios during migration are kept until all TLB flushes required will have been done, of course, only if the target PTE entries have read-only permission, more precisely speaking, don't have write permission. Otherwise, no doubt the folio might get messed up. To achieve that: 1. For the folios that map only to non-writable TLB entries, prevent TLB flush at migration by keeping both source and destination folios, which will be handled later at a better time. 2. When any non-writable TLB entry changes to writable e.g. through fault handler, give up migrc mechanism so as to perform TLB flush required right away. The following estimation using XSBench shows the improvement like: 1. itlb flush was reduced by 93.9%. 2. dtlb thread was reduced by 43.5%. 3. stlb flush was reduced by 24.9%. 4. dtlb store misses was reduced by 34.2%. 5. itlb load misses was reduced by 45.5%. 6. The runtime was reduced by 3.5%. --- The measurement result: Architecture - x86_64 QEMU - kvm enabled, host cpu Numa - 2 nodes (16 CPUs 1GB, no CPUs 8GB) Linux Kernel - v6.7, numa balancing tiering on, demotion enabled Benchmark - XSBench -p 100000000 (-p option makes the runtime longer) run 'perf stat' using events: 1) itlb.itlb_flush 2) tlb_flush.dtlb_thread 3) tlb_flush.stlb_any 4) dTLB-load-misses 5) dTLB-store-misses 6) iTLB-load-misses run 'cat /proc/vmstat' and pick: 1) numa_pages_migrated 2) pgmigrate_success 3) nr_tlb_remote_flush 4) nr_tlb_remote_flush_received 5) nr_tlb_local_flush_all 6) nr_tlb_local_flush_one BEFORE - mainline v6.7 ------------------------------------------ $ perf stat -a \ -e itlb.itlb_flush \ -e tlb_flush.dtlb_thread \ -e tlb_flush.stlb_any \ -e dTLB-load-misses \ -e dTLB-store-misses \ -e iTLB-load-misses \ ./XSBench -p 100000000 Performance counter stats for 'system wide': 85647229 itlb.itlb_flush 480981504 tlb_flush.dtlb_thread 323937200 tlb_flush.stlb_any 238381632579 dTLB-load-misses 601514255 dTLB-store-misses 2974157461 iTLB-load-misses 2252.883892112 seconds time elapsed $ cat /proc/vmstat ... numa_pages_migrated 12790664 pgmigrate_success 26835314 nr_tlb_remote_flush 3031412 nr_tlb_remote_flush_received 45234862 nr_tlb_local_flush_all 216584 nr_tlb_local_flush_one 740940 ... AFTER - mainline v6.7 + migrc ------------------------------------------ $ perf stat -a \ -e itlb.itlb_flush \ -e tlb_flush.dtlb_thread \ -e tlb_flush.stlb_any \ -e dTLB-load-misses \ -e dTLB-store-misses \ -e iTLB-load-misses \ ./XSBench -p 100000000 Performance counter stats for 'system wide': 5240261 itlb.itlb_flush 271581774 tlb_flush.dtlb_thread 243149389 tlb_flush.stlb_any 234502983364 dTLB-load-misses 395673680 dTLB-store-misses 1620215163 iTLB-load-misses 2172.283436287 seconds time elapsed $ cat /proc/vmstat ... numa_pages_migrated 14897064 pgmigrate_success 30825530 nr_tlb_remote_flush 198290 nr_tlb_remote_flush_received 2820156 nr_tlb_local_flush_all 92048 nr_tlb_local_flush_one 741401 ... Signed-off-by: Byungchul Park --- arch/x86/mm/tlb.c | 7 ++ include/linux/mmzone.h | 5 + include/linux/sched.h | 6 + mm/internal.h | 59 ++++++++++ mm/memory.c | 8 ++ mm/migrate.c | 243 +++++++++++++++++++++++++++++++++++++++-- mm/page_alloc.c | 11 +- mm/rmap.c | 10 +- 8 files changed, 337 insertions(+), 12 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 453ea95b667d..daaf8e9580f5 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1246,6 +1246,9 @@ void __flush_tlb_all(void) } EXPORT_SYMBOL_GPL(__flush_tlb_all); +extern void migrc_flush_start(void); +extern void migrc_flush_end(struct arch_tlbflush_unmap_batch *arch); + void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { struct flush_tlb_info *info; @@ -1254,6 +1257,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) info = get_flush_tlb_info(NULL, 0, TLB_FLUSH_ALL, 0, false, TLB_GENERATION_INVALID); + + migrc_flush_start(); + /* * flush_tlb_multi() is not optimized for the common case in which only * a local TLB flush is needed. Optimize this use-case by calling @@ -1268,6 +1274,7 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) local_irq_enable(); } + migrc_flush_end(batch); cpumask_clear(&batch->cpumask); put_flush_tlb_info(); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9db36e197712..5df11a1166f9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1002,6 +1002,11 @@ struct zone { /* Zone statistics */ atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS]; atomic_long_t vm_numa_event[NR_VM_NUMA_EVENT_ITEMS]; + + /* + * the number of folios pending for TLB flush in the zone + */ + atomic_t migrc_pending_nr; } ____cacheline_internodealigned_in_smp; enum pgdat_flags { diff --git a/include/linux/sched.h b/include/linux/sched.h index 0317e7a65151..0cfb7486ecdd 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1330,6 +1330,12 @@ struct task_struct { struct tlbflush_unmap_batch tlb_ubc; struct tlbflush_unmap_batch tlb_ubc_ro; + /* + * whether all the mappings of a folio during unmap are read-only + * so that migrc can work on the folio + */ + bool can_migrc; + /* Cache last used pipe for splice(): */ struct pipe_inode_info *splice_pipe; diff --git a/mm/internal.h b/mm/internal.h index 3be8fd5604e8..dc72a04d33a8 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -925,6 +925,13 @@ void try_to_unmap_flush(void); void try_to_unmap_flush_dirty(void); void flush_tlb_batched_pending(struct mm_struct *mm); void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src); + +static inline void init_tlb_ubc(struct tlbflush_unmap_batch *ubc) +{ + arch_tlbbatch_clear(&ubc->arch); + ubc->flush_required = false; + ubc->writable = false; +} #else static inline void try_to_unmap_flush(void) { @@ -938,6 +945,9 @@ static inline void flush_tlb_batched_pending(struct mm_struct *mm) static inline void fold_ubc(struct tlbflush_unmap_batch *dst, struct tlbflush_unmap_batch *src) { } +static inline void init_tlb_ubc(struct tlbflush_unmap_batch *ubc) +{ +} #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ extern const struct trace_print_flags pageflag_names[]; @@ -1284,4 +1294,53 @@ static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, } #endif /* CONFIG_SHRINKER_DEBUG */ +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +/* + * Reset the indicator indicating there are no writable mappings at the + * beginning of every rmap traverse for unmap. Migrc can work only when + * all the mappings are read-only. + */ +static inline void can_migrc_init(void) +{ + current->can_migrc = true; +} + +/* + * Mark the folio is not applicable to migrc, once it found a writble or + * dirty pte during rmap traverse for unmap. + */ +static inline void can_migrc_fail(void) +{ + current->can_migrc = false; +} + +/* + * Check if all the mappings are read-only and read-only mappings even + * exist. + */ +static inline bool can_migrc_test(void) +{ + return current->can_migrc && current->tlb_ubc_ro.flush_required; +} + +/* + * Return the number of folios pending TLB flush that have yet to get + * freed in the zone. + */ +static inline int migrc_pending_nr_in_zone(struct zone *z) +{ + return atomic_read(&z->migrc_pending_nr); +} + +/* + * Perform TLB flush needed and free the folios under migrc's control. + */ +bool migrc_flush_free_folios(void); +#else /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ +static inline void can_migrc_init(void) {} +static inline void can_migrc_fail(void) {} +static inline bool can_migrc_test(void) { return false; } +static inline int migrc_pending_nr_in_zone(struct zone *z) { return 0; } +static inline bool migrc_flush_free_folios(void) { return false; } +#endif #endif /* __MM_INTERNAL_H */ diff --git a/mm/memory.c b/mm/memory.c index 6e0712d06cd4..e67de161da8b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3462,6 +3462,14 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) if (vmf->page) folio = page_folio(vmf->page); + /* + * The folio may or may not be one that is under migrc's control + * and about to change its permission from read-only to writable. + * Conservatively give up deferring TLB flush just in case. + */ + if (folio) + migrc_flush_free_folios(); + /* * Shared mapping: we are guaranteed to have VM_WRITE and * FAULT_FLAG_WRITE set at this point. diff --git a/mm/migrate.c b/mm/migrate.c index bbe1ecef4956..181bfe260442 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -57,6 +57,162 @@ #include "internal.h" +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +static struct tlbflush_unmap_batch migrc_ubc; +static LIST_HEAD(migrc_folios); +static DEFINE_SPINLOCK(migrc_lock); + +/* + * Need to synchronize between TLB flush and managing pending CPUs in + * migrc_ubc. Take a look at the following scenario: + * + * CPU0 CPU1 + * ---- ---- + * TLB flush + * Unmap folios (needing TLB flush) + * Add pending CPUs to migrc_ubc + * Clear the CPUs from migrc_ubc + * + * The pending CPUs added in CPU1 should not be cleared from migrc_ubc + * in CPU0 because the TLB flush for migrc_ubc added in CPU1 has not + * been performed this turn. To avoid this, using 'migrc_flushing' + * variable, prevent adding pending CPUs to migrc_ubc and give up migrc + * mechanism if others are in the middle of TLB flush, like: + * + * CPU0 CPU1 + * ---- ---- + * migrc_flushing++ + * TLB flush + * Unmap folios (needing TLB flush) + * If migrc_flushing == 0: + * Add pending CPUs to migrc_ubc + * Else: <--- hit + * Give up migrc mechanism + * Clear the CPUs from migrc_ubc + * migrc_flush-- + * + * Only the following case would be allowed for migrc mechanism to work: + * + * CPU0 CPU1 + * ---- ---- + * Unmap folios (needing TLB flush) + * If migrc_flushing == 0: <--- hit + * Add pending CPUs to migrc_ubc + * Else: + * Give up migrc mechanism + * migrc_flushing++ + * TLB flush + * Clear the CPUs from migrc_ubc + * migrc_flush-- + */ +static int migrc_flushing; + +static bool migrc_add_pending_ubc(struct tlbflush_unmap_batch *ubc) +{ + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + unsigned long flags; + + spin_lock_irqsave(&migrc_lock, flags); + if (migrc_flushing) { + spin_unlock_irqrestore(&migrc_lock, flags); + + /* + * Give up migrc mechanism. Just let TLB flush needed + * handled by try_to_unmap_flush() at the caller side. + */ + fold_ubc(tlb_ubc, ubc); + return false; + } + fold_ubc(&migrc_ubc, ubc); + spin_unlock_irqrestore(&migrc_lock, flags); + return true; +} + +static bool migrc_add_pending_folios(struct list_head *folios) +{ + unsigned long flags; + + spin_lock_irqsave(&migrc_lock, flags); + if (migrc_flushing) { + spin_unlock_irqrestore(&migrc_lock, flags); + + /* + * Give up migrc mechanism. The caller should perform + * TLB flush needed using migrc_flush_free_folios() and + * undo some on the folios e.g. restore folios' + * reference count increased by migrc and more. + */ + return false; + } + list_splice(folios, &migrc_folios); + spin_unlock_irqrestore(&migrc_lock, flags); + return true; +} + +void migrc_flush_start(void) +{ + unsigned long flags; + + spin_lock_irqsave(&migrc_lock, flags); + migrc_flushing++; + spin_unlock_irqrestore(&migrc_lock, flags); +} + +void migrc_flush_end(struct arch_tlbflush_unmap_batch *arch) +{ + LIST_HEAD(folios); + struct folio *f, *f2; + unsigned long flags; + + spin_lock_irqsave(&migrc_lock, flags); + if (!arch_tlbbatch_done(&migrc_ubc.arch, arch)) { + list_splice_init(&migrc_folios, &folios); + migrc_ubc.flush_required = false; + migrc_ubc.writable = false; + } + migrc_flushing--; + spin_unlock_irqrestore(&migrc_lock, flags); + + list_for_each_entry_safe(f, f2, &folios, lru) { + folio_put_small_nopcp(f); + atomic_dec(&folio_zone(f)->migrc_pending_nr); + } +} + +bool migrc_flush_free_folios(void) +{ + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + LIST_HEAD(folios); + struct folio *f, *f2; + unsigned long flags; + bool ret = true; + + spin_lock_irqsave(&migrc_lock, flags); + list_splice_init(&migrc_folios, &folios); + fold_ubc(tlb_ubc, &migrc_ubc); + spin_unlock_irqrestore(&migrc_lock, flags); + + if (list_empty(&folios)) + ret = false; + + try_to_unmap_flush(); + list_for_each_entry_safe(f, f2, &folios, lru) { + folio_put_small_nopcp(f); + atomic_dec(&folio_zone(f)->migrc_pending_nr); + } + return ret; +} +#else /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ +static bool migrc_add_pending_ubc(struct tlbflush_unmap_batch *ubc) +{ + return false; +} +static bool migrc_add_pending_folios(struct list_head *folios) +{ + return false; +} +#endif + bool isolate_movable_page(struct page *page, isolate_mode_t mode) { struct folio *folio = folio_get_nontail_page(page); @@ -1274,7 +1430,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, struct folio *src, struct folio *dst, enum migrate_mode mode, enum migrate_reason reason, - struct list_head *ret) + struct list_head *ret, struct list_head *move_succ) { int rc; int old_page_state = 0; @@ -1321,9 +1477,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, /* * A folio that has been migrated has all references removed - * and will be freed. + * and will be freed, unless it's under migrc's control. */ - list_del(&src->lru); + if (move_succ) { + folio_get(src); + atomic_inc(&folio_zone(src)->migrc_pending_nr); + list_move_tail(&src->lru, move_succ); + } else + list_del(&src->lru); + /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1618,7 +1780,7 @@ static void migrate_folios_move(struct list_head *src_folios, struct list_head *ret_folios, struct migrate_pages_stats *stats, int *retry, int *thp_retry, int *nr_failed, - int *nr_retry_pages) + int *nr_retry_pages, struct list_head *move_succ) { struct folio *folio, *folio2, *dst, *dst2; bool is_thp; @@ -1635,7 +1797,7 @@ static void migrate_folios_move(struct list_head *src_folios, rc = migrate_folio_move(put_new_folio, private, folio, dst, mode, - reason, ret_folios); + reason, ret_folios, move_succ); /* * The rules are: * Success: folio will be freed @@ -1712,17 +1874,33 @@ static int migrate_pages_batch(struct list_head *from, int rc, rc_saved = 0, nr_pages; LIST_HEAD(unmap_folios); LIST_HEAD(dst_folios); + LIST_HEAD(unmap_folios_migrc); + LIST_HEAD(dst_folios_migrc); + LIST_HEAD(move_succ); bool nosplit = (reason == MR_NUMA_MISPLACED); + struct tlbflush_unmap_batch pending_ubc; + struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + struct tlbflush_unmap_batch *tlb_ubc_ro = ¤t->tlb_ubc_ro; + bool do_migrc; + bool migrc_ubc_succ; VM_WARN_ON_ONCE(mode != MIGRATE_ASYNC && !list_empty(from) && !list_is_singular(from)); + /* + * Apply migrc only to numa migration for now. + */ + init_tlb_ubc(&pending_ubc); + do_migrc = (reason == MR_DEMOTION || reason == MR_NUMA_MISPLACED); + for (pass = 0; pass < nr_pass && retry; pass++) { retry = 0; thp_retry = 0; nr_retry_pages = 0; list_for_each_entry_safe(folio, folio2, from, lru) { + bool can_migrc; + is_large = folio_test_large(folio); is_thp = is_large && folio_test_pmd_mappable(folio); nr_pages = folio_nr_pages(folio); @@ -1752,9 +1930,12 @@ static int migrate_pages_batch(struct list_head *from, continue; } + can_migrc_init(); rc = migrate_folio_unmap(get_new_folio, put_new_folio, private, folio, &dst, mode, reason, ret_folios); + can_migrc = do_migrc && can_migrc_test() && !is_large; + /* * The rules are: * Success: folio will be freed @@ -1800,7 +1981,8 @@ static int migrate_pages_batch(struct list_head *from, /* nr_failed isn't updated for not used */ stats->nr_thp_failed += thp_retry; rc_saved = rc; - if (list_empty(&unmap_folios)) + if (list_empty(&unmap_folios) && + list_empty(&unmap_folios_migrc)) goto out; else goto move; @@ -1814,8 +1996,19 @@ static int migrate_pages_batch(struct list_head *from, stats->nr_thp_succeeded += is_thp; break; case MIGRATEPAGE_UNMAP: - list_move_tail(&folio->lru, &unmap_folios); - list_add_tail(&dst->lru, &dst_folios); + if (can_migrc) { + list_move_tail(&folio->lru, &unmap_folios_migrc); + list_add_tail(&dst->lru, &dst_folios_migrc); + + /* + * Gather ro batch data to add + * to migrc_ubc after unmap. + */ + fold_ubc(&pending_ubc, tlb_ubc_ro); + } else { + list_move_tail(&folio->lru, &unmap_folios); + list_add_tail(&dst->lru, &dst_folios); + } break; default: /* @@ -1829,12 +2022,19 @@ static int migrate_pages_batch(struct list_head *from, stats->nr_failed_pages += nr_pages; break; } + /* + * Done with the current folio. Fold the ro + * batch data gathered, to the normal batch. + */ + fold_ubc(tlb_ubc, tlb_ubc_ro); } } nr_failed += retry; stats->nr_thp_failed += thp_retry; stats->nr_failed_pages += nr_retry_pages; move: + /* Should be before try_to_unmap_flush() */ + migrc_ubc_succ = do_migrc && migrc_add_pending_ubc(&pending_ubc); /* Flush TLBs for all unmapped folios */ try_to_unmap_flush(); @@ -1848,7 +2048,30 @@ static int migrate_pages_batch(struct list_head *from, migrate_folios_move(&unmap_folios, &dst_folios, put_new_folio, private, mode, reason, ret_folios, stats, &retry, &thp_retry, - &nr_failed, &nr_retry_pages); + &nr_failed, &nr_retry_pages, NULL); + migrate_folios_move(&unmap_folios_migrc, &dst_folios_migrc, + put_new_folio, private, mode, reason, + ret_folios, stats, &retry, &thp_retry, + &nr_failed, &nr_retry_pages, migrc_ubc_succ ? + &move_succ : NULL); + } + + /* + * In case that migrc_add_pending_ubc() has been added + * successfully but migrc_add_pending_folios() does not. + */ + if (migrc_ubc_succ && !migrc_add_pending_folios(&move_succ)) { + migrc_flush_free_folios(); + + /* + * Undo src folios that have been successfully added to + * move_succ. + */ + list_for_each_entry_safe(folio, folio2, &move_succ, lru) { + list_del(&folio->lru); + folio_put(folio); + atomic_dec(&folio_zone(folio)->migrc_pending_nr); + } } nr_failed += retry; stats->nr_thp_failed += thp_retry; @@ -1859,6 +2082,8 @@ static int migrate_pages_batch(struct list_head *from, /* Cleanup remaining folios */ migrate_folios_undo(&unmap_folios, &dst_folios, put_new_folio, private, ret_folios); + migrate_folios_undo(&unmap_folios_migrc, &dst_folios_migrc, + put_new_folio, private, ret_folios); return rc; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 21b8c8cd1673..6ef0c22b1109 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2972,6 +2972,8 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, long min = mark; int o; + free_pages += migrc_pending_nr_in_zone(z); + /* free_pages may go negative - that's OK */ free_pages -= __zone_watermark_unusable_free(z, order, alloc_flags); @@ -3066,7 +3068,7 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order, long usable_free; long reserved; - usable_free = free_pages; + usable_free = free_pages + migrc_pending_nr_in_zone(z); reserved = __zone_watermark_unusable_free(z, 0, alloc_flags); /* reserved may over estimate high-atomic reserves. */ @@ -3273,6 +3275,13 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, gfp_mask)) { int ret; + if (migrc_pending_nr_in_zone(zone) && + migrc_flush_free_folios() && + zone_watermark_fast(zone, order, mark, + ac->highest_zoneidx, + alloc_flags, gfp_mask)) + goto try_this_zone; + if (has_unaccepted_memory()) { if (try_to_accept_memory(zone, order)) goto try_this_zone; diff --git a/mm/rmap.c b/mm/rmap.c index da36f23ff7b0..79e1827dec89 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -676,9 +676,15 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, if (!pte_accessible(mm, pteval)) return; - if (pte_write(pteval) || writable) + if (pte_write(pteval) || writable) { tlb_ubc = ¤t->tlb_ubc; - else + + /* + * Migrc cannot work with the folio, once it found a + * writable or dirty mapping on it. + */ + can_migrc_fail(); + } else tlb_ubc = ¤t->tlb_ubc_ro; arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); From patchwork Thu Jan 11 06:07:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Byungchul Park X-Patchwork-Id: 187182 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1261085dyi; Wed, 10 Jan 2024 22:25:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IGMNeZ9uI9nyOS5lG29Y2cpr/TpwY33fXsEjesdwKrD6lIPJheyBAmJH5GESII4roHw4JDF X-Received: by 2002:a2e:7013:0:b0:2cc:e386:3772 with SMTP id l19-20020a2e7013000000b002cce3863772mr81900ljc.29.1704954300017; Wed, 10 Jan 2024 22:25:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704954299; cv=none; d=google.com; s=arc-20160816; b=O2I2M3cndZYjyxAPir2R2H1zHlUIyBiNKymZni6aJXkTLxVH9VsWYEvlIgSXKLZZ0Z wslgXxiB8HZ3s5V9S5gh5EXUB/3F5nHkhPlYPhciReh8C0a6DGv20er4L66fy28RHv98 /gFlNKJy/9PG5pCuGkjdsZOnUoApRaCDRfsBDRrGhc+WUYkkmz5PEcM2sbUXfFbLnGGR fXWhST3RbV5HSutTBQCzNCjDIIJkCgRh487i1g6+nWYU4Fz1sv2paet8m+7Lcj7RQNX2 NwrFJjlUUp2HfXPrx/T2TtWOswOHfG0oyntGI62ydbzdxHZT24f8jnLb0/X+4jhEiIbP seIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from; bh=jktqe/8wRKixGfJxBF2wqUAEEsUR7VixQg4D2XNIpYU=; fh=qkutryktk1+eA6li8Kr0a/P8xPDRQ0wHxGCKWFfNW0w=; b=QBsKgA2yxb7UYfF5PVM/qer0X+eKb+PgpAZ5Y/+RW2rJEwCiH2Nrzy4lw8ar8CQ5ay ui542rhSqWHrNWEF8cOrIIUSaWJCRZbu/2mf7Z4m+jUXT14aHljnk8uPI06xYYXNN0j8 h+Z5Iz2FegW4kKxHS8smf9HA0vvfsq5oxVVUOR6QldFLkkWacBKTAzy24yVXIhPZQI+I fFqT/TXmxb5ZwRRx0YEMmqjDfZMBvO2fYLrY8xjVGXJZ+hSOgQAnmJtiUh4k0mT4ARSL XoL/oEJtiKnYpd1czh0R7f2M/Mqu6dvJXmfDjMQZTi8osd8CjV1f/koPemy03FTAJvJT hO7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23092-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23092-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id y2-20020a056402440200b00552dd44ddb0si221229eda.288.2024.01.10.22.24.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 22:24:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23092-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-23092-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23092-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 772931F239FE for ; Thu, 11 Jan 2024 06:24:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E6B5EF9EB; Thu, 11 Jan 2024 06:23:38 +0000 (UTC) Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 31293DDB6 for ; Thu, 11 Jan 2024 06:23:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sk.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sk.com X-AuditID: a67dfc5b-d6dff70000001748-fc-659f85c84c60 From: Byungchul Park To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, namit@vmware.com, xhao@linux.alibaba.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com Subject: [v5 7/7] mm: Pause migrc mechanism at high memory pressure Date: Thu, 11 Jan 2024 15:07:57 +0900 Message-Id: <20240111060757.13563-8-byungchul@sk.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20240111060757.13563-1-byungchul@sk.com> References: <20240111060757.13563-1-byungchul@sk.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMLMWRmVeSWpSXmKPExsXC9ZZnke6J1vmpBt+OS1rMWb+GzeLzhn9s Fi82tDNafF3/i9ni6ac+FovLu+awWdxb85/V4vyutawWO5buY7K4dGABk8X1XQ8ZLY73HmCy 2LxpKrPF7x9AdXOmWFmcnDWZxUHA43trH4vHgk2lHptXaHks3vOSyWPTqk42j02fJrF7vDt3 jt3jxIzfLB47H1p6zDsZ6PF+31U2j62/7Dw+b5LzeDf/LVsAXxSXTUpqTmZZapG+XQJXRuvT 24wF15Qq7p/7wt7AOEGmi5GTQ0LAROLXs9WMXYwcYPbXpUogYTYBdYkbN34yg9giAmYSB1v/ sHcxcnEwCzxgkpj7dgUjSEJYwFniypE37CA2i4CqROeWKWANvAKmEvv6/7NAzJeXWL3hAFic E2jQhcOzwHYJAdU8aqsDmSkh8J1N4tSzHmaIekmJgytusExg5F3AyLCKUSgzryw3MTPHRC+j Mi+zQi85P3cTIzDol9X+id7B+OlC8CFGAQ5GJR7eB4vmpQqxJpYVV+YeYpTgYFYS4VX4PCdV iDclsbIqtSg/vqg0J7X4EKM0B4uSOK/Rt/IUIYH0xJLU7NTUgtQimCwTB6dUA2NK4BODuvaY 85cFJTwi2gO2xOTpXdLxvB3rWPEvRO3BVrPfk2a/mLP0+O0inqUT706TTah6NNH0uJH4iclr Pt/dVa+umMV6Z9WRz9p28tvio3J3Gu/8fDNWbOU7j4txHAvdkpoumdT7HLhWKNuyKt/lit/x yiS/n25KJz/XTOMoPKi/aJ3Yu0glluKMREMt5qLiRAA8WMaJdgIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrFLMWRmVeSWpSXmKPExsXC5WfdrHuidX6qwatdfBZz1q9hs/i84R+b xYsN7YwWX9f/YrZ4+qmPxeLw3JOsFpd3zWGzuLfmP6vF+V1rWS12LN3HZHHpwAImi+u7HjJa HO89wGSxedNUZovfP4Dq5kyxsjg5azKLg6DH99Y+Fo8Fm0o9Nq/Q8li85yWTx6ZVnWwemz5N Yvd4d+4cu8eJGb9ZPHY+tPSYdzLQ4/2+q2wei198YPLY+svO4/MmOY9389+yBfBHcdmkpOZk lqUW6dslcGW0Pr3NWHBNqeL+uS/sDYwTZLoYOTgkBEwkvi5V6mLk5GATUJe4ceMnM4gtImAm cbD1D3sXIxcHs8ADJom5b1cwgiSEBZwlrhx5ww5iswioSnRumQLWwCtgKrGv/z8LiC0hIC+x esMBsDgn0KALh2cxguwSAqp51FY3gZFrASPDKkaRzLyy3MTMHFO94uyMyrzMCr3k/NxNjMAQ Xlb7Z+IOxi+X3Q8xCnAwKvHwGryclyrEmlhWXJl7iFGCg1lJhFfh85xUId6UxMqq1KL8+KLS nNTiQ4zSHCxK4rxe4akJQgLpiSWp2ampBalFMFkmDk6pBsYWjaCUo+t/GWzL++0tv2VW7tMf f5/V301+XBobrB6ynffi8vTP7d/z9nGtcdu9oVps9zaOiwtn+6oa7T+kw7hwkYHh6nQHY0HB m7xr52xLmj7vQtXT0wc2yarpOrhskFvDf2NXl4iBQWjBnS+Nxit2f65gPy+tfdJzQvmzsNOT /p/33HyE/9I+JZbijERDLeai4kQAdp/uwF0CAAA= X-CFilter-Loop: Reflected Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787774160143610598 X-GMAIL-MSGID: 1787774160143610598 Regression was observed when the system is in high memory pressure with swap on, where migrc might keep a number of folios in its pending queue, which possibly makes it worse. So temporarily prevented migrc from working on that condition. Signed-off-by: Byungchul Park --- mm/internal.h | 20 ++++++++++++++++++++ mm/migrate.c | 16 ++++++++++++++++ mm/page_alloc.c | 13 +++++++++++++ 3 files changed, 49 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index dc72a04d33a8..cade8219b417 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1295,6 +1295,8 @@ static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, #endif /* CONFIG_SHRINKER_DEBUG */ #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +extern atomic_t migrc_pause_cnt; + /* * Reset the indicator indicating there are no writable mappings at the * beginning of every rmap traverse for unmap. Migrc can work only when @@ -1323,6 +1325,21 @@ static inline bool can_migrc_test(void) return current->can_migrc && current->tlb_ubc_ro.flush_required; } +static inline void migrc_pause(void) +{ + atomic_inc(&migrc_pause_cnt); +} + +static inline void migrc_resume(void) +{ + atomic_dec(&migrc_pause_cnt); +} + +static inline bool migrc_paused(void) +{ + return !!atomic_read(&migrc_pause_cnt); +} + /* * Return the number of folios pending TLB flush that have yet to get * freed in the zone. @@ -1340,6 +1357,9 @@ bool migrc_flush_free_folios(void); static inline void can_migrc_init(void) {} static inline void can_migrc_fail(void) {} static inline bool can_migrc_test(void) { return false; } +static inline void migrc_pause(void) {} +static inline void migrc_resume(void) {} +static inline bool migrc_paused(void) { return false; } static inline int migrc_pending_nr_in_zone(struct zone *z) { return 0; } static inline bool migrc_flush_free_folios(void) { return false; } #endif diff --git a/mm/migrate.c b/mm/migrate.c index 181bfe260442..d072591c6ce6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -62,6 +62,12 @@ static struct tlbflush_unmap_batch migrc_ubc; static LIST_HEAD(migrc_folios); static DEFINE_SPINLOCK(migrc_lock); +/* + * Increase on entry of handling high memory pressure e.g. direct + * reclaim, decrease on the exit. See __alloc_pages_slowpath(). + */ +atomic_t migrc_pause_cnt = ATOMIC_INIT(0); + /* * Need to synchronize between TLB flush and managing pending CPUs in * migrc_ubc. Take a look at the following scenario: @@ -1892,6 +1898,7 @@ static int migrate_pages_batch(struct list_head *from, */ init_tlb_ubc(&pending_ubc); do_migrc = (reason == MR_DEMOTION || reason == MR_NUMA_MISPLACED); + do_migrc = do_migrc && !migrc_paused(); for (pass = 0; pass < nr_pass && retry; pass++) { retry = 0; @@ -1930,6 +1937,15 @@ static int migrate_pages_batch(struct list_head *from, continue; } + /* + * In case that the system is in high memory + * pressure, give up migrc mechanism this turn. + */ + if (unlikely(do_migrc && migrc_paused())) { + fold_ubc(tlb_ubc, &pending_ubc); + do_migrc = false; + } + can_migrc_init(); rc = migrate_folio_unmap(get_new_folio, put_new_folio, private, folio, &dst, mode, reason, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6ef0c22b1109..366777afce7f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4072,6 +4072,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, unsigned int cpuset_mems_cookie; unsigned int zonelist_iter_cookie; int reserve_flags; + bool migrc_paused = false; restart: compaction_retries = 0; @@ -4203,6 +4204,16 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (page) goto got_pg; + /* + * The system is in very high memory pressure. Pause migrc from + * expanding its pending queue temporarily. + */ + if (!migrc_paused) { + migrc_pause(); + migrc_paused = true; + migrc_flush_free_folios(); + } + /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; @@ -4330,6 +4341,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: + if (migrc_paused) + migrc_resume(); return page; }