Message ID | 20231119194740.94101-5-ryncsn@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp1809601vqn; Sun, 19 Nov 2023 11:48:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEoVAfxqyj98H/qF3T9AseRBdZLe2x8w1ezygswot6nPTOYSZ7oio2UTpsmz4dos0vIHmuF X-Received: by 2002:a17:90a:1690:b0:281:416e:1c3f with SMTP id o16-20020a17090a169000b00281416e1c3fmr3936343pja.28.1700423311332; Sun, 19 Nov 2023 11:48:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700423311; cv=none; d=google.com; s=arc-20160816; b=XjwSqRWi3jtbU6gqt8sCrsOLb79Jw0bKv4zggJ8gKVD+LK7a7EP7KJDYPFK1CF3pF/ AhUqplHtXGhJ7obJWEdPtonIY3FOwKUOryaemVzImy1qr7Sy/WSf8AB80+gWEiYB5IvU YNzB1EpQOz7EwsZ28xT/ILTY7ews0rmShEzIOkRJUjTzKbKDqzJmrHWtaCtbhMdVID2j aSyVUYFqMxIF82FLU7s3acnJDTHzpNazj+fChBs9XkRFnZcb26w+qJcn0Ah28QWawZHs P5CGD4Wb5Kq9DJplkBSJpf6M4jedPITSi85ihdIFwk0/CK8pRecWXs93IfnuTEIyfZmn QjKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version:reply-to :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xjzoOYVfaQZliXw2umUqNXWj2wEYGvE1fLa1+gbge5g=; fh=4HE/piJoUCKuBTCCBiej4//zvvzywHdOLL9QM/KYjYM=; b=fpuBL2q2kzDlwym0RvFMbyowuZm2aWw4T+VjzvEAvOGFsqY0rOZDzDG7ufucguMrRS XeYBW4V2QkRdjB23P1WQNFoeMzpWReV+dr2lIQKIyr5yjpKrC4fb/MDKT6yV9kMB1whc GgyRRYoUa2l3mEhm5Im0Otx3ukS/y3yciKU+BQzg8CMkSyj8AQX+b6Hg7vvxn7wuIPr2 sR9zanOK5sO+LoRpekuD6YqWhdCbmKxnSwCV8VDZEl9qQ0392uRI9P5jj+pbVbQA4HY4 o3AR3dlifvK6bl7GPSigBzG/3RyNauOFe+icvg3uscqY7DufP9xHMGYcEjjYM5Tfhk8L s4Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=IusQLIKD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id s7-20020a170902ea0700b001ce64a096dfsi5381323plg.502.2023.11.19.11.48.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 11:48:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=IusQLIKD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 75AC480A9900; Sun, 19 Nov 2023 11:48:30 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230053AbjKSTsX (ORCPT <rfc822;jaysivo@gmail.com> + 29 others); Sun, 19 Nov 2023 14:48:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230475AbjKSTsU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 19 Nov 2023 14:48:20 -0500 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4EDED5D for <linux-kernel@vger.kernel.org>; Sun, 19 Nov 2023 11:48:15 -0800 (PST) Received: by mail-ot1-x335.google.com with SMTP id 46e09a7af769-6ce2cf67be2so2082085a34.2 for <linux-kernel@vger.kernel.org>; Sun, 19 Nov 2023 11:48:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700423295; x=1701028095; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=xjzoOYVfaQZliXw2umUqNXWj2wEYGvE1fLa1+gbge5g=; b=IusQLIKDW1ybycdFgJM22Bx23qUm7uOyWwwZrWxeWOz54H0+166QH1cupFkrJ7z3qU PKVbvp8SDoKUR63g3SMFujLfAAB0PkeY7X/R0ugnhUve16dPomwhOgnL8eZjHl7nFOxo McGvs6UkJCeQ5zdZBZLBdIdu770XpqgpoqREtIcC77A/Q/A6an29HV7BEXk6J+/BLmTC VkH+s4IpSReoXwzBFlgK4wcThRzXOgmucaS/agfqEUVQxrvL29D8wJ7p4jiTRRy8ylNK 7VeOoXaKKnQiT2xHZpcgSbwxJvE69Ena15wcbOkK6o6vmoIDXJYe9vL7hxcczurKFa1+ yrew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700423295; x=1701028095; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=xjzoOYVfaQZliXw2umUqNXWj2wEYGvE1fLa1+gbge5g=; b=wJgwmfXTB+WolAYtw0dtjmnvzSbO+Ua/M2C6WUk7NtFGmUUl5pk8z/YlzSCi7hSk6F EVYYqOOPTkkMNtIQvQINE2uVEHLx7UQ9JImgD7oHs6ve5VOxJr0waqhcXRRJSfEWcHwe fx2KBq5Jl58cKsRz+dTaBlhIf7EhZ2x6RiCx7p+AAc5lf/huaCAnu1XkIZ2fCMhY4IKl 0t52lkHTa5ZcQJJAdDe3qOgGu9QWGdC3QreuPxDJDOgpWUkDgGpBKRDTXpvI47CZIFPv rRO5KLjh5y3V6EynpcRVWLMM49RBh/mn7pcSYF7T1CWznuKBhIdOIY5Lu82eF83SikLZ 5eww== X-Gm-Message-State: AOJu0YyTrmwdZDSR+nShLug3Z+4TycM3VtBPe9nhXusLhw0MsEW7hXvm RdXm9Ft3YWdKh53ajIMRS3I= X-Received: by 2002:a05:6870:3b85:b0:1f9:36fe:fd0e with SMTP id gi5-20020a0568703b8500b001f936fefd0emr1616531oab.47.1700423294901; Sun, 19 Nov 2023 11:48:14 -0800 (PST) Received: from KASONG-MB2.tencent.com ([115.171.40.79]) by smtp.gmail.com with ESMTPSA id a6-20020aa78646000000b006cb7feae74fsm1237140pfo.164.2023.11.19.11.48.12 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 19 Nov 2023 11:48:14 -0800 (PST) From: Kairui Song <ryncsn@gmail.com> To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, "Huang, Ying" <ying.huang@intel.com>, David Hildenbrand <david@redhat.com>, Hugh Dickins <hughd@google.com>, Johannes Weiner <hannes@cmpxchg.org>, Matthew Wilcox <willy@infradead.org>, Michal Hocko <mhocko@suse.com>, linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com> Subject: [PATCH 04/24] mm/swap: avoid setting page lock bit and doing extra unlock check Date: Mon, 20 Nov 2023 03:47:20 +0800 Message-ID: <20231119194740.94101-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231119194740.94101-1-ryncsn@gmail.com> References: <20231119194740.94101-1-ryncsn@gmail.com> Reply-To: Kairui Song <kasong@tencent.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sun, 19 Nov 2023 11:48:30 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783023073626289089 X-GMAIL-MSGID: 1783023073626289089 |
Series |
Swapin path refactor for optimization and bugfix
|
|
Commit Message
Kairui Song
Nov. 19, 2023, 7:47 p.m. UTC
From: Kairui Song <kasong@tencent.com> When swapping in a page, mem_cgroup_swapin_charge_folio is called for new allocated folio, nothing else is referencing the folio so no need to set the lock bit. This avoided doing unlock check on error path. Signed-off-by: Kairui Song <kasong@tencent.com> --- mm/swap_state.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-)
Comments
On Sun, Nov 19, 2023 at 11:48 AM Kairui Song <ryncsn@gmail.com> wrote: > > From: Kairui Song <kasong@tencent.com> > > When swapping in a page, mem_cgroup_swapin_charge_folio is called for new > allocated folio, nothing else is referencing the folio so no need to set > the lock bit. This avoided doing unlock check on error path. > > Signed-off-by: Kairui Song <kasong@tencent.com> > --- > mm/swap_state.c | 20 +++++++++----------- > 1 file changed, 9 insertions(+), 11 deletions(-) > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index ac4fa404eaa7..45dd8b7c195d 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -458,6 +458,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, You move the mem_cgroup_swapin_charge_folio() inside the for loop: for (;;) { int err; /* * First check the swap cache. Since this is normally * called after swap_cache_get_folio() failed, re-calling * that would confuse statistics. */ folio = filemap_get_folio(swap_address_space(entry), swp_offset(entry)); > mpol, ilx, numa_node_id()); > if (!folio) > goto fail_put_swap; > + if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > + goto fail_put_folio; Wouldn't it cause repeat charging of the folio when it is racing against others in the for loop? > > /* > * Swap entry may have been freed since our caller observed it. > @@ -483,13 +485,9 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > /* > * The swap entry is ours to swap in. Prepare the new page. > */ > - > __folio_set_locked(folio); > __folio_set_swapbacked(folio); > > - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > - goto fail_unlock; > - The original code makes the charge outside of the for loop. Only the winner can charge once. Chris
Chris Li <chrisl@kernel.org> 于2023年11月20日周一 12:18写道: > > On Sun, Nov 19, 2023 at 11:48 AM Kairui Song <ryncsn@gmail.com> wrote: > > > > From: Kairui Song <kasong@tencent.com> > > > > When swapping in a page, mem_cgroup_swapin_charge_folio is called for new > > allocated folio, nothing else is referencing the folio so no need to set > > the lock bit. This avoided doing unlock check on error path. > > > > Signed-off-by: Kairui Song <kasong@tencent.com> > > --- > > mm/swap_state.c | 20 +++++++++----------- > > 1 file changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > index ac4fa404eaa7..45dd8b7c195d 100644 > > --- a/mm/swap_state.c > > +++ b/mm/swap_state.c > > @@ -458,6 +458,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > You move the mem_cgroup_swapin_charge_folio() inside the for loop: > > > for (;;) { > int err; > /* > * First check the swap cache. Since this is normally > * called after swap_cache_get_folio() failed, re-calling > * that would confuse statistics. > */ > folio = filemap_get_folio(swap_address_space(entry), > swp_offset(entry)); > > > > mpol, ilx, numa_node_id()); > > if (!folio) > > goto fail_put_swap; > > + if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > + goto fail_put_folio; > > Wouldn't it cause repeat charging of the folio when it is racing > against others in the for loop? The race loser will call folio_put and discharge it? > > > > > /* > > * Swap entry may have been freed since our caller observed it. > > @@ -483,13 +485,9 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > /* > > * The swap entry is ours to swap in. Prepare the new page. > > */ > > - > > __folio_set_locked(folio); > > __folio_set_swapbacked(folio); > > > > - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > - goto fail_unlock; > > - > > The original code makes the charge outside of the for loop. Only the > winner can charge once. Right, this patch may make the charge/dis-charge path more complex for race swapin, I'll re-check this part. > > Chris
On Mon, Nov 20, 2023 at 3:15 AM Kairui Song <ryncsn@gmail.com> wrote: > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > > index ac4fa404eaa7..45dd8b7c195d 100644 > > > --- a/mm/swap_state.c > > > +++ b/mm/swap_state.c > > > @@ -458,6 +458,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > > > You move the mem_cgroup_swapin_charge_folio() inside the for loop: > > > > > > for (;;) { > > int err; > > /* > > * First check the swap cache. Since this is normally > > * called after swap_cache_get_folio() failed, re-calling > > * that would confuse statistics. > > */ > > folio = filemap_get_folio(swap_address_space(entry), > > swp_offset(entry)); > > > > > > > mpol, ilx, numa_node_id()); > > > if (!folio) > > > goto fail_put_swap; > > > + if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > > + goto fail_put_folio; > > > > Wouldn't it cause repeat charging of the folio when it is racing > > against others in the for loop? > > The race loser will call folio_put and discharge it? There are two different charges. Memcg charging and memcg swapin charging. The folio_put will do the memcg discharge, the corresponding memcg charge is in follio allocation. Memcg swapin charge does things differently, it needs to modify the swap relately accounting. The memcg uncharge is not a pair for memcg swapin charge. > > > /* > > > * Swap entry may have been freed since our caller observed it. > > > @@ -483,13 +485,9 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > > /* > > > * The swap entry is ours to swap in. Prepare the new page. > > > */ > > > - > > > __folio_set_locked(folio); > > > __folio_set_swapbacked(folio); > > > > > > - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > > - goto fail_unlock; > > > - > > > > The original code makes the charge outside of the for loop. Only the > > winner can charge once. > > Right, this patch may make the charge/dis-charge path more complex for > race swapin, I'll re-check this part. It is more than just complex, it seems to change the behavior of this code. Chris
Chris Li <chrisl@kernel.org> 于2023年11月21日周二 01:44写道: > > On Mon, Nov 20, 2023 at 3:15 AM Kairui Song <ryncsn@gmail.com> wrote: > > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > > > index ac4fa404eaa7..45dd8b7c195d 100644 > > > > --- a/mm/swap_state.c > > > > +++ b/mm/swap_state.c > > > > @@ -458,6 +458,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > > > > > You move the mem_cgroup_swapin_charge_folio() inside the for loop: > > > > > > > > > for (;;) { > > > int err; > > > /* > > > * First check the swap cache. Since this is normally > > > * called after swap_cache_get_folio() failed, re-calling > > > * that would confuse statistics. > > > */ > > > folio = filemap_get_folio(swap_address_space(entry), > > > swp_offset(entry)); > > > > > > > > > > mpol, ilx, numa_node_id()); > > > > if (!folio) > > > > goto fail_put_swap; > > > > + if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > > > + goto fail_put_folio; > > > > > > Wouldn't it cause repeat charging of the folio when it is racing > > > against others in the for loop? > > > > The race loser will call folio_put and discharge it? > > There are two different charges. Memcg charging and memcg swapin charging. > The folio_put will do the memcg discharge, the corresponding memcg > charge is in follio allocation. Hi Chris, I didn't get your idea here... By "memcg swapin charge", do you mean "memory.swap.*"? And "memcg charging" means "memory.*"?. There is no memcg charge related code in folio allocation (alloc_pages_mpol), actually the mem_cgroup_swapin_charge_folio here is doing memcg charge not memcg swapin charge. Swapin path actually need to uncharge "memory.swap" by mem_cgroup_swapin_uncharge_swap in later part of this function. > Memcg swapin charge does things differently, it needs to modify the > swap relately accounting. > The memcg uncharge is not a pair for memcg swapin charge. > > > > > /* > > > > * Swap entry may have been freed since our caller observed it. > > > > @@ -483,13 +485,9 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > > > /* > > > > * The swap entry is ours to swap in. Prepare the new page. > > > > */ > > > > - > > > > __folio_set_locked(folio); > > > > __folio_set_swapbacked(folio); > > > > > > > > - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) > > > > - goto fail_unlock; > > > > - > > > > > > The original code makes the charge outside of the for loop. Only the > > > winner can charge once. > > > > Right, this patch may make the charge/dis-charge path more complex for > > race swapin, I'll re-check this part. > > It is more than just complex, it seems to change the behavior of this code. > > Chris
Hi Kairui, On Wed, Nov 22, 2023 at 9:33 AM Kairui Song <ryncsn@gmail.com> wrote: > > There are two different charges. Memcg charging and memcg swapin charging. > > The folio_put will do the memcg discharge, the corresponding memcg > > charge is in follio allocation. > > Hi Chris, > > I didn't get your idea here... By "memcg swapin charge", do you mean > "memory.swap.*"? And "memcg charging" means "memory.*"?. There is no Sorry I should have used the function name then there is no ambiguity. "memcg swapin charge" I mean function mem_cgroup_swapin_charge_folio(). This function will look up the swap entry and find the memcg by swap entry then charge to that memcg. > memcg charge related code in folio allocation (alloc_pages_mpol), > actually the mem_cgroup_swapin_charge_folio here is doing memcg charge > not memcg swapin charge. Swapin path actually need to uncharge > "memory.swap" by mem_cgroup_swapin_uncharge_swap in later part of this > function. I still think you have a bug there. Take this make up example: Let say the for loop runs 3 times and the 3rd time breaks out the for loop. The original code will call: filemap_get_folio() 3 times folio_put() 2 times mem_cgroup_swapin_charge_folio() 1 time. With your patch, it will call: filemap_get_folio() 3 times folio_put() 2 times mem_cgroup_swapin_charge_folio() 3 times. Do you see the behavior difference there? Chris
Chris Li <chrisl@kernel.org> 于2023年11月23日周四 04:57写道: > > Hi Kairui, > > On Wed, Nov 22, 2023 at 9:33 AM Kairui Song <ryncsn@gmail.com> wrote: > > > > There are two different charges. Memcg charging and memcg swapin charging. > > > The folio_put will do the memcg discharge, the corresponding memcg > > > charge is in follio allocation. > > > > Hi Chris, > > > > I didn't get your idea here... By "memcg swapin charge", do you mean > > "memory.swap.*"? And "memcg charging" means "memory.*"?. There is no > > Sorry I should have used the function name then there is no ambiguity. > "memcg swapin charge" I mean function mem_cgroup_swapin_charge_folio(). > This function will look up the swap entry and find the memcg by swap entry then > charge to that memcg. > > > memcg charge related code in folio allocation (alloc_pages_mpol), > > actually the mem_cgroup_swapin_charge_folio here is doing memcg charge > > not memcg swapin charge. Swapin path actually need to uncharge > > "memory.swap" by mem_cgroup_swapin_uncharge_swap in later part of this > > function. > > I still think you have a bug there. > > Take this make up example: > Let say the for loop runs 3 times and the 3rd time breaks out the for loop. > The original code will call: > filemap_get_folio() 3 times > folio_put() 2 times > mem_cgroup_swapin_charge_folio() 1 time. > > With your patch, it will call: > filemap_get_folio() 3 times > folio_put() 2 times > mem_cgroup_swapin_charge_folio() 3 times. > > Do you see the behavior difference there? Hi Chris. folio_put will discharge a page if it's charged, in original code the 2 folio_put call simply free the page since it's not charged. But in this patch, folio_put will cancel previous mem_cgroup_swapin_charge_folio call, so actually the 3 mem_cgroup_swapin_charge_folio calls will only charge once. (2 calls was cancelled by folio_put). I think this is making it confusing indeed and causing more trouble in error path (the uncharge could be more expensive than unlock check), will rework this part.
On Fri, Nov 24, 2023 at 12:15 AM Kairui Song <ryncsn@gmail.com> wrote: > > > folio_put will discharge a page if it's charged, in original code the > 2 folio_put call simply free the page since it's not charged. But in > this patch, folio_put will cancel previous > mem_cgroup_swapin_charge_folio call, so actually the 3 > mem_cgroup_swapin_charge_folio calls will only charge once. (2 calls > was cancelled by folio_put). You are saying the original code case folio_put() will be free without uncharge. But with your patch folio_put() will free and cancel mem_cgroup_swapin_charge_folio() charge. That is because the folio->memcg_data was not set in the first case and folio->memcg_data was set in the second case? > > I think this is making it confusing indeed and causing more trouble in > error path (the uncharge could be more expensive than unlock check), > will rework this part. Agree. Thanks. Chris
diff --git a/mm/swap_state.c b/mm/swap_state.c index ac4fa404eaa7..45dd8b7c195d 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -458,6 +458,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, mpol, ilx, numa_node_id()); if (!folio) goto fail_put_swap; + if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) + goto fail_put_folio; /* * Swap entry may have been freed since our caller observed it. @@ -483,13 +485,9 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, /* * The swap entry is ours to swap in. Prepare the new page. */ - __folio_set_locked(folio); __folio_set_swapbacked(folio); - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) - goto fail_unlock; - /* May fail (-ENOMEM) if XArray node allocation failed. */ if (add_to_swap_cache(folio, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) goto fail_unlock; @@ -510,6 +508,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, fail_unlock: put_swap_folio(folio, entry); folio_unlock(folio); +fail_put_folio: folio_put(folio); fail_put_swap: put_swap_device(si); @@ -873,16 +872,15 @@ struct page *swapin_no_readahead(swp_entry_t entry, gfp_t gfp_mask, folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vmf->address, false); if (folio) { - __folio_set_locked(folio); - __folio_set_swapbacked(folio); - - if (mem_cgroup_swapin_charge_folio(folio, - vma->vm_mm, GFP_KERNEL, - entry)) { - folio_unlock(folio); + if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, + GFP_KERNEL, entry)) { folio_put(folio); return NULL; } + + __folio_set_locked(folio); + __folio_set_swapbacked(folio); + mem_cgroup_swapin_uncharge_swap(entry); shadow = get_shadow_from_swap_cache(entry);