[v1,3/9] mm/memory: further separate anon and pagecache folio handling in zap_present_pte()
Message ID | 20240129143221.263763-4-david@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp613083dyb; Mon, 29 Jan 2024 06:48:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IFPBpNCggGEQibwusBswjLU33NzXNG7T5YK3j7vuhy/UR/6I7IBkmi34OqN8sfo5P0qn1r1 X-Received: by 2002:a25:8f8f:0:b0:dc6:4c61:248 with SMTP id u15-20020a258f8f000000b00dc64c610248mr2448791ybl.77.1706539718825; Mon, 29 Jan 2024 06:48:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706539718; cv=pass; d=google.com; s=arc-20160816; b=sxuIU0ZGw+B5ktkoqzDazyoexUDlymANYSllQuMtTWRKfNCz94PhpWQpTMuqDxZ9LG Dhsz8eQwOpxIHUhCJXdJ2bhiXZTPmdLTb2i5BWJsh9L+MqzqgaamoQJv4Z2v+A5lm36E mJcOIU5NCYaY05FivfpAyHe5xhpcsXoesxqvySagdf13ISpluMsG624sMoqIh4FLHRfU EWDSOVPOcYUY1Uzop4NCrwOM6Hh/h9/Kis7VCJO6pdVUu1P/hMgJm9fda1i3pZEy4Gj6 MAnPGgK+PHdhND+q7CQnwmccepOLdSFRDPC2ODysDTt+SMBftb7NnX5A9+z4ELbOJ5fX NLug== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=OkSqLVnz5/2y8MLWUf6UFgTHMoYCjTvN5uvEMNfOCyQ=; fh=18RVDvllJ6L23pprpg+CRg5YQHffBN1k78GW6gBbnXE=; b=cSoMpq6k+Hsd/T0GdfKfnzvlwFhAwIKHhXQDE7iD/c+tm/kNPeopqWiDFX+ie6vDnn GOzsnkxFakgTupOMZFKZDa7Y+nSd9xJd0vEZk2/yCNEZU/34s6bWBzvc8bhhjHwI8zYY LCg+wHYDvPf2CqizmET3uAJmGNzqSadrOokQgyTCzgvxsYAokeX8QaidDw/uwHA+p25a V3p60z9nbYREu+rcyuEDefFjo05QamVwIALKU7NMUUaClqzMCMzDi2y9feoRyoBTYZCo WiGEiWuHzV1cFMcMYGZJNJsS3J2DqnCoST+3/UXu911G9odSQkHkFGlObrxoEuzpzOCc ds9A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F0UaybRy; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id s76-20020a632c4f000000b005cf0ccc6287si5569543pgs.134.2024.01.29.06.48.38 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 06:48:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F0UaybRy; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42912-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 274AFB26DEB for <ouuuleilei@gmail.com>; Mon, 29 Jan 2024 14:33:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 80243153BCE; Mon, 29 Jan 2024 14:32:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F0UaybRy" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C9E1152DFC for <linux-kernel@vger.kernel.org>; Mon, 29 Jan 2024 14:32:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706538768; cv=none; b=k+lc3P59xctTSHWcfuWWhZQao30Y2qdbxHYx4BUF2ooCBP049NVJDpvusB2FzqLgxI5uWFhjgqEDt4ee3c+DN0KB2QHdhJx10mlx+ZifTC76kRfJkl5MrSlCe7OOA1zSkefTLpGraIZhE/gS9vejxyDkR9RgqljX05Kff7hlxBw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706538768; c=relaxed/simple; bh=uJmPGUDLFyZe0bZqwcbaHzJkmT/1nN67hPwrqw3D1Wc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cGGZIKTotXzZPW7gg+5amD92eM6BRF2AScOFk9np8BjROXc42XsuJs2QsnCUr4VvqaolDQK6ZtWV9r3p1s4RS8heHlNxSBFVdqhBF/cIjUa1mb5Q8m7623sma8MdQP8n3h92mY4Qco0YOR/rdvLPJTkgnKT5syucIBD7jTIHm38= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=F0UaybRy; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706538766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OkSqLVnz5/2y8MLWUf6UFgTHMoYCjTvN5uvEMNfOCyQ=; b=F0UaybRyVw2kjNTaK0O0yXrpbmvCoOW4YouhbkRq/wymc5KL+KMO+Im9D5YTbRy43mvm4I qML0mhUMuICLgRpj8/c3zNHjD8PbOLGnZEin9BZeXqQuLqlKEuyZNfAeEi+F3bjM4HDFVB xZ5tbGj9hRy0HYv3MLwBmmYAhYnQwvY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-332-tQCqrTzqPz-8PYoPDfxKqg-1; Mon, 29 Jan 2024 09:32:44 -0500 X-MC-Unique: tQCqrTzqPz-8PYoPDfxKqg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8BAC885A599; Mon, 29 Jan 2024 14:32:42 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCEB7AD1; Mon, 29 Jan 2024 14:32:37 +0000 (UTC) From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand <david@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>, Ryan Roberts <ryan.roberts@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Nick Piggin <npiggin@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Michael Ellerman <mpe@ellerman.id.au>, Christophe Leroy <christophe.leroy@csgroup.eu>, "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>, Heiko Carstens <hca@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Alexander Gordeev <agordeev@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Sven Schnelle <svens@linux.ibm.com>, Arnd Bergmann <arnd@arndb.de>, linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org Subject: [PATCH v1 3/9] mm/memory: further separate anon and pagecache folio handling in zap_present_pte() Date: Mon, 29 Jan 2024 15:32:15 +0100 Message-ID: <20240129143221.263763-4-david@redhat.com> In-Reply-To: <20240129143221.263763-1-david@redhat.com> References: <20240129143221.263763-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789436592111361910 X-GMAIL-MSGID: 1789436592111361910 |
Series |
mm/memory: optimize unmap/zap with PTE-mapped THP
|
|
Commit Message
David Hildenbrand
Jan. 29, 2024, 2:32 p.m. UTC
We don't need up-to-date accessed-dirty information for anon folios and can
simply work with the ptent we already have. Also, we know the RSS counter
we want to update.
We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() +
zap_install_uffd_wp_if_needed() after updating the folio and RSS.
While at it, only call zap_install_uffd_wp_if_needed() if there is even
any chance that pte_install_uffd_wp_if_needed() would do *something*.
That is, just don't bother if uffd-wp does not apply.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/memory.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
Comments
On 30.01.24 09:31, Ryan Roberts wrote: > On 29/01/2024 14:32, David Hildenbrand wrote: >> We don't need up-to-date accessed-dirty information for anon folios and can >> simply work with the ptent we already have. Also, we know the RSS counter >> we want to update. >> >> We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() + >> zap_install_uffd_wp_if_needed() after updating the folio and RSS. >> >> While at it, only call zap_install_uffd_wp_if_needed() if there is even >> any chance that pte_install_uffd_wp_if_needed() would do *something*. >> That is, just don't bother if uffd-wp does not apply. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> mm/memory.c | 16 +++++++++++----- >> 1 file changed, 11 insertions(+), 5 deletions(-) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 69502cdc0a7d..20bc13ab8db2 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -1552,12 +1552,9 @@ static inline void zap_present_pte(struct mmu_gather *tlb, >> folio = page_folio(page); >> if (unlikely(!should_zap_folio(details, folio))) >> return; >> - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >> - arch_check_zapped_pte(vma, ptent); >> - tlb_remove_tlb_entry(tlb, pte, addr); >> - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); >> >> if (!folio_test_anon(folio)) { >> + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >> if (pte_dirty(ptent)) { >> folio_mark_dirty(folio); >> if (tlb_delay_rmap(tlb)) { >> @@ -1567,8 +1564,17 @@ static inline void zap_present_pte(struct mmu_gather *tlb, >> } >> if (pte_young(ptent) && likely(vma_has_recency(vma))) >> folio_mark_accessed(folio); >> + rss[mm_counter(folio)]--; >> + } else { >> + /* We don't need up-to-date accessed/dirty bits. */ >> + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >> + rss[MM_ANONPAGES]--; >> } >> - rss[mm_counter(folio)]--; >> + arch_check_zapped_pte(vma, ptent); > > Isn't the x86 (only) implementation of this relying on the dirty bit? So doesn't > that imply you still need get_and_clear for anon? (And in hindsight I think that > logic would apply to the previous patch too?) x86 uses the encoding !writable && dirty to indicate special shadow stacks. That is, the hw dirty bit is set by software (to create that combination), not by hardware. So you don't have to sync against any hw changes of the hw dirty bit. What you had in the original PTE you read is sufficient.
On 30.01.24 09:45, Ryan Roberts wrote: > On 30/01/2024 08:37, David Hildenbrand wrote: >> On 30.01.24 09:31, Ryan Roberts wrote: >>> On 29/01/2024 14:32, David Hildenbrand wrote: >>>> We don't need up-to-date accessed-dirty information for anon folios and can >>>> simply work with the ptent we already have. Also, we know the RSS counter >>>> we want to update. >>>> >>>> We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() + >>>> zap_install_uffd_wp_if_needed() after updating the folio and RSS. >>>> >>>> While at it, only call zap_install_uffd_wp_if_needed() if there is even >>>> any chance that pte_install_uffd_wp_if_needed() would do *something*. >>>> That is, just don't bother if uffd-wp does not apply. >>>> >>>> Signed-off-by: David Hildenbrand <david@redhat.com> >>>> --- >>>> mm/memory.c | 16 +++++++++++----- >>>> 1 file changed, 11 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/mm/memory.c b/mm/memory.c >>>> index 69502cdc0a7d..20bc13ab8db2 100644 >>>> --- a/mm/memory.c >>>> +++ b/mm/memory.c >>>> @@ -1552,12 +1552,9 @@ static inline void zap_present_pte(struct mmu_gather >>>> *tlb, >>>> folio = page_folio(page); >>>> if (unlikely(!should_zap_folio(details, folio))) >>>> return; >>>> - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >>>> - arch_check_zapped_pte(vma, ptent); >>>> - tlb_remove_tlb_entry(tlb, pte, addr); >>>> - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); >>>> if (!folio_test_anon(folio)) { >>>> + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >>>> if (pte_dirty(ptent)) { >>>> folio_mark_dirty(folio); >>>> if (tlb_delay_rmap(tlb)) { >>>> @@ -1567,8 +1564,17 @@ static inline void zap_present_pte(struct mmu_gather >>>> *tlb, >>>> } >>>> if (pte_young(ptent) && likely(vma_has_recency(vma))) >>>> folio_mark_accessed(folio); >>>> + rss[mm_counter(folio)]--; >>>> + } else { >>>> + /* We don't need up-to-date accessed/dirty bits. */ >>>> + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); >>>> + rss[MM_ANONPAGES]--; >>>> } >>>> - rss[mm_counter(folio)]--; >>>> + arch_check_zapped_pte(vma, ptent); >>> >>> Isn't the x86 (only) implementation of this relying on the dirty bit? So doesn't >>> that imply you still need get_and_clear for anon? (And in hindsight I think that >>> logic would apply to the previous patch too?) >> >> x86 uses the encoding !writable && dirty to indicate special shadow stacks. That >> is, the hw dirty bit is set by software (to create that combination), not by >> hardware. >> >> So you don't have to sync against any hw changes of the hw dirty bit. What you >> had in the original PTE you read is sufficient. >> > > Right, got it. In that case: Thanks a lot for paying that much attention during your reviews! Highly appreciated! > > Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > >
diff --git a/mm/memory.c b/mm/memory.c index 69502cdc0a7d..20bc13ab8db2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1552,12 +1552,9 @@ static inline void zap_present_pte(struct mmu_gather *tlb, folio = page_folio(page); if (unlikely(!should_zap_folio(details, folio))) return; - ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); - arch_check_zapped_pte(vma, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); if (!folio_test_anon(folio)) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); if (pte_dirty(ptent)) { folio_mark_dirty(folio); if (tlb_delay_rmap(tlb)) { @@ -1567,8 +1564,17 @@ static inline void zap_present_pte(struct mmu_gather *tlb, } if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); + rss[mm_counter(folio)]--; + } else { + /* We don't need up-to-date accessed/dirty bits. */ + ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + rss[MM_ANONPAGES]--; } - rss[mm_counter(folio)]--; + arch_check_zapped_pte(vma, ptent); + tlb_remove_tlb_entry(tlb, pte, addr); + if (unlikely(userfaultfd_pte_wp(vma, ptent))) + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); + if (!delay_rmap) { folio_remove_rmap_pte(folio, page, vma); if (unlikely(page_mapcount(page) < 0))