Message ID | 20240103091423.400294-4-peterx@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp4911295dyb; Wed, 3 Jan 2024 01:16:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IFNzu0N0XIPAr4kgAVmqoVJUMLQQ0LMPZBRITO/M/xWJHLjNclKQPSG+ACjkB3e+E+xfI+S X-Received: by 2002:a17:90a:f698:b0:28c:5a10:f327 with SMTP id cl24-20020a17090af69800b0028c5a10f327mr4488839pjb.56.1704273393796; Wed, 03 Jan 2024 01:16:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704273393; cv=none; d=google.com; s=arc-20160816; b=FcFU/64Z9oF1IRSBZd1K5dYN8phcPdWiBAoPVlDG7karzzgs/gJXHiagKGY+aZV2u6 ZtaWObxt3BnGQmFn707SUsuYknhtPQdtEF6jZepFDOaRiTEU5T8eOsjj/kQ2yp+RNMCv 6RjBKN8eiPSjDhKk/ps7CA8p5w7RezF4EEPSqPFEo2O4IQJPCs525VDBzp8AXWxN3OGG Eh7106vqqRspc4KsFNdob7Em08XC3uBMHGQk2vT3iq5Uxdpb5xGCSUwNje2nL1GyZzCH pXT9/3QQRG5leaDAu6yx5VTBIaVCHO3sF7/HBJ96O79wcln8QUkuPRfPuxwH4uJ8c8tz tANA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=fpxdFtXHPoqbi8Omg7z7HSYBdI7g84IQc8JywSZrzBY=; fh=JvQ3nGflNTIwPBfhSW2OJAIjHOHR+R1SiFkwzYoQoWY=; b=M0Lk9s+cwLd/xA0S7GyivsX/rY+IUWGUpW83RaqdmVvIrJM2YWUelTNkl0QJ+/Z2jE lEPOczmhyuxn8wNIhXDshn6A13KhEKv1KWfibp/cGJ88zw8PD1QdtdzFWQIQIS2v+psE PhGhg1MbI3coXTYUBycHDhTkck6h7DXuAMjcw7qRBMKP1gYI718CpeyOAtGUaPR1ji3Q rMAAv09O721J4XtFzjJDKu2Ab2Y5eTTbgA1GXjZQkD13u22SJzacG2aH8wyIi20tBeC4 MuBaXiE8GpvdZQRYgNrgk4h027jjToRiP10xWlnGji0Gvsv/7PVvjXoN2QlUpIfPQuSL 23aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gviZcFvb; spf=pass (google.com: domain of linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id oo7-20020a17090b1c8700b0028ceab17618si696984pjb.90.2024.01.03.01.16.33 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jan 2024 01:16:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gviZcFvb; spf=pass (google.com: domain of linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-15318-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7F9F6284AF4 for <ouuuleilei@gmail.com>; Wed, 3 Jan 2024 09:16:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8D69E18EB2; Wed, 3 Jan 2024 09:15:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gviZcFvb" X-Original-To: linux-kernel@vger.kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BE2318E14 for <linux-kernel@vger.kernel.org>; Wed, 3 Jan 2024 09:15:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704273322; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fpxdFtXHPoqbi8Omg7z7HSYBdI7g84IQc8JywSZrzBY=; b=gviZcFvbiOu45yHuUP4TduVm7bKPdn1p+imduE73g0o8qP1zX97X47DSXQe5a2xCqW7AFW Wgv89ibOLA6RaNKfMRcr30AcifTuV4gPH/IBHkc8aYWW3wptAe30a9DEHZXaRJDJA1Cqnu Qfpl1lrOLpqLbOceSyHbHUU9t9YMv5M= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-479-b9n2pak4N8WkCJuJM8p50w-1; Wed, 03 Jan 2024 04:15:17 -0500 X-MC-Unique: b9n2pak4N8WkCJuJM8p50w-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 806BE85A588; Wed, 3 Jan 2024 09:15:16 +0000 (UTC) Received: from x1n.redhat.com (unknown [10.72.116.69]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08899492BE6; Wed, 3 Jan 2024 09:15:04 +0000 (UTC) From: peterx@redhat.com To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton <jthoughton@google.com>, David Hildenbrand <david@redhat.com>, "Kirill A . Shutemov" <kirill@shutemov.name>, Yang Shi <shy828301@gmail.com>, peterx@redhat.com, linux-riscv@lists.infradead.org, Andrew Morton <akpm@linux-foundation.org>, "Aneesh Kumar K . V" <aneesh.kumar@kernel.org>, Rik van Riel <riel@surriel.com>, Andrea Arcangeli <aarcange@redhat.com>, Axel Rasmussen <axelrasmussen@google.com>, Mike Rapoport <rppt@kernel.org>, John Hubbard <jhubbard@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>, Michael Ellerman <mpe@ellerman.id.au>, Christophe Leroy <christophe.leroy@csgroup.eu>, Andrew Jones <andrew.jones@linux.dev>, linuxppc-dev@lists.ozlabs.org, Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <muchun.song@linux.dev>, linux-arm-kernel@lists.infradead.org, Jason Gunthorpe <jgg@nvidia.com>, Christoph Hellwig <hch@infradead.org>, Lorenzo Stoakes <lstoakes@gmail.com>, Matthew Wilcox <willy@infradead.org> Subject: [PATCH v2 03/13] mm: Provide generic pmd_thp_or_huge() Date: Wed, 3 Jan 2024 17:14:13 +0800 Message-ID: <20240103091423.400294-4-peterx@redhat.com> In-Reply-To: <20240103091423.400294-1-peterx@redhat.com> References: <20240103091423.400294-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787060178390467144 X-GMAIL-MSGID: 1787060178390467144 |
Series |
mm/gup: Unify hugetlb, part 2
|
|
Commit Message
Peter Xu
Jan. 3, 2024, 9:14 a.m. UTC
From: Peter Xu <peterx@redhat.com> ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It can be a helpful helper if we want to merge more THP and hugetlb code paths. Make it a generic default implementation, only exist when CONFIG_MMU. Arch can overwrite it by defining its own version. For example, ARM's pgtable-2level.h defines it to always return false. Keep the macro declared with all config, it should be optimized to a false anyway if !THP && !HUGETLB. Signed-off-by: Peter Xu <peterx@redhat.com> --- include/linux/pgtable.h | 4 ++++ mm/gup.c | 3 +-- 2 files changed, 5 insertions(+), 2 deletions(-)
Comments
On Wed, Jan 03, 2024 at 05:14:13PM +0800, peterx@redhat.com wrote: > From: Peter Xu <peterx@redhat.com> > > ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It > can be a helpful helper if we want to merge more THP and hugetlb code > paths. Make it a generic default implementation, only exist when > CONFIG_MMU. Arch can overwrite it by defining its own version. > > For example, ARM's pgtable-2level.h defines it to always return false. > > Keep the macro declared with all config, it should be optimized to a false > anyway if !THP && !HUGETLB. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/linux/pgtable.h | 4 ++++ > mm/gup.c | 3 +-- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 466cf477551a..2b42e95a4e3a 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1362,6 +1362,10 @@ static inline int pmd_write(pmd_t pmd) > #endif /* pmd_write */ > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > +#ifndef pmd_thp_or_huge > +#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) > +#endif Why not just use pmd_leaf() ? This GUP case seems to me exactly like what pmd_leaf() should really do and be used for.. eg x86 does: #define pmd_leaf pmd_large static inline int pmd_large(pmd_t pte) return pmd_flags(pte) & _PAGE_PSE; static inline int pmd_trans_huge(pmd_t pmd) return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; int pmd_huge(pmd_t pmd) return !pmd_none(pmd) && (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; I spot checked a couple arches and it looks like it holds up. Further, it looks to me like this site in GUP is the only core code caller.. So, I'd suggest a small series to go arch by arch and convert the arch to use pmd_huge() == pmd_leaf(). Then retire pmd_huge() as a public API. > diff --git a/mm/gup.c b/mm/gup.c > index df83182ec72d..eebae70d2465 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -3004,8 +3004,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo > if (!pmd_present(pmd)) > return 0; > > - if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || > - pmd_devmap(pmd))) { > + if (unlikely(pmd_thp_or_huge(pmd) || pmd_devmap(pmd))) { > /* See gup_pte_range() */ > if (pmd_protnone(pmd)) > return 0; And the devmap thing here doesn't make any sense either. The arch should ensure that pmd_devmap() implies pmd_leaf(). Since devmap is a purely SW construct it almost certainly does already anyhow. Jason
On Mon, Jan 15, 2024 at 01:55:51PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 03, 2024 at 05:14:13PM +0800, peterx@redhat.com wrote: > > From: Peter Xu <peterx@redhat.com> > > > > ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It > > can be a helpful helper if we want to merge more THP and hugetlb code > > paths. Make it a generic default implementation, only exist when > > CONFIG_MMU. Arch can overwrite it by defining its own version. > > > > For example, ARM's pgtable-2level.h defines it to always return false. > > > > Keep the macro declared with all config, it should be optimized to a false > > anyway if !THP && !HUGETLB. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > include/linux/pgtable.h | 4 ++++ > > mm/gup.c | 3 +-- > > 2 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > index 466cf477551a..2b42e95a4e3a 100644 > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -1362,6 +1362,10 @@ static inline int pmd_write(pmd_t pmd) > > #endif /* pmd_write */ > > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > > > +#ifndef pmd_thp_or_huge > > +#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) > > +#endif > > Why not just use pmd_leaf() ? > > This GUP case seems to me exactly like what pmd_leaf() should really > do and be used for.. I think I mostly agree with you, and these APIs are indeed confusing. IMHO the challenge is about the risk of breaking others on small changes in the details where evil resides. > > eg x86 does: > > #define pmd_leaf pmd_large > static inline int pmd_large(pmd_t pte) > return pmd_flags(pte) & _PAGE_PSE; > > static inline int pmd_trans_huge(pmd_t pmd) > return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; > > int pmd_huge(pmd_t pmd) > return !pmd_none(pmd) && > (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; For example, here I don't think it's strictly pmd_leaf()? As pmd_huge() will return true if PRESENT=0 && PSE=0 (as long as none pte ruled out first), while pmd_leaf() will return false; I think that came from cbef8478bee5. I'm not sure whether that is the best solution, e.g., from a 1st glance it seems better to me to process swap entries separately (including both migration and poisoned entries).. Sparc has similar things there, which in that case I'm not sure whether a direct replace is always safe. Besides that, there're also other cases where it's not clear of such direct replacement, not until further investigated. E.g., arm-3level has: #define pmd_leaf(pmd) pmd_sect(pmd) #define pmd_sect(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ PMD_TYPE_SECT) #define PMD_TYPE_SECT (_AT(pmdval_t, 1) << 0) While pmd_huge() there relies on PMD_TABLE_BIT () int pmd_huge(pmd_t pmd) { return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT); } #define PMD_TABLE_BIT (_AT(pmdval_t, 1) << 1) These are just the trivial details that I wanted to avoid to touch in this series, so as to resolve the hugetlb issue separately from others. The new pmd_huge_or_thp() is not ideal, but that easily isolates all these trivial details / evils out of the picture, so that we can tackle them one by one. It is strictly an OR or huge||thp, so it's hopefully safe to not break anything yet from that regard. > > I spot checked a couple arches and it looks like it holds up. > > Further, it looks to me like this site in GUP is the only core code > caller.. > > So, I'd suggest a small series to go arch by arch and convert the arch > to use pmd_huge() == pmd_leaf(). Then retire pmd_huge() as a public > API. > > > diff --git a/mm/gup.c b/mm/gup.c > > index df83182ec72d..eebae70d2465 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -3004,8 +3004,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo > > if (!pmd_present(pmd)) > > return 0; > > > > - if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || > > - pmd_devmap(pmd))) { > > + if (unlikely(pmd_thp_or_huge(pmd) || pmd_devmap(pmd))) { > > /* See gup_pte_range() */ > > if (pmd_protnone(pmd)) > > return 0; > > And the devmap thing here doesn't make any sense either. The arch > should ensure that pmd_devmap() implies pmd_leaf(). Since devmap is a > purely SW construct it almost certainly does already anyhow. Yep, but only if pmd_leaf() is safe to be put here. A pmd devmap should always imply as a pmd_leaf() indeed. Thanks,
On Wed, Feb 21, 2024 at 05:37:37PM +0800, Peter Xu wrote: > On Mon, Jan 15, 2024 at 01:55:51PM -0400, Jason Gunthorpe wrote: > > On Wed, Jan 03, 2024 at 05:14:13PM +0800, peterx@redhat.com wrote: > > > From: Peter Xu <peterx@redhat.com> > > > > > > ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It > > > can be a helpful helper if we want to merge more THP and hugetlb code > > > paths. Make it a generic default implementation, only exist when > > > CONFIG_MMU. Arch can overwrite it by defining its own version. > > > > > > For example, ARM's pgtable-2level.h defines it to always return false. > > > > > > Keep the macro declared with all config, it should be optimized to a false > > > anyway if !THP && !HUGETLB. > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > --- > > > include/linux/pgtable.h | 4 ++++ > > > mm/gup.c | 3 +-- > > > 2 files changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > > index 466cf477551a..2b42e95a4e3a 100644 > > > --- a/include/linux/pgtable.h > > > +++ b/include/linux/pgtable.h > > > @@ -1362,6 +1362,10 @@ static inline int pmd_write(pmd_t pmd) > > > #endif /* pmd_write */ > > > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > > > > > +#ifndef pmd_thp_or_huge > > > +#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) > > > +#endif > > > > Why not just use pmd_leaf() ? > > > > This GUP case seems to me exactly like what pmd_leaf() should really > > do and be used for.. > > I think I mostly agree with you, and these APIs are indeed confusing. IMHO > the challenge is about the risk of breaking others on small changes in the > details where evil resides. These APIs are super confusing, which is why I brought it up.. Adding even more subtly different variations is not helping. I think pmd_leaf means the entry is present and refers to a physical page not another radix level. > > eg x86 does: > > > > #define pmd_leaf pmd_large > > static inline int pmd_large(pmd_t pte) > > return pmd_flags(pte) & _PAGE_PSE; > > > > static inline int pmd_trans_huge(pmd_t pmd) > > return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; > > > > int pmd_huge(pmd_t pmd) > > return !pmd_none(pmd) && > > (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; > > For example, here I don't think it's strictly pmd_leaf()? As pmd_huge() > will return true if PRESENT=0 && PSE=0 (as long as none pte ruled out > first), while pmd_leaf() will return false; I think that came from > cbef8478bee5. Yikes, but do you even want to handle non-present entries in GUP world? Isn't everything gated by !present in the first place? > Besides that, there're also other cases where it's not clear of such direct > replacement, not until further investigated. E.g., arm-3level has: > > #define pmd_leaf(pmd) pmd_sect(pmd) > #define pmd_sect(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ > PMD_TYPE_SECT) > #define PMD_TYPE_SECT (_AT(pmdval_t, 1) << 0) > > While pmd_huge() there relies on PMD_TABLE_BIT () I looked at tht, it looked OK.. #define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0) #define PMD_TABLE_BIT (_AT(pmdval_t, 1) << 1) It is the same stuff, just a little confusingly written Jason
On Wed, Feb 21, 2024 at 08:57:53AM -0400, Jason Gunthorpe wrote: > On Wed, Feb 21, 2024 at 05:37:37PM +0800, Peter Xu wrote: > > On Mon, Jan 15, 2024 at 01:55:51PM -0400, Jason Gunthorpe wrote: > > > On Wed, Jan 03, 2024 at 05:14:13PM +0800, peterx@redhat.com wrote: > > > > From: Peter Xu <peterx@redhat.com> > > > > > > > > ARM defines pmd_thp_or_huge(), detecting either a THP or a huge PMD. It > > > > can be a helpful helper if we want to merge more THP and hugetlb code > > > > paths. Make it a generic default implementation, only exist when > > > > CONFIG_MMU. Arch can overwrite it by defining its own version. > > > > > > > > For example, ARM's pgtable-2level.h defines it to always return false. > > > > > > > > Keep the macro declared with all config, it should be optimized to a false > > > > anyway if !THP && !HUGETLB. > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > --- > > > > include/linux/pgtable.h | 4 ++++ > > > > mm/gup.c | 3 +-- > > > > 2 files changed, 5 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > > > index 466cf477551a..2b42e95a4e3a 100644 > > > > --- a/include/linux/pgtable.h > > > > +++ b/include/linux/pgtable.h > > > > @@ -1362,6 +1362,10 @@ static inline int pmd_write(pmd_t pmd) > > > > #endif /* pmd_write */ > > > > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > > > > > > > +#ifndef pmd_thp_or_huge > > > > +#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) > > > > +#endif > > > > > > Why not just use pmd_leaf() ? > > > > > > This GUP case seems to me exactly like what pmd_leaf() should really > > > do and be used for.. > > > > I think I mostly agree with you, and these APIs are indeed confusing. IMHO > > the challenge is about the risk of breaking others on small changes in the > > details where evil resides. > > These APIs are super confusing, which is why I brought it up.. Adding > even more subtly different variations is not helping. > > I think pmd_leaf means the entry is present and refers to a physical > page not another radix level. > > > > eg x86 does: > > > > > > #define pmd_leaf pmd_large > > > static inline int pmd_large(pmd_t pte) > > > return pmd_flags(pte) & _PAGE_PSE; > > > > > > static inline int pmd_trans_huge(pmd_t pmd) > > > return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; > > > > > > int pmd_huge(pmd_t pmd) > > > return !pmd_none(pmd) && > > > (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT; > > > > For example, here I don't think it's strictly pmd_leaf()? As pmd_huge() > > will return true if PRESENT=0 && PSE=0 (as long as none pte ruled out > > first), while pmd_leaf() will return false; I think that came from > > cbef8478bee5. > > Yikes, but do you even want to handle non-present entries in GUP > world? Isn't everything gated by !present in the first place? I am as confused indeed. > > > Besides that, there're also other cases where it's not clear of such direct > > replacement, not until further investigated. E.g., arm-3level has: > > > > #define pmd_leaf(pmd) pmd_sect(pmd) > > #define pmd_sect(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ > > PMD_TYPE_SECT) > > #define PMD_TYPE_SECT (_AT(pmdval_t, 1) << 0) > > > > While pmd_huge() there relies on PMD_TABLE_BIT () > > I looked at tht, it looked OK.. > > #define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0) > #define PMD_TABLE_BIT (_AT(pmdval_t, 1) << 1) > > It is the same stuff, just a little confusingly written True, my eyes decided to skip all the shifts. :-( Ok then, let me see whether I can give it a stab on the pXd_huge() mess. Thanks,
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 466cf477551a..2b42e95a4e3a 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1362,6 +1362,10 @@ static inline int pmd_write(pmd_t pmd) #endif /* pmd_write */ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +#ifndef pmd_thp_or_huge +#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd)) +#endif + #ifndef pud_write static inline int pud_write(pud_t pud) { diff --git a/mm/gup.c b/mm/gup.c index df83182ec72d..eebae70d2465 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -3004,8 +3004,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo if (!pmd_present(pmd)) return 0; - if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || - pmd_devmap(pmd))) { + if (unlikely(pmd_thp_or_huge(pmd) || pmd_devmap(pmd))) { /* See gup_pte_range() */ if (pmd_protnone(pmd)) return 0;