diff mbox series

[21/23] x86: Allow get_locked_pte() to fail

Message ID	eba2b72f-2180-498b-c8bd-ce8f717fc78a@google.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Date: Tue, 9 May 2023 22:08:37 -0700 (PDT) From: Hugh Dickins <hughd@google.com> To: Andrew Morton <akpm@linux-foundation.org> cc: Mike Kravetz <mike.kravetz@oracle.com>, Mike Rapoport <rppt@kernel.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Matthew Wilcox <willy@infradead.org>, David Hildenbrand <david@redhat.com>, Suren Baghdasaryan <surenb@google.com>, Qi Zheng <zhengqi.arch@bytedance.com>, Russell King <linux@armlinux.org.uk>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Geert Uytterhoeven <geert@linux-m68k.org>, Greg Ungerer <gerg@linux-m68k.org>, Michal Simek <monstr@monstr.eu>, Thomas Bogendoerfer <tsbogend@alpha.franken.de>, Helge Deller <deller@gmx.de>, John David Anglin <dave.anglin@bell.net>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Michael Ellerman <mpe@ellerman.id.au>, Alexandre Ghiti <alexghiti@rivosinc.com>, Palmer Dabbelt <palmer@dabbelt.com>, Heiko Carstens <hca@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Claudio Imbrenda <imbrenda@linux.ibm.com>, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>, "David S. Miller" <davem@davemloft.net>, Chris Zankel <chris@zankel.net>, Max Filippov <jcmvbkbc@gmail.com>, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 21/23] x86: Allow get_locked_pte() to fail In-Reply-To: <77a5d8c-406b-7068-4f17-23b7ac53bc83@google.com> Message-ID: <eba2b72f-2180-498b-c8bd-ce8f717fc78a@google.com> References: <77a5d8c-406b-7068-4f17-23b7ac53bc83@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk
Series	arch: allow pte_offset_map[_lock]() to fail \| [00/23] arch: allow pte_offset_map[_lock]() to fail [01/23] arm: allow pte_offset_map[_lock]() to fail [02/23] arm64: allow pte_offset_map() to fail [03/23] arm64/hugetlb: pte_alloc_huge() pte_offset_huge() [04/23] ia64/hugetlb: pte_alloc_huge() pte_offset_huge() [05/23] m68k: allow pte_offset_map[_lock]() to fail [06/23] microblaze: allow pte_offset_map() to fail [07/23] mips: update_mmu_cache() can replace __update_tlb() [08/23] parisc: add pte_unmap() to balance get_ptep() [09/23] parisc: unmap_uncached_pte() use pte_offset_kernel() [10/23] parisc/hugetlb: pte_alloc_huge() pte_offset_huge() [11/23] powerpc: kvmppc_unmap_free_pmd() pte_offset_kernel() [12/23] powerpc: allow pte_offset_map[_lock]() to fail [13/23] powerpc/hugetlb: pte_alloc_huge() [14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge() [15/23] s390: allow pte_offset_map_lock() to fail [16/23] s390: gmap use pte_unmap_unlock() not spin_unlock() [17/23] sh/hugetlb: pte_alloc_huge() pte_offset_huge() [18/23] sparc/hugetlb: pte_alloc_huge() pte_offset_huge() [19/23] sparc: allow pte_offset_map() to fail [20/23] sparc: iounit and iommu use pte_offset_kernel() [21/23] x86: Allow get_locked_pte() to fail [22/23] x86: sme_populate_pgd() use pte_offset_kernel() [23/23] xtensa: add pte_unmap() to balance pte_offset_map()

Commit Message

Hugh Dickins May 10, 2023, 5:08 a.m. UTC

  In rare transient cases, not yet made possible, pte_offset_map() and
pte_offset_map_lock() may not find a page table: handle appropriately.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
 arch/x86/kernel/ldt.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Peter Zijlstra May 10, 2023, 8:18 a.m. UTC | #1

On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote:
> In rare transient cases, not yet made possible, pte_offset_map() and
> pte_offset_map_lock() may not find a page table: handle appropriately.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>  arch/x86/kernel/ldt.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
> index 525876e7b9f4..eb844549cd83 100644
> --- a/arch/x86/kernel/ldt.c
> +++ b/arch/x86/kernel/ldt.c
> @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
>  
>  		va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
>  		ptep = get_locked_pte(mm, va, &ptl);
> -		pte_clear(mm, va, ptep);
> -		pte_unmap_unlock(ptep, ptl);
> +		if (ptep) {
> +			pte_clear(mm, va, ptep);
> +			pte_unmap_unlock(ptep, ptl);
> +		}
>  	}

Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/

At first glance this seems wrong; we can't just not unmap the LDT if we
can't find it in a hurry. Also, IIRC this isn't in fact a regular user
mapping, so it should not be subject to THP induced seizures.

... memory bubbles back ... for PTI kernels we need to map this in the
user and kernel page-tables because obviously userspace needs to be able
to have access to the LDT. But it is not directly acessible by
userspace. It lives in the cpu_entry_area as a virtual map of the real
kernel allocation, and this virtual address is used for LLDT.
Modification is done through sys_modify_ldt().

I think I would feel much better if this were something like:

	if (!WARN_ON_ONCE(!ptep))

This really shouldn't fail and if it does, simply skipping it isn't the
right thing either.

Hugh Dickins May 11, 2023, 3:16 a.m. UTC | #2

On Wed, 10 May 2023, Peter Zijlstra wrote:

> On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote:
> > In rare transient cases, not yet made possible, pte_offset_map() and
> > pte_offset_map_lock() may not find a page table: handle appropriately.
> > 
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > ---
> >  arch/x86/kernel/ldt.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
> > index 525876e7b9f4..eb844549cd83 100644
> > --- a/arch/x86/kernel/ldt.c
> > +++ b/arch/x86/kernel/ldt.c
> > @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
> >  
> >  		va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
> >  		ptep = get_locked_pte(mm, va, &ptl);
> > -		pte_clear(mm, va, ptep);
> > -		pte_unmap_unlock(ptep, ptl);
> > +		if (ptep) {
> > +			pte_clear(mm, va, ptep);
> > +			pte_unmap_unlock(ptep, ptl);
> > +		}
> >  	}
> 
> Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/

I apologize for sending you back there!

> 
> At first glance this seems wrong; we can't just not unmap the LDT if we
> can't find it in a hurry. Also, IIRC this isn't in fact a regular user
> mapping, so it should not be subject to THP induced seizures.
> 
> ... memory bubbles back ... for PTI kernels we need to map this in the
> user and kernel page-tables because obviously userspace needs to be able
> to have access to the LDT. But it is not directly acessible by
> userspace. It lives in the cpu_entry_area as a virtual map of the real
> kernel allocation, and this virtual address is used for LLDT.
> Modification is done through sys_modify_ldt().

And there must be a user-style page table backing that cpu_entry_area,
because the use of get_locked_pte() and pte_unmap_unlock() implies
that there's a user page table (struct page containing spinlock if
config says so) rather than just a kernel page table mapping it.

> 
> I think I would feel much better if this were something like:
> 
> 	if (!WARN_ON_ONCE(!ptep))
> 
> This really shouldn't fail and if it does, simply skipping it isn't the
> right thing either.

Sure, I'll gladly make that change when I respin - not immediately, let's
get more feedback on this arch series first, but maybe in a week's time.

Thanks for looking so quickly, Peter: I didn't Cc you on this particular
series, but shall certainly be doing so on the ones that follow, because
a few of those patches go into interesting pmdp_get_lockless() territory.

Hugh

Peter Zijlstra May 11, 2023, 7:29 a.m. UTC | #3

On Wed, May 10, 2023 at 08:16:34PM -0700, Hugh Dickins wrote:
> Thanks for looking so quickly, Peter: I didn't Cc you on this particular
> series, but shall certainly be doing so on the ones that follow, because
> a few of those patches go into interesting pmdp_get_lockless() territory.

I'm in the x86@ catch-all, which is how I saw this as quickly :-) A
direct copy won't hurt ofc, the mail system will sort it out.

diff mbox series

Patch

diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index 525876e7b9f4..eb844549cd83 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -367,8 +367,10 @@  static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
 
 		va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
 		ptep = get_locked_pte(mm, va, &ptl);
-		pte_clear(mm, va, ptep);
-		pte_unmap_unlock(ptep, ptl);
+		if (ptep) {
+			pte_clear(mm, va, ptep);
+			pte_unmap_unlock(ptep, ptl);
+		}
 	}
 
 	va = (unsigned long)ldt_slot_va(ldt->slot);