Message ID | 20230921162007.1630149-1-ryan.roberts@arm.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5189457vqi; Thu, 21 Sep 2023 15:53:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFF3RZA8OKajMDWXpMugxYFo0lJFeIqiZ5q1Vn5NUIvNdNjLK5YkaoHmtWJWpcgMl8CW2Hi X-Received: by 2002:a05:6870:b52c:b0:1c8:c27f:7d9b with SMTP id v44-20020a056870b52c00b001c8c27f7d9bmr7394166oap.27.1695336821336; Thu, 21 Sep 2023 15:53:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695336821; cv=none; d=google.com; s=arc-20160816; b=ueBrdDWEAGvB6cH28TLU2I/ndoLKLJIoQKoUOO1IoBBIymW8S9PenoNtF/FC7RPczc QMxUUIRAPclk384kuZXYq7Z6T+MbztjARPx5fBKwWYQfNnBNeK1ckhpgexz8QqXgrACo Uo84nlJk5RnNngYVEJpvUDEWXiiMTlK7/1kTIrO0rJww3Tqc1iXI04FPou6JQ9b3Duq/ 0caO04GSGdJVSp9+wSPQaU6xcrzl/GKDoYy+LEUH7HReLz0bcCSABx2gZlujPOOGRgEL K7Eb0k4uapgyQMxAGAZG8vJCw+W66ZR/ErFRG4s3T0uD7sAiiFluzzu1iuTN/q8Q91QQ lK7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=qUyz7Y+MvvRCe+mAqo4ZDMmuZYuJbLYP31TN2y8wsAg=; fh=q/eoddAVbbeWDzUVs4Vwk85GEJr6Ex1kH1y0q+v38Hg=; b=X9FLd48tTKLhnGCANBK6nTt5/8ppB9tByT/Z95T7WVf4EXAO+RDU/zkdD8aLsL3/5y G3gbnZjW+MpUwkxsBKlmpiABSBrmRfHhSX9B+rCRiXJMHcfxbq//15i5vHeZbcYpNMSp mjEn0VE64A7u12mOnLaKIGXUiOocikQcZqXzhnyYXbnuJdw9voBWZFOaQDngGFbATHqn YU4M0Zc3eMFhgv+mfmXi3umiyRzj7ZjzILMJS4FjlPGi0jSfQIqnqoWC9v/+JOffWZuk qWgLsHl57KMxxsQ+xtb18BLMX6/RWK9dGMSxFibPqTvcuU4b8NZKB/wvHp1GpLznfaWN IHRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id w1-20020a63c101000000b00578a6aaae10si2402732pgf.86.2023.09.21.15.53.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 15:53:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id B8B3881B5A82; Thu, 21 Sep 2023 14:04:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232424AbjIUVDt (ORCPT <rfc822;pwkd43@gmail.com> + 29 others); Thu, 21 Sep 2023 17:03:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232457AbjIUVDS (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 21 Sep 2023 17:03:18 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BA3B485D0D; Thu, 21 Sep 2023 10:37:47 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 72F9616F8; Thu, 21 Sep 2023 09:21:01 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 036003F59C; Thu, 21 Sep 2023 09:20:19 -0700 (PDT) From: Ryan Roberts <ryan.roberts@arm.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>, Helge Deller <deller@gmx.de>, Nicholas Piggin <npiggin@gmail.com>, Christophe Leroy <christophe.leroy@csgroup.eu>, Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Heiko Carstens <hca@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Alexander Gordeev <agordeev@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Sven Schnelle <svens@linux.ibm.com>, Gerald Schaefer <gerald.schaefer@linux.ibm.com>, "David S. Miller" <davem@davemloft.net>, Arnd Bergmann <arnd@arndb.de>, Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <muchun.song@linux.dev>, SeongJae Park <sj@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Uladzislau Rezki <urezki@gmail.com>, Christoph Hellwig <hch@infradead.org>, Lorenzo Stoakes <lstoakes@gmail.com>, Anshuman Khandual <anshuman.khandual@arm.com>, Peter Xu <peterx@redhat.com>, Axel Rasmussen <axelrasmussen@google.com>, Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Subject: [PATCH v1 0/8] Fix set_huge_pte_at() panic on arm64 Date: Thu, 21 Sep 2023 17:19:59 +0100 Message-Id: <20230921162007.1630149-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 21 Sep 2023 14:04:14 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777673709902947272 X-GMAIL-MSGID: 1777689502779842217 |
Series |
Fix set_huge_pte_at() panic on arm64
|
|
Message
Ryan Roberts
Sept. 21, 2023, 4:19 p.m. UTC
Hi All, This series fixes a bug in arm64's implementation of set_huge_pte_at(), which can result in an unprivileged user causing a kernel panic. The problem was triggered when running the new uffd poison mm selftest for HUGETLB memory. This test (and the uffd poison feature) was merged for v6.6-rc1. However, upon inspection there are multiple other pre-existing paths that can trigger this bug. Ideally, I'd like to get this fix in for v6.6 if possible? And I guess it should be backported too, given there are call sites where this can theoretically happen that pre-date v6.6-rc1 (I've cc'ed stable@vger.kernel.org). Description of Bug ------------------ arm64's huge pte implementation supports multiple huge page sizes, some of which are implemented in the page table with contiguous mappings. So set_huge_pte_at() needs to work out how big the logical pte is, so that it can also work out how many physical ptes (or pmds) need to be written. It does this by grabbing the folio out of the pte and querying its size. However, there are cases when the pte being set is actually a swap entry. But this also used to work fine, because for huge ptes, we only ever saw migration entries and hwpoison entries. And both of these types of swap entries have a PFN embedded, so the code would grab that and everything still worked out. But over time, more calls to set_huge_pte_at() have been added that set swap entry types that do not embed a PFN. And this causes the code to go bang. The triggering case is for the uffd poison test, commit 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which sets a PTE_MARKER_POISONED swap entry. But review shows there are other places too (PTE_MARKER_UFFD_WP). If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise, it will dereference a bad pointer in page_folio(): static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) { VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry)); return page_folio(pfn_to_page(swp_offset_pfn(entry))); } So the root cause is due to commit 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), which aimed to simplify the interface to the core code by removing set_huge_swap_pte_at() (which took a page size parameter) and replacing it with calls to set_huge_swap_pte_at() where the size was inferred from the folio, as descibed above. While that commit didn't break anything at the time, it did break the interface because it couldn't handle swap entries without PFNs. And since then new callers have come along which rely on this working. Fix --- The simplest fix would have been to revert the dodgy cleanup commit, but since things have moved on, this would have required an audit of all the new set_huge_pte_at() call sites to see if they should be converted to set_huge_swap_pte_at(). As per the original intent of the change, it would also leave us open to future bugs when people invariably get it wrong and call the wrong helper. So instead, I've converted the first parameter of set_huge_pte_at() to be a vma rather than an mm. This means that the arm64 code can easily recover the huge page size in all cases. It's a bigger change, due to needing to touch the arches that implement the function, but it is entirely mechanical, so in my view, low risk. I've compile-tested all touched arches; arm64, parisc, powerpc, riscv, s390 (and additionally x86_64). I've additionally booted and run mm selftests against arm64, where I observe the uffd poison test is fixed, and there are no other regressions. Patches ------- patches 1-7: Convert core mm and arches to pass vma instead of mm patch: 8: Fixes the arm64 bug Patches based on v6.6-rc2. Thanks, Ryan Ryan Roberts (8): parisc: hugetlb: Convert set_huge_pte_at() to take vma powerpc: hugetlb: Convert set_huge_pte_at() to take vma riscv: hugetlb: Convert set_huge_pte_at() to take vma s390: hugetlb: Convert set_huge_pte_at() to take vma sparc: hugetlb: Convert set_huge_pte_at() to take vma mm: hugetlb: Convert set_huge_pte_at() to take vma arm64: hugetlb: Convert set_huge_pte_at() to take vma arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries arch/arm64/include/asm/hugetlb.h | 2 +- arch/arm64/mm/hugetlbpage.c | 22 ++++---------- arch/parisc/include/asm/hugetlb.h | 2 +- arch/parisc/mm/hugetlbpage.c | 4 +-- .../include/asm/nohash/32/hugetlb-8xx.h | 3 +- arch/powerpc/mm/book3s64/hugetlbpage.c | 2 +- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 2 +- arch/powerpc/mm/nohash/8xx.c | 2 +- arch/powerpc/mm/pgtable.c | 7 ++++- arch/riscv/include/asm/hugetlb.h | 2 +- arch/riscv/mm/hugetlbpage.c | 3 +- arch/s390/include/asm/hugetlb.h | 8 +++-- arch/s390/mm/hugetlbpage.c | 8 ++++- arch/sparc/include/asm/hugetlb.h | 8 +++-- arch/sparc/mm/hugetlbpage.c | 8 ++++- include/asm-generic/hugetlb.h | 6 ++-- include/linux/hugetlb.h | 6 ++-- mm/damon/vaddr.c | 2 +- mm/hugetlb.c | 30 +++++++++---------- mm/migrate.c | 2 +- mm/rmap.c | 10 +++---- mm/vmalloc.c | 5 +++- 22 files changed, 80 insertions(+), 64 deletions(-) -- 2.25.1
Comments
On Thu, 21 Sep 2023 17:19:59 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote: > Hi All, > > This series fixes a bug in arm64's implementation of set_huge_pte_at(), which > can result in an unprivileged user causing a kernel panic. The problem was > triggered when running the new uffd poison mm selftest for HUGETLB memory. This > test (and the uffd poison feature) was merged for v6.6-rc1. However, upon > inspection there are multiple other pre-existing paths that can trigger this > bug. > > Ideally, I'd like to get this fix in for v6.6 if possible? And I guess it should > be backported too, given there are call sites where this can theoretically > happen that pre-date v6.6-rc1 (I've cc'ed stable@vger.kernel.org). This gets you a naggygram from Greg. The way to request a backport is to add cc:stable to all the changelogs. I'll make that change to my copy. > Ryan Roberts (8): > parisc: hugetlb: Convert set_huge_pte_at() to take vma > powerpc: hugetlb: Convert set_huge_pte_at() to take vma > riscv: hugetlb: Convert set_huge_pte_at() to take vma > s390: hugetlb: Convert set_huge_pte_at() to take vma > sparc: hugetlb: Convert set_huge_pte_at() to take vma > mm: hugetlb: Convert set_huge_pte_at() to take vma > arm64: hugetlb: Convert set_huge_pte_at() to take vma > arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries > > arch/arm64/include/asm/hugetlb.h | 2 +- > arch/arm64/mm/hugetlbpage.c | 22 ++++---------- > arch/parisc/include/asm/hugetlb.h | 2 +- > arch/parisc/mm/hugetlbpage.c | 4 +-- > .../include/asm/nohash/32/hugetlb-8xx.h | 3 +- > arch/powerpc/mm/book3s64/hugetlbpage.c | 2 +- > arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 2 +- > arch/powerpc/mm/nohash/8xx.c | 2 +- > arch/powerpc/mm/pgtable.c | 7 ++++- > arch/riscv/include/asm/hugetlb.h | 2 +- > arch/riscv/mm/hugetlbpage.c | 3 +- > arch/s390/include/asm/hugetlb.h | 8 +++-- > arch/s390/mm/hugetlbpage.c | 8 ++++- > arch/sparc/include/asm/hugetlb.h | 8 +++-- > arch/sparc/mm/hugetlbpage.c | 8 ++++- > include/asm-generic/hugetlb.h | 6 ++-- > include/linux/hugetlb.h | 6 ++-- > mm/damon/vaddr.c | 2 +- > mm/hugetlb.c | 30 +++++++++---------- > mm/migrate.c | 2 +- > mm/rmap.c | 10 +++---- > mm/vmalloc.c | 5 +++- > 22 files changed, 80 insertions(+), 64 deletions(-) Looks scary but it's actually a fairly modest patchset. It could easily be all rolled into a single patch for ease of backporting. Maybe Greg has an opinion?
On 21/09/2023 18:38, Catalin Marinas wrote: > On Thu, Sep 21, 2023 at 05:35:54PM +0100, Ryan Roberts wrote: >> On 21/09/2023 17:30, Andrew Morton wrote: >>> On Thu, 21 Sep 2023 17:19:59 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote: >>>> Ryan Roberts (8): >>>> parisc: hugetlb: Convert set_huge_pte_at() to take vma >>>> powerpc: hugetlb: Convert set_huge_pte_at() to take vma >>>> riscv: hugetlb: Convert set_huge_pte_at() to take vma >>>> s390: hugetlb: Convert set_huge_pte_at() to take vma >>>> sparc: hugetlb: Convert set_huge_pte_at() to take vma >>>> mm: hugetlb: Convert set_huge_pte_at() to take vma >>>> arm64: hugetlb: Convert set_huge_pte_at() to take vma >>>> arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries >>>> >>>> arch/arm64/include/asm/hugetlb.h | 2 +- >>>> arch/arm64/mm/hugetlbpage.c | 22 ++++---------- >>>> arch/parisc/include/asm/hugetlb.h | 2 +- >>>> arch/parisc/mm/hugetlbpage.c | 4 +-- >>>> .../include/asm/nohash/32/hugetlb-8xx.h | 3 +- >>>> arch/powerpc/mm/book3s64/hugetlbpage.c | 2 +- >>>> arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 2 +- >>>> arch/powerpc/mm/nohash/8xx.c | 2 +- >>>> arch/powerpc/mm/pgtable.c | 7 ++++- >>>> arch/riscv/include/asm/hugetlb.h | 2 +- >>>> arch/riscv/mm/hugetlbpage.c | 3 +- >>>> arch/s390/include/asm/hugetlb.h | 8 +++-- >>>> arch/s390/mm/hugetlbpage.c | 8 ++++- >>>> arch/sparc/include/asm/hugetlb.h | 8 +++-- >>>> arch/sparc/mm/hugetlbpage.c | 8 ++++- >>>> include/asm-generic/hugetlb.h | 6 ++-- >>>> include/linux/hugetlb.h | 6 ++-- >>>> mm/damon/vaddr.c | 2 +- >>>> mm/hugetlb.c | 30 +++++++++---------- >>>> mm/migrate.c | 2 +- >>>> mm/rmap.c | 10 +++---- >>>> mm/vmalloc.c | 5 +++- >>>> 22 files changed, 80 insertions(+), 64 deletions(-) >>> >>> Looks scary but it's actually a fairly modest patchset. It could >>> easily be all rolled into a single patch for ease of backporting. >>> Maybe Greg has an opinion? >> >> Yes, I thought about doing that; or perhaps 2 patches - one for the interface >> change across all arches and core code, and one for the actual bug fix? > > I think this would make more sense, especially if we want to backport > it. The first patch would have no functional change, only an interface > change, followed by the arm64 fix. OK I'll do it like this for v2. >
On Thu, Sep 21, 2023 at 05:35:54PM +0100, Ryan Roberts wrote: > On 21/09/2023 17:30, Andrew Morton wrote: > > On Thu, 21 Sep 2023 17:19:59 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote: > > > >> Hi All, > >> > >> This series fixes a bug in arm64's implementation of set_huge_pte_at(), which > >> can result in an unprivileged user causing a kernel panic. The problem was > >> triggered when running the new uffd poison mm selftest for HUGETLB memory. This > >> test (and the uffd poison feature) was merged for v6.6-rc1. However, upon > >> inspection there are multiple other pre-existing paths that can trigger this > >> bug. > >> > >> Ideally, I'd like to get this fix in for v6.6 if possible? And I guess it should > >> be backported too, given there are call sites where this can theoretically > >> happen that pre-date v6.6-rc1 (I've cc'ed stable@vger.kernel.org). > > > > This gets you a naggygram from Greg. The way to request a backport is > > to add cc:stable to all the changelogs. I'll make that change to my copy. > > Ahh, sorry about that... I just got the same moan from the kernel test robot too. > > > > > > >> Ryan Roberts (8): > >> parisc: hugetlb: Convert set_huge_pte_at() to take vma > >> powerpc: hugetlb: Convert set_huge_pte_at() to take vma > >> riscv: hugetlb: Convert set_huge_pte_at() to take vma > >> s390: hugetlb: Convert set_huge_pte_at() to take vma > >> sparc: hugetlb: Convert set_huge_pte_at() to take vma > >> mm: hugetlb: Convert set_huge_pte_at() to take vma > >> arm64: hugetlb: Convert set_huge_pte_at() to take vma > >> arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries > >> > >> arch/arm64/include/asm/hugetlb.h | 2 +- > >> arch/arm64/mm/hugetlbpage.c | 22 ++++---------- > >> arch/parisc/include/asm/hugetlb.h | 2 +- > >> arch/parisc/mm/hugetlbpage.c | 4 +-- > >> .../include/asm/nohash/32/hugetlb-8xx.h | 3 +- > >> arch/powerpc/mm/book3s64/hugetlbpage.c | 2 +- > >> arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 2 +- > >> arch/powerpc/mm/nohash/8xx.c | 2 +- > >> arch/powerpc/mm/pgtable.c | 7 ++++- > >> arch/riscv/include/asm/hugetlb.h | 2 +- > >> arch/riscv/mm/hugetlbpage.c | 3 +- > >> arch/s390/include/asm/hugetlb.h | 8 +++-- > >> arch/s390/mm/hugetlbpage.c | 8 ++++- > >> arch/sparc/include/asm/hugetlb.h | 8 +++-- > >> arch/sparc/mm/hugetlbpage.c | 8 ++++- > >> include/asm-generic/hugetlb.h | 6 ++-- > >> include/linux/hugetlb.h | 6 ++-- > >> mm/damon/vaddr.c | 2 +- > >> mm/hugetlb.c | 30 +++++++++---------- > >> mm/migrate.c | 2 +- > >> mm/rmap.c | 10 +++---- > >> mm/vmalloc.c | 5 +++- > >> 22 files changed, 80 insertions(+), 64 deletions(-) > > > > Looks scary but it's actually a fairly modest patchset. It could > > easily be all rolled into a single patch for ease of backporting. > > Maybe Greg has an opinion? > > Yes, I thought about doing that; or perhaps 2 patches - one for the interface > change across all arches and core code, and one for the actual bug fix? I have no issues with taking patch series, or one big patch, into stable trees, they just have to match up with what is in Linus's tree. so if it makes more sense to have this as a series (like you did here), wonderful, make it a patch series. Do not go out of your way to do things differently just for stable kernels, that is not necessary or needed at all. thanks, greg k-h