From patchwork Fri Sep 22 11:58:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14370 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5517966vqi; Fri, 22 Sep 2023 05:09:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFNwk2C/6OuAY+6OC/qwcjAsFcvu4CjB1yw9eqM+DPANd3evhFg/lPTeTFavxRBjH5O4Gur X-Received: by 2002:a05:6a20:8f21:b0:152:efa4:21b with SMTP id b33-20020a056a208f2100b00152efa4021bmr8926274pzk.5.1695384540079; Fri, 22 Sep 2023 05:09:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695384540; cv=none; d=google.com; s=arc-20160816; b=rSmyk2/KsOMC61IHIUHqZGxVNuZRv88p7o9MZKfyWtXeWCoSjNhS6noZQFr8UfvF6+ n/FkEgetNub9PupjEyO2vcvDKrRzGh4b5qiMzjQs8doVs3NaAnf3T9KlV9n87IQpHiZF JPzXTgnXPbbR0oXv59xRG9abd1k77VFju6AdFJ1fh2aarrp1r0uQcizWFvJnNmn+EEqa FJ7mL5s3zt1gJpMifxh86cse7pCAEsd7ppXhQYpxV8NApazA0u/ZXsT4X/gWdX3d90HF VlQLE7uZHRjhgybKqxjB/iyOY75StTI+tbGRPTG5tsFVpxjxC3R8eEn1QehnLtIFqX1/ xiZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=31f4aUAVz/0Q33PRfDUN8I5Zm8yBJPQTdRAC553VfNY=; fh=q/eoddAVbbeWDzUVs4Vwk85GEJr6Ex1kH1y0q+v38Hg=; b=Wd8aR6XNMh9oq8ABR0jo119Dx1vf/T9bN3CombEnv4pr2ygqmnNi+1zGi5e6snina9 2tFhnpeQXKFJ36RZyVFEsGdB9rOuZxTAQuskU8fX6P/CSbIB3HaodvzUsUFEhauhcthA cs2hjzCuZKCd/I6APzwrJeUtGtQmuJlaIJuemnMYTvCbSyCebn1LnhHKt+PhpQaDD4LU H577skcu4G1A60ccGXh/wORXDIWGN9utEtGt1lGFbC7yMIdiAxAh5Mcq8kritDeQdx8K eNyIhEd1bu1KosK5i5/ElqakBNQ9zmmJYviW7S2PdSz6kM3I9QUQ4/1OCbQfdrae501Y Serg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id b7-20020a17090acc0700b002734f48cfd6si3763772pju.155.2023.09.22.05.08.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 05:09:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 2267B83B363D; Fri, 22 Sep 2023 04:58:33 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233934AbjIVL6Y (ORCPT + 29 others); Fri, 22 Sep 2023 07:58:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233669AbjIVL6W (ORCPT ); Fri, 22 Sep 2023 07:58:22 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4CEBF18F; Fri, 22 Sep 2023 04:58:16 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 13FA5DA7; Fri, 22 Sep 2023 04:58:53 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9C73D3F67D; Fri, 22 Sep 2023 04:58:11 -0700 (PDT) From: Ryan Roberts To: Catalin Marinas , Will Deacon , "James E.J. Bottomley" , Helge Deller , Nicholas Piggin , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Gerald Schaefer , "David S. Miller" , Arnd Bergmann , Mike Kravetz , Muchun Song , SeongJae Park , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Anshuman Khandual , Peter Xu , Axel Rasmussen , Qi Zheng Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Subject: [PATCH v2 0/2] Fix set_huge_pte_at() panic on arm64 Date: Fri, 22 Sep 2023 12:58:02 +0100 Message-Id: <20230922115804.2043771-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 22 Sep 2023 04:58:33 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777739539325512664 X-GMAIL-MSGID: 1777739539325512664 Hi All, This series fixes a bug in arm64's implementation of set_huge_pte_at(), which can result in an unprivileged user causing a kernel panic. The problem was triggered when running the new uffd poison mm selftest for HUGETLB memory. This test (and the uffd poison feature) was merged for v6.5-rc7. Ideally, I'd like to get this fix in for v6.6 and I've cc'ed stable (correctly this time) to get it backported to v6.5, where the issue first showed up. Description of Bug ------------------ arm64's huge pte implementation supports multiple huge page sizes, some of which are implemented in the page table with multiple contiguous entries. So set_huge_pte_at() needs to work out how big the logical pte is, so that it can also work out how many physical ptes (or pmds) need to be written. It previously did this by grabbing the folio out of the pte and querying its size. However, there are cases when the pte being set is actually a swap entry. But this also used to work fine, because for huge ptes, we only ever saw migration entries and hwpoison entries. And both of these types of swap entries have a PFN embedded, so the code would grab that and everything still worked out. But over time, more calls to set_huge_pte_at() have been added that set swap entry types that do not embed a PFN. And this causes the code to go bang. The triggering case is for the uffd poison test, commit 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") - added in v6.5-rc7. Although review shows that there are other call sites that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger on arm64 because arm64 doesn't support UFFD WP. If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise, it will dereference a bad pointer in page_folio(): static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) { VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry)); return page_folio(pfn_to_page(swp_offset_pfn(entry))); } Fix --- The simplest fix would have been to revert the dodgy cleanup commit 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), but since things have moved on, this would have required an audit of all the new set_huge_pte_at() call sites to see if they should be converted to set_huge_swap_pte_at(). As per the original intent of the change, it would also leave us open to future bugs when people invariably get it wrong and call the wrong helper. So instead, I've added a huge page size parameter to set_huge_pte_at(). This means that the arm64 code has the size in all cases. It's a bigger change, due to needing to touch the arches that implement the function, but it is entirely mechanical, so in my view, low risk. I've compile-tested all touched arches; arm64, parisc, powerpc, riscv, s390, sparc (and additionally x86_64). I've additionally booted and run mm selftests against arm64, where I observe the uffd poison test is fixed, and there are no other regressions. Patches ------- patch 1: Convert core mm and arches to pass extra param (no behavioral change) patch 8: Fix the arm64 bug Patches based on v6.6-rc2. Changes since v1 [1] -------------------- - Pass extra size param instead of converting mm to vma. - Passing vma was problematic for kernel mapping case without vma - Squash all interface changes to single patch - Simplify powerpc so that is doesn't require __set_huge_page_at() - Added Reviewed-bys [1] https://lore.kernel.org/linux-arm-kernel/20230921162007.1630149-1-ryan.roberts@arm.com/ Thanks, Ryan Ryan Roberts (2): mm: hugetlb: Add huge page size param to set_huge_pte_at() arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries arch/arm64/include/asm/hugetlb.h | 2 +- arch/arm64/mm/hugetlbpage.c | 23 +++------- arch/parisc/include/asm/hugetlb.h | 2 +- arch/parisc/mm/hugetlbpage.c | 2 +- .../include/asm/nohash/32/hugetlb-8xx.h | 3 +- arch/powerpc/mm/book3s64/hugetlbpage.c | 5 ++- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 3 +- arch/powerpc/mm/nohash/8xx.c | 3 +- arch/powerpc/mm/pgtable.c | 3 +- arch/riscv/include/asm/hugetlb.h | 3 +- arch/riscv/mm/hugetlbpage.c | 3 +- arch/s390/include/asm/hugetlb.h | 6 ++- arch/s390/mm/hugetlbpage.c | 8 +++- arch/sparc/include/asm/hugetlb.h | 6 ++- arch/sparc/mm/hugetlbpage.c | 8 +++- include/asm-generic/hugetlb.h | 2 +- include/linux/hugetlb.h | 6 ++- mm/damon/vaddr.c | 3 +- mm/hugetlb.c | 43 +++++++++++-------- mm/migrate.c | 7 ++- mm/rmap.c | 23 +++++++--- mm/vmalloc.c | 2 +- 22 files changed, 103 insertions(+), 63 deletions(-) -- 2.25.1