Message ID | 20221117075602.2904324-3-liushixin2@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp254037wrr; Wed, 16 Nov 2022 23:10:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf7AAUomDCLYspVO7Xm5b1B2jyWJJEVoodFWU7l2nsa63H8bUDgFR6khR3cvbNMfD3eLhKGp X-Received: by 2002:a05:6a00:1a88:b0:562:bcf8:7b35 with SMTP id e8-20020a056a001a8800b00562bcf87b35mr1746881pfv.52.1668669006918; Wed, 16 Nov 2022 23:10:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668669006; cv=none; d=google.com; s=arc-20160816; b=IJxqUj6cj6pA0P/R8yQOQuz7zzfOSJtjOvCv9qTFPArF9GTf1bxil1O98kqYB87Cfu j8kxAa1ZXVhAJUrgBd8PEHvQeeVDWapOHq4gYGuLoQ6WWbMIc3InaDCHPWZN2XYaRb1x 0HORQnGqzB37R90C5CKplrTU/j/mEJ5mXsTMiUjc32KdY+KeV1lMDm3Y5ijBGRhgoPoP 8rheLiTdJi+i5qpO/hIYWPSB+Qlt4nSCz5b9u6olpzaTKhkRm9Qmg9tyuCNuPrWkaNax YuDpUlnfrcyc9+uSYiUuE+l6ZQYdx6x+FY67StRiLIWagzJzsnPDzc/M+sk81LBwsIxw FIOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=snJVPV7kSrcZBDr5Ltv+4/P/nSPL/rfu+uI69mtBzJ0=; b=UjzO3x1d9PZDXSdo3HZq++GBr82SSS4opyco1fId7AABFa1y1Gv9KFP/ijNEg20sjT 5iSi+kuLvemm+CpjPU7dXGAzrDAS9OKoudHYiIi2nIgMR5fTuoS1Un2E1T3nuGv3O/JJ qdOPJttMZ5E6cerS2RTa+oS0ILAPfFki1k3kohVyZ07U7qDZgh37angIYkTGYHxnbDaZ +f7oGBGQJOIOk6euF7iIb2SvBUqwWEbFC0vJ8jkhdYWkFHZ126WAwdDY5LyLT6GPyOl3 9xc7N7w6W76FbxCK3i1QwoB6406CQxgOvbJVEO2f/7TpT5hLr9NUmKuxlcef7L01UmiY brHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c5-20020a634e05000000b004703be83ebesi346190pgb.79.2022.11.16.23.09.53; Wed, 16 Nov 2022 23:10:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238868AbiKQHIj (ORCPT <rfc822;just.gull.subs@gmail.com> + 99 others); Thu, 17 Nov 2022 02:08:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238783AbiKQHIf (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 17 Nov 2022 02:08:35 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B3371B1F8 for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 23:08:29 -0800 (PST) Received: from dggpemm500020.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NCWGG4cggzRpGD; Thu, 17 Nov 2022 15:08:06 +0800 (CST) Received: from dggpemm100009.china.huawei.com (7.185.36.113) by dggpemm500020.china.huawei.com (7.185.36.49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Thu, 17 Nov 2022 15:08:27 +0800 Received: from huawei.com (10.175.113.32) by dggpemm100009.china.huawei.com (7.185.36.113) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Thu, 17 Nov 2022 15:08:26 +0800 From: Liu Shixin <liushixin2@huawei.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Denys Vlasenko <dvlasenk@redhat.com>, Kefeng Wang <wangkefeng.wang@huawei.com>, Anshuman Khandual <anshuman.khandual@arm.com>, David Hildenbrand <dhildenb@redhat.com>, Rafael Aquini <raquini@redhat.com>, Pasha Tatashin <pasha.tatashin@soleen.com> CC: <linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org>, Liu Shixin <liushixin2@huawei.com> Subject: [PATCH v2 2/2] arm64/mm: fix incorrect file_map_count for invalid pmd/pud Date: Thu, 17 Nov 2022 15:56:02 +0800 Message-ID: <20221117075602.2904324-3-liushixin2@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221117075602.2904324-1-liushixin2@huawei.com> References: <20221117075602.2904324-1-liushixin2@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.113.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemm100009.china.huawei.com (7.185.36.113) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749726272956235792?= X-GMAIL-MSGID: =?utf-8?q?1749726272956235792?= |
Series |
arm64: fix two bug about page table check
|
|
Commit Message
Liu Shixin
Nov. 17, 2022, 7:56 a.m. UTC
The page table check trigger BUG_ON() unexpectedly when split hugepage: ------------[ cut here ]------------ kernel BUG at mm/page_table_check.c:119! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 Hardware name: linux,dummy-virt (DT) pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : page_table_check_set.isra.0+0x398/0x468 lr : page_table_check_set.isra.0+0x1c0/0x468 [...] Call trace: page_table_check_set.isra.0+0x398/0x468 __page_table_check_pte_set+0x160/0x1c0 __split_huge_pmd_locked+0x900/0x1648 __split_huge_pmd+0x28c/0x3b8 unmap_page_range+0x428/0x858 unmap_single_vma+0xf4/0x1c8 zap_page_range+0x2b0/0x410 madvise_vma_behavior+0xc44/0xe78 do_madvise+0x280/0x698 __arm64_sys_madvise+0x90/0xe8 invoke_syscall.constprop.0+0xdc/0x1d8 do_el0_svc+0xf4/0x3f8 el0_svc+0x58/0x120 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x19c/0x1a0 [...] On arm64, pmd_leaf() will return true even if the pmd is invalid due to pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count will not only decrease once but also increase once. Then in set_pte_at(), the file_map_count increase again, and so trigger BUG_ON() unexpectedly. Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). Moreover, add pud_valid() for pud_user_accessible_page() too. Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Reported-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> --- arch/arm64/include/asm/pgtable.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On 17.11.22 08:56, Liu Shixin wrote: > The page table check trigger BUG_ON() unexpectedly when split hugepage: > > ------------[ cut here ]------------ > kernel BUG at mm/page_table_check.c:119! > Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 > Hardware name: linux,dummy-virt (DT) > pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : page_table_check_set.isra.0+0x398/0x468 > lr : page_table_check_set.isra.0+0x1c0/0x468 > [...] > Call trace: > page_table_check_set.isra.0+0x398/0x468 > __page_table_check_pte_set+0x160/0x1c0 > __split_huge_pmd_locked+0x900/0x1648 > __split_huge_pmd+0x28c/0x3b8 > unmap_page_range+0x428/0x858 > unmap_single_vma+0xf4/0x1c8 > zap_page_range+0x2b0/0x410 > madvise_vma_behavior+0xc44/0xe78 > do_madvise+0x280/0x698 > __arm64_sys_madvise+0x90/0xe8 > invoke_syscall.constprop.0+0xdc/0x1d8 > do_el0_svc+0xf4/0x3f8 > el0_svc+0x58/0x120 > el0t_64_sync_handler+0xb8/0xc0 > el0t_64_sync+0x19c/0x1a0 > [...] > > On arm64, pmd_leaf() will return true even if the pmd is invalid due to > pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count > will not only decrease once but also increase once. Then in set_pte_at(), > the file_map_count increase again, and so trigger BUG_ON() unexpectedly. > > Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). > Moreover, add pud_valid() for pud_user_accessible_page() too. > > Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") > Reported-by: Denys Vlasenko <dvlasenk@redhat.com> > Signed-off-by: Liu Shixin <liushixin2@huawei.com> > Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> > --- > arch/arm64/include/asm/pgtable.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index edf6625ce965..3bc64199aa2e 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -863,12 +863,12 @@ static inline bool pte_user_accessible_page(pte_t pte) > > static inline bool pmd_user_accessible_page(pmd_t pmd) > { > - return pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > + return pmd_valid(pmd) && pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > } > > static inline bool pud_user_accessible_page(pud_t pud) > { > - return pud_leaf(pud) && pud_user(pud); > + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); > } > #endif > Acked-by: David Hildenbrand <david@redhat.com>
On 2022/11/17 15:56, Liu Shixin wrote: > The page table check trigger BUG_ON() unexpectedly when split hugepage: > > ------------[ cut here ]------------ > kernel BUG at mm/page_table_check.c:119! > Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 > Hardware name: linux,dummy-virt (DT) > pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : page_table_check_set.isra.0+0x398/0x468 > lr : page_table_check_set.isra.0+0x1c0/0x468 > [...] > Call trace: > page_table_check_set.isra.0+0x398/0x468 > __page_table_check_pte_set+0x160/0x1c0 > __split_huge_pmd_locked+0x900/0x1648 > __split_huge_pmd+0x28c/0x3b8 > unmap_page_range+0x428/0x858 > unmap_single_vma+0xf4/0x1c8 > zap_page_range+0x2b0/0x410 > madvise_vma_behavior+0xc44/0xe78 > do_madvise+0x280/0x698 > __arm64_sys_madvise+0x90/0xe8 > invoke_syscall.constprop.0+0xdc/0x1d8 > do_el0_svc+0xf4/0x3f8 > el0_svc+0x58/0x120 > el0t_64_sync_handler+0xb8/0xc0 > el0t_64_sync+0x19c/0x1a0 > [...] > > On arm64, pmd_leaf() will return true even if the pmd is invalid due to > pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count > will not only decrease once but also increase once. Then in set_pte_at(), > the file_map_count increase again, and so trigger BUG_ON() unexpectedly. > > Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). > Moreover, add pud_valid() for pud_user_accessible_page() too. > > Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") > Reported-by: Denys Vlasenko <dvlasenk@redhat.com> > Signed-off-by: Liu Shixin <liushixin2@huawei.com> > Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
On Thu, Nov 17, 2022 at 03:56:02PM +0800, Liu Shixin wrote: > The page table check trigger BUG_ON() unexpectedly when split hugepage: > > ------------[ cut here ]------------ > kernel BUG at mm/page_table_check.c:119! > Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 > Hardware name: linux,dummy-virt (DT) > pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : page_table_check_set.isra.0+0x398/0x468 > lr : page_table_check_set.isra.0+0x1c0/0x468 > [...] > Call trace: > page_table_check_set.isra.0+0x398/0x468 > __page_table_check_pte_set+0x160/0x1c0 > __split_huge_pmd_locked+0x900/0x1648 > __split_huge_pmd+0x28c/0x3b8 > unmap_page_range+0x428/0x858 > unmap_single_vma+0xf4/0x1c8 > zap_page_range+0x2b0/0x410 > madvise_vma_behavior+0xc44/0xe78 > do_madvise+0x280/0x698 > __arm64_sys_madvise+0x90/0xe8 > invoke_syscall.constprop.0+0xdc/0x1d8 > do_el0_svc+0xf4/0x3f8 > el0_svc+0x58/0x120 > el0t_64_sync_handler+0xb8/0xc0 > el0t_64_sync+0x19c/0x1a0 > [...] > > On arm64, pmd_leaf() will return true even if the pmd is invalid due to > pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count > will not only decrease once but also increase once. Then in set_pte_at(), > the file_map_count increase again, and so trigger BUG_ON() unexpectedly. > > Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). > Moreover, add pud_valid() for pud_user_accessible_page() too. > > Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") > Reported-by: Denys Vlasenko <dvlasenk@redhat.com> > Signed-off-by: Liu Shixin <liushixin2@huawei.com> > Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> > --- > arch/arm64/include/asm/pgtable.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index edf6625ce965..3bc64199aa2e 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -863,12 +863,12 @@ static inline bool pte_user_accessible_page(pte_t pte) > > static inline bool pmd_user_accessible_page(pmd_t pmd) > { > - return pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > + return pmd_valid(pmd) && pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); Hmm, doesn't this have a funny interaction with PROT_NONE where the pmd is invalid but present? If you don't care about PROT_NONE, then you could just do: pmd_valid(pmd) && !pmd_table(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) but if you do care then you could do: pmd_leaf(pmd) && !pmd_present_invalid(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) > static inline bool pud_user_accessible_page(pud_t pud) > { > - return pud_leaf(pud) && pud_user(pud); > + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); Not caused by this patch, but why don't we have something like a pud_user_exec() check here like we do for the pte and pmd levels? Will
On 2022/11/18 22:34, Will Deacon wrote: > On Thu, Nov 17, 2022 at 03:56:02PM +0800, Liu Shixin wrote: >> The page table check trigger BUG_ON() unexpectedly when split hugepage: >> >> ------------[ cut here ]------------ >> kernel BUG at mm/page_table_check.c:119! >> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP >> Dumping ftrace buffer: >> (ftrace buffer empty) >> Modules linked in: >> CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 >> Hardware name: linux,dummy-virt (DT) >> pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : page_table_check_set.isra.0+0x398/0x468 >> lr : page_table_check_set.isra.0+0x1c0/0x468 >> [...] >> Call trace: >> page_table_check_set.isra.0+0x398/0x468 >> __page_table_check_pte_set+0x160/0x1c0 >> __split_huge_pmd_locked+0x900/0x1648 >> __split_huge_pmd+0x28c/0x3b8 >> unmap_page_range+0x428/0x858 >> unmap_single_vma+0xf4/0x1c8 >> zap_page_range+0x2b0/0x410 >> madvise_vma_behavior+0xc44/0xe78 >> do_madvise+0x280/0x698 >> __arm64_sys_madvise+0x90/0xe8 >> invoke_syscall.constprop.0+0xdc/0x1d8 >> do_el0_svc+0xf4/0x3f8 >> el0_svc+0x58/0x120 >> el0t_64_sync_handler+0xb8/0xc0 >> el0t_64_sync+0x19c/0x1a0 >> [...] >> >> On arm64, pmd_leaf() will return true even if the pmd is invalid due to >> pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count >> will not only decrease once but also increase once. Then in set_pte_at(), >> the file_map_count increase again, and so trigger BUG_ON() unexpectedly. >> >> Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). >> Moreover, add pud_valid() for pud_user_accessible_page() too. >> >> Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") >> Reported-by: Denys Vlasenko <dvlasenk@redhat.com> >> Signed-off-by: Liu Shixin <liushixin2@huawei.com> >> Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> >> --- >> arch/arm64/include/asm/pgtable.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >> index edf6625ce965..3bc64199aa2e 100644 >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -863,12 +863,12 @@ static inline bool pte_user_accessible_page(pte_t pte) >> >> static inline bool pmd_user_accessible_page(pmd_t pmd) >> { >> - return pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); >> + return pmd_valid(pmd) && pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > Hmm, doesn't this have a funny interaction with PROT_NONE where the pmd is > invalid but present? If you don't care about PROT_NONE, then you could just > do: > > pmd_valid(pmd) && !pmd_table(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) > > but if you do care then you could do: > > pmd_leaf(pmd) && !pmd_present_invalid(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) I prefer the latter. I will fix and resend later. >> static inline bool pud_user_accessible_page(pud_t pud) >> { >> - return pud_leaf(pud) && pud_user(pud); >> + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); > Not caused by this patch, but why don't we have something like a > pud_user_exec() check here like we do for the pte and pmd levels? As far as I know, there is no user use the user executable pud on arm64, so didn't define pud_user_exec(). Thanks, > > Will > . >
On Mon, Nov 21, 2022 at 11:15:49AM +0800, Liu Shixin wrote: > On 2022/11/18 22:34, Will Deacon wrote: > > On Thu, Nov 17, 2022 at 03:56:02PM +0800, Liu Shixin wrote: > >> The page table check trigger BUG_ON() unexpectedly when split hugepage: > >> > >> ------------[ cut here ]------------ > >> kernel BUG at mm/page_table_check.c:119! > >> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > >> Dumping ftrace buffer: > >> (ftrace buffer empty) > >> Modules linked in: > >> CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748 > >> Hardware name: linux,dummy-virt (DT) > >> pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > >> pc : page_table_check_set.isra.0+0x398/0x468 > >> lr : page_table_check_set.isra.0+0x1c0/0x468 > >> [...] > >> Call trace: > >> page_table_check_set.isra.0+0x398/0x468 > >> __page_table_check_pte_set+0x160/0x1c0 > >> __split_huge_pmd_locked+0x900/0x1648 > >> __split_huge_pmd+0x28c/0x3b8 > >> unmap_page_range+0x428/0x858 > >> unmap_single_vma+0xf4/0x1c8 > >> zap_page_range+0x2b0/0x410 > >> madvise_vma_behavior+0xc44/0xe78 > >> do_madvise+0x280/0x698 > >> __arm64_sys_madvise+0x90/0xe8 > >> invoke_syscall.constprop.0+0xdc/0x1d8 > >> do_el0_svc+0xf4/0x3f8 > >> el0_svc+0x58/0x120 > >> el0t_64_sync_handler+0xb8/0xc0 > >> el0t_64_sync+0x19c/0x1a0 > >> [...] > >> > >> On arm64, pmd_leaf() will return true even if the pmd is invalid due to > >> pmd_present_invalid() check. So in pmdp_invalidate() the file_map_count > >> will not only decrease once but also increase once. Then in set_pte_at(), > >> the file_map_count increase again, and so trigger BUG_ON() unexpectedly. > >> > >> Fix this problem by adding pmd_valid() in pmd_user_accessible_page(). > >> Moreover, add pud_valid() for pud_user_accessible_page() too. > >> > >> Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") > >> Reported-by: Denys Vlasenko <dvlasenk@redhat.com> > >> Signed-off-by: Liu Shixin <liushixin2@huawei.com> > >> Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> > >> --- > >> arch/arm64/include/asm/pgtable.h | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > >> index edf6625ce965..3bc64199aa2e 100644 > >> --- a/arch/arm64/include/asm/pgtable.h > >> +++ b/arch/arm64/include/asm/pgtable.h > >> @@ -863,12 +863,12 @@ static inline bool pte_user_accessible_page(pte_t pte) > >> > >> static inline bool pmd_user_accessible_page(pmd_t pmd) > >> { > >> - return pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > >> + return pmd_valid(pmd) && pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); > > Hmm, doesn't this have a funny interaction with PROT_NONE where the pmd is > > invalid but present? If you don't care about PROT_NONE, then you could just > > do: > > > > pmd_valid(pmd) && !pmd_table(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) > > > > but if you do care then you could do: > > > > pmd_leaf(pmd) && !pmd_present_invalid(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)) > I prefer the latter. I will fix and resend later. > >> static inline bool pud_user_accessible_page(pud_t pud) > >> { > >> - return pud_leaf(pud) && pud_user(pud); > >> + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); > > Not caused by this patch, but why don't we have something like a > > pud_user_exec() check here like we do for the pte and pmd levels? > As far as I know, there is no user use the user executable pud on arm64, so didn't define pud_user_exec(). I can believe they don't get exposed to userspace at all, but exposing only as non-executable doesn't sound right. So I would have thought that either pud_user_accessible_page() would always return false or it would need to check for the executable case too. Will
On 2022/11/22 2:16, Will Deacon wrote: > On Mon, Nov 21, 2022 at 11:15:49AM +0800, Liu Shixin wrote: >> On 2022/11/18 22:34, Will Deacon wrote: >>> On Thu, Nov 17, 2022 at 03:56:02PM +0800, Liu Shixin wrote: >>>> static inline bool pud_user_accessible_page(pud_t pud) >>>> { >>>> - return pud_leaf(pud) && pud_user(pud); >>>> + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); >>> Not caused by this patch, but why don't we have something like a >>> pud_user_exec() check here like we do for the pte and pmd levels? >> As far as I know, there is no user use the user executable pud on arm64, so didn't define pud_user_exec(). > I can believe they don't get exposed to userspace at all, but exposing only > as non-executable doesn't sound right. So I would have thought that either > pud_user_accessible_page() would always return false or it would need to > check for the executable case too. Thanks for your advice, I will add the check for the executable case too. > > Will > . >
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index edf6625ce965..3bc64199aa2e 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -863,12 +863,12 @@ static inline bool pte_user_accessible_page(pte_t pte) static inline bool pmd_user_accessible_page(pmd_t pmd) { - return pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); + return pmd_valid(pmd) && pmd_leaf(pmd) && (pmd_user(pmd) || pmd_user_exec(pmd)); } static inline bool pud_user_accessible_page(pud_t pud) { - return pud_leaf(pud) && pud_user(pud); + return pud_valid(pud) && pud_leaf(pud) && pud_user(pud); } #endif