Message ID | 20230218211433.26859-28-rick.p.edgecombe@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp555739wrn; Sat, 18 Feb 2023 13:22:14 -0800 (PST) X-Google-Smtp-Source: AK7set9l0PtS8VZU5+CTHPjjbx0j3cUkhYdTASkLO8FGFsUk2kBgHWvcFdOodC581tm3dzzJZBSU X-Received: by 2002:a17:903:1c6:b0:19a:a6ec:6726 with SMTP id e6-20020a17090301c600b0019aa6ec6726mr2042348plh.10.1676755334663; Sat, 18 Feb 2023 13:22:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676755334; cv=none; d=google.com; s=arc-20160816; b=RkhyC1uPFAUQcNFNXL8iFM+SSvWRX/Q5FCPcGsWj6oYDMsQMuLcd2NmzUwEcgBGIyl cClpq8XGgg56laUxd4zEvcFyIHOUDqiylJP7NxOTZYnML8a+YEdcy4eBE6WVdNRXcWKD ETUL87AWF8gm83/PbOS7XvVXXzxeD6daZfgsmt6b94HVE1ZTMWNegHb6ClNXbls3bXjf K3QiUyTh81sSUh8ixFVcAV4Yb9wVAYNBkuWvWyEhsu/gdfW+0o7p3iBwcVoTHX4Pw4LU OH1AEttK31pWGlKFwV8cAKWuX6E15zBA5coEgNUyFBljno0mO9PJwgGVSwLX4vjIuCYo GvhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=Yrz+d/eFxmaL9MnO2KDDZpe/U0WyP+SMDwz8S4U7Yyk=; b=AEKpBbZmxR1/cWK/iD2jblNkWAT2KXJD4t1Th3j8yKZnRxz05pApZPCw1jMzkr7f1X LRB4uAb9ZQNnHUwSsi4S3Nu/kya6GUjmpTGTD8Lw0FbGejl1t9+7/VSFrIIDz9BCSZOT JHStJzktWj4WkbxLhWwrkS9aXwptBKJ4WnpZivnbxiIGXrh4Tb4AQVviUjdVUzhtYkKt AcOuEsCityAzS2TfiRgwM5xakLwevpH4ELLgmIs5L8XnwBaBstj7yW+3Z25xLz3BmvD2 wCQrPmSXQLZh82tNtgLG5KBnJm1tmfKHr2Lw187lBK0K7MIOx1KWnxjJPxy5m7n5Gc72 gvqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QqrhcmX2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n185-20020a6340c2000000b004e934a7fc39si8858646pga.348.2023.02.18.13.22.01; Sat, 18 Feb 2023 13:22:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QqrhcmX2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230221AbjBRVVh (ORCPT <rfc822;assdfgzxcv4@gmail.com> + 99 others); Sat, 18 Feb 2023 16:21:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230198AbjBRVUu (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 18 Feb 2023 16:20:50 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50A621ABEF; Sat, 18 Feb 2023 13:18:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676755104; x=1708291104; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=gHMCyuAknB7ri+34Rr5iU3FsNUK61YpioXkhorFA7UU=; b=QqrhcmX2fBvjEcdbUGHL1DQLhUfkAr4dxbmZcoAOLs3NQGehJgPuSo16 xNw14ExkMdj7DkSvUKNrtOn/i3DBHzN2PXGQL2pH7jtmyv9uV2GyTXd+e HuLNBRBXvq3hSXX4/W17KgtIiQ4zlZuD7W4nHi7pcswfJOrFpXqaEsFye brCVK1kszgCkVG5vpOXcZxH22QtlR/92TbSUALwZbQ+NB0/jdg7+DaNhn N0fnAKg4Ij7014TipWYeNwAvwPWMdxBLGPAY432WzPTJBwE/Gh127ym/b ZzIEaLYmPCpj6TugOLrJFDo8qOZ7EGYnkGPQi/Em59dlFBfOQENTNFXDK g==; X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="418427670" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="418427670" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:18 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10625"; a="664241699" X-IronPort-AV: E=Sophos;i="5.97,309,1669104000"; d="scan'208";a="664241699" Received: from adityava-mobl1.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.209.80.223]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2023 13:16:17 -0800 From: Rick Edgecombe <rick.p.edgecombe@intel.com> To: x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>, Andy Lutomirski <luto@kernel.org>, Balbir Singh <bsingharora@gmail.com>, Borislav Petkov <bp@alien8.de>, Cyrill Gorcunov <gorcunov@gmail.com>, Dave Hansen <dave.hansen@linux.intel.com>, Eugene Syromiatnikov <esyr@redhat.com>, Florian Weimer <fweimer@redhat.com>, "H . J . Lu" <hjl.tools@gmail.com>, Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>, Kees Cook <keescook@chromium.org>, Mike Kravetz <mike.kravetz@oracle.com>, Nadav Amit <nadav.amit@gmail.com>, Oleg Nesterov <oleg@redhat.com>, Pavel Machek <pavel@ucw.cz>, Peter Zijlstra <peterz@infradead.org>, Randy Dunlap <rdunlap@infradead.org>, Weijiang Yang <weijiang.yang@intel.com>, "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>, John Allen <john.allen@amd.com>, kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com, david@redhat.com, debug@rivosinc.com Cc: rick.p.edgecombe@intel.com Subject: [PATCH v6 27/41] x86/mm: Warn if create Write=0,Dirty=1 with raw prot Date: Sat, 18 Feb 2023 13:14:19 -0800 Message-Id: <20230218211433.26859-28-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230218211433.26859-1-rick.p.edgecombe@intel.com> References: <20230218211433.26859-1-rick.p.edgecombe@intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758205401421952349?= X-GMAIL-MSGID: =?utf-8?q?1758205401421952349?= |
Series |
Shadow stacks for userspace
|
|
Commit Message
Edgecombe, Rick P
Feb. 18, 2023, 9:14 p.m. UTC
When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as shadow stack memory. So for shadow stack memory this bit combination is valid, but when Dirty=1,Write=1 (conventionally writable) memory is being write protected, the kernel has been taught to transition the Dirty=1 bit to SavedDirty=1, to avoid inadvertently creating shadow stack memory. It does this inside pte_wrprotect() because it knows the PTE is not intended to be a writable shadow stack entry, it is supposed to be write protected. However, when a PTE is created by a raw prot using mk_pte(), mk_pte() can't know whether to adjust Dirty=1 to SavedDirty=1. It can't distinguish between the caller intending to create a shadow stack PTE or needing the SavedDirty shift. The kernel has been updated to not do this, and so Write=0,Dirty=1 memory should only be created by the pte_mkfoo() helpers. Add a warning to make sure no new mk_pte() start doing this. Tested-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> --- v6: - New patch (Note, this has already been a useful warning, it caught the newly added set_memory_rox() doing this) --- arch/x86/include/asm/pgtable.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
Comments
On Sat, Feb 18, 2023 at 01:14:19PM -0800, Rick Edgecombe wrote: > When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as > shadow stack memory. So for shadow stack memory this bit combination is > valid, but when Dirty=1,Write=1 (conventionally writable) memory is being > write protected, the kernel has been taught to transition the Dirty=1 > bit to SavedDirty=1, to avoid inadvertently creating shadow stack > memory. It does this inside pte_wrprotect() because it knows the PTE is > not intended to be a writable shadow stack entry, it is supposed to be > write protected. > > However, when a PTE is created by a raw prot using mk_pte(), mk_pte() > can't know whether to adjust Dirty=1 to SavedDirty=1. It can't > distinguish between the caller intending to create a shadow stack PTE or > needing the SavedDirty shift. > > The kernel has been updated to not do this, and so Write=0,Dirty=1 > memory should only be created by the pte_mkfoo() helpers. Add a warning > to make sure no new mk_pte() start doing this. > > Tested-by: Pengfei Xu <pengfei.xu@intel.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > > --- > v6: > - New patch (Note, this has already been a useful warning, it caught the > newly added set_memory_rox() doing this) > --- > arch/x86/include/asm/pgtable.h | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h > index f3dc16fc4389..db8fe5511c74 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -1032,7 +1032,15 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) > * (Currently stuck as a macro because of indirect forward reference > * to linux/mm.h:page_to_nid()) > */ > -#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot)) > +#define mk_pte(page, pgprot) \ > +({ \ > + pgprot_t __pgprot = pgprot; \ > + \ > + WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && \ > + (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \ > + _PAGE_DIRTY); \ > + pfn_pte(page_to_pfn(page), __pgprot); \ > +}) This only warns? Should it also enforce the state?
On Sun, 2023-02-19 at 12:45 -0800, Kees Cook wrote: > > diff --git a/arch/x86/include/asm/pgtable.h > > b/arch/x86/include/asm/pgtable.h > > index f3dc16fc4389..db8fe5511c74 100644 > > --- a/arch/x86/include/asm/pgtable.h > > +++ b/arch/x86/include/asm/pgtable.h > > @@ -1032,7 +1032,15 @@ static inline unsigned long > > pmd_page_vaddr(pmd_t pmd) > > * (Currently stuck as a macro because of indirect forward > > reference > > * to linux/mm.h:page_to_nid()) > > */ > > -#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), > > (pgprot)) > > +#define mk_pte(page, > > pgprot) \ > > +({ > > \ > > + pgprot_t __pgprot = > > pgprot; \ > > + > > \ > > + WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) > > && \ > > + (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) > > == \ > > + > > _PAGE_DIRTY); \ > > + pfn_pte(page_to_pfn(page), > > __pgprot); \ > > +}) > > This only warns? Should it also enforce the state? Hmm, you mean something like forcing Dirty=0 if Write=0? The thing we are worried about here is some new x86 code that creates Write=0,Dirty=1 PTEs directly because the developer is unaware or forgot about shadow stack. The issue the warning actually caught was kernel memory being marked Write=0,Dirty=1, which today is more about consistency than any functional issue. But if some future hypothetical code was creating a userspace PTE like this, and depending on the memory being read-only, then the enforcement would be useful and potentially save the day. The downside is that it adds tricky logic into a low level helper that shouldn't be required unless strange and wrong new code is added in the future. And then it is still only useful if the warning doesn't catch the issue in testing. And then there would be some slight risk that the Dirty bit was expected to be there in some PTE without shadow stack exposure, and a functional bug would be introduced. I'm waffling here. I could be convinced either way. Hopefully that helps characterize the dilemma at least.
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index f3dc16fc4389..db8fe5511c74 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1032,7 +1032,15 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) * (Currently stuck as a macro because of indirect forward reference * to linux/mm.h:page_to_nid()) */ -#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot)) +#define mk_pte(page, pgprot) \ +({ \ + pgprot_t __pgprot = pgprot; \ + \ + WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) && \ + (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \ + _PAGE_DIRTY); \ + pfn_pte(page_to_pfn(page), __pgprot); \ +}) static inline int pmd_bad(pmd_t pmd) {