[v7,27/41] x86/mm: Warn if create Write=0,Dirty=1 with raw prot

Message ID 20230227222957.24501-28-rick.p.edgecombe@intel.com
State New
Headers
Series Shadow stacks for userspace |

Commit Message

Edgecombe, Rick P Feb. 27, 2023, 10:29 p.m. UTC
  When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as
shadow stack memory. So for shadow stack memory this bit combination is
valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
write protected, the kernel has been taught to transition the Dirty=1
bit to SavedDirty=1, to avoid inadvertently creating shadow stack
memory. It does this inside pte_wrprotect() because it knows the PTE is
not intended to be a writable shadow stack entry, it is supposed to be
write protected.

However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
distinguish between the caller intending to create a shadow stack PTE or
needing the SavedDirty shift.

The kernel has been updated to not do this, and so Write=0,Dirty=1
memory should only be created by the pte_mkfoo() helpers. Add a warning
to make sure no new mk_pte() start doing this.

Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---
v6:
 - New patch (Note, this has already been a useful warning, it caught the
   newly added set_memory_rox() doing this)
---
 arch/x86/include/asm/pgtable.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
  

Comments

Kees Cook Feb. 27, 2023, 10:54 p.m. UTC | #1
On February 27, 2023 2:29:43 PM PST, Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
>When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as
>shadow stack memory. So for shadow stack memory this bit combination is
>valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
>write protected, the kernel has been taught to transition the Dirty=1
>bit to SavedDirty=1, to avoid inadvertently creating shadow stack
>memory. It does this inside pte_wrprotect() because it knows the PTE is
>not intended to be a writable shadow stack entry, it is supposed to be
>write protected.
>
>However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
>can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
>distinguish between the caller intending to create a shadow stack PTE or
>needing the SavedDirty shift.
>
>The kernel has been updated to not do this, and so Write=0,Dirty=1
>memory should only be created by the pte_mkfoo() helpers. Add a warning
>to make sure no new mk_pte() start doing this.
>
>Tested-by: Pengfei Xu <pengfei.xu@intel.com>
>Tested-by: John Allen <john.allen@amd.com>
>Tested-by: Kees Cook <keescook@chromium.org>
>Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
>Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Reviewed-by: Kees Cook <keescook@chromium.org>
  
Borislav Petkov March 8, 2023, 9:23 a.m. UTC | #2
On Mon, Feb 27, 2023 at 02:29:43PM -0800, Rick Edgecombe wrote:
> When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as
			   ^
			   in

> shadow stack memory. So for shadow stack memory this bit combination is
> valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
> write protected, the kernel has been taught to transition the Dirty=1
> bit to SavedDirty=1, to avoid inadvertently creating shadow stack
> memory. It does this inside pte_wrprotect() because it knows the PTE is
> not intended to be a writable shadow stack entry, it is supposed to be
> write protected.


> 
> However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
> can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
> distinguish between the caller intending to create a shadow stack PTE or
> needing the SavedDirty shift.
> 
> The kernel has been updated to not do this, and so Write=0,Dirty=1
> memory should only be created by the pte_mkfoo() helpers. Add a warning
> to make sure no new mk_pte() start doing this.

Might wanna add the note from below here:

"... start doing this, like, for example, set_memory_rox() did."

> Tested-by: Pengfei Xu <pengfei.xu@intel.com>
> Tested-by: John Allen <john.allen@amd.com>
> Tested-by: Kees Cook <keescook@chromium.org>
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> ---
> v6:
>  - New patch (Note, this has already been a useful warning, it caught the
>    newly added set_memory_rox() doing this)

Thx.
  
Edgecombe, Rick P March 8, 2023, 11:35 p.m. UTC | #3
On Wed, 2023-03-08 at 10:23 +0100, Borislav Petkov wrote:
> On Mon, Feb 27, 2023 at 02:29:43PM -0800, Rick Edgecombe wrote:
> > When user shadow stack is use, Write=0,Dirty=1 is treated by the
> > CPU as
> 
>                            ^
>                            in

Oops, yes.
> 
> > shadow stack memory. So for shadow stack memory this bit
> > combination is
> > valid, but when Dirty=1,Write=1 (conventionally writable) memory is
> > being
> > write protected, the kernel has been taught to transition the
> > Dirty=1
> > bit to SavedDirty=1, to avoid inadvertently creating shadow stack
> > memory. It does this inside pte_wrprotect() because it knows the
> > PTE is
> > not intended to be a writable shadow stack entry, it is supposed to
> > be
> > write protected.
> 
> 
> > 
> > However, when a PTE is created by a raw prot using mk_pte(),
> > mk_pte()
> > can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
> > distinguish between the caller intending to create a shadow stack
> > PTE or
> > needing the SavedDirty shift.
> > 
> > The kernel has been updated to not do this, and so Write=0,Dirty=1
> > memory should only be created by the pte_mkfoo() helpers. Add a
> > warning
> > to make sure no new mk_pte() start doing this.
> 
> Might wanna add the note from below here:
> 
> "... start doing this, like, for example, set_memory_rox() did."

Fine by me.

Thanks.
  

Patch

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index e5b3dce0d9fe..7142f99d3fbb 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1032,7 +1032,15 @@  static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * (Currently stuck as a macro because of indirect forward reference
  * to linux/mm.h:page_to_nid())
  */
-#define mk_pte(page, pgprot)   pfn_pte(page_to_pfn(page), (pgprot))
+#define mk_pte(page, pgprot)						 \
+({									 \
+	pgprot_t __pgprot = pgprot;					 \
+									 \
+	WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) &&	 \
+		    (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \
+		    _PAGE_DIRTY);					 \
+	pfn_pte(page_to_pfn(page), __pgprot);				 \
+})
 
 static inline int pmd_bad(pmd_t pmd)
 {