[v6,27/41] x86/mm: Warn if create Write=0,Dirty=1 with raw prot

Message ID 20230218211433.26859-28-rick.p.edgecombe@intel.com
State New
Headers
Series Shadow stacks for userspace |

Commit Message

Edgecombe, Rick P Feb. 18, 2023, 9:14 p.m. UTC
  When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as
shadow stack memory. So for shadow stack memory this bit combination is
valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
write protected, the kernel has been taught to transition the Dirty=1
bit to SavedDirty=1, to avoid inadvertently creating shadow stack
memory. It does this inside pte_wrprotect() because it knows the PTE is
not intended to be a writable shadow stack entry, it is supposed to be
write protected.

However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
distinguish between the caller intending to create a shadow stack PTE or
needing the SavedDirty shift.

The kernel has been updated to not do this, and so Write=0,Dirty=1
memory should only be created by the pte_mkfoo() helpers. Add a warning
to make sure no new mk_pte() start doing this.

Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

---
v6:
 - New patch (Note, this has already been a useful warning, it caught the
   newly added set_memory_rox() doing this)
---
 arch/x86/include/asm/pgtable.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
  

Comments

Kees Cook Feb. 19, 2023, 8:45 p.m. UTC | #1
On Sat, Feb 18, 2023 at 01:14:19PM -0800, Rick Edgecombe wrote:
> When user shadow stack is use, Write=0,Dirty=1 is treated by the CPU as
> shadow stack memory. So for shadow stack memory this bit combination is
> valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
> write protected, the kernel has been taught to transition the Dirty=1
> bit to SavedDirty=1, to avoid inadvertently creating shadow stack
> memory. It does this inside pte_wrprotect() because it knows the PTE is
> not intended to be a writable shadow stack entry, it is supposed to be
> write protected.
> 
> However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
> can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
> distinguish between the caller intending to create a shadow stack PTE or
> needing the SavedDirty shift.
> 
> The kernel has been updated to not do this, and so Write=0,Dirty=1
> memory should only be created by the pte_mkfoo() helpers. Add a warning
> to make sure no new mk_pte() start doing this.
> 
> Tested-by: Pengfei Xu <pengfei.xu@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> ---
> v6:
>  - New patch (Note, this has already been a useful warning, it caught the
>    newly added set_memory_rox() doing this)
> ---
>  arch/x86/include/asm/pgtable.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index f3dc16fc4389..db8fe5511c74 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -1032,7 +1032,15 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
>   * (Currently stuck as a macro because of indirect forward reference
>   * to linux/mm.h:page_to_nid())
>   */
> -#define mk_pte(page, pgprot)   pfn_pte(page_to_pfn(page), (pgprot))
> +#define mk_pte(page, pgprot)						 \
> +({									 \
> +	pgprot_t __pgprot = pgprot;					 \
> +									 \
> +	WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) &&	 \
> +		    (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \
> +		    _PAGE_DIRTY);					 \
> +	pfn_pte(page_to_pfn(page), __pgprot);				 \
> +})

This only warns? Should it also enforce the state?
  
Edgecombe, Rick P Feb. 20, 2023, 10:32 p.m. UTC | #2
On Sun, 2023-02-19 at 12:45 -0800, Kees Cook wrote:
> > diff --git a/arch/x86/include/asm/pgtable.h
> > b/arch/x86/include/asm/pgtable.h
> > index f3dc16fc4389..db8fe5511c74 100644
> > --- a/arch/x86/include/asm/pgtable.h
> > +++ b/arch/x86/include/asm/pgtable.h
> > @@ -1032,7 +1032,15 @@ static inline unsigned long
> > pmd_page_vaddr(pmd_t pmd)
> >    * (Currently stuck as a macro because of indirect forward
> > reference
> >    * to linux/mm.h:page_to_nid())
> >    */
> > -#define mk_pte(page, pgprot)   pfn_pte(page_to_pfn(page),
> > (pgprot))
> > +#define mk_pte(page,
> > pgprot)                                          \
> > +({                                                                
> >     \
> > +     pgprot_t __pgprot =
> > pgprot;                                      \
> > +                                                                  
> >     \
> > +     WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK)
> > &&      \
> > +                 (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW))
> > == \
> > +                
> > _PAGE_DIRTY);                                        \
> > +     pfn_pte(page_to_pfn(page),
> > __pgprot);                            \
> > +})
> 
> This only warns? Should it also enforce the state?

Hmm, you mean something like forcing Dirty=0 if Write=0? 

The thing we are worried about here is some new x86 code that creates
Write=0,Dirty=1 PTEs directly because the developer is unaware or
forgot about shadow stack. The issue the warning actually caught was
kernel memory being marked Write=0,Dirty=1, which today is more about
consistency than any functional issue. But if some future hypothetical
code was creating a userspace PTE like this, and depending on the
memory being read-only, then the enforcement would be useful and
potentially save the day.

The downside is that it adds tricky logic into a low level helper that
shouldn't be required unless strange and wrong new code is added in the
future. And then it is still only useful if the warning doesn't catch
the issue in testing. And then there would be some slight risk that the
Dirty bit was expected to be there in some PTE without shadow stack
exposure, and a functional bug would be introduced.

I'm waffling here. I could be convinced either way. Hopefully that
helps characterize the dilemma at least.
  

Patch

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f3dc16fc4389..db8fe5511c74 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1032,7 +1032,15 @@  static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * (Currently stuck as a macro because of indirect forward reference
  * to linux/mm.h:page_to_nid())
  */
-#define mk_pte(page, pgprot)   pfn_pte(page_to_pfn(page), (pgprot))
+#define mk_pte(page, pgprot)						 \
+({									 \
+	pgprot_t __pgprot = pgprot;					 \
+									 \
+	WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_USER_SHSTK) &&	 \
+		    (pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \
+		    _PAGE_DIRTY);					 \
+	pfn_pte(page_to_pfn(page), __pgprot);				 \
+})
 
 static inline int pmd_bad(pmd_t pmd)
 {