[02/10] x86/mm/cpa: Reject incorrect encryption change requests
Commit Message
Kernel memory is "encrypted" by default. Some callers may "decrypt" it
in order to share it with things outside the kernel like a device or an
untrusted VMM.
There is nothing to stop set_memory_encrypted() from being passed memory
that is already "encrypted" (aka. "private" on TDX). In fact, some
callers do this because ... $REASONS. Unfortunately, part of the TDX
decrypted=>encrypted transition is truly one way*. It can't handle
being asked to encrypt an already encrypted page
Allow __set_memory_enc_pgtable() to detect already-encrypted memory
before it hits the TDX code.
* The one way part is "page acceptance"
[commit log written by Dave Hansen]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/mm/pat/set_memory.c | 41 +++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
Comments
* Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> Kernel memory is "encrypted" by default. Some callers may "decrypt" it
> in order to share it with things outside the kernel like a device or an
> untrusted VMM.
>
> There is nothing to stop set_memory_encrypted() from being passed memory
> that is already "encrypted" (aka. "private" on TDX). In fact, some
> callers do this because ... $REASONS. Unfortunately, part of the TDX
> decrypted=>encrypted transition is truly one way*. It can't handle
> being asked to encrypt an already encrypted page
>
> Allow __set_memory_enc_pgtable() to detect already-encrypted memory
> before it hits the TDX code.
>
> * The one way part is "page acceptance"
>
> [commit log written by Dave Hansen]
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
> arch/x86/mm/pat/set_memory.c | 41 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index bda9f129835e..1238b0db3e33 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2122,6 +2122,21 @@ int set_memory_global(unsigned long addr, int numpages)
> __pgprot(_PAGE_GLOBAL), 0);
> }
>
> +static bool kernel_vaddr_encryped(unsigned long addr, bool enc)
> +{
> + unsigned int level;
> + pte_t *pte;
> +
> + pte = lookup_address(addr, &level);
> + if (!pte)
> + return false;
> +
> + if (enc)
> + return pte_val(*pte) == cc_mkenc(pte_val(*pte));
> +
> + return pte_val(*pte) == cc_mkdec(pte_val(*pte));
> +}
> +
> /*
> * __set_memory_enc_pgtable() is used for the hypervisors that get
> * informed about "encryption" status via page tables.
> @@ -2130,7 +2145,7 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
> {
> pgprot_t empty = __pgprot(0);
> struct cpa_data cpa;
> - int ret;
> + int ret, numpages_in_state = 0;
>
> /* Should not be working on unaligned addresses */
> if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr))
> @@ -2143,6 +2158,30 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
> cpa.mask_clr = enc ? pgprot_decrypted(empty) : pgprot_encrypted(empty);
> cpa.pgd = init_mm.pgd;
>
> + /*
> + * If any page is already in the right state, bail with an error
> + * because the code doesn't handled it. This is likely because
Grammar mistake here.
> + * something has gone wrong and isn't worth optimizing for.
> + *
> + * If all the memory pages are already in the desired state return
> + * success.
Missing comma.
> + *
> + * kernel_vaddr_encryped() does not synchronize against huge page
> + * splits so take pgd_lock. A caller doing strange things could
Missing comma.
Thanks,
Ingo
On Wed, 2023-10-18 at 10:44 +0200, Ingo Molnar wrote:
> > + /*
> > + * If any page is already in the right state, bail with an
> > error
> > + * because the code doesn't handled it. This is likely
> > because
>
> Grammar mistake here.
>
> > + * something has gone wrong and isn't worth optimizing for.
> > + *
> > + * If all the memory pages are already in the desired state
> > return
> > + * success.
>
> Missing comma.
>
> > + *
> > + * kernel_vaddr_encryped() does not synchronize against
> > huge page
> > + * splits so take pgd_lock. A caller doing strange things
> > could
>
> Missing comma.
Oh, yep. Thanks.
@@ -2122,6 +2122,21 @@ int set_memory_global(unsigned long addr, int numpages)
__pgprot(_PAGE_GLOBAL), 0);
}
+static bool kernel_vaddr_encryped(unsigned long addr, bool enc)
+{
+ unsigned int level;
+ pte_t *pte;
+
+ pte = lookup_address(addr, &level);
+ if (!pte)
+ return false;
+
+ if (enc)
+ return pte_val(*pte) == cc_mkenc(pte_val(*pte));
+
+ return pte_val(*pte) == cc_mkdec(pte_val(*pte));
+}
+
/*
* __set_memory_enc_pgtable() is used for the hypervisors that get
* informed about "encryption" status via page tables.
@@ -2130,7 +2145,7 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
{
pgprot_t empty = __pgprot(0);
struct cpa_data cpa;
- int ret;
+ int ret, numpages_in_state = 0;
/* Should not be working on unaligned addresses */
if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr))
@@ -2143,6 +2158,30 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
cpa.mask_clr = enc ? pgprot_decrypted(empty) : pgprot_encrypted(empty);
cpa.pgd = init_mm.pgd;
+ /*
+ * If any page is already in the right state, bail with an error
+ * because the code doesn't handled it. This is likely because
+ * something has gone wrong and isn't worth optimizing for.
+ *
+ * If all the memory pages are already in the desired state return
+ * success.
+ *
+ * kernel_vaddr_encryped() does not synchronize against huge page
+ * splits so take pgd_lock. A caller doing strange things could
+ * get a new PMD mid level PTE confused with a huge PMD entry. Just
+ * lock to tie up loose ends.
+ */
+ spin_lock(&pgd_lock);
+ for (int i = 0; i < numpages; i++) {
+ if (kernel_vaddr_encryped(addr + (PAGE_SIZE * i), enc))
+ numpages_in_state++;
+ }
+ spin_unlock(&pgd_lock);
+ if (numpages_in_state == numpages)
+ return 0;
+ else if (numpages_in_state)
+ return 1;
+
/* Must avoid aliasing mappings in the highmem code */
kmap_flush_unused();
vm_unmap_aliases();