[1/5] x86/coco: Use slow_virt_to_phys() in page transition hypervisor callbacks

Message ID 1696011549-28036-2-git-send-email-mikelley@microsoft.com
State New
Headers
Series x86/coco: Mark CoCo VM pages not present when changing encrypted state |

Commit Message

Michael Kelley (LINUX) Sept. 29, 2023, 6:19 p.m. UTC
  In preparation for temporarily marking pages not present during a
transition between encrypted and decrypted, use slow_virt_to_phys()
in the hypervisor callbacks. As long as the PFN is correct,
slow_virt_to_phys() works even if the leaf PTE is not present.
The existing functions that depends on vmalloc_to_page() all
require that the leaf PTE be marked present, so they don't work.

Update the comments for slow_virt_to_phys() to note this broader usage
and the requirement to work even if the PTE is not marked present.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
 arch/x86/hyperv/ivm.c        |  9 ++++++++-
 arch/x86/kernel/sev.c        |  8 +++++++-
 arch/x86/mm/pat/set_memory.c | 13 +++++++++----
 3 files changed, 24 insertions(+), 6 deletions(-)
  

Comments

Tom Lendacky Oct. 2, 2023, 3:52 p.m. UTC | #1
On 9/29/23 13:19, Michael Kelley wrote:
> In preparation for temporarily marking pages not present during a
> transition between encrypted and decrypted, use slow_virt_to_phys()
> in the hypervisor callbacks. As long as the PFN is correct,
> slow_virt_to_phys() works even if the leaf PTE is not present.
> The existing functions that depends on vmalloc_to_page() all
> require that the leaf PTE be marked present, so they don't work.
> 
> Update the comments for slow_virt_to_phys() to note this broader usage
> and the requirement to work even if the PTE is not marked present.
> 
> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> ---
>   arch/x86/hyperv/ivm.c        |  9 ++++++++-
>   arch/x86/kernel/sev.c        |  8 +++++++-
>   arch/x86/mm/pat/set_memory.c | 13 +++++++++----
>   3 files changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
> index c1088d3..084fab6 100644
> --- a/arch/x86/hyperv/ivm.c
> +++ b/arch/x86/hyperv/ivm.c
> @@ -524,7 +524,14 @@ static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo
>   		return false;
>   
>   	for (i = 0, pfn = 0; i < pagecount; i++) {
> -		pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE);
> +		/*
> +		 * Use slow_virt_to_phys() because the PRESENT bit has been
> +		 * temporarily cleared in the PTEs.  slow_virt_to_phys() works
> +		 * without the PRESENT bit while virt_to_hvpfn() or similar
> +		 * does not.
> +		 */
> +		pfn_array[pfn] = slow_virt_to_phys((void *)kbuffer +
> +					i * HV_HYP_PAGE_SIZE) >> HV_HYP_PAGE_SHIFT;
>   		pfn++;
>   
>   		if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) {
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 2787826..f5d6cec 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -784,7 +784,13 @@ static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long
>   		hdr->end_entry = i;
>   
>   		if (is_vmalloc_addr((void *)vaddr)) {
> -			pfn = vmalloc_to_pfn((void *)vaddr);
> +			/*
> +			 * Use slow_virt_to_phys() because the PRESENT bit has been

Since __set_pages_state() is called by more than one path, please update 
this comment to add something like "because when called via the 
set_memory_encrypted() or set_memory_decrypted() path the PRESENT bit..."

Thanks,
Tom

> +			 * temporarily cleared in the PTEs.  slow_virt_to_phys() works
> +			 * without the PRESENT bit while vmalloc_to_pfn() or similar
> +			 * does not.
> +			 */
> +			pfn = slow_virt_to_phys((void *)vaddr) >> PAGE_SHIFT;
>   			use_large_entry = false;
>   		} else {
>   			pfn = __pa(vaddr) >> PAGE_SHIFT;
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index bda9f12..8e19796 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -755,10 +755,15 @@ pmd_t *lookup_pmd_address(unsigned long address)
>    * areas on 32-bit NUMA systems.  The percpu areas can
>    * end up in this kind of memory, for instance.
>    *
> - * This could be optimized, but it is only intended to be
> - * used at initialization time, and keeping it
> - * unoptimized should increase the testing coverage for
> - * the more obscure platforms.
> + * It is also used in callbacks for CoCo VM page transitions between private
> + * and shared because it works when the PRESENT bit is not set in the leaf
> + * PTE. In such cases, the state of the PTEs, including the PFN, is otherwise
> + * known to be valid, so the returned physical address is correct. The similar
> + * function vmalloc_to_pfn() can't be used because it requires the PRESENT bit.
> + *
> + * This could be optimized, but it is only used in paths that are not perf
> + * sensitive, and keeping it unoptimized should increase the testing coverage
> + * for the more obscure platforms.
>    */
>   phys_addr_t slow_virt_to_phys(void *__virt_addr)
>   {
  

Patch

diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index c1088d3..084fab6 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -524,7 +524,14 @@  static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bo
 		return false;
 
 	for (i = 0, pfn = 0; i < pagecount; i++) {
-		pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE);
+		/*
+		 * Use slow_virt_to_phys() because the PRESENT bit has been
+		 * temporarily cleared in the PTEs.  slow_virt_to_phys() works
+		 * without the PRESENT bit while virt_to_hvpfn() or similar
+		 * does not.
+		 */
+		pfn_array[pfn] = slow_virt_to_phys((void *)kbuffer +
+					i * HV_HYP_PAGE_SIZE) >> HV_HYP_PAGE_SHIFT;
 		pfn++;
 
 		if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) {
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 2787826..f5d6cec 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -784,7 +784,13 @@  static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long
 		hdr->end_entry = i;
 
 		if (is_vmalloc_addr((void *)vaddr)) {
-			pfn = vmalloc_to_pfn((void *)vaddr);
+			/*
+			 * Use slow_virt_to_phys() because the PRESENT bit has been
+			 * temporarily cleared in the PTEs.  slow_virt_to_phys() works
+			 * without the PRESENT bit while vmalloc_to_pfn() or similar
+			 * does not.
+			 */
+			pfn = slow_virt_to_phys((void *)vaddr) >> PAGE_SHIFT;
 			use_large_entry = false;
 		} else {
 			pfn = __pa(vaddr) >> PAGE_SHIFT;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index bda9f12..8e19796 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -755,10 +755,15 @@  pmd_t *lookup_pmd_address(unsigned long address)
  * areas on 32-bit NUMA systems.  The percpu areas can
  * end up in this kind of memory, for instance.
  *
- * This could be optimized, but it is only intended to be
- * used at initialization time, and keeping it
- * unoptimized should increase the testing coverage for
- * the more obscure platforms.
+ * It is also used in callbacks for CoCo VM page transitions between private
+ * and shared because it works when the PRESENT bit is not set in the leaf
+ * PTE. In such cases, the state of the PTEs, including the PFN, is otherwise
+ * known to be valid, so the returned physical address is correct. The similar
+ * function vmalloc_to_pfn() can't be used because it requires the PRESENT bit.
+ *
+ * This could be optimized, but it is only used in paths that are not perf
+ * sensitive, and keeping it unoptimized should increase the testing coverage
+ * for the more obscure platforms.
  */
 phys_addr_t slow_virt_to_phys(void *__virt_addr)
 {