[v2,5/5] x86/kasan: Populate shadow for shared chunk of the CPU entry area

Message ID 20221110203504.1985010-6-seanjc@google.com
State New
Headers
Series x86/kasan: Bug fixes for recent CEA changes |

Commit Message

Sean Christopherson Nov. 10, 2022, 8:35 p.m. UTC
  Popuplate the shadow for the shared portion of the CPU entry area, i.e.
the read-only IDT mapping, during KASAN initialization.  A recent change
modified KASAN to map the per-CPU areas on-demand, but forgot to keep a
shadow for the common area that is shared amongst all CPUs.

Map the common area in KASAN init instead of letting idt_map_in_cea() do
the dirty work so that it Just Works in the unlikely event more shared
data is shoved into the CPU entry area.

The bug manifests as a not-present #PF when software attempts to lookup
an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs
direct CALL to the IRQ handler to avoid the overhead of INTn):

 BUG: unable to handle page fault for address: fffffbc0000001d8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 16c03a067 P4D 16c03a067 PUD 0
 Oops: 0000 [#1] PREEMPT SMP KASAN
 CPU: 5 PID: 901 Comm: repro Tainted: G        W          6.1.0-rc3+ #410
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
 RIP: 0010:kasan_check_range+0xdf/0x190
  vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel]
  vcpu_run+0x1d89/0x2bd0 [kvm]
  kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm]
  kvm_vcpu_ioctl+0x349/0x900 [kvm]
  __x64_sys_ioctl+0xb8/0xf0
  do_syscall_64+0x2b/0x50
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: syzbot+8cdd16fd5a6c0565e227@syzkaller.appspotmail.com
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/mm/kasan_init_64.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
  

Comments

Andrey Ryabinin Nov. 14, 2022, 2:44 p.m. UTC | #1
On 11/10/22 23:35, Sean Christopherson wrote:

>  
> +	/*
> +	 * Populate the shadow for the shared portion of the CPU entry area.
> +	 * Shadows for the per-CPU areas are mapped on-demand, as each CPU's
> +	 * area is randomly placed somewhere in the 512GiB range and mapping
> +	 * the entire 512GiB range is prohibitively expensive.
> +	 */
> +	kasan_populate_early_shadow((void *)shadow_cea_begin,
> +				    (void *)shadow_cea_per_cpu_begin);
> +

I know I suggested to use "early" here, but I just realized that this might be a problem.
This will actually map shadow page for the 8 pages (KASAN_SHADOW_SCALE_SHIFT) of the original memory.
In case there is some per-cpu entry area starting right at CPU_ENTRY_AREA_PER_CPU the shadow for it will
be covered with kasan_early_shadow_page instead of the usual one.

So we need to go back to your v1 PATCH, or alternatively we can round up CPU_ENTRY_AREA_PER_CPU
#define CPU_ENTRY_AREA_PER_CPU		(CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE << KASAN_SHADOW_SCALE_SHIFT)

Such change will also require fixing up max_cea calculation in init_cea_offsets()


Going back kasan_populate_shadow() seems like safer and easier choice. The only disadvantage of it
that we might waste 1 page, which is not much compared to the KASAN memory overhead.



>  	kasan_populate_early_shadow((void *)shadow_cea_end,
>  			kasan_mem_to_shadow((void *)__START_KERNEL_map));
>
  
Peter Zijlstra Nov. 14, 2022, 3:12 p.m. UTC | #2
On Mon, Nov 14, 2022 at 05:44:00PM +0300, Andrey Ryabinin wrote:
> Going back kasan_populate_shadow() seems like safer and easier choice.
> The only disadvantage of it that we might waste 1 page, which is not
> much compared to the KASAN memory overhead.

So the below delta?

---
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -388,7 +388,7 @@ void __init kasan_init(void)
 	shadow_cea_end = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_BASE +
 						      CPU_ENTRY_AREA_MAP_SIZE);
 
-	kasan_populate_early_shadow(
+	kasan_populate_shadow(
 		kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
 		kasan_mem_to_shadow((void *)VMALLOC_START));
  
Sean Christopherson Nov. 14, 2022, 5:53 p.m. UTC | #3
On Mon, Nov 14, 2022, Peter Zijlstra wrote:
> On Mon, Nov 14, 2022 at 05:44:00PM +0300, Andrey Ryabinin wrote:
> > Going back kasan_populate_shadow() seems like safer and easier choice.
> > The only disadvantage of it that we might waste 1 page, which is not
> > much compared to the KASAN memory overhead.
> 
> So the below delta?
> 
> ---
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -388,7 +388,7 @@ void __init kasan_init(void)
>  	shadow_cea_end = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_BASE +
>  						      CPU_ENTRY_AREA_MAP_SIZE);
>  
> -	kasan_populate_early_shadow(
> +	kasan_populate_shadow(
>  		kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
>  		kasan_mem_to_shadow((void *)VMALLOC_START));

Wrong one, that's the existing mapping.  To get back to v1:

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index af82046348a0..0302491d799d 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -416,8 +416,8 @@ void __init kasan_init(void)
         * area is randomly placed somewhere in the 512GiB range and mapping
         * the entire 512GiB range is prohibitively expensive.
         */
-       kasan_populate_early_shadow((void *)shadow_cea_begin,
-                                   (void *)shadow_cea_per_cpu_begin);
+       kasan_populate_shadow(shadow_cea_begin,
+                             shadow_cea_per_cpu_begin, 0);
 
        kasan_populate_early_shadow((void *)shadow_cea_end,
                        kasan_mem_to_shadow((void *)__START_KERNEL_map));
  
Peter Zijlstra Nov. 14, 2022, 9:46 p.m. UTC | #4
On Mon, Nov 14, 2022 at 05:53:43PM +0000, Sean Christopherson wrote:

> Wrong one, that's the existing mapping.  To get back to v1:
> 
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index af82046348a0..0302491d799d 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -416,8 +416,8 @@ void __init kasan_init(void)
>          * area is randomly placed somewhere in the 512GiB range and mapping
>          * the entire 512GiB range is prohibitively expensive.
>          */
> -       kasan_populate_early_shadow((void *)shadow_cea_begin,
> -                                   (void *)shadow_cea_per_cpu_begin);
> +       kasan_populate_shadow(shadow_cea_begin,
> +                             shadow_cea_per_cpu_begin, 0);
>  
>         kasan_populate_early_shadow((void *)shadow_cea_end,
>                         kasan_mem_to_shadow((void *)__START_KERNEL_map));

OK. It now looks like so:

  https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=x86/mm&id=14ca169feec3cb442ef4d322f8f65ba360f42784

If the robots don't hate on it because I fat fingered it or seomthing
stupid, I'll go push it out tomorrow.
  

Patch

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index afc5e129ca7b..af82046348a0 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -341,7 +341,7 @@  void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid)
 
 void __init kasan_init(void)
 {
-	unsigned long shadow_cea_begin, shadow_cea_end;
+	unsigned long shadow_cea_begin, shadow_cea_per_cpu_begin, shadow_cea_end;
 	int i;
 
 	memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
@@ -384,6 +384,7 @@  void __init kasan_init(void)
 	}
 
 	shadow_cea_begin = kasan_mem_to_shadow_align_down(CPU_ENTRY_AREA_BASE);
+	shadow_cea_per_cpu_begin = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_PER_CPU);
 	shadow_cea_end = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_BASE +
 						      CPU_ENTRY_AREA_MAP_SIZE);
 
@@ -409,6 +410,15 @@  void __init kasan_init(void)
 		kasan_mem_to_shadow((void *)VMALLOC_END + 1),
 		(void *)shadow_cea_begin);
 
+	/*
+	 * Populate the shadow for the shared portion of the CPU entry area.
+	 * Shadows for the per-CPU areas are mapped on-demand, as each CPU's
+	 * area is randomly placed somewhere in the 512GiB range and mapping
+	 * the entire 512GiB range is prohibitively expensive.
+	 */
+	kasan_populate_early_shadow((void *)shadow_cea_begin,
+				    (void *)shadow_cea_per_cpu_begin);
+
 	kasan_populate_early_shadow((void *)shadow_cea_end,
 			kasan_mem_to_shadow((void *)__START_KERNEL_map));