[4/5] KVM: x86/mmu: Don't install TDP MMU SPTE if SP has unexpected level

Message ID 20221213033030.83345-5-seanjc@google.com
State New
Headers
Series KVM: x86/mmu: TDP MMU fixes for 6.2 |

Commit Message

Sean Christopherson Dec. 13, 2022, 3:30 a.m. UTC
  Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
match the target level of the fault, and instead have the vCPU retry the
faulting instruction after warning.  Continuing on is completely
unnecessary as the absolute worst case scenario of retrying is DoSing
the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
  invalid opcode: 0000 [#1] SMP
  CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #64
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
  RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
  RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
  RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
  RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
  R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
  R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
  FS:  00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   kvm_tdp_mmu_map+0x3b0/0x510
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  Modules linked in: kvm_intel
  ---[ end trace 0000000000000000 ]---

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

David Matlack Dec. 13, 2022, 5:59 p.m. UTC | #1
On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
> match the target level of the fault, and instead have the vCPU retry the
> faulting instruction after warning.  Continuing on is completely
> unnecessary as the absolute worst case scenario of retrying is DoSing
> the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.

Would it make sense to kill the VM instead via KVM_BUG()?
  
Sean Christopherson Dec. 13, 2022, 6:15 p.m. UTC | #2
On Tue, Dec 13, 2022, David Matlack wrote:
> On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
> > match the target level of the fault, and instead have the vCPU retry the
> > faulting instruction after warning.  Continuing on is completely
> > unnecessary as the absolute worst case scenario of retrying is DoSing
> > the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.
> 
> Would it make sense to kill the VM instead via KVM_BUG()?

No, because if bug that hits this escapes to a release, odds are quite high that
retrying will succeed.  E.g. the fix earlier in this series is for a rare corner
case that I was able to hit consistently only by hacking KVM to effectively
synchronize the page fault and zap.  Other than an extra page fault, no harm has
been done to the guest, e.g. there's no need to kill the VM to protect it from
data corruption.
  
David Matlack Dec. 20, 2022, 6:24 p.m. UTC | #3
On Tue, Dec 13, 2022 at 06:15:56PM +0000, Sean Christopherson wrote:
> On Tue, Dec 13, 2022, David Matlack wrote:
> > On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
> > > match the target level of the fault, and instead have the vCPU retry the
> > > faulting instruction after warning.  Continuing on is completely
> > > unnecessary as the absolute worst case scenario of retrying is DoSing
> > > the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.
> > 
> > Would it make sense to kill the VM instead via KVM_BUG()?
> 
> No, because if bug that hits this escapes to a release, odds are quite high that
> retrying will succeed.  E.g. the fix earlier in this series is for a rare corner
> case that I was able to hit consistently only by hacking KVM to effectively
> synchronize the page fault and zap.  Other than an extra page fault, no harm has
> been done to the guest, e.g. there's no need to kill the VM to protect it from
> data corruption.

Good points, agreed!
  

Patch

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index fd4ae99790d7..cc1fb9a65620 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1063,7 +1063,9 @@  static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
 	int ret = RET_PF_FIXED;
 	bool wrprot = false;
 
-	WARN_ON(sp->role.level != fault->goal_level);
+	if (WARN_ON_ONCE(sp->role.level != fault->goal_level))
+		return RET_PF_RETRY;
+
 	if (unlikely(!fault->slot))
 		new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
 	else