[2/2] KVM: arm64: selftests: Disable single-step without relying on ucall()

Message ID 20221117002350.2178351-3-seanjc@google.com
State New
Headers
Series KVM: arm64: selftests: Fixes for single-step test |

Commit Message

Sean Christopherson Nov. 17, 2022, 12:23 a.m. UTC
  Automatically disable single-step when the guest reaches the end of the
verified section instead of using an explicit ucall() to ask userspace to
disable single-step.  An upcoming change to implement a pool-based scheme
for ucall() will add an atomic operation (bit test and set) in the guest
ucall code, and if the compiler generate "old school" atomics, e.g.

  40e57c:       c85f7c20        ldxr    x0, [x1]
  40e580:       aa100011        orr     x17, x0, x16
  40e584:       c80ffc31        stlxr   w15, x17, [x1]
  40e588:       35ffffaf        cbnz    w15, 40e57c <__aarch64_ldset8_sync+0x1c>

the guest will hang as the local exclusive monitor is reset by eret,
i.e. the stlxr will always fail due to the VM-Exit for the debug
exception.

Link: https://lore.kernel.org/all/20221006003409.649993-8-seanjc@google.com
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/aarch64/debug-exceptions.c  | 28 ++++++++++---------
 1 file changed, 15 insertions(+), 13 deletions(-)
  

Comments

Oliver Upton Nov. 17, 2022, 12:53 a.m. UTC | #1
On Thu, Nov 17, 2022 at 12:23:50AM +0000, Sean Christopherson wrote:
> Automatically disable single-step when the guest reaches the end of the
> verified section instead of using an explicit ucall() to ask userspace to
> disable single-step.  An upcoming change to implement a pool-based scheme
> for ucall() will add an atomic operation (bit test and set) in the guest
> ucall code, and if the compiler generate "old school" atomics, e.g.

Off topic, but I didn't ask when we were discussing this issue. What is
the atomic used for in the pool-based ucall implementation?

>   40e57c:       c85f7c20        ldxr    x0, [x1]
>   40e580:       aa100011        orr     x17, x0, x16
>   40e584:       c80ffc31        stlxr   w15, x17, [x1]
>   40e588:       35ffffaf        cbnz    w15, 40e57c <__aarch64_ldset8_sync+0x1c>
> 
> the guest will hang as the local exclusive monitor is reset by eret,
> i.e. the stlxr will always fail due to the VM-Exit for the debug
> exception.

... due to the debug exception taken to EL2.

What's a VM-Exit anyways? ;-)

> Link: https://lore.kernel.org/all/20221006003409.649993-8-seanjc@google.com
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

With the commit message nit:

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>

--
Thanks,
Oliver
  
Sean Christopherson Nov. 17, 2022, 1:20 a.m. UTC | #2
On Thu, Nov 17, 2022, Oliver Upton wrote:
> On Thu, Nov 17, 2022 at 12:23:50AM +0000, Sean Christopherson wrote:
> > Automatically disable single-step when the guest reaches the end of the
> > verified section instead of using an explicit ucall() to ask userspace to
> > disable single-step.  An upcoming change to implement a pool-based scheme
> > for ucall() will add an atomic operation (bit test and set) in the guest
> > ucall code, and if the compiler generate "old school" atomics, e.g.
> 
> Off topic, but I didn't ask when we were discussing this issue. What is
> the atomic used for in the pool-based ucall implementation?

To avoid having to plumb an "id" into the guest, vCPUs grab a ucall entry from
the pool on a first-come first-serve basis, and then release the entry when the
ucall is complete.  The current implementation is a bitmap, e.g. every possible
entry has a bit in the map, and vCPUs do an atomic bit-test-and-set to claim an
entry.

Ugh.  And there's a bug.  Of course I notice it after sending the pull request.
Depsite being defined in atomic.h, and despite clear_bit() being atomic in the
kernel, tools' clear_bit() isn't actually atomic.  Grr.

Doesn't cause problems because there are so few multi-vCPU selftests, but that
needs to be fixed.  Best thing would be to fix clear_bit() itself.
  
Reiji Watanabe Nov. 17, 2022, 7:07 a.m. UTC | #3
On Wed, Nov 16, 2022 at 4:24 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Automatically disable single-step when the guest reaches the end of the
> verified section instead of using an explicit ucall() to ask userspace to
> disable single-step.  An upcoming change to implement a pool-based scheme
> for ucall() will add an atomic operation (bit test and set) in the guest
> ucall code, and if the compiler generate "old school" atomics, e.g.
>
>   40e57c:       c85f7c20        ldxr    x0, [x1]
>   40e580:       aa100011        orr     x17, x0, x16
>   40e584:       c80ffc31        stlxr   w15, x17, [x1]
>   40e588:       35ffffaf        cbnz    w15, 40e57c <__aarch64_ldset8_sync+0x1c>
>
> the guest will hang as the local exclusive monitor is reset by eret,
> i.e. the stlxr will always fail due to the VM-Exit for the debug
> exception.
>
> Link: https://lore.kernel.org/all/20221006003409.649993-8-seanjc@google.com
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Reiji Watanabe <reijiw@google.com>
  
Sean Christopherson Nov. 17, 2022, 3:19 p.m. UTC | #4
On Thu, Nov 17, 2022, Sean Christopherson wrote:
> On Thu, Nov 17, 2022, Oliver Upton wrote:
> > On Thu, Nov 17, 2022 at 12:23:50AM +0000, Sean Christopherson wrote:
> > > Automatically disable single-step when the guest reaches the end of the
> > > verified section instead of using an explicit ucall() to ask userspace to
> > > disable single-step.  An upcoming change to implement a pool-based scheme
> > > for ucall() will add an atomic operation (bit test and set) in the guest
> > > ucall code, and if the compiler generate "old school" atomics, e.g.
> > 
> > Off topic, but I didn't ask when we were discussing this issue. What is
> > the atomic used for in the pool-based ucall implementation?
> 
> To avoid having to plumb an "id" into the guest, vCPUs grab a ucall entry from
> the pool on a first-come first-serve basis, and then release the entry when the
> ucall is complete.  The current implementation is a bitmap, e.g. every possible
> entry has a bit in the map, and vCPUs do an atomic bit-test-and-set to claim an
> entry.
> 
> Ugh.  And there's a bug.  Of course I notice it after sending the pull request.
> Depsite being defined in atomic.h, and despite clear_bit() being atomic in the
> kernel, tools' clear_bit() isn't actually atomic.  Grr.
> 
> Doesn't cause problems because there are so few multi-vCPU selftests, but that
> needs to be fixed.  Best thing would be to fix clear_bit() itself.

Ha!  And I bet when clear_bit() is fixed, this test will start failing again
because the ucall() to activate single-step needs to release the entry _after_
exiting to the host, i.e. single-step will be enabled across the atomic region
again.
  
Sean Christopherson Nov. 17, 2022, 10:32 p.m. UTC | #5
On Thu, Nov 17, 2022, Sean Christopherson wrote:
> On Thu, Nov 17, 2022, Sean Christopherson wrote:
> > On Thu, Nov 17, 2022, Oliver Upton wrote:
> > > On Thu, Nov 17, 2022 at 12:23:50AM +0000, Sean Christopherson wrote:
> > > > Automatically disable single-step when the guest reaches the end of the
> > > > verified section instead of using an explicit ucall() to ask userspace to
> > > > disable single-step.  An upcoming change to implement a pool-based scheme
> > > > for ucall() will add an atomic operation (bit test and set) in the guest
> > > > ucall code, and if the compiler generate "old school" atomics, e.g.
> > > 
> > > Off topic, but I didn't ask when we were discussing this issue. What is
> > > the atomic used for in the pool-based ucall implementation?
> > 
> > To avoid having to plumb an "id" into the guest, vCPUs grab a ucall entry from
> > the pool on a first-come first-serve basis, and then release the entry when the
> > ucall is complete.  The current implementation is a bitmap, e.g. every possible
> > entry has a bit in the map, and vCPUs do an atomic bit-test-and-set to claim an
> > entry.
> > 
> > Ugh.  And there's a bug.  Of course I notice it after sending the pull request.
> > Depsite being defined in atomic.h, and despite clear_bit() being atomic in the
> > kernel, tools' clear_bit() isn't actually atomic.  Grr.
> > 
> > Doesn't cause problems because there are so few multi-vCPU selftests, but that
> > needs to be fixed.  Best thing would be to fix clear_bit() itself.
> 
> Ha!  And I bet when clear_bit() is fixed, this test will start failing again
> because the ucall() to activate single-step needs to release the entry _after_
> exiting to the host, i.e. single-step will be enabled across the atomic region
> again.

LOL, yep.  Test gets stuck in __aarch64_ldclr8_sync().
  

Patch

diff --git a/tools/testing/selftests/kvm/aarch64/debug-exceptions.c b/tools/testing/selftests/kvm/aarch64/debug-exceptions.c
index 91f55b45dfc7..65cd7e4f07fa 100644
--- a/tools/testing/selftests/kvm/aarch64/debug-exceptions.c
+++ b/tools/testing/selftests/kvm/aarch64/debug-exceptions.c
@@ -241,7 +241,6 @@  static void guest_svc_handler(struct ex_regs *regs)
 
 enum single_step_op {
 	SINGLE_STEP_ENABLE = 0,
-	SINGLE_STEP_DISABLE = 1,
 };
 
 static void guest_code_ss(int test_cnt)
@@ -258,7 +257,7 @@  static void guest_code_ss(int test_cnt)
 		GUEST_SYNC(SINGLE_STEP_ENABLE);
 
 		/*
-		 * The userspace will veriry that the pc is as expected during
+		 * The userspace will verify that the pc is as expected during
 		 * single step execution between iter_ss_begin and iter_ss_end.
 		 */
 		asm volatile("iter_ss_begin:nop\n");
@@ -268,11 +267,9 @@  static void guest_code_ss(int test_cnt)
 		bvr = read_sysreg(dbgbvr0_el1);
 		wvr = read_sysreg(dbgwvr0_el1);
 
+		/* Userspace disables Single Step when the end is nigh. */
 		asm volatile("iter_ss_end:\n");
 
-		/* Disable Single Step execution */
-		GUEST_SYNC(SINGLE_STEP_DISABLE);
-
 		GUEST_ASSERT(bvr == w_bvr);
 		GUEST_ASSERT(wvr == w_wvr);
 	}
@@ -364,15 +361,12 @@  void test_single_step_from_userspace(int test_cnt)
 			TEST_ASSERT(cmd == UCALL_SYNC,
 				    "Unexpected ucall cmd 0x%lx", cmd);
 
-			if (uc.args[1] == SINGLE_STEP_ENABLE) {
-				debug.control = KVM_GUESTDBG_ENABLE |
-						KVM_GUESTDBG_SINGLESTEP;
-				ss_enable = true;
-			} else {
-				debug.control = KVM_GUESTDBG_ENABLE;
-				ss_enable = false;
-			}
+			TEST_ASSERT(uc.args[1] == SINGLE_STEP_ENABLE,
+				    "Unexpected ucall action 0x%lx", uc.args[1]);
 
+			debug.control = KVM_GUESTDBG_ENABLE |
+					KVM_GUESTDBG_SINGLESTEP;
+			ss_enable = true;
 			vcpu_guest_debug_set(vcpu, &debug);
 			continue;
 		}
@@ -385,6 +379,14 @@  void test_single_step_from_userspace(int test_cnt)
 			    "Unexpected pc 0x%lx (expected 0x%lx)",
 			    pc, test_pc);
 
+		if ((pc + 4) == (uint64_t)&iter_ss_end) {
+			test_pc = 0;
+			debug.control = KVM_GUESTDBG_ENABLE;
+			ss_enable = false;
+			vcpu_guest_debug_set(vcpu, &debug);
+			continue;
+		}
+
 		/*
 		 * If the current pc is between iter_ss_bgin and
 		 * iter_ss_end, the pc for the next KVM_EXIT_DEBUG should