[04/13] KVM: nSVM: clean up copying of int_ctl fields back to vmcb01/vmcb12

Message ID 20221117143242.102721-5-mlevitsk@redhat.com
State New
Headers
Series SVM: vNMI (with my fixes) |

Commit Message

Maxim Levitsky Nov. 17, 2022, 2:32 p.m. UTC
  Clean up the nested_sync_int_ctl_from_vmcb02:

1. The comment about preservation of V_IRQ is wrong: when the L2 doesn't
   use virtual interrupt masking, then the field just doesn't exist in
   vmcb12 thus it should not be touched at all.
   Since it is unused in this case, touching it doesn't matter that much,
   so the bug is theoretical.

2. When the L2 doesn't use virtual interrupt masking, then in the *theory*
   if KVM uses the feature, it should copy the changes to V_IRQ* bits from
   vmcb02 to vmcb01.

   In practise, KVM only uses it for detection of the interrupt window,
   and it happens to re-open it on each nested VM exit because
   kvm_set_rflags happens to raise the KVM_REQ_EVENT.
   Do this explicitly.

3. Add comment on why we don't need to copy V_GIF from vmcb02 to vmcb01
   when nested guest doesn't use nested V_GIF (and thus L1's GIF is in
   vmcb02 while nested), even though it can in theory affect L1's GIF.

4. Add support code to also copy some bits of int_ctl from
   vmcb02 to vmcb01.
   Currently there are none.

No (visible) functional change is intended.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 47 ++++++++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 15 deletions(-)
  

Comments

Sean Christopherson Nov. 17, 2022, 8:15 p.m. UTC | #1
On Thu, Nov 17, 2022, Maxim Levitsky wrote:
> Clean up the nested_sync_int_ctl_from_vmcb02:
> 
> 1. The comment about preservation of V_IRQ is wrong: when the L2 doesn't
>    use virtual interrupt masking, then the field just doesn't exist in
>    vmcb12 thus it should not be touched at all.
>    Since it is unused in this case, touching it doesn't matter that much,
>    so the bug is theoretical.
> 
> 2. When the L2 doesn't use virtual interrupt masking, then in the *theory*
>    if KVM uses the feature, it should copy the changes to V_IRQ* bits from
>    vmcb02 to vmcb01.
> 
>    In practise, KVM only uses it for detection of the interrupt window,
>    and it happens to re-open it on each nested VM exit because
>    kvm_set_rflags happens to raise the KVM_REQ_EVENT.
>    Do this explicitly.
> 
> 3. Add comment on why we don't need to copy V_GIF from vmcb02 to vmcb01
>    when nested guest doesn't use nested V_GIF (and thus L1's GIF is in
>    vmcb02 while nested), even though it can in theory affect L1's GIF.
> 
> 4. Add support code to also copy some bits of int_ctl from
>    vmcb02 to vmcb01.
>    Currently there are none.

Unless it's impossible for whatever reason, this patch should be split into
multiple patches.  IIUC, there are at least 2 different functional changes being
made, they just happen to not have any actual impact on things.

> No (visible) functional change is intended.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/nested.c | 47 ++++++++++++++++++++++++++-------------
>  1 file changed, 32 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 54eb152e2b60b6..1f2b8492c8782f 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -410,28 +410,45 @@ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
>  static void nested_sync_int_ctl_from_vmcb02(struct vcpu_svm *svm,
>  					    struct vmcb *vmcb12)
>  {
> -	u32 mask;
> +	struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
> +	struct vmcb *vmcb01 = svm->vmcb01.ptr;
> +
> +	/* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb12*/
> +	u32 l2_to_l1_mask = 0;
> +	/* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb01*/
> +	u32 l2_to_l0_mask = 0;
>  
> -	/* Only a few fields of int_ctl are written by the processor.  */

Can this comment be kept in some form?  I found it super useful when reading this
code just now.

> -	mask = V_IRQ_MASK | V_TPR_MASK;
> -	if (!(svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) &&
> -	    svm_is_intercept(svm, INTERCEPT_VINTR)) {
> +	if (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK)
> +		l2_to_l1_mask |= V_IRQ_MASK | V_TPR_MASK;
> +	else {
>  		/*
> -		 * In order to request an interrupt window, L0 is usurping
> -		 * svm->vmcb->control.int_ctl and possibly setting V_IRQ
> -		 * even if it was clear in L1's VMCB.  Restoring it would be
> -		 * wrong.  However, in this case V_IRQ will remain true until
> -		 * interrupt_window_interception calls svm_clear_vintr and
> -		 * restores int_ctl.  We can just leave it aside.
> +		 * If IRQ window was opened while in L2, it must be reopened
> +		 * after the VM exit
> +		 *
> +		 * vTPR value doesn't need to be copied from vmcb02 to vmcb01
> +		 * because it is synced from/to apic registers on each VM exit
>  		 */
> -		mask &= ~V_IRQ_MASK;
> +		if (vmcb02->control.int_ctl & V_IRQ_MASK)
> +			kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>  	}
>  
>  	if (nested_vgif_enabled(svm))
> -		mask |= V_GIF_MASK;
> +		l2_to_l1_mask |= V_GIF_MASK;
> +	else
> +		/* There is no need to sync V_GIF from vmcb02 to vmcb01
> +		 * because GIF is cleared on VMexit, thus even though
> +		 * nested guest can control host's GIF, on VM exit
> +		 * its set value is lost
> +		 */
> +		;

The "else ... ;" is unnecessary, just throw the block comment above the nested
vGIF if-statment, e.g. if I'm understanding everything, this?

	/*
	 * If nested vGIF is not enabled, L2 has access to L1's "real" GIF.  In
	 * this case, there's no need to sync V_GIF from vmcb02 to vmcb01
	 * because GIF is cleared on VM-Exit, thus any changes made by L2 are
	 * overwritten on VM-Exit to L1.
	 */
	if (nested_vgif_enabled(svm))
		l2_to_l1_mask |= V_GIF_MASK;

> +
> +	vmcb12->control.int_ctl =
> +		(svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
> +		(vmcb02->control.int_ctl & l2_to_l1_mask);
>  
> -	vmcb12->control.int_ctl        &= ~mask;
> -	vmcb12->control.int_ctl        |= svm->vmcb->control.int_ctl & mask;
> +	vmcb01->control.int_ctl =
> +		(vmcb01->control.int_ctl & ~l2_to_l0_mask) |
> +		(vmcb02->control.int_ctl & l2_to_l0_mask);

No need for wrapping immediately after the "=", these all fit under the soft limit:

	vmcb12->control.int_ctl = (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
				  (vmcb02->control.int_ctl & l2_to_l1_mask);

	vmcb01->control.int_ctl = (vmcb01->control.int_ctl & ~l2_to_l0_mask) |
				  (vmcb02->control.int_ctl & l2_to_l0_mask);
  
Maxim Levitsky Nov. 21, 2022, 11:10 a.m. UTC | #2
On Thu, 2022-11-17 at 20:15 +0000, Sean Christopherson wrote:
> On Thu, Nov 17, 2022, Maxim Levitsky wrote:
> > Clean up the nested_sync_int_ctl_from_vmcb02:
> > 
> > 1. The comment about preservation of V_IRQ is wrong: when the L2 doesn't
> >    use virtual interrupt masking, then the field just doesn't exist in
> >    vmcb12 thus it should not be touched at all.
> >    Since it is unused in this case, touching it doesn't matter that much,
> >    so the bug is theoretical.
> > 
> > 2. When the L2 doesn't use virtual interrupt masking, then in the *theory*
> >    if KVM uses the feature, it should copy the changes to V_IRQ* bits from
> >    vmcb02 to vmcb01.
> > 
> >    In practise, KVM only uses it for detection of the interrupt window,
> >    and it happens to re-open it on each nested VM exit because
> >    kvm_set_rflags happens to raise the KVM_REQ_EVENT.
> >    Do this explicitly.
> > 
> > 3. Add comment on why we don't need to copy V_GIF from vmcb02 to vmcb01
> >    when nested guest doesn't use nested V_GIF (and thus L1's GIF is in
> >    vmcb02 while nested), even though it can in theory affect L1's GIF.
> > 
> > 4. Add support code to also copy some bits of int_ctl from
> >    vmcb02 to vmcb01.
> >    Currently there are none.
> 
> Unless it's impossible for whatever reason, this patch should be split into
> multiple patches.  IIUC, there are at least 2 different functional changes being
> made, they just happen to not have any actual impact on things.

No objection to this.

> 
> > No (visible) functional change is intended.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/svm/nested.c | 47 ++++++++++++++++++++++++++-------------
> >  1 file changed, 32 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index 54eb152e2b60b6..1f2b8492c8782f 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -410,28 +410,45 @@ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
> >  static void nested_sync_int_ctl_from_vmcb02(struct vcpu_svm *svm,
> >                                             struct vmcb *vmcb12)
> >  {
> > -       u32 mask;
> > +       struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
> > +       struct vmcb *vmcb01 = svm->vmcb01.ptr;
> > +
> > +       /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb12*/
> > +       u32 l2_to_l1_mask = 0;
> > +       /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb01*/
> > +       u32 l2_to_l0_mask = 0;
> >  
> > -       /* Only a few fields of int_ctl are written by the processor.  */
> 
> Can this comment be kept in some form?  I found it super useful when reading this
> code just now.

No problem.

> 
> > -       mask = V_IRQ_MASK | V_TPR_MASK;
> > -       if (!(svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) &&
> > -           svm_is_intercept(svm, INTERCEPT_VINTR)) {
> > +       if (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK)
> > +               l2_to_l1_mask |= V_IRQ_MASK | V_TPR_MASK;
> > +       else {
> >                 /*
> > -                * In order to request an interrupt window, L0 is usurping
> > -                * svm->vmcb->control.int_ctl and possibly setting V_IRQ
> > -                * even if it was clear in L1's VMCB.  Restoring it would be
> > -                * wrong.  However, in this case V_IRQ will remain true until
> > -                * interrupt_window_interception calls svm_clear_vintr and
> > -                * restores int_ctl.  We can just leave it aside.
> > +                * If IRQ window was opened while in L2, it must be reopened
> > +                * after the VM exit
> > +                *
> > +                * vTPR value doesn't need to be copied from vmcb02 to vmcb01
> > +                * because it is synced from/to apic registers on each VM exit
> >                  */
> > -               mask &= ~V_IRQ_MASK;
> > +               if (vmcb02->control.int_ctl & V_IRQ_MASK)
> > +                       kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> >         }
> >  
> >         if (nested_vgif_enabled(svm))
> > -               mask |= V_GIF_MASK;
> > +               l2_to_l1_mask |= V_GIF_MASK;
> > +       else
> > +               /* There is no need to sync V_GIF from vmcb02 to vmcb01
> > +                * because GIF is cleared on VMexit, thus even though
> > +                * nested guest can control host's GIF, on VM exit
> > +                * its set value is lost
> > +                */
> > +               ;
> 
> The "else ... ;" is unnecessary, just throw the block comment above the nested
> vGIF if-statment, e.g. if I'm understanding everything, this?
Yes.

> 
>         /*
>          * If nested vGIF is not enabled, L2 has access to L1's "real" GIF.  In
>          * this case, there's no need to sync V_GIF from vmcb02 to vmcb01
>          * because GIF is cleared on VM-Exit, thus any changes made by L2 are
>          * overwritten on VM-Exit to L1.
>          */
>         if (nested_vgif_enabled(svm))
>                 l2_to_l1_mask |= V_GIF_MASK;
> 
> > +
> > +       vmcb12->control.int_ctl =
> > +               (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
> > +               (vmcb02->control.int_ctl & l2_to_l1_mask);
> >  
> > -       vmcb12->control.int_ctl        &= ~mask;
> > -       vmcb12->control.int_ctl        |= svm->vmcb->control.int_ctl & mask;
> > +       vmcb01->control.int_ctl =
> > +               (vmcb01->control.int_ctl & ~l2_to_l0_mask) |
> > +               (vmcb02->control.int_ctl & l2_to_l0_mask);
> 
> No need for wrapping immediately after the "=", these all fit under the soft limit:
> 
>         vmcb12->control.int_ctl = (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
>                                   (vmcb02->control.int_ctl & l2_to_l1_mask);
> 
>         vmcb01->control.int_ctl = (vmcb01->control.int_ctl & ~l2_to_l0_mask) |
>                                   (vmcb02->control.int_ctl & l2_to_l0_mask);

OK.


Best regards,
	Maxim Levitsky

>
  

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 54eb152e2b60b6..1f2b8492c8782f 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -410,28 +410,45 @@  void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
 static void nested_sync_int_ctl_from_vmcb02(struct vcpu_svm *svm,
 					    struct vmcb *vmcb12)
 {
-	u32 mask;
+	struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
+	struct vmcb *vmcb01 = svm->vmcb01.ptr;
+
+	/* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb12*/
+	u32 l2_to_l1_mask = 0;
+	/* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb01*/
+	u32 l2_to_l0_mask = 0;
 
-	/* Only a few fields of int_ctl are written by the processor.  */
-	mask = V_IRQ_MASK | V_TPR_MASK;
-	if (!(svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) &&
-	    svm_is_intercept(svm, INTERCEPT_VINTR)) {
+	if (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK)
+		l2_to_l1_mask |= V_IRQ_MASK | V_TPR_MASK;
+	else {
 		/*
-		 * In order to request an interrupt window, L0 is usurping
-		 * svm->vmcb->control.int_ctl and possibly setting V_IRQ
-		 * even if it was clear in L1's VMCB.  Restoring it would be
-		 * wrong.  However, in this case V_IRQ will remain true until
-		 * interrupt_window_interception calls svm_clear_vintr and
-		 * restores int_ctl.  We can just leave it aside.
+		 * If IRQ window was opened while in L2, it must be reopened
+		 * after the VM exit
+		 *
+		 * vTPR value doesn't need to be copied from vmcb02 to vmcb01
+		 * because it is synced from/to apic registers on each VM exit
 		 */
-		mask &= ~V_IRQ_MASK;
+		if (vmcb02->control.int_ctl & V_IRQ_MASK)
+			kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 	}
 
 	if (nested_vgif_enabled(svm))
-		mask |= V_GIF_MASK;
+		l2_to_l1_mask |= V_GIF_MASK;
+	else
+		/* There is no need to sync V_GIF from vmcb02 to vmcb01
+		 * because GIF is cleared on VMexit, thus even though
+		 * nested guest can control host's GIF, on VM exit
+		 * its set value is lost
+		 */
+		;
+
+	vmcb12->control.int_ctl =
+		(svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
+		(vmcb02->control.int_ctl & l2_to_l1_mask);
 
-	vmcb12->control.int_ctl        &= ~mask;
-	vmcb12->control.int_ctl        |= svm->vmcb->control.int_ctl & mask;
+	vmcb01->control.int_ctl =
+		(vmcb01->control.int_ctl & ~l2_to_l0_mask) |
+		(vmcb02->control.int_ctl & l2_to_l0_mask);
 }
 
 /*