[v2,03/11] KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1 doesn't intercept interrupts

Message ID 20221129193717.513824-4-mlevitsk@redhat.com
State New
Headers
Series SVM: vNMI (with my fixes) |

Commit Message

Maxim Levitsky Nov. 29, 2022, 7:37 p.m. UTC
  If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's
V_IRQ for L1 (to detect the interrupt window)

In this case on the nested VM exit KVM might need to copy the V_IRQ bit
from the vmcb02 to the vmcb01, to continue waiting for the
interrupt window.

To make it simple, just raise the KVM_REQ_EVENT request, which
execution will lead to the reenabling of the interrupt
window if needed.

Note that this is a theoretical bug because the KVM already does raise
the KVM_REQ_EVENT request one each nested VM exit because the nested
VM exit resets RFLAGS and the kvm_set_rflags() raises the
KVM_REQ_EVENT request in the response.

However raising this request explicitly, together with
documenting why this is needed, is still preferred.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
  

Comments

Sean Christopherson Jan. 28, 2023, 12:56 a.m. UTC | #1
Shortlog is too long, maybe this?

  KVM: nSVM: Raise event on nested VM exit if L1 doesn't intercept IRQs

On Tue, Nov 29, 2022, Maxim Levitsky wrote:
> If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's

s/the L2/L2, though don't you mean L1?

> V_IRQ for L1 (to detect the interrupt window)

"an interrupt window", i.e. there's not just one window.

> In this case on the nested VM exit KVM might need to copy the V_IRQ bit

s/the nested/nested

> from the vmcb02 to the vmcb01, to continue waiting for the
> interrupt window.
> 
> To make it simple, just raise the KVM_REQ_EVENT request, which
> execution will lead to the reenabling of the interrupt
> window if needed.
> 
> Note that this is a theoretical bug because the KVM already does raise

s/the KVM/KVM

> the KVM_REQ_EVENT request one each nested VM exit because the nested

s/the KVM_REQ_EVENT/KVM_REQ_EVENT, and s/one/on

> VM exit resets RFLAGS and the kvm_set_rflags() raises the
> KVM_REQ_EVENT request in the response.
> 
> However raising this request explicitly, together with
> documenting why this is needed, is still preferred.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index aad3145b2f62fe..e891318595113e 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1016,6 +1016,31 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  
>  	svm_switch_vmcb(svm, &svm->vmcb01);
>  
> +	/* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:

	/*
	 * Preferred block comment style...

> +	 *
> +	 * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:

Drop the "-" to be consistent with the rest of the paragraphs.

> +	 * If the L2 doesn't intercept interrupts, then
> +	 * (even if the L2 does use virtual interrupt masking),

KVM uses "L2" to refer to the thing running at L2.  I think what you are referring
to here is vmcb12?  And that's controlled by L1.

> +	 * KVM will use the vmcb02's V_INTR to detect interrupt window.

s/the vmcb02/vmcb02

Which of the V_INTR fields does this refer to?  Oooh, you're saying the KVM injects
a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
enabled.

Why does KVM do that?  Why not pend the actual IRQ directly?  

> +	 *
> +	 * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window

s/the KVM_REQ_EVENT/KVM_REQ_EVENT

> +	 * is not lost and this implicitly copies these bits from vmcb02 to vmcb01

Too many pronouns.  What do "this" and "these bits" refer to?

> +	 *
> +	 * V_TPR:
> +	 * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
> +	 * is stored in the vmcb02 but its value doesn't need to be copied from/to
> +	 * vmcb01 because it is copied from/to the TPR APIC's register on
> +	 * each VM entry/exit.
> +	 *
> +	 * V_GIF:
> +	 * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,

Drop this "-" too.

> +	 * however, the L1 vGIF is reset to false on each VM exit, thus
> +	 * there is no need to copy it from vmcb02 to vmcb01.
> +	 */
> +
> +	if (!nested_exit_on_intr(svm))
> +		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> +
>  	if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
>  		svm_copy_lbrs(vmcb12, vmcb02);
>  		svm_update_lbrv(vcpu);
> -- 
> 2.26.3
>
  
Sean Christopherson Jan. 30, 2023, 6:41 p.m. UTC | #2
On Sat, Jan 28, 2023, Sean Christopherson wrote:
> > +	 * If the L2 doesn't intercept interrupts, then
> > +	 * (even if the L2 does use virtual interrupt masking),
> 
> KVM uses "L2" to refer to the thing running at L2.  I think what you are referring
> to here is vmcb12?  And that's controlled by L1.
> 
> > +	 * KVM will use the vmcb02's V_INTR to detect interrupt window.
> 
> s/the vmcb02/vmcb02
> 
> Which of the V_INTR fields does this refer to?  Oooh, you're saying the KVM injects
> a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
> enabled.
> 
> Why does KVM do that?  Why not pend the actual IRQ directly?

Duh, because KVM needs to gain control in if there are multiple pending events.
  

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index aad3145b2f62fe..e891318595113e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1016,6 +1016,31 @@  int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	svm_switch_vmcb(svm, &svm->vmcb01);
 
+	/* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:
+	 *
+	 * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:
+	 * If the L2 doesn't intercept interrupts, then
+	 * (even if the L2 does use virtual interrupt masking),
+	 * KVM will use the vmcb02's V_INTR to detect interrupt window.
+	 *
+	 * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window
+	 * is not lost and this implicitly copies these bits from vmcb02 to vmcb01
+	 *
+	 * V_TPR:
+	 * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
+	 * is stored in the vmcb02 but its value doesn't need to be copied from/to
+	 * vmcb01 because it is copied from/to the TPR APIC's register on
+	 * each VM entry/exit.
+	 *
+	 * V_GIF:
+	 * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,
+	 * however, the L1 vGIF is reset to false on each VM exit, thus
+	 * there is no need to copy it from vmcb02 to vmcb01.
+	 */
+
+	if (!nested_exit_on_intr(svm))
+		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
+
 	if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		svm_copy_lbrs(vmcb12, vmcb02);
 		svm_update_lbrv(vcpu);