[38/44] KVM: Disable CPU hotplug during hardware enabling

Message ID 20221102231911.3107438-39-seanjc@google.com
State New
Headers
Series KVM: Rework kvm_init() and hardware enabling |

Commit Message

Sean Christopherson Nov. 2, 2022, 11:19 p.m. UTC
  From: Chao Gao <chao.gao@intel.com>

Disable CPU hotplug during hardware_enable_all() to prevent the corner
case where if the following sequence occurs:

  1. A hotplugged CPU marks itself online in cpu_online_mask
  2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
     callback
  3  hardware_enable_all() is invoked on another CPU right

the hotplugged CPU will be included in on_each_cpu() and thus get sent
through hardware_enable_nolock() before kvm_online_cpu() is called.

        start_secondary { ...
                set_cpu_online(smp_processor_id(), true); <- 1
                ...
                local_irq_enable();  <- 2
                ...
                cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
        }

KVM currently fudges around this race by keeping track of which CPUs have
done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
cpus have virtualization enabled"), but that's an inefficient, convoluted,
and hacky solution.

Signed-off-by: Chao Gao <chao.gao@intel.com>
[sean: split to separate patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c  |  8 +++++++-
 virt/kvm/kvm_main.c | 10 ++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)
  

Comments

Kai Huang Nov. 10, 2022, 1:08 a.m. UTC | #1
On Wed, 2022-11-02 at 23:19 +0000, Sean Christopherson wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Disable CPU hotplug during hardware_enable_all() to prevent the corner
> case where if the following sequence occurs:
> 
>   1. A hotplugged CPU marks itself online in cpu_online_mask
>   2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
>      callback
>   3  hardware_enable_all() is invoked on another CPU right
> 
> the hotplugged CPU will be included in on_each_cpu() and thus get sent
> through hardware_enable_nolock() before kvm_online_cpu() is called.
> 
>         start_secondary { ...
>                 set_cpu_online(smp_processor_id(), true); <- 1
>                 ...
>                 local_irq_enable();  <- 2
>                 ...
>                 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
>         }
> 
> KVM currently fudges around this race by keeping track of which CPUs have
> done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
> cpus have virtualization enabled"), but that's an inefficient, convoluted,
> and hacky solution.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> [sean: split to separate patch, write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c  |  8 +++++++-
>  virt/kvm/kvm_main.c | 10 ++++++++++
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a7b1d916ecb2..a15e54ba0471 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9283,7 +9283,13 @@ static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
>  	int cpu = smp_processor_id();
>  	struct cpuinfo_x86 *c = &cpu_data(cpu);
>  
> -	WARN_ON(!irqs_disabled());
> +	/*
> +	 * Compatibility checks are done when loading KVM and when enabling
> +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> +	 * compatible, i.e. KVM should never perform a compatibility check on
> +	 * an offline CPU.
> +	 */
> +	WARN_ON(!irqs_disabled() && cpu_active(cpu));

Comment doesn't match with the code?

"KVM should never perform a compatibility check on on offline CPU" should be
something like:

	WARN_ON(!cpu_online(cpu));

So, should the comment be something like below?

"KVM compatibility check happens before CPU is marked as active".

>  
>  	if (__cr4_reserved_bits(cpu_has, c) !=
>  	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index fd9e39c85549..4e765ef9f4bd 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5088,6 +5088,15 @@ static int hardware_enable_all(void)
>  {
>  	int r = 0;
>  
> +	/*
> +	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
> +	 * is called, and so on_each_cpu() between them includes the CPU that
> +	 * is being onlined.  As a result, hardware_enable_nolock() may get
> +	 * invoked before kvm_online_cpu().
> +	 *
> +	 * Disable CPU hotplug to prevent scenarios where KVM sees
> +	 */

The above sentence is broken.

I think below comment Quoted from Isaku's series should be OK?

	/*
	 * During onlining a CPU, cpu_online_mask is set before
kvm_online_cpu()
	 * is called. on_each_cpu() between them includes the CPU. As a result,
	 * hardware_enable_nolock() may get invoked before kvm_online_cpu().
	 * This would enable hardware virtualization on that cpu without
	 * compatibility checks, which can potentially crash system or break
	 * running VMs.
	 *
	 * Disable CPU hotplug to prevent this case from happening.
	 */

> +	cpus_read_lock();
>  	raw_spin_lock(&kvm_count_lock);
>  
>  	kvm_usage_count++;
> @@ -5102,6 +5111,7 @@ static int hardware_enable_all(void)
>  	}
>  
>  	raw_spin_unlock(&kvm_count_lock);
> +	cpus_read_unlock();
>  
>  	return r;
>  }
  
Kai Huang Nov. 10, 2022, 1:33 a.m. UTC | #2
On Wed, 2022-11-02 at 23:19 +0000, Sean Christopherson wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Disable CPU hotplug during hardware_enable_all() to prevent the corner
> case where if the following sequence occurs:
> 
>   1. A hotplugged CPU marks itself online in cpu_online_mask
>   2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
>      callback
>   3  hardware_enable_all() is invoked on another CPU right
> 
> the hotplugged CPU will be included in on_each_cpu() and thus get sent
> through hardware_enable_nolock() before kvm_online_cpu() is called.
> 
>         start_secondary { ...
>                 set_cpu_online(smp_processor_id(), true); <- 1
>                 ...
>                 local_irq_enable();  <- 2
>                 ...
>                 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
>         }
> 
> KVM currently fudges around this race by keeping track of which CPUs have
> done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
> cpus have virtualization enabled"), but that's an inefficient, convoluted,
> and hacky solution.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> [sean: split to separate patch, write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c  |  8 +++++++-
>  virt/kvm/kvm_main.c | 10 ++++++++++
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a7b1d916ecb2..a15e54ba0471 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9283,7 +9283,13 @@ static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
>  	int cpu = smp_processor_id();
>  	struct cpuinfo_x86 *c = &cpu_data(cpu);
>  
> -	WARN_ON(!irqs_disabled());
> +	/*
> +	 * Compatibility checks are done when loading KVM and when enabling
> +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> +	 * compatible, i.e. KVM should never perform a compatibility check on
> +	 * an offline CPU.
> +	 */
> +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
>  

Also, the logic of:

	!irqs_disabled() && cpu_active(cpu)

is quite weird.

The original "WARN(!irqs_disabled())" is reasonable because in STARTING section
the IRQ is indeed disabled.

But this doesn't make sense anymore after we move to ONLINE section, in which
IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
doesn't get exploded is purely because there's an additional cpu_active(cpu)
check.

So, a more reasonable check should be something like:

	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));

Or we can simply do:

	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));

(because I don't know whether it's possible IRQ can somehow get disabled in
ONLINE section).

Btw above is purely based on code analysis, but I haven't done any test.
  
Kai Huang Nov. 10, 2022, 2:11 a.m. UTC | #3
On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > @@ -9283,7 +9283,13 @@ static int
> > kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> >  	int cpu = smp_processor_id();
> >  	struct cpuinfo_x86 *c = &cpu_data(cpu);
> >  
> > -	WARN_ON(!irqs_disabled());
> > +	/*
> > +	 * Compatibility checks are done when loading KVM and when enabling
> > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > +	 * compatible, i.e. KVM should never perform a compatibility check
> > on
> > +	 * an offline CPU.
> > +	 */
> > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> >  
> 
> Also, the logic of:
> 
> 	!irqs_disabled() && cpu_active(cpu)
> 
> is quite weird.
> 
> The original "WARN(!irqs_disabled())" is reasonable because in STARTING
> section
> the IRQ is indeed disabled.
> 
> But this doesn't make sense anymore after we move to ONLINE section, in which
> IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
> doesn't get exploded is purely because there's an additional cpu_active(cpu)
> check.
> 
> So, a more reasonable check should be something like:
> 
> 	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));
> 
> Or we can simply do:
> 
> 	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));
> 
> (because I don't know whether it's possible IRQ can somehow get disabled in
> ONLINE section).
> 
> Btw above is purely based on code analysis, but I haven't done any test.

Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
false.

So WARN_ON(!irqs_disabled() && cpu_active(cpu)) looks reasonable.  Sorry for the
noise.  Just needed some time to connect the comment with the code.
  
Kai Huang Nov. 10, 2022, 2:20 a.m. UTC | #4
On Thu, 2022-11-10 at 01:08 +0000, Huang, Kai wrote:
> > -	WARN_ON(!irqs_disabled());
> > +	/*
> > +	 * Compatibility checks are done when loading KVM and when enabling
> > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > +	 * compatible, i.e. KVM should never perform a compatibility check
> > on
> > +	 * an offline CPU.
> > +	 */
> > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> 
> Comment doesn't match with the code?
> 
> "KVM should never perform a compatibility check on on offline CPU" should be
> something like:
> 
> 	WARN_ON(!cpu_online(cpu));
> 
> So, should the comment be something like below?
> 
> "KVM compatibility check happens before CPU is marked as active".

Also ignore this one as I only thought about hotplug case.
  
Sean Christopherson Nov. 10, 2022, 4:58 p.m. UTC | #5
On Thu, Nov 10, 2022, Huang, Kai wrote:
> On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > @@ -9283,7 +9283,13 @@ static int
> > > kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> > >  	int cpu = smp_processor_id();
> > >  	struct cpuinfo_x86 *c = &cpu_data(cpu);
> > >  
> > > -	WARN_ON(!irqs_disabled());
> > > +	/*
> > > +	 * Compatibility checks are done when loading KVM and when enabling
> > > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > > +	 * compatible, i.e. KVM should never perform a compatibility check
> > > on
> > > +	 * an offline CPU.
> > > +	 */
> > > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> > >  
> > 
> > Also, the logic of:
> > 
> > 	!irqs_disabled() && cpu_active(cpu)
> > 
> > is quite weird.
> > 
> > The original "WARN(!irqs_disabled())" is reasonable because in STARTING
> > section
> > the IRQ is indeed disabled.
> > 
> > But this doesn't make sense anymore after we move to ONLINE section, in which
> > IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
> > doesn't get exploded is purely because there's an additional cpu_active(cpu)
> > check.
> > 
> > So, a more reasonable check should be something like:
> > 
> > 	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));
> > 
> > Or we can simply do:
> > 
> > 	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));
> > 
> > (because I don't know whether it's possible IRQ can somehow get disabled in
> > ONLINE section).
> > 
> > Btw above is purely based on code analysis, but I haven't done any test.
> 
> Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> false.
> 
> So WARN_ON(!irqs_disabled() && cpu_active(cpu)) looks reasonable.  Sorry for the
> noise.  Just needed some time to connect the comment with the code.

No worries, more than once while working through this code, I've considered setting
up one of those evidence boards from the movies with string and push pins to help
connect all the dots.
  
Sean Christopherson Nov. 15, 2022, 8:16 p.m. UTC | #6
On Thu, Nov 10, 2022, Huang, Kai wrote:
> On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > @@ -9283,7 +9283,13 @@ static int
> > > kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> > >  	int cpu = smp_processor_id();
> > >  	struct cpuinfo_x86 *c = &cpu_data(cpu);
> > >  
> > > -	WARN_ON(!irqs_disabled());
> > > +	/*
> > > +	 * Compatibility checks are done when loading KVM and when enabling
> > > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > > +	 * compatible, i.e. KVM should never perform a compatibility check
> > > on
> > > +	 * an offline CPU.
> > > +	 */
> > > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> > >  
> > 
> > Also, the logic of:
> > 
> > 	!irqs_disabled() && cpu_active(cpu)
> > 
> > is quite weird.
> > 
> > The original "WARN(!irqs_disabled())" is reasonable because in STARTING
> > section
> > the IRQ is indeed disabled.
> > 
> > But this doesn't make sense anymore after we move to ONLINE section, in which
> > IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
> > doesn't get exploded is purely because there's an additional cpu_active(cpu)
> > check.
> > 
> > So, a more reasonable check should be something like:
> > 
> > 	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));
> > 
> > Or we can simply do:
> > 
> > 	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));
> > 
> > (because I don't know whether it's possible IRQ can somehow get disabled in
> > ONLINE section).
> > 
> > Btw above is purely based on code analysis, but I haven't done any test.
> 
> Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> false.

Actually, you're right (and wrong).  You're right in that the WARN is flawed.  And
the reason for that is because you're wrong about the hotplug case.  In this version
of things, the compatibility checks are routed through hardware enabling, i.e. this
flow is used only when loading KVM.  This helper should only be called via SMP function
call, which means that IRQs should always be disabled.
  
Sean Christopherson Nov. 15, 2022, 8:21 p.m. UTC | #7
On Tue, Nov 15, 2022, Sean Christopherson wrote:
> On Thu, Nov 10, 2022, Huang, Kai wrote:
> > On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > > @@ -9283,7 +9283,13 @@ static int
> > > > kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> > > >  	int cpu = smp_processor_id();
> > > >  	struct cpuinfo_x86 *c = &cpu_data(cpu);
> > > >  
> > > > -	WARN_ON(!irqs_disabled());
> > > > +	/*
> > > > +	 * Compatibility checks are done when loading KVM and when enabling
> > > > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > > > +	 * compatible, i.e. KVM should never perform a compatibility check
> > > > on
> > > > +	 * an offline CPU.
> > > > +	 */
> > > > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> > > >  
> > > 
> > > Also, the logic of:
> > > 
> > > 	!irqs_disabled() && cpu_active(cpu)
> > > 
> > > is quite weird.
> > > 
> > > The original "WARN(!irqs_disabled())" is reasonable because in STARTING
> > > section
> > > the IRQ is indeed disabled.
> > > 
> > > But this doesn't make sense anymore after we move to ONLINE section, in which
> > > IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
> > > doesn't get exploded is purely because there's an additional cpu_active(cpu)
> > > check.
> > > 
> > > So, a more reasonable check should be something like:
> > > 
> > > 	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));
> > > 
> > > Or we can simply do:
> > > 
> > > 	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));
> > > 
> > > (because I don't know whether it's possible IRQ can somehow get disabled in
> > > ONLINE section).
> > > 
> > > Btw above is purely based on code analysis, but I haven't done any test.
> > 
> > Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> > happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> > cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> > false.
> 
> Actually, you're right (and wrong).  You're right in that the WARN is flawed.  And
> the reason for that is because you're wrong about the hotplug case.  In this version
> of things, the compatibility checks are routed through hardware enabling, i.e. this
> flow is used only when loading KVM.  This helper should only be called via SMP function
> call, which means that IRQs should always be disabled.

Grr, but not routing through this helper is flawed in that KVM doesn't do the
CR4 checks in the hardware enabling case.  Don't think that changes the WARN, but
other patches in this series need tweaks.
  
Kai Huang Nov. 16, 2022, 12:23 p.m. UTC | #8
On Tue, 2022-11-15 at 20:16 +0000, Sean Christopherson wrote:
> On Thu, Nov 10, 2022, Huang, Kai wrote:
> > On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > > @@ -9283,7 +9283,13 @@ static int
> > > > kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> > > >  	int cpu = smp_processor_id();
> > > >  	struct cpuinfo_x86 *c = &cpu_data(cpu);
> > > >  
> > > > -	WARN_ON(!irqs_disabled());
> > > > +	/*
> > > > +	 * Compatibility checks are done when loading KVM and when enabling
> > > > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > > > +	 * compatible, i.e. KVM should never perform a compatibility check
> > > > on
> > > > +	 * an offline CPU.
> > > > +	 */
> > > > +	WARN_ON(!irqs_disabled() && cpu_active(cpu));
> > > >  
> > > 
> > > Also, the logic of:
> > > 
> > > 	!irqs_disabled() && cpu_active(cpu)
> > > 
> > > is quite weird.
> > > 
> > > The original "WARN(!irqs_disabled())" is reasonable because in STARTING
> > > section
> > > the IRQ is indeed disabled.
> > > 
> > > But this doesn't make sense anymore after we move to ONLINE section, in which
> > > IRQ has already been enabled (see start_secondary()).  IIUC the WARN_ON()
> > > doesn't get exploded is purely because there's an additional cpu_active(cpu)
> > > check.
> > > 
> > > So, a more reasonable check should be something like:
> > > 
> > > 	WARN_ON(irqs_disabled() || cpu_active(cpu) || !cpu_online(cpu));
> > > 
> > > Or we can simply do:
> > > 
> > > 	WARN_ON(!cpu_online(cpu) || cpu_active(cpu));
> > > 
> > > (because I don't know whether it's possible IRQ can somehow get disabled in
> > > ONLINE section).
> > > 
> > > Btw above is purely based on code analysis, but I haven't done any test.
> > 
> > Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> > happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> > cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> > false.
> 
> Actually, you're right (and wrong).  You're right in that the WARN is flawed.  And
> the reason for that is because you're wrong about the hotplug case.  In this version
> of things, the compatibility checks are routed through hardware enabling, i.e. this
> flow is used only when loading KVM.  This helper should only be called via SMP function
> call, which means that IRQs should always be disabled.

Did you mean below code change in later patch "[PATCH 39/44] KVM: Drop
kvm_count_lock and instead protect kvm_usage_count with kvm_lock"?

 	/*
 	 * Abort the CPU online process if hardware virtualization cannot
 	 * be enabled. Otherwise running VMs would encounter unrecoverable
@@ -5039,13 +5039,16 @@ static int kvm_online_cpu(unsigned int cpu)
 	if (kvm_usage_count) {
 		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
 
+		local_irq_save(flags);
 		hardware_enable_nolock(NULL);
+		local_irq_restore(flags);
+
  
Sean Christopherson Nov. 16, 2022, 5:11 p.m. UTC | #9
On Wed, Nov 16, 2022, Huang, Kai wrote:
> On Tue, 2022-11-15 at 20:16 +0000, Sean Christopherson wrote:
> > On Thu, Nov 10, 2022, Huang, Kai wrote:
> > > On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> > > happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> > > cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> > > false.
> > 
> > Actually, you're right (and wrong).  You're right in that the WARN is flawed.  And
> > the reason for that is because you're wrong about the hotplug case.  In this version
> > of things, the compatibility checks are routed through hardware enabling, i.e. this
> > flow is used only when loading KVM.  This helper should only be called via SMP function
> > call, which means that IRQs should always be disabled.
> 
> Did you mean below code change in later patch "[PATCH 39/44] KVM: Drop
> kvm_count_lock and instead protect kvm_usage_count with kvm_lock"?
> 
>  	/*
>  	 * Abort the CPU online process if hardware virtualization cannot
>  	 * be enabled. Otherwise running VMs would encounter unrecoverable
> @@ -5039,13 +5039,16 @@ static int kvm_online_cpu(unsigned int cpu)
>  	if (kvm_usage_count) {
>  		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
>  
> +		local_irq_save(flags);
>  		hardware_enable_nolock(NULL);
> +		local_irq_restore(flags);

Sort of.  What I was saying is that in this v1, the compatibility checks that are
done during harware enabling are initiated from vendor code, i.e. VMX and SVM call
{svm,vmx}_check_processor_compat() directly.  As a result, the compat checks that
are handled in common code:

	if (__cr4_reserved_bits(cpu_has, c) !=
	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
		return -EIO;

are skipped.  And if that's fixed, then the above hardware_enable_nolock() call
will bounce through kvm_x86_check_processor_compatibility() with IRQs enabled
once the KVM hotplug hook is moved to the ONLINE section.

As above, the simple "fix" would be to disable IRQs, but that's not actually
necessary.  The only requirement is that preemption is disabled so that the checks
are done on the current CPU.  The "IRQs disabled" check was a deliberately
agressive WARN that was added to guard against doing compatibility checks from
the "wrong" location.

E.g. this is what I ended up with for a changelog to drop the irqs_disabled()
check and for the end code (though it's not tested yet...)

    Drop kvm_x86_check_processor_compatibility()'s WARN that IRQs are
    disabled, as the ONLINE section runs with IRQs disabled.  The WARN wasn't
    intended to be a requirement, e.g. disabling preemption is sufficient,
    the IRQ thing was purely an aggressive sanity check since the helper was
    only ever invoked via SMP function call.


static int kvm_x86_check_processor_compatibility(void)
{
        int cpu = smp_processor_id();
        struct cpuinfo_x86 *c = &cpu_data(cpu);

        /*
         * Compatibility checks are done when loading KVM and when enabling
         * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
         * compatible, i.e. KVM should never perform a compatibility check on
         * an offline CPU.
         */
        WARN_ON(!cpu_online(cpu));

        if (__cr4_reserved_bits(cpu_has, c) !=
            __cr4_reserved_bits(cpu_has, &boot_cpu_data))
                return -EIO;

        return static_call(kvm_x86_check_processor_compatibility)();
}


int kvm_arch_hardware_enable(void)
{
        struct kvm *kvm;
        struct kvm_vcpu *vcpu;
        unsigned long i;
        int ret;
        u64 local_tsc;
        u64 max_tsc = 0;
        bool stable, backwards_tsc = false;

        kvm_user_return_msr_cpu_online();

        ret = kvm_x86_check_processor_compatibility();
        if (ret)
                return ret;

        ret = static_call(kvm_x86_hardware_enable)();
        if (ret != 0)
                return ret;


	....
}
  
Kai Huang Nov. 17, 2022, 1:39 a.m. UTC | #10
On Wed, 2022-11-16 at 17:11 +0000, Sean Christopherson wrote:
> On Wed, Nov 16, 2022, Huang, Kai wrote:
> > On Tue, 2022-11-15 at 20:16 +0000, Sean Christopherson wrote:
> > > On Thu, Nov 10, 2022, Huang, Kai wrote:
> > > > On Thu, 2022-11-10 at 01:33 +0000, Huang, Kai wrote:
> > > > Hmm.. I wasn't thinking thoroughly.  I forgot CPU compatibility check also
> > > > happens on all online cpus when loading KVM.  For this case, IRQ is disabled and
> > > > cpu_active() is true.  For the hotplug case, IRQ is enabled but  cpu_active() is
> > > > false.
> > > 
> > > Actually, you're right (and wrong).  You're right in that the WARN is flawed.  And
> > > the reason for that is because you're wrong about the hotplug case.  In this version
> > > of things, the compatibility checks are routed through hardware enabling, i.e. this
> > > flow is used only when loading KVM.  This helper should only be called via SMP function
> > > call, which means that IRQs should always be disabled.
> > 
> > Did you mean below code change in later patch "[PATCH 39/44] KVM: Drop
> > kvm_count_lock and instead protect kvm_usage_count with kvm_lock"?
> > 
> >  	/*
> >  	 * Abort the CPU online process if hardware virtualization cannot
> >  	 * be enabled. Otherwise running VMs would encounter unrecoverable
> > @@ -5039,13 +5039,16 @@ static int kvm_online_cpu(unsigned int cpu)
> >  	if (kvm_usage_count) {
> >  		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
> >  
> > +		local_irq_save(flags);
> >  		hardware_enable_nolock(NULL);
> > +		local_irq_restore(flags);
> 
> Sort of.  What I was saying is that in this v1, the compatibility checks that are
> done during harware enabling are initiated from vendor code, i.e. VMX and SVM call
> {svm,vmx}_check_processor_compat() directly.  As a result, the compat checks that
> are handled in common code:
> 
> 	if (__cr4_reserved_bits(cpu_has, c) !=
> 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> 		return -EIO;
> 
> are skipped.  And if that's fixed, then the above hardware_enable_nolock() call
> will bounce through kvm_x86_check_processor_compatibility() with IRQs enabled
> once the KVM hotplug hook is moved to the ONLINE section.

Oh I see.  So you still want the kvm_x86_ops->check_processor_compatibility(),
in order to avoid duplicating the above code in SVM and VMX.

> 
> As above, the simple "fix" would be to disable IRQs, but that's not actually
> necessary.  The only requirement is that preemption is disabled so that the checks
> are done on the current CPU.  
> 

Probably even preemption is allowed, as long as the compatibility check is not
scheduled to another cpu.


> The "IRQs disabled" check was a deliberately
> agressive WARN that was added to guard against doing compatibility checks from
> the "wrong" location.
> 
> E.g. this is what I ended up with for a changelog to drop the irqs_disabled()
> check and for the end code (though it's not tested yet...)
> 
>     Drop kvm_x86_check_processor_compatibility()'s WARN that IRQs are
>     disabled, as the ONLINE section runs with IRQs disabled.  The WARN wasn't
						     ^
						     enabled.

>     intended to be a requirement, e.g. disabling preemption is sufficient,
>     the IRQ thing was purely an aggressive sanity check since the helper was
>     only ever invoked via SMP function call.
> 
> 
> static int kvm_x86_check_processor_compatibility(void)
> {
>         int cpu = smp_processor_id();
>         struct cpuinfo_x86 *c = &cpu_data(cpu);
> 
>         /*
>          * Compatibility checks are done when loading KVM and when enabling
>          * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
>          * compatible, i.e. KVM should never perform a compatibility check on
>          * an offline CPU.
>          */
>         WARN_ON(!cpu_online(cpu));

Looks good to me.  Perhaps this also can be removed, though.

And IMHO the removing of WARN_ON(!irq_disabled()) should be folded to the patch
"[PATCH 37/44] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section". 
Because moving from STARTING section to ONLINE section changes the IRQ status
when the compatibility check is called.

> 
>         if (__cr4_reserved_bits(cpu_has, c) !=
>             __cr4_reserved_bits(cpu_has, &boot_cpu_data))
>                 return -EIO;
> 
>         return static_call(kvm_x86_check_processor_compatibility)();
> }
> 
> 
> int kvm_arch_hardware_enable(void)
> {
>         struct kvm *kvm;
>         struct kvm_vcpu *vcpu;
>         unsigned long i;
>         int ret;
>         u64 local_tsc;
>         u64 max_tsc = 0;
>         bool stable, backwards_tsc = false;
> 
>         kvm_user_return_msr_cpu_online();
> 
>         ret = kvm_x86_check_processor_compatibility();
>         if (ret)
>                 return ret;
> 
>         ret = static_call(kvm_x86_hardware_enable)();
>         if (ret != 0)
>                 return ret;
> 
> 
> 	....
> }
  
Sean Christopherson Nov. 17, 2022, 3:16 p.m. UTC | #11
On Thu, Nov 17, 2022, Huang, Kai wrote:
> On Wed, 2022-11-16 at 17:11 +0000, Sean Christopherson wrote:
> > static int kvm_x86_check_processor_compatibility(void)
> > {
> >         int cpu = smp_processor_id();
> >         struct cpuinfo_x86 *c = &cpu_data(cpu);
> > 
> >         /*
> >          * Compatibility checks are done when loading KVM and when enabling
> >          * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> >          * compatible, i.e. KVM should never perform a compatibility check on
> >          * an offline CPU.
> >          */
> >         WARN_ON(!cpu_online(cpu));
> 
> Looks good to me.  Perhaps this also can be removed, though.

Hmm, it's a bit superfluous, but I think it could fire if KVM messed up CPU
hotplug again, e.g. if the for_each_online_cpu() => IPI raced with CPU unplug.

> And IMHO the removing of WARN_ON(!irq_disabled()) should be folded to the patch
> "[PATCH 37/44] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section". 
> Because moving from STARTING section to ONLINE section changes the IRQ status
> when the compatibility check is called.

Yep, that's what I have coded up, just smushed it all together here.
  

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a7b1d916ecb2..a15e54ba0471 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9283,7 +9283,13 @@  static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
 	int cpu = smp_processor_id();
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-	WARN_ON(!irqs_disabled());
+	/*
+	 * Compatibility checks are done when loading KVM and when enabling
+	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
+	 * compatible, i.e. KVM should never perform a compatibility check on
+	 * an offline CPU.
+	 */
+	WARN_ON(!irqs_disabled() && cpu_active(cpu));
 
 	if (__cr4_reserved_bits(cpu_has, c) !=
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fd9e39c85549..4e765ef9f4bd 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5088,6 +5088,15 @@  static int hardware_enable_all(void)
 {
 	int r = 0;
 
+	/*
+	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
+	 * is called, and so on_each_cpu() between them includes the CPU that
+	 * is being onlined.  As a result, hardware_enable_nolock() may get
+	 * invoked before kvm_online_cpu().
+	 *
+	 * Disable CPU hotplug to prevent scenarios where KVM sees
+	 */
+	cpus_read_lock();
 	raw_spin_lock(&kvm_count_lock);
 
 	kvm_usage_count++;
@@ -5102,6 +5111,7 @@  static int hardware_enable_all(void)
 	}
 
 	raw_spin_unlock(&kvm_count_lock);
+	cpus_read_unlock();
 
 	return r;
 }