[0/2] KVM: x86: Propagate AMD-specific IBRS bits to guests

Message ID 20230227210526.83182-1-itazur@amazon.com
Headers
Series KVM: x86: Propagate AMD-specific IBRS bits to guests |

Message

Takahiro Itazuri Feb. 27, 2023, 9:05 p.m. UTC
  VMMs retrieve supported CPUID features via KVM_GET_SUPPORTED_CPUID to
construct CPUID information to be passed to KVM_SET_CPUID2. Most CPUID
feature bits related to speculative attacks are propagated from host
CPUID. But AMD processors have AMD-specific IBRS related bits in CPUID
Fn8000_0008_EBX (ref: AMD64 Architecture Programmer's Manual Volume 3:
General-Purpose and System Instructions) and some bits are not
propagated to guests.

Enable propagation of these bits to guests, so that VMMs don't have to
enable them explicitly based on host CPUID.

Takahiro Itazuri (2):
  x86/cpufeatures: Add AMD-specific IBRS bits
  KVM: x86: Propagate AMD-specific IBRS related bits

 arch/x86/include/asm/cpufeatures.h | 3 +++
 arch/x86/kvm/cpuid.c               | 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)
  

Comments

Borislav Petkov Feb. 27, 2023, 9:40 p.m. UTC | #1
On Mon, Feb 27, 2023 at 09:05:24PM +0000, Takahiro Itazuri wrote:
> VMMs retrieve supported CPUID features via KVM_GET_SUPPORTED_CPUID to
> construct CPUID information to be passed to KVM_SET_CPUID2. Most CPUID
> feature bits related to speculative attacks are propagated from host
> CPUID. But AMD processors have AMD-specific IBRS related bits in CPUID
> Fn8000_0008_EBX (ref: AMD64 Architecture Programmer's Manual Volume 3:
> General-Purpose and System Instructions) and some bits are not
> propagated to guests.
> 
> Enable propagation of these bits to guests, so that VMMs don't have to
> enable them explicitly based on host CPUID.

How hard is it for the VMMs to enable them?
  
Takahiro Itazuri Feb. 28, 2023, 6:13 p.m. UTC | #2
Date:   Mon, 27 Feb 2023 22:40:21 +0100
From:   Borislav Petkov <bp@alien8.de>
> On Mon, Feb 27, 2023 at 09:05:24PM +0000, Takahiro Itazuri wrote:
> > VMMs retrieve supported CPUID features via KVM_GET_SUPPORTED_CPUID to
> > construct CPUID information to be passed to KVM_SET_CPUID2. Most CPUID
> > feature bits related to speculative attacks are propagated from host
> > CPUID. But AMD processors have AMD-specific IBRS related bits in CPUID
> > Fn8000_0008_EBX (ref: AMD64 Architecture Programmer's Manual Volume 3:
> > General-Purpose and System Instructions) and some bits are not
> > propagated to guests.
> >
> > Enable propagation of these bits to guests, so that VMMs don't have to
> > enable them explicitly based on host CPUID.
> 
> How hard is it for the VMMs to enable them?

Actually it is not so hard. What VMMs need to do is:
1. Get host CPUID value.
2. Check if these bits are set.
3. Modify the return value of KVM_GET_SUPPORTED_CPUID based on step 2.
4. Pass it to KVM_SET_CPUID2.

If these bits are propagated to guests same as other bits, VMMs can
skip the above process.

https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt
> This ioctl returns x86 cpuid features which are supported by both the
> hardware and kvm in its default configuration.  Userspace can use the
> information returned by this ioctl to construct cpuid information (for
> KVM_SET_CPUID2) that is consistent with hardware, kernel, and
> userspace capabilities, and with user requirements (for example, the
> user may wish to constrain cpuid to emulate older hardware, or for
> feature consistency across a cluster).

VMMs trust to some extent that KVM_GET_SUPPORTED_CPUID returns cpuid
information consistent with hardware, although they should not for some
leaves (like CPU topoligy). IMHO, propagating these bits without VMM
actions would be helpful since guests come to know IBRS related
information of processors by default and applies mitigations properly
based on that information.

Best regards,
Takahiro Itazuri
  
Borislav Petkov Feb. 28, 2023, 7:24 p.m. UTC | #3
On Tue, Feb 28, 2023 at 06:13:45PM +0000, Takahiro Itazuri wrote:
> VMMs trust to some extent that KVM_GET_SUPPORTED_CPUID returns cpuid
> information consistent with hardware, although they should not for some
> leaves (like CPU topoligy). IMHO, propagating these bits without VMM
> actions would be helpful since guests come to know IBRS related
> information of processors by default and applies mitigations properly
> based on that information.

I'd prefer if VMMs did supply whatever they prefer to the guests
instead. None of those bits are used in the kernel for mitigations, as
you've realized.

Thx.
  
Takahiro Itazuri Feb. 28, 2023, 7:41 p.m. UTC | #4
Date:   Tue, 28 Feb 2023 20:24:12 +0100
From:   Borislav Petkov <bp@alien8.de>
> I'd prefer if VMMs did supply whatever they prefer to the guests
> instead. None of those bits are used in the kernel for mitigations, as
> you've realized.

It is true that the kernel does not use those bits at all, but any
codes could be run inside guests.

One of examples is the following spectre/meltdown checker scipt used as
de facto standard.
https://github.com/speed47/spectre-meltdown-checker/blob/master/spectre-meltdown-checker.sh#L2768

Best regards,
Takahiro Itazuri
  
Borislav Petkov Feb. 28, 2023, 8:45 p.m. UTC | #5
On Tue, Feb 28, 2023 at 07:41:53PM +0000, Takahiro Itazuri wrote:
> It is true that the kernel does not use those bits at all, but any
> codes could be run inside guests.

So you mean we should stick *all* CPUID leafs in there just because
anything can run in guests?

What is the hypervisor then for?

> One of examples is the following spectre/meltdown checker scipt used as
> de facto standard.

Really? Says who?

$ grep -r . /sys/devices/system/cpu/vulnerabilities/

gives you all you need to know.

And if something's missing, then I'm listening.

Thx.
  
Takahiro Itazuri Feb. 28, 2023, 10:24 p.m. UTC | #6
Date:   Tue, 28 Feb 2023 21:45:56 +0100
From:   Borislav Petkov <bp@alien8.de>
> So you mean we should stick *all* CPUID leafs in there just because
> anything can run in guests?
> 
> What is the hypervisor then for?

I'm still a kernel newbie and I don't have a strong opinion for that.
I just thought it would be helpful if the KVM_GET_SUPPORTED_CPUID API
returns the same security information as the host, as long as it is
harmless. I'm inclined to withdraw this patch if it is not worth
enough.

> Really? Says who?
> 
> $ grep -r . /sys/devices/system/cpu/vulnerabilities/
> 
> gives you all you need to know.
> 
> And if something's missing, then I'm listening.

"De facto standard" was too much. I apologize for my incorrect
expression and poor English. What I wanted to say is that the script
was introduced as a useful tool by Intel and SLES and it provides some
additional information (IBRS always-on in this case).
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/secure-coding/spectre-and-meltdown-checker-script.html
https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-spectre.html

Best regards,
Takahiro Itazuri
  
Borislav Petkov Feb. 28, 2023, 10:50 p.m. UTC | #7
On Tue, Feb 28, 2023 at 10:24:16PM +0000, Takahiro Itazuri wrote:
> I'm still a kernel newbie and I don't have a strong opinion for that.
> I just thought it would be helpful if the KVM_GET_SUPPORTED_CPUID API
> returns the same security information as the host, as long as it is
> harmless.

Not harmless - cpufeatures.h should contain flags which the kernel uses
and not *every* CPUID bit out there.

If you want to advertize flags to guests, see
arch/x86/kvm/reverse_cpuid.h and the KVM-only feature flags. You can add
them there.

> https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-spectre.html

Well, I was against adding that to the documentation when I was at SUSE
but ...
  
Sean Christopherson March 6, 2023, 9:16 p.m. UTC | #8
On Tue, Feb 28, 2023, Borislav Petkov wrote:
> On Tue, Feb 28, 2023 at 10:24:16PM +0000, Takahiro Itazuri wrote:
> > I'm still a kernel newbie and I don't have a strong opinion for that.
> > I just thought it would be helpful if the KVM_GET_SUPPORTED_CPUID API
> > returns the same security information as the host, as long as it is
> > harmless.
> 
> Not harmless - cpufeatures.h should contain flags which the kernel uses
> and not *every* CPUID bit out there.

I thought that the consensus was that adding unused-by-the-kernel flags to
cpufeatures.h is ok so long as the feature is hidden from /proc/cpuinfo and the
kernel already dedicates a word to the CPUID leaf?
  
Paolo Bonzini March 6, 2023, 9:25 p.m. UTC | #9
On 3/6/23 22:16, Sean Christopherson wrote:
>> Not harmless - cpufeatures.h should contain flags which the kernel uses
>> and not*every*  CPUID bit out there.
>
> I thought that the consensus was that adding unused-by-the-kernel flags to
> cpufeatures.h is ok so long as the feature is hidden from /proc/cpuinfo and the
> kernel already dedicates a word to the CPUID leaf?
Yeah, I understand adding no new CPUID leaf just for KVM, but you don't 
gain anything really from not having X86_FEATURE_* defines.

Paolo
  
Paolo Bonzini March 6, 2023, 9:31 p.m. UTC | #10
On 2/27/23 22:40, Borislav Petkov wrote:
> On Mon, Feb 27, 2023 at 09:05:24PM +0000, Takahiro Itazuri wrote:
>> VMMs retrieve supported CPUID features via KVM_GET_SUPPORTED_CPUID to
>> construct CPUID information to be passed to KVM_SET_CPUID2. Most CPUID
>> feature bits related to speculative attacks are propagated from host
>> CPUID. But AMD processors have AMD-specific IBRS related bits in CPUID
>> Fn8000_0008_EBX (ref: AMD64 Architecture Programmer's Manual Volume 3:
>> General-Purpose and System Instructions) and some bits are not
>> propagated to guests.
>>
>> Enable propagation of these bits to guests, so that VMMs don't have to
>> enable them explicitly based on host CPUID.
>
> How hard is it for the VMMs to enable them?

Let me rephrase the second paragraph of Takahiro's commit message:

"Tell the VMMs that they can pass the bits to the guests, instead of 
having to second-guess that the hypervisor does not have to do anything 
to support these bits".

In general, userspace should not second guess the hypervisor.  There are 
some rare cases in which QEMU (and probably the proprietary hypervisors 
at Google and Amazon) does that, but in general you want it to trust 
information coming from the kernel.  New CPUID bits are quite frequent, 
and sometimes also stupidly difficult to get right, so if filtering 
CPUID can be done in the kernel you won't have to do the same change N 
times in _all_ userspaces that use KVM.

Paolo
  
Borislav Petkov March 6, 2023, 9:44 p.m. UTC | #11
On Mon, Mar 06, 2023 at 01:16:25PM -0800, Sean Christopherson wrote:
> I thought that the consensus was that adding unused-by-the-kernel flags to
> cpufeatures.h is ok so long as the feature is hidden from /proc/cpuinfo and the
> kernel already dedicates a word to the CPUID leaf?

I guess we should finally write it down in Documentation/x86/cpuinfo.rst

And in case there's no dedicated word, it should be resorted to KVM-only
feature flags.

In any case, I'd like for baremetal CPUID stuff to be decoupled from
KVM's machinery as far as possible as both have different goals wrt
feature flags.

Thx.
  
Paolo Bonzini March 6, 2023, 9:47 p.m. UTC | #12
On 3/6/23 22:44, Borislav Petkov wrote:
>> I thought that the consensus was that adding unused-by-the-kernel flags to
>> cpufeatures.h is ok so long as the feature is hidden from /proc/cpuinfo and the
>> kernel already dedicates a word to the CPUID leaf?
> I guess we should finally write it down in Documentation/x86/cpuinfo.rst
> 
> And in case there's no dedicated word, it should be resorted to KVM-only
> feature flags.
> 
> In any case, I'd like for baremetal CPUID stuff to be decoupled from
> KVM's machinery as far as possible as both have different goals wrt
> feature flags.

It's very rare that KVM can provide a CPUID feature if the kernel has 
masked it, so if the kernel needs to know about a feature word than KVM 
most likely needs to know what kind of massaging the kernel has done.

Paolo
  
Borislav Petkov March 6, 2023, 9:54 p.m. UTC | #13
On Mon, Mar 06, 2023 at 10:47:18PM +0100, Paolo Bonzini wrote:
> It's very rare that KVM can provide a CPUID feature if the kernel has
> masked it,

I'm talking about pure hw feature bits which don't need any enablement.
Like AVX512 insns subset support or something else which the hw does
without the need for the kernel.

Those should be KVM-only if baremetal doesn't use them.
  
Sean Christopherson March 7, 2023, 6:49 p.m. UTC | #14
On Mon, Mar 06, 2023, Borislav Petkov wrote:
> On Mon, Mar 06, 2023 at 10:47:18PM +0100, Paolo Bonzini wrote:
> > It's very rare that KVM can provide a CPUID feature if the kernel has
> > masked it,
> 
> I'm talking about pure hw feature bits which don't need any enablement.
> Like AVX512 insns subset support or something else which the hw does
> without the need for the kernel.
> 
> Those should be KVM-only if baremetal doesn't use them.

I don't see what such a rule buys us beyond complexity and, IMO, unnecessary
maintenance burden.  As Paolo pointed out, when there's an existing word, the
only "cost" is the existence of the #define.  The bit is still present in the
capabilities, and KVM relies on this!  And as mentioned in the tangent about
reworking cpufeatures[*], I get a _lot_ of value out of cpufeatures.h being fully
populated for known bits (in defined words).

Forcing KVM to #define bits in reverse_cpuid.h just because the kernel doesn't
need the macro will inevitably lead to confusion for KVM developers, both when
writing new code and when reading existing code.

[*] https://lore.kernel.org/all/Y8nhtjFcsB63UsmQ@google.com
  
Borislav Petkov March 7, 2023, 6:58 p.m. UTC | #15
On Tue, Mar 07, 2023 at 10:49:22AM -0800, Sean Christopherson wrote:
> I don't see what such a rule buys us beyond complexity and, IMO, unnecessary
> maintenance burden.  As Paolo pointed out, when there's an existing word, the

Maybe I wasn't clear enough - I don't mind existing words. What I mind
is adding new ones only for KVM's sake.
  
Sean Christopherson March 7, 2023, 7:28 p.m. UTC | #16
On Tue, Mar 07, 2023, Borislav Petkov wrote:
> On Tue, Mar 07, 2023 at 10:49:22AM -0800, Sean Christopherson wrote:
> > I don't see what such a rule buys us beyond complexity and, IMO, unnecessary
> > maintenance burden.  As Paolo pointed out, when there's an existing word, the
> 
> Maybe I wasn't clear enough - I don't mind existing words. What I mind
> is adding new ones only for KVM's sake.

Ah, gotcha.  We're on the same page then.  Thanks!
  
Borislav Petkov March 7, 2023, 7:55 p.m. UTC | #17
On Tue, Mar 07, 2023 at 11:28:26AM -0800, Sean Christopherson wrote:
> Ah, gotcha.  We're on the same page then.  Thanks!

Yeah, now someone needs to document it.. :-)

I'll put it on my TODO.