KVM: x86: Ignore MSR_AMD64_BU_CFG access

Message ID 0ffde769702c6cdf6b6c18e1dcb28b25309af7f7.1695659717.git.maciej.szmigiero@oracle.com
State New
Headers
Series KVM: x86: Ignore MSR_AMD64_BU_CFG access |

Commit Message

Maciej S. Szmigiero Sept. 25, 2023, 4:36 p.m. UTC
  From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
in the guest kernel).

This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
handle receiving a #GP when doing so.

Give this MSR the same treatment that commit 2e32b7190641
("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave
MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant
only.
Although apparently it was then needed for Linux guests, not Windows as in
this case.

With this change, the aforementioned guest setup is able to finish booting
successfully.

This issue can be reproduced either on a Summit Ridge Ryzen (with
just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since
EPYC is ordinarily stepping 2).

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 arch/x86/include/asm/msr-index.h | 1 +
 arch/x86/kvm/x86.c               | 2 ++
 2 files changed, 3 insertions(+)
  

Comments

Sean Christopherson Sept. 25, 2023, 6:30 p.m. UTC | #1
On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> 
> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
> in the guest kernel).
> 
> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
> handle receiving a #GP when doing so.

Any idea why?

> Give this MSR the same treatment that commit 2e32b7190641
> ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave
> MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant
> only.

Ugh, that commit set a terrible example.  The kernel change should have been
conditioned on !X86_FEATURE_HYPERVISOR if the MSR only has meaning for bare metal.

> Although apparently it was then needed for Linux guests, not Windows as in
> this case.
> 
> With this change, the aforementioned guest setup is able to finish booting
> successfully.
> 
> This issue can be reproduced either on a Summit Ridge Ryzen (with
> just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since
> EPYC is ordinarily stepping 2).

This seems like it needs to be tagged for stable?

> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
>  arch/x86/include/asm/msr-index.h | 1 +
>  arch/x86/kvm/x86.c               | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 1d111350197f..c80a5cea80c4 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -553,6 +553,7 @@
>  #define MSR_AMD64_CPUID_FN_1		0xc0011004
>  #define MSR_AMD64_LS_CFG		0xc0011020
>  #define MSR_AMD64_DC_CFG		0xc0011022
> +#define MSR_AMD64_BU_CFG		0xc0011023

What document actually defines this MSR?  All of the PPRs I can find for Family 17h
list it as:

   MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)

>  #define MSR_AMD64_DE_CFG		0xc0011029
>  #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT	 1
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9f18b06bbda6..2f3cdd798185 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3639,6 +3639,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_UCODE_WRITE:
>  	case MSR_VM_HSAVE_PA:
>  	case MSR_AMD64_PATCH_LOADER:
> +	case MSR_AMD64_BU_CFG:

I am sorely tempted to say that this should be solved in userspace via MSR
filtering.  IIUC, the MSR truly is model specific, and I don't love the idea of
effectively ignoring accesses to unknown MSRs.  And I really, really don't want
KVM to pivot on FMS.

Paolo, is punting to userspace reasonable, or should we just bite the bullet in
KVM and commit to ignoring MSRs like this?

>  	case MSR_AMD64_BU_CFG2:
>  	case MSR_AMD64_DC_CFG:
>  	case MSR_F15H_EX_CFG:
> @@ -4062,6 +4063,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_K8_INT_PENDING_MSG:
>  	case MSR_AMD64_NB_CFG:
>  	case MSR_FAM10H_MMIO_CONF_BASE:
> +	case MSR_AMD64_BU_CFG:
>  	case MSR_AMD64_BU_CFG2:
>  	case MSR_IA32_PERF_CTL:
>  	case MSR_AMD64_DC_CFG:
  
Maciej S. Szmigiero Sept. 25, 2023, 6:53 p.m. UTC | #2
On 25.09.2023 20:30, Sean Christopherson wrote:
> On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
>> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
>> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
>> in the guest kernel).
>>
>> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
>> handle receiving a #GP when doing so.
> 
> Any idea why?

I guess it is trying to set some chicken bit?

By the way, I tested Windows Server 2019 now - it has the same problem.

So likely Windows 11 and newer version of Windows 10 have it, too.

>> Give this MSR the same treatment that commit 2e32b7190641
>> ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave
>> MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant
>> only.
> 
> Ugh, that commit set a terrible example.  The kernel change should have been
> conditioned on !X86_FEATURE_HYPERVISOR if the MSR only has meaning for bare metal.

You are right with respect to the original guest kernel change that
triggered the later KVM commit ignoring MSR_AMD64_BU_CFG2.

This doesn't help Windows guests, however.

>> Although apparently it was then needed for Linux guests, not Windows as in
>> this case.
>>
>> With this change, the aforementioned guest setup is able to finish booting
>> successfully.
>>
>> This issue can be reproduced either on a Summit Ridge Ryzen (with
>> just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since
>> EPYC is ordinarily stepping 2).
> 
> This seems like it needs to be tagged for stable?

Like with just "Cc: stable@vger.kernel.org", but without "Fixes:" tag?
Can do.

>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> ---
>>   arch/x86/include/asm/msr-index.h | 1 +
>>   arch/x86/kvm/x86.c               | 2 ++
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 1d111350197f..c80a5cea80c4 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -553,6 +553,7 @@
>>   #define MSR_AMD64_CPUID_FN_1		0xc0011004
>>   #define MSR_AMD64_LS_CFG		0xc0011020
>>   #define MSR_AMD64_DC_CFG		0xc0011022
>> +#define MSR_AMD64_BU_CFG		0xc0011023
> 
> What document actually defines this MSR?  All of the PPRs I can find for Family 17h
> list it as:
> 
>     MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)

It's partially documented in various AMD BKDGs, however I couldn't find
any definition for this particular bit (8) - other than that it is reserved.

>>   #define MSR_AMD64_DE_CFG		0xc0011029
>>   #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT	 1
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 9f18b06bbda6..2f3cdd798185 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3639,6 +3639,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   	case MSR_IA32_UCODE_WRITE:
>>   	case MSR_VM_HSAVE_PA:
>>   	case MSR_AMD64_PATCH_LOADER:
>> +	case MSR_AMD64_BU_CFG:
> 
> I am sorely tempted to say that this should be solved in userspace via MSR
> filtering.  IIUC, the MSR truly is model specific, and I don't love the idea of
> effectively ignoring accesses to unknown MSRs.  And I really, really don't want
> KVM to pivot on FMS.
> 
> Paolo, is punting to userspace reasonable, or should we just bite the bullet in
> KVM and commit to ignoring MSRs like this?
> 

Waiting for Paolo's decision here then.

Thanks,
Maciej
  
Sean Christopherson Sept. 25, 2023, 7:16 p.m. UTC | #3
+Tom

On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
> On 25.09.2023 20:30, Sean Christopherson wrote:
> >>
> >> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
> >> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
> >> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
> >> in the guest kernel).
> >>
> >> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
> >> handle receiving a #GP when doing so.
> >
> > Any idea why?
>
> I guess it is trying to set some chicken bit?
>
> By the way, I tested Windows Server 2019 now - it has the same problem.
>
> So likely Windows 11 and newer version of Windows 10 have it, too.

...

> > > diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> > > index 1d111350197f..c80a5cea80c4 100644
> > > --- a/arch/x86/include/asm/msr-index.h
> > > +++ b/arch/x86/include/asm/msr-index.h
> > > @@ -553,6 +553,7 @@
> > >   #define MSR_AMD64_CPUID_FN_1		0xc0011004
> > >   #define MSR_AMD64_LS_CFG		0xc0011020
> > >   #define MSR_AMD64_DC_CFG		0xc0011022
> > > +#define MSR_AMD64_BU_CFG		0xc0011023
> > 
> > What document actually defines this MSR?  All of the PPRs I can find for Family 17h
> > list it as:
> > 
> >     MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
> 
> It's partially documented in various AMD BKDGs, however I couldn't find
> any definition for this particular bit (8) - other than that it is reserved.

I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
My guess is that Windows is trying to write this thing:

  MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
  Read-write. Reset: 0000_0000_0000_0000h.
  _lthree0_core[3,1]; MSRC001_1023

  Bits   Description
  63:50  Reserved.
  49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
         1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
  48:0   Reserved.

Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
another speculation bug?

Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
would be nice.
  
Tom Lendacky Sept. 25, 2023, 10:25 p.m. UTC | #4
On 9/25/23 14:16, Sean Christopherson wrote:
> +Tom
> 
> On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
>> On 25.09.2023 20:30, Sean Christopherson wrote:
>>>>
>>>> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
>>>> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
>>>> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
>>>> in the guest kernel).
>>>>
>>>> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
>>>> handle receiving a #GP when doing so.
>>>
>>> Any idea why?
>>
>> I guess it is trying to set some chicken bit?
>>
>> By the way, I tested Windows Server 2019 now - it has the same problem.
>>
>> So likely Windows 11 and newer version of Windows 10 have it, too.
> 
> ...
> 
>>>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>>>> index 1d111350197f..c80a5cea80c4 100644
>>>> --- a/arch/x86/include/asm/msr-index.h
>>>> +++ b/arch/x86/include/asm/msr-index.h
>>>> @@ -553,6 +553,7 @@
>>>>    #define MSR_AMD64_CPUID_FN_1		0xc0011004
>>>>    #define MSR_AMD64_LS_CFG		0xc0011020
>>>>    #define MSR_AMD64_DC_CFG		0xc0011022
>>>> +#define MSR_AMD64_BU_CFG		0xc0011023
>>>
>>> What document actually defines this MSR?  All of the PPRs I can find for Family 17h
>>> list it as:
>>>
>>>      MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
>>
>> It's partially documented in various AMD BKDGs, however I couldn't find
>> any definition for this particular bit (8) - other than that it is reserved.
> 
> I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
> My guess is that Windows is trying to write this thing:
> 
>    MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
>    Read-write. Reset: 0000_0000_0000_0000h.
>    _lthree0_core[3,1]; MSRC001_1023
> 
>    Bits   Description
>    63:50  Reserved.
>    49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
>           1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
>    48:0   Reserved.
> 
> Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
> another speculation bug?
> 
> Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
> terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
> would be nice.

It's definitely not related to a speculation bug, but I'm unsure what was 
told to Microsoft that has them performing that WRMSR. The patch does the 
proper thing, though, as a guest shouldn't be updating that setting.

And TW_CFG is the proper name of that MSR for Zen.

Thanks,
Tom
  
Maciej S. Szmigiero Oct. 2, 2023, 4:32 p.m. UTC | #5
On 26.09.2023 00:25, Tom Lendacky wrote:
> On 9/25/23 14:16, Sean Christopherson wrote:
>> +Tom
>>
>> On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
>>> On 25.09.2023 20:30, Sean Christopherson wrote:
>>>>>
>>>>> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
>>>>> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
>>>>> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
>>>>> in the guest kernel).
>>>>>
>>>>> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
>>>>> handle receiving a #GP when doing so.
>>>>
>>>> Any idea why?
>>>
>>> I guess it is trying to set some chicken bit?
>>>
>>> By the way, I tested Windows Server 2019 now - it has the same problem.
>>>
>>> So likely Windows 11 and newer version of Windows 10 have it, too.
>>
>> ...
>>
>>>>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>>>>> index 1d111350197f..c80a5cea80c4 100644
>>>>> --- a/arch/x86/include/asm/msr-index.h
>>>>> +++ b/arch/x86/include/asm/msr-index.h
>>>>> @@ -553,6 +553,7 @@
>>>>>    #define MSR_AMD64_CPUID_FN_1        0xc0011004
>>>>>    #define MSR_AMD64_LS_CFG        0xc0011020
>>>>>    #define MSR_AMD64_DC_CFG        0xc0011022
>>>>> +#define MSR_AMD64_BU_CFG        0xc0011023
>>>>
>>>> What document actually defines this MSR?  All of the PPRs I can find for Family 17h
>>>> list it as:
>>>>
>>>>      MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
>>>
>>> It's partially documented in various AMD BKDGs, however I couldn't find
>>> any definition for this particular bit (8) - other than that it is reserved.
>>
>> I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
>> My guess is that Windows is trying to write this thing:
>>
>>    MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
>>    Read-write. Reset: 0000_0000_0000_0000h.
>>    _lthree0_core[3,1]; MSRC001_1023
>>
>>    Bits   Description
>>    63:50  Reserved.
>>    49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
>>           1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
>>    48:0   Reserved.
>>
>> Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
>> another speculation bug?
>>
>> Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
>> terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
>> would be nice.
> 
> It's definitely not related to a speculation bug, but I'm unsure what was told to Microsoft that has them performing that WRMSR. The patch does the proper thing, though, as a guest shouldn't be updating that setting.
> 
> And TW_CFG is the proper name of that MSR for Zen.


So, should I prepare v2 with MSR_AMD64_BU_CFG -> MSR_AMD64_TW_CFG change?

Thanks,
Maciej
  
Sean Christopherson Oct. 5, 2023, 12:10 a.m. UTC | #6
On Mon, Oct 02, 2023, Maciej S. Szmigiero wrote:
> On 26.09.2023 00:25, Tom Lendacky wrote:
> > > > It's partially documented in various AMD BKDGs, however I couldn't find
> > > > any definition for this particular bit (8) - other than that it is reserved.
> > > 
> > > I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
> > > My guess is that Windows is trying to write this thing:
> > > 
> > >    MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
> > >    Read-write. Reset: 0000_0000_0000_0000h.
> > >    _lthree0_core[3,1]; MSRC001_1023
> > > 
> > >    Bits   Description
> > >    63:50  Reserved.
> > >    49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
> > >           1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
> > >    48:0   Reserved.
> > > 
> > > Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
> > > another speculation bug?
> > > 
> > > Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
> > > terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
> > > would be nice.
> > 
> > It's definitely not related to a speculation bug, but I'm unsure what was
> > told to Microsoft that has them performing that WRMSR. The patch does the
> > proper thing, though, as a guest shouldn't be updating that setting.
> > 
> > And TW_CFG is the proper name of that MSR for Zen.
> 
> So, should I prepare v2 with MSR_AMD64_BU_CFG -> MSR_AMD64_TW_CFG change?

If we can get Paolo's attention, I'd like to get his thoughts on punting this
to QEMU/userspace.  I'm worried that "handling" uarch specific MSRs in KVM is
going to paint us into a corner and force KVM to check guest F/M/S someday, which
I want to avoid at pretty much all costs.
  
Maciej S. Szmigiero Oct. 5, 2023, 10:50 a.m. UTC | #7
On 5.10.2023 02:10, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, Maciej S. Szmigiero wrote:
>> On 26.09.2023 00:25, Tom Lendacky wrote:
>>>>> It's partially documented in various AMD BKDGs, however I couldn't find
>>>>> any definition for this particular bit (8) - other than that it is reserved.
>>>>
>>>> I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
>>>> My guess is that Windows is trying to write this thing:
>>>>
>>>>     MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
>>>>     Read-write. Reset: 0000_0000_0000_0000h.
>>>>     _lthree0_core[3,1]; MSRC001_1023
>>>>
>>>>     Bits   Description
>>>>     63:50  Reserved.
>>>>     49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
>>>>            1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
>>>>     48:0   Reserved.
>>>>
>>>> Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
>>>> another speculation bug?
>>>>
>>>> Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
>>>> terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
>>>> would be nice.
>>>
>>> It's definitely not related to a speculation bug, but I'm unsure what was
>>> told to Microsoft that has them performing that WRMSR. The patch does the
>>> proper thing, though, as a guest shouldn't be updating that setting.
>>>
>>> And TW_CFG is the proper name of that MSR for Zen.
>>
>> So, should I prepare v2 with MSR_AMD64_BU_CFG -> MSR_AMD64_TW_CFG change?
> 
> If we can get Paolo's attention, I'd like to get his thoughts on punting this
> to QEMU/userspace.  I'm worried that "handling" uarch specific MSRs in KVM is
> going to paint us into a corner and force KVM to check guest F/M/S someday, which
> I want to avoid at pretty much all costs.

We already do similar ignoring in KVM for MSR_AMD64_BU_CFG2, MSR_AMD64_DC_CFG
and MSR_F15H_EX_CFG, so doing this {BU_CFG2,TW_CFG} MSR filtering in QEMU would
be inconsistent with these.

But let's wait for Paolo's opinion then.

Thanks,
Maciej
  
Sean Christopherson Oct. 6, 2023, 12:44 a.m. UTC | #8
On Thu, Oct 05, 2023, Maciej S. Szmigiero wrote:
> On 5.10.2023 02:10, Sean Christopherson wrote:
> > On Mon, Oct 02, 2023, Maciej S. Szmigiero wrote:
> > > On 26.09.2023 00:25, Tom Lendacky wrote:
> > > > > > It's partially documented in various AMD BKDGs, however I couldn't find
> > > > > > any definition for this particular bit (8) - other than that it is reserved.
> > > > > 
> > > > > I found it as MSR_AMD64_BU_CFG for Model 16h, but that's Jaguar/Puma, not Zen1.
> > > > > My guess is that Windows is trying to write this thing:
> > > > > 
> > > > >     MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)
> > > > >     Read-write. Reset: 0000_0000_0000_0000h.
> > > > >     _lthree0_core[3,1]; MSRC001_1023
> > > > > 
> > > > >     Bits   Description
> > > > >     63:50  Reserved.
> > > > >     49     TwCfgCombineCr0Cd: combine CR0_CD for both threads of a core. Read-write. Reset: 0. Init: BIOS,1.
> > > > >            1=The host Cr0_Cd values from the two threads are OR'd together and used by both threads.
> > > > >     48:0   Reserved.
> > > > > 
> > > > > Though that still doesn't explain bit 8...  Perhaps a chicken-bit related to yet
> > > > > another speculation bug?
> > > > > 
> > > > > Boris or Tom, any idea what Windows is doing?  I doubt it changes our options in
> > > > > terms of "fixing" this in KVM, but having a somewhat accurate/helpful changelog
> > > > > would be nice.
> > > > 
> > > > It's definitely not related to a speculation bug, but I'm unsure what was
> > > > told to Microsoft that has them performing that WRMSR. The patch does the
> > > > proper thing, though, as a guest shouldn't be updating that setting.
> > > > 
> > > > And TW_CFG is the proper name of that MSR for Zen.
> > > 
> > > So, should I prepare v2 with MSR_AMD64_BU_CFG -> MSR_AMD64_TW_CFG change?
> > 
> > If we can get Paolo's attention, I'd like to get his thoughts on punting this
> > to QEMU/userspace.  I'm worried that "handling" uarch specific MSRs in KVM is
> > going to paint us into a corner and force KVM to check guest F/M/S someday, which
> > I want to avoid at pretty much all costs.
> 
> We already do similar ignoring in KVM for MSR_AMD64_BU_CFG2, MSR_AMD64_DC_CFG
> and MSR_F15H_EX_CFG, so doing this {BU_CFG2,TW_CFG} MSR filtering in QEMU would
> be inconsistent with these.

Not if QEMU filters those too. :-)

The MSR filter mechanism wasn't a thing back when KVM added "support" for those
MSR, so I don't feel that punting to userspace would be inconsistent.  It's more
along the lines of asking/requiring userspace to utilize a new tool to solve a
problem that is best solved in userspace, with a few outliers that got
grandfathered in.

Anyways, yeah, let's get Paolo's feedback.
  
Paolo Bonzini Oct. 19, 2023, 5:37 p.m. UTC | #9
On Fri, Oct 6, 2023 at 2:44 AM Sean Christopherson <seanjc@google.com> wrote:
> > We already do similar ignoring in KVM for MSR_AMD64_BU_CFG2, MSR_AMD64_DC_CFG
> > and MSR_F15H_EX_CFG, so doing this {BU_CFG2,TW_CFG} MSR filtering in QEMU would
> > be inconsistent with these.
>
> Not if QEMU filters those too. :-)
>
> The MSR filter mechanism wasn't a thing back when KVM added "support" for those
> MSR, so I don't feel that punting to userspace would be inconsistent.  It's more
> along the lines of asking/requiring userspace to utilize a new tool to solve a
> problem that is best solved in userspace, with a few outliers that got
> grandfathered in.

As long as it is enough to ignore the value and read as zero, IMO there
is an advantage in doing so in KVM, because it centralizes the update to
one place instead of requiring changes to all userspace implementations
(those that can run Windows can be counted on one hand, granted, but
still).

If we get into giving semantics to really-model-specific registers, the
disadvantages would be way more pronounced, however. I don't think we
should get any of those MSRs into KVM_GET_MSR_INDEX_LIST ever, and
we shouldn't implement them in KVM if KVM_SET_MSR has to do anything.
Also, we probably shouldn't implement them in KVM if KVM_GET_MSR
has to ever return anything but zero.

We almost have one of those, a "feature MSR" bit that is used to pass
down information on whether LFENCE serializes execution; but it's not
supported by KVM_GET_MSR/KVM_SET_MSR. I'd retcon this happily as
"we don't want any stateful chicken bit MSRs in the kernel", and I'm
in favor of removing it if there are no complaints, now that AMD has
defined a bit in 0x80000021 for this purpose.

We have the microcode revision, which we had to add because
of the side-channel mess of a few years ago.
And we also have MSR_AMD64_OSVW_ID_LENGTH and
MSR_AMD64_OSVW_STATUS, but those technically are vendor-
specific but not model-specific and have an associated CPUID bit. I
think those are as close as we get to being a mistake, but still (barely
so) on the acceptable side.

Paolo
  

Patch

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 1d111350197f..c80a5cea80c4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -553,6 +553,7 @@ 
 #define MSR_AMD64_CPUID_FN_1		0xc0011004
 #define MSR_AMD64_LS_CFG		0xc0011020
 #define MSR_AMD64_DC_CFG		0xc0011022
+#define MSR_AMD64_BU_CFG		0xc0011023
 
 #define MSR_AMD64_DE_CFG		0xc0011029
 #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT	 1
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9f18b06bbda6..2f3cdd798185 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3639,6 +3639,7 @@  int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_UCODE_WRITE:
 	case MSR_VM_HSAVE_PA:
 	case MSR_AMD64_PATCH_LOADER:
+	case MSR_AMD64_BU_CFG:
 	case MSR_AMD64_BU_CFG2:
 	case MSR_AMD64_DC_CFG:
 	case MSR_F15H_EX_CFG:
@@ -4062,6 +4063,7 @@  int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_K8_INT_PENDING_MSG:
 	case MSR_AMD64_NB_CFG:
 	case MSR_FAM10H_MMIO_CONF_BASE:
+	case MSR_AMD64_BU_CFG:
 	case MSR_AMD64_BU_CFG2:
 	case MSR_IA32_PERF_CTL:
 	case MSR_AMD64_DC_CFG: