perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS

Message ID 20230517133808.67885-1-likexu@tencent.com
State New
Headers
Series perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS |

Commit Message

Like Xu May 17, 2023, 1:38 p.m. UTC
  From: Like Xu <likexu@tencent.com>

After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
the VMX hardware MSR switching mechanism to save/restore invalid values
for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
Fix it by using the active host value from cpuc->active_pebs_data_cfg.

Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Liang, Kan May 18, 2023, 4:31 p.m. UTC | #1
On 2023-05-17 9:38 a.m., Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
> the VMX hardware MSR switching mechanism to save/restore invalid values
> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.

I believe we clear the SW bit when it takes effect.

+	if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
+		cpuc->pebs_data_cfg = pebs_data_cfg;
+		pebs_update_threshold(cpuc);
+	}

I think the SW bit can only be seen in a shot period between add() and
enable(). Is it caused by a VM enter which just happens on the period?

> Fix it by using the active host value from cpuc->active_pebs_data_cfg.

I don't see a problem of using active_pebs_data_cfg, since it reflects
the current MSR setting. Just curious about how it's triggered.

> 
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>

Thanks,
Kan

>  arch/x86/events/intel/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 070cc4ef2672..89b9c1cebb61 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  	if (x86_pmu.intel_cap.pebs_baseline) {
>  		arr[(*nr)++] = (struct perf_guest_switch_msr){
>  			.msr = MSR_PEBS_DATA_CFG,
> -			.host = cpuc->pebs_data_cfg,
> +			.host = cpuc->active_pebs_data_cfg,
>  			.guest = kvm_pmu->pebs_data_cfg,
>  		};
>  	}
  
Like Xu May 19, 2023, 7:40 a.m. UTC | #2
On 19/5/2023 12:31 am, Liang, Kan wrote:
> 
> 
> On 2023-05-17 9:38 a.m., Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
>> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
>> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
>> the VMX hardware MSR switching mechanism to save/restore invalid values
>> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
> 
> I believe we clear the SW bit when it takes effect.
> 
> +	if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
> +		cpuc->pebs_data_cfg = pebs_data_cfg;
> +		pebs_update_threshold(cpuc);
> +	}
> 
> I think the SW bit can only be seen in a shot period between add() and
> enable(). Is it caused by a VM enter which just happens on the period?

What happens here is that when *intel_pmu_pebs_del()* is called,
the pebs_update_state() also triggers:
	cpuc->pebs_data_cfg |= PEBS_UPDATE_DS_SW;
and the new value will then be used for the next kvm_entry.

The KVM created pebs perf_event is not added/enabled at this point
and the cpuc->pebs_data_cfg strangely holds a non-zero value.

Perhaps there is more room for perf fixes here, but for guest pebs usages,
using active_pebs_data_cfg in intel_guest_get_msrs() is part of what is needed.

> 
>> Fix it by using the active host value from cpuc->active_pebs_data_cfg.
> 
> I don't see a problem of using active_pebs_data_cfg, since it reflects
> the current MSR setting. Just curious about how it's triggered.
> 
>>
>> Cc: Kan Liang <kan.liang@linux.intel.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
> 
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> 
> Thanks,
> Kan
> 
>>   arch/x86/events/intel/core.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 070cc4ef2672..89b9c1cebb61 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>>   	if (x86_pmu.intel_cap.pebs_baseline) {
>>   		arr[(*nr)++] = (struct perf_guest_switch_msr){
>>   			.msr = MSR_PEBS_DATA_CFG,
>> -			.host = cpuc->pebs_data_cfg,
>> +			.host = cpuc->active_pebs_data_cfg,
>>   			.guest = kvm_pmu->pebs_data_cfg,
>>   		};
>>   	}
  
Liang, Kan May 19, 2023, 12:33 p.m. UTC | #3
On 2023-05-19 3:40 a.m., Like Xu wrote:
> On 19/5/2023 12:31 am, Liang, Kan wrote:
>>
>>
>> On 2023-05-17 9:38 a.m., Like Xu wrote:
>>> From: Like Xu <likexu@tencent.com>
>>>
>>> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when
>>> changing
>>> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
>>> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
>>> the VMX hardware MSR switching mechanism to save/restore invalid values
>>> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for
>>> guest.
>>
>> I believe we clear the SW bit when it takes effect.
>>
>> +    if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
>> +        cpuc->pebs_data_cfg = pebs_data_cfg;
>> +        pebs_update_threshold(cpuc);
>> +    }
>>
>> I think the SW bit can only be seen in a shot period between add() and
>> enable(). Is it caused by a VM enter which just happens on the period?
> 
> What happens here is that when *intel_pmu_pebs_del()* is called,

Ah, yes. I think it's useless to set the bit for the removal. We may
need a cleanup patch to improve it.

Thanks,
Kan

> the pebs_update_state() also triggers:
>     cpuc->pebs_data_cfg |= PEBS_UPDATE_DS_SW;
> and the new value will then be used for the next kvm_entry.
> 
> The KVM created pebs perf_event is not added/enabled at this point
> and the cpuc->pebs_data_cfg strangely holds a non-zero value.
> 
> Perhaps there is more room for perf fixes here, but for guest pebs usages,
> using active_pebs_data_cfg in intel_guest_get_msrs() is part of what is
> needed.
> 
>>
>>> Fix it by using the active host value from cpuc->active_pebs_data_cfg.
>>
>> I don't see a problem of using active_pebs_data_cfg, since it reflects
>> the current MSR setting. Just curious about how it's triggered.
>>
>>>
>>> Cc: Kan Liang <kan.liang@linux.intel.com>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>> ---
>>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
>>
>> Thanks,
>> Kan
>>
>>>   arch/x86/events/intel/core.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>> index 070cc4ef2672..89b9c1cebb61 100644
>>> --- a/arch/x86/events/intel/core.c
>>> +++ b/arch/x86/events/intel/core.c
>>> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr
>>> *intel_guest_get_msrs(int *nr, void *data)
>>>       if (x86_pmu.intel_cap.pebs_baseline) {
>>>           arr[(*nr)++] = (struct perf_guest_switch_msr){
>>>               .msr = MSR_PEBS_DATA_CFG,
>>> -            .host = cpuc->pebs_data_cfg,
>>> +            .host = cpuc->active_pebs_data_cfg,
>>>               .guest = kvm_pmu->pebs_data_cfg,
>>>           };
>>>       }
  
Sean Christopherson June 12, 2023, 8:45 p.m. UTC | #4
+KVM

On Wed, May 17, 2023, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
> the VMX hardware MSR switching mechanism to save/restore invalid values
> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
> Fix it by using the active host value from cpuc->active_pebs_data_cfg.

In the future, please Cc: kvm@vger.kernel.org when posting fixes that obviously
affect KVM.  I wasted several hours bisecting these crashes.  In hindsight, I
should have searched all of lore sooner, but it really shouldn't have been that
hard for me to find this fix.

> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/events/intel/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 070cc4ef2672..89b9c1cebb61 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  	if (x86_pmu.intel_cap.pebs_baseline) {
>  		arr[(*nr)++] = (struct perf_guest_switch_msr){
>  			.msr = MSR_PEBS_DATA_CFG,
> -			.host = cpuc->pebs_data_cfg,
> +			.host = cpuc->active_pebs_data_cfg,
>  			.guest = kvm_pmu->pebs_data_cfg,
>  		};
>  	}
> -- 
> 2.40.1
>
  

Patch

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 070cc4ef2672..89b9c1cebb61 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4074,7 +4074,7 @@  static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 	if (x86_pmu.intel_cap.pebs_baseline) {
 		arr[(*nr)++] = (struct perf_guest_switch_msr){
 			.msr = MSR_PEBS_DATA_CFG,
-			.host = cpuc->pebs_data_cfg,
+			.host = cpuc->active_pebs_data_cfg,
 			.guest = kvm_pmu->pebs_data_cfg,
 		};
 	}