[v6,03/20] KVM: x86/pmu: Don't enumerate arch events KVM doesn't support

Message ID 20231104000239.367005-4-seanjc@google.com
State New
Headers
Series KVM: x86/pmu: selftests: Fixes and new tests |

Commit Message

Sean Christopherson Nov. 4, 2023, 12:02 a.m. UTC
  Don't advertise support to userspace for architectural events that KVM
doesn't support, i.e. for "real" events that aren't listed in
intel_pmu_architectural_events.  On current hardware, this effectively
means "don't advertise support for Top Down Slots".

Mask off the associated "unavailable" bits, as said bits for undefined
events are reserved to zero.  Arguably the events _are_ defined, but from
a KVM perspective they might as well not exist, and there's absolutely no
reason to leave useless unavailable bits set.

Fixes: a6c06ed1a60a ("KVM: Expose the architectural performance monitoring CPUID leaf")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 9 +++++++++
 1 file changed, 9 insertions(+)
  

Comments

Jim Mattson Nov. 4, 2023, 12:41 p.m. UTC | #1
On Fri, Nov 3, 2023 at 5:02 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Don't advertise support to userspace for architectural events that KVM
> doesn't support, i.e. for "real" events that aren't listed in
> intel_pmu_architectural_events.  On current hardware, this effectively
> means "don't advertise support for Top Down Slots".

NR_REAL_INTEL_ARCH_EVENTS is only used in intel_hw_event_available().
As discussed (https://lore.kernel.org/kvm/ZUU12-TUR_1cj47u@google.com/),
intel_hw_event_available() should go away.

Aside from mapping fixed counters to event selector and unit mask
(fixed_pmc_events[]), KVM has no reason to know when a new
architectural event is defined.

The variable that this change "fixes" is only used to feed
CPUID.0AH:EBX in KVM_GET_SUPPORTED_CPUID, and kvm_pmu_cap.events_mask
is already constructed from what host perf advertises support for.

> Mask off the associated "unavailable" bits, as said bits for undefined
> events are reserved to zero.  Arguably the events _are_ defined, but from
> a KVM perspective they might as well not exist, and there's absolutely no
> reason to leave useless unavailable bits set.
>
> Fixes: a6c06ed1a60a ("KVM: Expose the architectural performance monitoring CPUID leaf")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/pmu_intel.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 3316fdea212a..8d545f84dc4a 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -73,6 +73,15 @@ static void intel_init_pmu_capability(void)
>         int i;
>
>         /*
> +        * Do not enumerate support for architectural events that KVM doesn't
> +        * support.  Clear unsupported events "unavailable" bit as well, as
> +        * architecturally such bits are reserved to zero.
> +        */
> +       kvm_pmu_cap.events_mask_len = min(kvm_pmu_cap.events_mask_len,
> +                                         NR_REAL_INTEL_ARCH_EVENTS);
> +       kvm_pmu_cap.events_mask &= GENMASK(kvm_pmu_cap.events_mask_len - 1, 0);
> +
> +        /*
>          * Perf may (sadly) back a guest fixed counter with a general purpose
>          * counter, and so KVM must hide fixed counters whose associated
>          * architectural event are unsupported.  On real hardware, this should
> --
> 2.42.0.869.gea05f2083d-goog
>
  
Mi, Dapeng Nov. 7, 2023, 7:14 a.m. UTC | #2
On 11/4/2023 8:41 PM, Jim Mattson wrote:
> On Fri, Nov 3, 2023 at 5:02 PM Sean Christopherson <seanjc@google.com> wrote:
>> Don't advertise support to userspace for architectural events that KVM
>> doesn't support, i.e. for "real" events that aren't listed in
>> intel_pmu_architectural_events.  On current hardware, this effectively
>> means "don't advertise support for Top Down Slots".
> NR_REAL_INTEL_ARCH_EVENTS is only used in intel_hw_event_available().
> As discussed (https://lore.kernel.org/kvm/ZUU12-TUR_1cj47u@google.com/),
> intel_hw_event_available() should go away.
>
> Aside from mapping fixed counters to event selector and unit mask
> (fixed_pmc_events[]), KVM has no reason to know when a new
> architectural event is defined.


Since intel_hw_event_available() would be removed, it looks the enum 
intel_pmu_architectural_events and intel_arch_events[] array become 
useless. We can directly simply modify current fixed_pmc_events[] array 
and use it to store fixed counter events code and umask.


>
> The variable that this change "fixes" is only used to feed
> CPUID.0AH:EBX in KVM_GET_SUPPORTED_CPUID, and kvm_pmu_cap.events_mask
> is already constructed from what host perf advertises support for.
>
>> Mask off the associated "unavailable" bits, as said bits for undefined
>> events are reserved to zero.  Arguably the events _are_ defined, but from
>> a KVM perspective they might as well not exist, and there's absolutely no
>> reason to leave useless unavailable bits set.
>>
>> Fixes: a6c06ed1a60a ("KVM: Expose the architectural performance monitoring CPUID leaf")
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>>   arch/x86/kvm/vmx/pmu_intel.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index 3316fdea212a..8d545f84dc4a 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> @@ -73,6 +73,15 @@ static void intel_init_pmu_capability(void)
>>          int i;
>>
>>          /*
>> +        * Do not enumerate support for architectural events that KVM doesn't
>> +        * support.  Clear unsupported events "unavailable" bit as well, as
>> +        * architecturally such bits are reserved to zero.
>> +        */
>> +       kvm_pmu_cap.events_mask_len = min(kvm_pmu_cap.events_mask_len,
>> +                                         NR_REAL_INTEL_ARCH_EVENTS);
>> +       kvm_pmu_cap.events_mask &= GENMASK(kvm_pmu_cap.events_mask_len - 1, 0);
>> +
>> +        /*
>>           * Perf may (sadly) back a guest fixed counter with a general purpose
>>           * counter, and so KVM must hide fixed counters whose associated
>>           * architectural event are unsupported.  On real hardware, this should
>> --
>> 2.42.0.869.gea05f2083d-goog
>>
  
Sean Christopherson Nov. 7, 2023, 5:27 p.m. UTC | #3
On Tue, Nov 07, 2023, Dapeng Mi wrote:
> 
> On 11/4/2023 8:41 PM, Jim Mattson wrote:
> > On Fri, Nov 3, 2023 at 5:02 PM Sean Christopherson <seanjc@google.com> wrote:
> > > Don't advertise support to userspace for architectural events that KVM
> > > doesn't support, i.e. for "real" events that aren't listed in
> > > intel_pmu_architectural_events.  On current hardware, this effectively
> > > means "don't advertise support for Top Down Slots".
> > NR_REAL_INTEL_ARCH_EVENTS is only used in intel_hw_event_available().
> > As discussed (https://lore.kernel.org/kvm/ZUU12-TUR_1cj47u@google.com/),
> > intel_hw_event_available() should go away.
> > 
> > Aside from mapping fixed counters to event selector and unit mask
> > (fixed_pmc_events[]), KVM has no reason to know when a new
> > architectural event is defined.
> 
> 
> Since intel_hw_event_available() would be removed, it looks the enum
> intel_pmu_architectural_events and intel_arch_events[] array become useless.
> We can directly simply modify current fixed_pmc_events[] array and use it to
> store fixed counter events code and umask.

Yep, I came to the same conclusion.  This is what I ended up with yesterday:

/*
 * Map fixed counter events to architectural general purpose event encodings.
 * Perf doesn't provide APIs to allow KVM to directly program a fixed counter,
 * and so KVM instead programs the architectural event to effectively request
 * the fixed counter.  Perf isn't guaranteed to use a fixed counter and may
 * instead program the encoding into a general purpose counter, e.g. if a
 * different perf_event is already utilizing the requested counter, but the end
 * result is the same (ignoring the fact that using a general purpose counter
 * will likely exacerbate counter contention).
 *
 * Note, reference cycles is counted using a perf-defined "psuedo-encoding",
 * there is no architectural general purpose encoding for reference TSC cycles.
 */
static u64 intel_get_fixed_pmc_eventsel(int index)
{
        const struct {
                u8 eventsel;
                u8 unit_mask;
        } fixed_pmc_events[] = {
                [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */
                [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */
                [2] = { 0x00, 0x03 }, /* Reference TSC Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/
        };

        BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);

        return (fixed_pmc_events[index].unit_mask << 8) |
               fixed_pmc_events[index].eventsel;
}

...

static void intel_pmu_init(struct kvm_vcpu *vcpu)
{
        int i;
        struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
        struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);

        for (i = 0; i < KVM_INTEL_PMC_MAX_GENERIC; i++) {
                pmu->gp_counters[i].type = KVM_PMC_GP;
                pmu->gp_counters[i].vcpu = vcpu;
                pmu->gp_counters[i].idx = i;
                pmu->gp_counters[i].current_config = 0;
        }

        for (i = 0; i < KVM_PMC_MAX_FIXED; i++) {
                pmu->fixed_counters[i].type = KVM_PMC_FIXED;
                pmu->fixed_counters[i].vcpu = vcpu;
                pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED;
                pmu->fixed_counters[i].current_config = 0;
                pmu->fixed_counters[i].eventsel = intel_get_fixed_pmc_eventsel(i);
        }

        lbr_desc->records.nr = 0;
        lbr_desc->event = NULL;
        lbr_desc->msr_passthrough = false;
}
  

Patch

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 3316fdea212a..8d545f84dc4a 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -73,6 +73,15 @@  static void intel_init_pmu_capability(void)
 	int i;
 
 	/*
+	 * Do not enumerate support for architectural events that KVM doesn't
+	 * support.  Clear unsupported events "unavailable" bit as well, as
+	 * architecturally such bits are reserved to zero.
+	 */
+	kvm_pmu_cap.events_mask_len = min(kvm_pmu_cap.events_mask_len,
+					  NR_REAL_INTEL_ARCH_EVENTS);
+	kvm_pmu_cap.events_mask &= GENMASK(kvm_pmu_cap.events_mask_len - 1, 0);
+
+	 /*
 	 * Perf may (sadly) back a guest fixed counter with a general purpose
 	 * counter, and so KVM must hide fixed counters whose associated
 	 * architectural event are unsupported.  On real hardware, this should