[1/2] perf/x86/rapl: Add support for Intel Meteor Lake

Message ID 20230104145831.25498-1-rui.zhang@intel.com
State New
Headers
Series [1/2] perf/x86/rapl: Add support for Intel Meteor Lake |

Commit Message

Zhang, Rui Jan. 4, 2023, 2:58 p.m. UTC
  Meteor Lake RAPL support is the same as previous Sky Lake.
Add Meteor Lake model for RAPL.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 arch/x86/events/rapl.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Dave Hansen Jan. 4, 2023, 4:55 p.m. UTC | #1
On 1/4/23 06:58, Zhang Rui wrote:
> @@ -807,6 +807,8 @@ static const struct x86_cpu_id rapl_model_match[] __initconst = {
>  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE,		&model_skl),
>  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P,	&model_skl),
>  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S,	&model_skl),
> +	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE,		&model_skl),
> +	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L,	&model_skl),
>  	{},
>  };

Any chance this will ever get architectural enumeration?  Or, are we
doomed to grow this model list until the end of time?
  
Zhang, Rui Jan. 5, 2023, 6:54 a.m. UTC | #2
On Wed, 2023-01-04 at 08:55 -0800, Dave Hansen wrote:
> On 1/4/23 06:58, Zhang Rui wrote:
> > @@ -807,6 +807,8 @@ static const struct x86_cpu_id
> > rapl_model_match[] __initconst = {
> >  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE,		&model_skl)
> > ,
> >  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P,	&model_skl),
> >  	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S,	&model_skl),
> > +	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE,		&model_skl)
> > ,
> > +	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L,	&model_skl),
> >  	{},
> >  };
> 
> Any chance this will ever get architectural enumeration?

We will get rid of the non-architechtural MSR interface on future
server platforms. But for client, we still have to live with it so far.

>   Or, are we
> doomed to grow this model list until the end of time?

I thought of this before and got some ideas related.
Say, instead of maintaining the model list in a series of drivers, can
we have something similar to "cpu_feature" instead? The feature flags
can be set in a central place, when adding the new CPU ID. Then all
these drivers can just probe based on the feature flag.
There are a list of drivers that could benefit from this, say,
intel_rapl, RAPL PMU, turbostat, thermal tcc cooling, PMC, etc.
currently, all these drivers have their own model lists.

thanks,
rui
  
Borislav Petkov Jan. 5, 2023, 9:55 a.m. UTC | #3
On Thu, Jan 05, 2023 at 06:54:31AM +0000, Zhang, Rui wrote:
> I thought of this before and got some ideas related.
> Say, instead of maintaining the model list in a series of drivers, can
> we have something similar to "cpu_feature" instead?

Yes, you can define a synthetic X86_FEATURE flag and set it for each CPU model
which supports the feature in arch/x86/kernel/cpu/intel.c so that at least all
the model matching gunk is kept where it belongs, in the CPU init code and the
other code simply tests that flag.
  
Zhang, Rui Jan. 6, 2023, 6:05 a.m. UTC | #4
On Thu, 2023-01-05 at 10:55 +0100, Borislav Petkov wrote:
> On Thu, Jan 05, 2023 at 06:54:31AM +0000, Zhang, Rui wrote:
> > I thought of this before and got some ideas related.
> > Say, instead of maintaining the model list in a series of drivers,
> > can
> > we have something similar to "cpu_feature" instead?
> 
> Yes, you can define a synthetic X86_FEATURE flag and set it for each
> CPU model
> which supports the feature in arch/x86/kernel/cpu/intel.c so that at
> least all
> the model matching gunk is kept where it belongs, in the CPU init
> code and the
> other code simply tests that flag.

Great, thanks for this info.

But I still have a question.
Take RAPL feature for example, the feature is not architectural,
although 80% of the platforms may follow the same behavior, but there
are still cases that behave differently. And so far, there are 8
different behaviors based on different models.

In this case, can we have several different flags for the RAPL feature
and make the RAPL driver probe on different RAPL flags? Or else, a
model list is still needed.

thanks,
rui
  
Ingo Molnar Jan. 6, 2023, 10:14 a.m. UTC | #5
* Zhang, Rui <rui.zhang@intel.com> wrote:

> On Thu, 2023-01-05 at 10:55 +0100, Borislav Petkov wrote:
> > On Thu, Jan 05, 2023 at 06:54:31AM +0000, Zhang, Rui wrote:
> > > I thought of this before and got some ideas related.
> > > Say, instead of maintaining the model list in a series of drivers,
> > > can
> > > we have something similar to "cpu_feature" instead?
> > 
> > Yes, you can define a synthetic X86_FEATURE flag and set it for each 
> > CPU model which supports the feature in arch/x86/kernel/cpu/intel.c so 
> > that at least all the model matching gunk is kept where it belongs, in 
> > the CPU init code and the other code simply tests that flag.
> 
> Great, thanks for this info.
> 
> But I still have a question.
>
> Take RAPL feature for example, the feature is not architectural, although 
> 80% of the platforms may follow the same behavior, but there are still 
> cases that behave differently. And so far, there are 8 different 
> behaviors based on different models.
> 
> In this case, can we have several different flags for the RAPL feature 
> and make the RAPL driver probe on different RAPL flags? Or else, a model 
> list is still needed.

Introducing a synthethic CPU flag only makes sense for behavior that is 
near-100% identical among models - ie. if the only thing missing is the 
CPUID enumeration.

If RAPL details are continuously changing among CPU models, with no real 
architected compatibility guarantees, then it probably only makes sense to 
introduce the flag once it's enumerated at the CPUID level as well.

Thanks,

	Ingo
  
Borislav Petkov Jan. 6, 2023, 10:39 a.m. UTC | #6
On Fri, Jan 06, 2023 at 06:05:41AM +0000, Zhang, Rui wrote:
> But I still have a question.
> Take RAPL feature for example, the feature is not architectural,
> although 80% of the platforms may follow the same behavior, but there
> are still cases that behave differently. And so far, there are 8
> different behaviors based on different models.
> 
> In this case, can we have several different flags for the RAPL feature
> and make the RAPL driver probe on different RAPL flags? Or else, a
> model list is still needed.

Well, you asked about detecting CPUs supporting RAPL.

Now you're asking about different RAPL "subfunctionality" or whatever.

You could do the synthetic flag for feature detection because apparently giving
it a CPUID flag is soo expensive (/sarcastic eyeroll) and then you can pick
apart subfeatures in the RAPL code and do flags there, away from the x86 arch
code because no one should see that.

Thx.
  
Ingo Molnar Jan. 6, 2023, 10:56 a.m. UTC | #7
* Borislav Petkov <bp@alien8.de> wrote:

> On Fri, Jan 06, 2023 at 06:05:41AM +0000, Zhang, Rui wrote:
> > But I still have a question.
> > Take RAPL feature for example, the feature is not architectural,
> > although 80% of the platforms may follow the same behavior, but there
> > are still cases that behave differently. And so far, there are 8
> > different behaviors based on different models.
> > 
> > In this case, can we have several different flags for the RAPL feature
> > and make the RAPL driver probe on different RAPL flags? Or else, a
> > model list is still needed.
> 
> Well, you asked about detecting CPUs supporting RAPL.
> 
> Now you're asking about different RAPL "subfunctionality" or whatever.
> 
> You could do the synthetic flag for feature detection because apparently 
> giving it a CPUID flag is soo expensive (/sarcastic eyeroll) and then you 
> can pick apart subfeatures in the RAPL code and do flags there, away from 
> the x86 arch code because no one should see that.

It's a trade-off in any case: there's a point where quirk flags or even 
feature flags become harder to read and harder to maintain than cleanly 
separated per model driver functions.

Especially if internally at Intel RAPL functionality is not yet treated as 
an 'architected' feature, and new aspects are added in a less disciplined 
fashion ...

Sometimes the addition of an 'architected' CPU feature iterates the 
feature-set non-trivially - as people consider it a last-minute chance to 
get in their favorite features without having to deal with backwards 
compatibility ...

So I'm somewhat pessimistically leaning towards the current status quo of 
per model enumeration. Would be glad to be proven wrong too. :-)

Thanks,

	Ingo
  
Borislav Petkov Jan. 6, 2023, 11:11 a.m. UTC | #8
On Fri, Jan 06, 2023 at 11:56:18AM +0100, Ingo Molnar wrote:
> It's a trade-off in any case: there's a point where quirk flags or even 
> feature flags become harder to read and harder to maintain than cleanly 
> separated per model driver functions.

Yeah, no, singular: a synthetic feature *flag*: X86_FEATURE_RAPL.

cpu/intel.c can set it and driver can test it.

Everything else inside the driver.

Until Intel can get their act together and actually do a CPUID bit like AMD. :-P

But when you think about it, whether the model matching happens in the driver or
in cpu/intel.c doesn't matter a whole lot.

All that matters is, they should finally give it a CPUID bit.
  
Ingo Molnar Jan. 6, 2023, 11:33 a.m. UTC | #9
* Borislav Petkov <bp@alien8.de> wrote:

> On Fri, Jan 06, 2023 at 11:56:18AM +0100, Ingo Molnar wrote:
> > It's a trade-off in any case: there's a point where quirk flags or even 
> > feature flags become harder to read and harder to maintain than cleanly 
> > separated per model driver functions.
> 
> Yeah, no, singular: a synthetic feature *flag*: X86_FEATURE_RAPL.
> 
> cpu/intel.c can set it and driver can test it.
> 
> Everything else inside the driver.
> 
> Until Intel can get their act together and actually do a CPUID bit like AMD. :-P
> 
> But when you think about it, whether the model matching happens in the driver or
> in cpu/intel.c doesn't matter a whole lot.
> 
> All that matters is, they should finally give it a CPUID bit.

The other thing that matters here are the RAPL *incompatibilities* between 
model variants, which are significant AFAICS.

With a CPUID we get a kind of semi-compatible hardware interface with well 
defined semantics & expansion.

With 'non-architectural', per-model RAPL features we get very little of 
that...

Which is why it's a trade-off that is hard to judge in advance: maybe we 
can simplify the code via a synthethic CPUID[s], maybe it will just be 
another zoo of per-model feature flags...

Likely won't be able to tell for sure until we see patches.

Thanks,

	Ingo
  
Zhang, Rui Jan. 6, 2023, 2:38 p.m. UTC | #10
Hi, Boris,

On Fri, 2023-01-06 at 11:39 +0100, Borislav Petkov wrote:
> On Fri, Jan 06, 2023 at 06:05:41AM +0000, Zhang, Rui wrote:
> > But I still have a question.
> > Take RAPL feature for example, the feature is not architectural,
> > although 80% of the platforms may follow the same behavior, but
> > there
> > are still cases that behave differently. And so far, there are 8
> > different behaviors based on different models.
> > 
> > In this case, can we have several different flags for the RAPL
> > feature
> > and make the RAPL driver probe on different RAPL flags? Or else, a
> > model list is still needed.
> 
> Well, you asked about detecting CPUs supporting RAPL.
> 
> Now you're asking about different RAPL "subfunctionality" or
> whatever.
> 
Sorry that I was not clear enough.

My original proposal is that, instead of maintaining model lists in a
series of different drivers, can we use feature flags instead, and
maintain them in a central place instead of different drivers. say,
something like

static const struct x86_cpu_id intel_pm_features[] __initconst = {
        X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,           X86_FEATURE_RAPL | X86_FEATURE_TCC_COOLING),
        X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,           X86_FEATURE_RAPL | X86_FEATURE_UNCORE_FREQ),
        ... 
        X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,           X86_FEATURE_RAPL | X86_FEATURE_TCC_COOLING),
        X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X,    X86_FEATURE_RAPL | X86_FEATURE_UNCORE_FREQ),
        ...
        {},
};
And then set the feature flags based on this, and make the drivers test
the feature flags.

The goal of this is to do model list update in one place instead of 4
or more different drivers when a new model comes.

If yes, then, the second question is that,  there are cases like RAPL
which has model specific behavior. To make the driver totally clean, I'
m wondering if we can have different flags for one feature so that we
don't need to maintain the exceptions in the driver. Say, something
like

static const struct x86_cpu_id intel_pm_features[] __initconst = {
     
   X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,           X86_FEATURE_RAPL_DEF
AULT | X86_FEATURE_TCC_COOLING),
        X86_MATCH_INTEL_FAM6_MODEL(SKYL
AKE_X,           X86_FEATURE_RAPL_FIX_DRAM | X86_FEATURE_UNCORE_FREQ),
 
... 
     
  X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,           X86_FEATURE_RAPL_DEFA
ULT | X86_FEATURE_TCC_COOLING),
        X86_MATCH_INTEL_FAM6_MODEL(SAPPH
IRERAPIDS_X,    X86_FEATURE_RAPL_FIX_PSYS | X86_FEATURE_UNCORE_FREQ),
  
...
        {},
};

> You could do the synthetic flag for feature detection because
> apparently giving
> it a CPUID flag is soo expensive (/sarcastic eyeroll) and then you
> can pick
> apart subfeatures in the RAPL code and do flags there, away from the
> x86 arch
> code because no one should see that.

I got your point.

thanks,
rui
  
Zhang, Rui Jan. 6, 2023, 2:45 p.m. UTC | #11
On Fri, 2023-01-06 at 12:33 +0100, Ingo Molnar wrote:
> * Borislav Petkov <bp@alien8.de> wrote:
> 
> > On Fri, Jan 06, 2023 at 11:56:18AM +0100, Ingo Molnar wrote:
> > > It's a trade-off in any case: there's a point where quirk flags
> > > or even 
> > > feature flags become harder to read and harder to maintain than
> > > cleanly 
> > > separated per model driver functions.
> > 
> > Yeah, no, singular: a synthetic feature *flag*: X86_FEATURE_RAPL.
> > 
> > cpu/intel.c can set it and driver can test it.
> > 
> > Everything else inside the driver.
> > 
> > Until Intel can get their act together and actually do a CPUID bit
> > like AMD. :-P
> > 
> > But when you think about it, whether the model matching happens in
> > the driver or
> > in cpu/intel.c doesn't matter a whole lot.
> > 
> > All that matters is, they should finally give it a CPUID bit.
> 
> The other thing that matters here are the RAPL *incompatibilities*
> between 
> model variants, which are significant AFAICS.
> 
> With a CPUID we get a kind of semi-compatible hardware interface with
> well 
> defined semantics & expansion.

Agreed.
> 
> With 'non-architectural', per-model RAPL features we get very little
> of 
> that...

Exactly.

The main purpose of the model list in RAPL PMU code and the intel_rapl
driver is to differentiate the model-specific behavior, say,
some models use standard energy unit retrieved from MSR
some models use a fixed energy unit for Dram Domain
and
some models use a fixed energy unit for Psys Domain
etc.

> 
> Which is why it's a trade-off that is hard to judge in advance: maybe
> we 
> can simplify the code via a synthethic CPUID[s], maybe it will just
> be 
> another zoo of per-model feature flags...

Agreed.

> Likely won't be able to tell for sure until we see patches.
> 
Yeah, let me cook up a RFC series later and we can continue with that.

thanks,
rui
  
Borislav Petkov Jan. 6, 2023, 2:49 p.m. UTC | #12
On Fri, Jan 06, 2023 at 02:38:10PM +0000, Zhang, Rui wrote:
> And then set the feature flags based on this, and make the drivers test
> the feature flags.

That would be the purpose of synthetic flags.

> The goal of this is to do model list update in one place instead of 4
> or more different drivers when a new model comes.

Do you really have to update 4 different places each time?

As said before, you have to do the model matching *somewhere*. If you have to do
model matching in a lot of drivers - and it looks like you do - judging by

$ git grep X86_MATCH_INTEL_FAM6

output, then doing the matching once in cpu/intel.c and setting a synthetic flag
does make sense because all that matching code will disappear from all the
drivers but be concentrated in one place.

Thx.
  
Dave Hansen Jan. 6, 2023, 2:50 p.m. UTC | #13
On 1/6/23 06:38, Zhang, Rui wrote:
> My original proposal is that, instead of maintaining model lists in a
> series of different drivers, can we use feature flags instead, and
> maintain them in a central place instead of different drivers. say,
> something like
> 
> static const struct x86_cpu_id intel_pm_features[] __initconst = {
>         X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,           X86_FEATURE_RAPL | X86_FEATURE_TCC_COOLING),
>         X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,           X86_FEATURE_RAPL | X86_FEATURE_UNCORE_FREQ),
>         ...
>         X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,           X86_FEATURE_RAPL | X86_FEATURE_TCC_COOLING),
>         X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X,    X86_FEATURE_RAPL | X86_FEATURE_UNCORE_FREQ),
>         ...
>         {},
> };
> And then set the feature flags based on this, and make the drivers test
> the feature flags.

That works if you have very few features.  SKYLAKE_X looks to have on
the order of 15 model-specific features, or at least references in the code.

That means that the

	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, ...

list goes on for 15 features.  It's even worse than that because you'd
*like* to be able to scan up and down the list looking for, say, "all
the CPUs that support RAPL".  But, if you do that, you actually need a
table -- a really wide table -- for *all* the features and a column for
each.

What we have now isn't bad.  The only real way to fix this is to have
the features enumerated *properly*, aka. architecturally.

I get it, Intel doesn't want to dedicate CPUID bits and architecture to
one-offs.  But, at the point that there are a dozen CPU models across
three or four different CPU generations, it's time to revisit it.  Could
you help our colleagues revisit it, please?
  
Zhang, Rui Jan. 7, 2023, 2:07 p.m. UTC | #14
On Fri, 2023-01-06 at 06:50 -0800, Dave Hansen wrote:
> On 1/6/23 06:38, Zhang, Rui wrote:
> > My original proposal is that, instead of maintaining model lists in
> > a
> > series of different drivers, can we use feature flags instead, and
> > maintain them in a central place instead of different drivers. say,
> > something like
> > 
> > static const struct x86_cpu_id intel_pm_features[] __initconst = {
> >         X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,           X86_FEATURE
> > _RAPL | X86_FEATURE_TCC_COOLING),
> >         X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,           X86_FEATURE
> > _RAPL | X86_FEATURE_UNCORE_FREQ),
> >         ...
> >         X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,           X86_FEATURE
> > _RAPL | X86_FEATURE_TCC_COOLING),
> >         X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X,    X86_FEATURE
> > _RAPL | X86_FEATURE_UNCORE_FREQ),
> >         ...
> >         {},
> > };
> > And then set the feature flags based on this, and make the drivers
> > test
> > the feature flags.
> 
> That works if you have very few features.  SKYLAKE_X looks to have on
> the order of 15 model-specific features, or at least references in
> the code.
> 
> That means that the
> 
> 	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, ...
> 
> list goes on for 15 features.  It's even worse than that because
> you'd
> *like* to be able to scan up and down the list looking for, say, "all
> the CPUs that support RAPL".  But, if you do that, you actually need
> a
> table -- a really wide table -- for *all* the features and a column
> for
> each.

That's true.

> 
> What we have now isn't bad.  The only real way to fix this is to have
> the features enumerated *properly*, aka. architecturally.
> 
> I get it, Intel doesn't want to dedicate CPUID bits and architecture
> to
> one-offs.

> But, at the point that there are a dozen CPU models across
> three or four different CPU generations, it's time to revisit
> it.  Could
> you help our colleagues revisit it, please?

For this RAPL case, I think the biggest problem is the RAPL
*incompatibilities* between model variants as Ingo pointed out.
So a CPUID bit can not solve all the problems.

But given that the biggest inconsistency is the energy unit used on
different generations, I can also check with our colleagues if there is
a software visible way to get the "fixed" energy units rather than
hardcoding it in the driver using a model list.

thanks,
rui
  

Patch

diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index a829492bca4c..0ef255602aa8 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -807,6 +807,8 @@  static const struct x86_cpu_id rapl_model_match[] __initconst = {
 	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE,		&model_skl),
 	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P,	&model_skl),
 	X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S,	&model_skl),
+	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE,		&model_skl),
+	X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L,	&model_skl),
 	{},
 };
 MODULE_DEVICE_TABLE(x86cpu, rapl_model_match);