[0/4] amd_pstate: Add guided autonomous mode support

Message ID 20221207154648.233759-1-wyes.karny@amd.com
Headers
Series amd_pstate: Add guided autonomous mode support |

Message

Wyes Karny Dec. 7, 2022, 3:46 p.m. UTC
  From ACPI spec[1] below 3 modes for CPPC can be defined:
1. Non autonomous: OS scaling governor specifies operating frequency/
   performance level through `Desired Performance` register and PMFW
follows that.
2. Guided autonomous: OS scaling governor specifies min and max
   frequencies/ performance levels through `Minimum Performance` and
`Maximum Performance` register, and PMFW can autonomously select an
operating frequency in this range.
3. Fully autonomous: OS only hints (via EPP) to PMFW for the required
   energy performance preference for the workload and PMFW autonomously
scales the frequency.

Currently (1) is supported by amd_pstate as passive mode, and (3) is
implemented by EPP support[2]. This change is to support (2).

In guided autonomous mode the min_perf is based on the input from the
scaling governor. For example, in case of schedutil this value depends
on the current utilization. And max_perf is set to max capacity.

To activate guided auto mode ``amd_pstate=guided`` command line
parameter has to be passed in the kernel.

Below are the results (normalized) of benchmarks with this patch:
System: Genoa 96C 192T
Kernel: v6.1-rc6 + patch
Scaling governor: schedutil

================ tbench  ================
tbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.16 (16.00 pct)	   2.20 (120.00 pct)
    2	   1.97 (0.00 pct)	   2.29 (16.24 pct)	   4.38 (122.33 pct)
    4	   3.95 (0.00 pct)	   4.51 (14.17 pct)	   8.50 (115.18 pct)
    8	   7.83 (0.00 pct)	   8.89 (13.53 pct)	  16.62 (112.26 pct)
   16	  15.28 (0.00 pct)	  16.81 (10.01 pct)	  31.02 (103.01 pct)
   32	  41.64 (0.00 pct)	  30.67 (-26.34 pct)	  55.63 (33.59 pct)
   64	  91.29 (0.00 pct)	  79.67 (-12.72 pct)	  91.74 (0.49 pct)
  128	 118.06 (0.00 pct)	 122.34 (3.62 pct)	 122.04 (3.37 pct)
  256	 260.47 (0.00 pct)	 264.31 (1.47 pct)	 264.49 (1.54 pct)
  512	 254.16 (0.00 pct)	 245.25 (-3.50 pct)	 245.50 (-3.40 pct)
tbench power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.00 (0.00 pct)	   1.15 (15.00 pct)
    2	   0.99 (0.00 pct)	   1.00 (1.01 pct)	   1.17 (18.18 pct)
    4	   1.01 (0.00 pct)	   1.02 (0.99 pct)	   1.24 (22.77 pct)
    8	   1.05 (0.00 pct)	   1.06 (0.95 pct)	   1.36 (29.52 pct)
   16	   1.15 (0.00 pct)	   1.13 (-1.73 pct)	   1.58 (37.39 pct)
   32	   1.71 (0.00 pct)	   1.30 (-23.97 pct)	   1.96 (14.61 pct)
   64	   2.35 (0.00 pct)	   2.15 (-8.51 pct)	   2.36 (0.42 pct)
  128	   2.77 (0.00 pct)	   2.77 (0.00 pct)	   2.78 (0.36 pct)
  256	   3.39 (0.00 pct)	   3.41 (0.58 pct)	   3.43 (1.17 pct)
  512	   3.42 (0.00 pct)	   3.40 (-0.58 pct)	   3.41 (-0.29 pct)

================ dbench  ================
dbench result comparison: (higher the better)
Here results are throughput (MB/s)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   0.96 (-4.00 pct)	   1.02 (2.00 pct)
    2	   1.89 (0.00 pct)	   1.90 (0.52 pct)	   1.91 (1.05 pct)
    4	   3.39 (0.00 pct)	   3.31 (-2.35 pct)	   3.38 (-0.29 pct)
    8	   5.56 (0.00 pct)	   5.46 (-1.79 pct)	   5.60 (0.71 pct)
   16	   7.25 (0.00 pct)	   7.90 (8.96 pct)	   8.29 (14.34 pct)
   32	  10.85 (0.00 pct)	  10.00 (-7.83 pct)	  10.40 (-4.14 pct)
   64	  12.30 (0.00 pct)	  11.94 (-2.92 pct)	  11.82 (-3.90 pct)
  128	  12.56 (0.00 pct)	  12.30 (-2.07 pct)	  12.98 (3.34 pct)
  256	   6.55 (0.00 pct)	   6.54 (-0.15 pct)	   7.38 (12.67 pct)
  512	   1.61 (0.00 pct)	   1.58 (-1.86 pct)	   1.95 (21.11 pct)
dbench power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
    1	   1.00 (0.00 pct)	   1.01 (1.00 pct)	   1.05 (5.00 pct)
    2	   1.07 (0.00 pct)	   1.07 (0.00 pct)	   1.09 (1.86 pct)
    4	   1.15 (0.00 pct)	   1.15 (0.00 pct)	   1.16 (0.86 pct)
    8	   1.26 (0.00 pct)	   1.26 (0.00 pct)	   1.27 (0.79 pct)
   16	   1.39 (0.00 pct)	   1.41 (1.43 pct)	   1.43 (2.87 pct)
   32	   1.60 (0.00 pct)	   1.56 (-2.50 pct)	   1.59 (-0.62 pct)
   64	   1.75 (0.00 pct)	   1.75 (0.00 pct)	   1.74 (-0.57 pct)
  128	   1.90 (0.00 pct)	   1.91 (0.52 pct)	   1.93 (1.57 pct)
  256	   1.76 (0.00 pct)	   1.77 (0.56 pct)	   1.85 (5.11 pct)
  512	   1.55 (0.00 pct)	   1.49 (-3.87 pct)	   1.73 (11.61 pct)

================ git-source  ================
git-source result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Threads 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
  192	 1000.00 (0.00 pct)	 943.00 (-5.70 pct)	 1000.00 (0.00 pct)
git-source power comparison: (lower the better)
Threads 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
  192	   1.00 (0.00 pct)	   1.03 (3.00 pct)	   1.02 (2.00 pct)

================ kernbench  ================
kernbench result comparison: (higher the better)
Here results are throughput (compilations per 1000 sec)
Load 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
32	   1.00 (0.00 pct)	   0.94 (-6.00 pct)	   1.02 (2.00 pct)
48	   1.24 (0.00 pct)	   1.16 (-6.45 pct)	   1.24 (0.00 pct)
64	   1.35 (0.00 pct)	   1.30 (-3.70 pct)	   1.39 (2.96 pct)
96	   1.42 (0.00 pct)	   1.28 (-9.85 pct)	   1.48 (4.22 pct)
128	   1.39 (0.00 pct)	   1.29 (-7.19 pct)	   1.41 (1.43 pct)
192	   1.32 (0.00 pct)	   1.18 (-10.60 pct)	   1.32 (0.00 pct)
256	   1.28 (0.00 pct)	   1.14 (-10.93 pct)	   1.29 (0.78 pct)
384	   1.28 (0.00 pct)	   1.13 (-11.71 pct)	   1.27 (-0.78 pct)
git-source power comparison: (lower the better)
Clients 	acpi-cpufreq			amd_pst+passive			amd_pst+guided
   32	   1.00 (0.00 pct)	   1.04 (4.00 pct)	   0.95 (-5.00 pct)
   48	   0.83 (0.00 pct)	   0.90 (8.43 pct)	   0.82 (-1.20 pct)
   64	   0.80 (0.00 pct)	   0.82 (2.50 pct)	   0.75 (-6.25 pct)
   96	   0.77 (0.00 pct)	   0.81 (5.19 pct)	   0.75 (-2.59 pct)
  128	   0.78 (0.00 pct)	   0.82 (5.12 pct)	   0.75 (-3.84 pct)
  192	   0.84 (0.00 pct)	   0.89 (5.95 pct)	   0.83 (-1.19 pct)
  256	   0.84 (0.00 pct)	   0.89 (5.95 pct)	   0.84 (0.00 pct)
  384	   0.84 (0.00 pct)	   0.90 (7.14 pct)	   0.84 (0.00 pct)

[1]: https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
[2]: https://lore.kernel.org/lkml/20221110175847.3098728-1-Perry.Yuan@amd.com/

Wyes Karny (4):
  cpufreq: amd_pstate: Add guided autonomous mode
  Documentation: amd_pstate: Move amd_pstate param to alphabetical order
  cpufreq: amd_pstate: Expose sysfs interface to control state
  Documentation: amd_pstate: Add amd_pstate state sysfs file

 .../admin-guide/kernel-parameters.txt         |  26 ++-
 Documentation/admin-guide/pm/amd-pstate.rst   |  11 +
 drivers/cpufreq/amd-pstate.c                  | 202 +++++++++++++++++-
 3 files changed, 217 insertions(+), 22 deletions(-)
  

Comments

Rafael J. Wysocki Dec. 8, 2022, 11:48 a.m. UTC | #1
On Wed, Dec 7, 2022 at 4:47 PM Wyes Karny <wyes.karny@amd.com> wrote:
>
> From ACPI spec[1] below 3 modes for CPPC can be defined:
> 1. Non autonomous: OS scaling governor specifies operating frequency/
>    performance level through `Desired Performance` register and PMFW
> follows that.
> 2. Guided autonomous: OS scaling governor specifies min and max
>    frequencies/ performance levels through `Minimum Performance` and
> `Maximum Performance` register, and PMFW can autonomously select an
> operating frequency in this range.
> 3. Fully autonomous: OS only hints (via EPP) to PMFW for the required
>    energy performance preference for the workload and PMFW autonomously
> scales the frequency.
>
> Currently (1) is supported by amd_pstate as passive mode, and (3) is
> implemented by EPP support[2]. This change is to support (2).

Well, can you guys please agree on priorities?  Like which one is more
important and so it should go in first?

At this point I'm not sure what the ordering assumptions with respect
to the EPP series are.

Also this submission is late for 6.2, so I will only regard it as 6.3
material on the condition the above gets clarified.  In the meantime,
please address review comments and consider resending after 6.2-rc1 is
out.

Thanks!
  
Wyes Karny Dec. 9, 2022, 2:44 p.m. UTC | #2
Hi Rafael,

On 12/8/2022 5:18 PM, Rafael J. Wysocki wrote:
> Well, can you guys please agree on priorities?  Like which one is more
> important and so it should go in first?
> 
> At this point I'm not sure what the ordering assumptions with respect
> to the EPP series are.

I am working with Perry on this and we will post a new patchset.
Meanwhile, any review comments on this patchset is welcome.

> 
> Also this submission is late for 6.2, so I will only regard it as 6.3
> material on the condition the above gets clarified.  In the meantime,
> please address review comments and consider resending after 6.2-rc1 is
> out.
Sure, Thanks!