Message ID | 20231009103621.374412-7-vincent.guittot@linaro.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a888:0:b0:403:3b70:6f57 with SMTP id x8csp1773973vqo; Mon, 9 Oct 2023 03:37:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFasjrp5pSru5n80KYwbQhQYXEO3gEGvJjqHfzuYSh4YW5lwYc2tATfaX+Vp+NxzMldSNCv X-Received: by 2002:a17:902:ee86:b0:1c5:ad14:907a with SMTP id a6-20020a170902ee8600b001c5ad14907amr12224434pld.38.1696847846281; Mon, 09 Oct 2023 03:37:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696847846; cv=none; d=google.com; s=arc-20160816; b=OeUSkQRFNkBsAcMF2NdwyA4Y6KgVGk2JGV9XZ+bl1XrCGFb3j28DntwRtp9IYeuw8r Q8Ucz1GgGbG2t4pshMNB1pXX8wqHN2CDwXZ2gzclAtYtCyMsLIT0qQd1k+z4PTUwpL/D xeHBUyBr2Z1K7ODxJTz7bg2rjCfhoE9C1tbXs5VPmCBfzyQDJU8LJ7rZPNxsyhlRC6AO P2kxWnB6Txqc2aSDWaDkMQj2c3eT9rJQX//wYKqZf4MVxYMhPYg+gtPJRID6vKGxPV2a 5d8V0h80BHBJxXtpXGTPu+LMEAwxRbn5cgfkv7BObbjFye3jBb8Peuv6fkG2H6tGwKg4 HUWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zLqmJToKVvGx3RPF93C/D7OTDNmDiwUSULcdREcteNg=; fh=CTzDJVM7oULtQm/gNhTIYIp8l1hhytXd8GM++3u6z3Y=; b=DjFvC9dLHD6HaiXwkcXTZihwUrm9iz02zDRFAYI1ilXTYFfuz0kUuqPyhCiEVPQVaw JJS6zViYFvrhUuC3KKQydsqAUjTpgjhyu/mv1V5H/vmO6ZyNI21hreGxpxJhTZVquLhv QdlZwFj0ppnCJqDFXD2uAMg/8aLMbaeV41Hn4dpyacYtllIWKftrAq92fvA+XDw2m+Ea 6nq+IGZBXfmdHMteE9d9DcXIaaahIdk/xUjVuC3+/rnVvTF0TLglFX14xum2kOV9qokZ PpVnMj2ebQqVTRxteB9jAIRJK+di2eO7EjiPZb2e3BbkcYCx7hbHcKKWdPekFduP7PGf iVSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="V4ndD/Fv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id s2-20020a170902ea0200b001c60ee5a9dcsi8755027plg.428.2023.10.09.03.37.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 03:37:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="V4ndD/Fv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id E2CA18052BE6; Mon, 9 Oct 2023 03:37:23 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346143AbjJIKgw (ORCPT <rfc822;ezelljr.billy@gmail.com> + 18 others); Mon, 9 Oct 2023 06:36:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345937AbjJIKgh (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 9 Oct 2023 06:36:37 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AC8CAB for <linux-kernel@vger.kernel.org>; Mon, 9 Oct 2023 03:36:35 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id ffacd0b85a97d-3296b49c546so2468783f8f.3 for <linux-kernel@vger.kernel.org>; Mon, 09 Oct 2023 03:36:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1696847793; x=1697452593; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zLqmJToKVvGx3RPF93C/D7OTDNmDiwUSULcdREcteNg=; b=V4ndD/FvTN8lDLyDZ+iL1ssnIQXGKkAxLC7Gh+qcxuxRwg/FhAD/TVT73fKADskQrU 1cveajp1LH72QU65QuUopll3bPyK9bDBOJzUAmTmoa3nZluUL7vesFFIU+SQQR/KkuPr 8m08QwvP9FKNhgHIVeNVG2Tt0F7BqhbJHr0F7duxV63YusafmwMjcKyO6ZK3axtUsZqo w6XTcqnQ+3blMZxf4mH+kQlo60LS+gNWeoaOCXY2kmIQFg75gPCpMyc+NfeFtkhM8ijY l4L5NejI8U/JD8NdUPBtpAmSVxw4F46tlb5HhFn7m+0/Rs8dWJz1T9wSMBZ6qSoOEcyH csog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696847793; x=1697452593; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zLqmJToKVvGx3RPF93C/D7OTDNmDiwUSULcdREcteNg=; b=ZGO7PZ2oGLs5ubx6SVJG+mq9gt9rYJ29u9RmN/K4mq624XpUT6MKl8oiy2lBtfCj6W EQdyRrpx3x4H/6TiI19oB4i0ZtxLbxpI+PtlcKrh6vzwctg0DeMMEquEjZvTvQdmyK2k 6YYo/0HAkF4U4gAoX/k2iI13Vvv9u2y8HS/4CRiD7Xg6zvEjHT1XoMNoC2A5yrWCr/Rz t1wehAtRGcfYb+Ghft4WwayuvDa5FcWw4S1IfGMQnP98z545QxtQw4QRHmMeZhie38KZ wXJItXurLVjCsnoJRI4vjk13TNNMmKfswezZJCyI5bmsQap81BsGF24UGp4CEPeQc6NU wQow== X-Gm-Message-State: AOJu0Yzwge8SNJRRMUOvH67RFqfl6RFotMlo0IeoY4EImMaP0kOFjgNF /a+pQUM/cSrEZ40pEc0amZsyNw== X-Received: by 2002:a5d:614b:0:b0:31c:8c93:61e3 with SMTP id y11-20020a5d614b000000b0031c8c9361e3mr12270603wrt.60.1696847793608; Mon, 09 Oct 2023 03:36:33 -0700 (PDT) Received: from vingu-book.. ([2a01:e0a:f:6020:53f1:24bc:5e47:821d]) by smtp.gmail.com with ESMTPSA id f16-20020adfdb50000000b0031ff89af0e4sm9226722wrj.99.2023.10.09.03.36.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 03:36:33 -0700 (PDT) From: Vincent Guittot <vincent.guittot@linaro.org> To: linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, sudeep.holla@arm.com, gregkh@linuxfoundation.org, rafael@kernel.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, viresh.kumar@linaro.org, lukasz.luba@arm.com, ionela.voinescu@arm.com, pierre.gondois@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, linux-pm@vger.kernel.org Cc: conor.dooley@microchip.com, suagrfillet@gmail.com, ajones@ventanamicro.com, lftan@kernel.org, Vincent Guittot <vincent.guittot@linaro.org> Subject: [PATCH v2 6/6] cpufreq/cppc: set the frequency used for capacity computation Date: Mon, 9 Oct 2023 12:36:21 +0200 Message-Id: <20231009103621.374412-7-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231009103621.374412-1-vincent.guittot@linaro.org> References: <20231009103621.374412-1-vincent.guittot@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 09 Oct 2023 03:37:23 -0700 (PDT) X-Spam-Level: ** X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779273927342987594 X-GMAIL-MSGID: 1779273927342987594 |
Series |
consolidate and cleanup CPU capacity
|
|
Commit Message
Vincent Guittot
Oct. 9, 2023, 10:36 a.m. UTC
cppc cpufreq driver can register an artificial energy model. In such case,
it also have to register the frequency that is used to define the CPU
capacity
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
Comments
Hello Vincent, On 10/9/23 12:36, Vincent Guittot wrote: > cppc cpufreq driver can register an artificial energy model. In such case, > it also have to register the frequency that is used to define the CPU > capacity > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > index fe08ca419b3d..24c6ba349f01 100644 > --- a/drivers/cpufreq/cppc_cpufreq.c > +++ b/drivers/cpufreq/cppc_cpufreq.c > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > return 0; > } > > + > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > +{ > + struct cppc_perf_caps *perf_caps; > + struct cppc_cpudata *cpu_data; > + unsigned int ref_freq; > + > + cpu_data = policy->driver_data; > + perf_caps = &cpu_data->perf_caps; > + > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > + > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used without energy model I believe. Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should be set for the whole perf domain in case this 'policy->cpu' goes offline. Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches the boosting frequency. We have: 'non-boosted max capacity' < 'boosted max capacity'. - If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max capacity'. The overutilization of the system seems to be triggered by comparing the CPU util to the 'boosted max capacity'. So systems might not be detected as overutilized. For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will be used. It is currently unknown to the function that the frequency request will be clamped by __resolve_freq(): get_next_freq() \-cpufreq_driver_resolve_freq() \-__resolve_freq() This means that the energy computation might use boosting frequencies, which are not available. Regards, Pierre [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > +} > + > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > { > struct cppc_cpudata *cpu_data; > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > cpu_data = policy->driver_data; > + > + cppc_cpufreq_set_capacity_ref_freq(policy); > + > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > get_perf_level_count(policy), &em_cb, > cpu_data->shared_cpu_map, 0);
On Wed, 11 Oct 2023 at 12:27, Pierre Gondois <pierre.gondois@arm.com> wrote: > > Hello Vincent, > > On 10/9/23 12:36, Vincent Guittot wrote: > > cppc cpufreq driver can register an artificial energy model. In such case, > > it also have to register the frequency that is used to define the CPU > > capacity > > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > --- > > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > > index fe08ca419b3d..24c6ba349f01 100644 > > --- a/drivers/cpufreq/cppc_cpufreq.c > > +++ b/drivers/cpufreq/cppc_cpufreq.c > > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > > return 0; > > } > > > > + > > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > > +{ > > + struct cppc_perf_caps *perf_caps; > > + struct cppc_cpudata *cpu_data; > > + unsigned int ref_freq; > > + > > + cpu_data = policy->driver_data; > > + perf_caps = &cpu_data->perf_caps; > > + > > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > > + > > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; > > 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in > [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor > should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used > without energy model I believe. we can disable it by setting capacity_ref_freq to 0 so it will fallback on cpuinfo like intel and amd which uses default SCHED_CAPACITY_SCALE capacity Could you provide me with more details about your platform ? I still try to understand how the cpu compute capacity is set up on your system. How do you set per_cpu cpu_scale variable ? we should set the ref freq at the same time > > Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should > be set for the whole perf domain in case this 'policy->cpu' goes offline. > > Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches > the boosting frequency. We have: > 'non-boosted max capacity' < 'boosted max capacity'. > - > If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max > capacity'. The overutilization of the system seems to be triggered by comparing the CPU > util to the 'boosted max capacity'. So systems might not be detected as overutilized. As Peter mentioned, we have to decide what is the original compute capacity of your CPUs which is usually the sustainable max compute capacity, especially when using EAS and EM > > For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will > be used. It is currently unknown to the function that the frequency request will be > clamped by __resolve_freq(): > get_next_freq() > \-cpufreq_driver_resolve_freq() > \-__resolve_freq() > This means that the energy computation might use boosting frequencies, which are not > available. > > Regards, > Pierre > > [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency > [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > > > +} > > + > > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > { > > struct cppc_cpudata *cpu_data; > > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > > > cpu_data = policy->driver_data; > > + > > + cppc_cpufreq_set_capacity_ref_freq(policy); > > + > > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > > get_perf_level_count(policy), &em_cb, > > cpu_data->shared_cpu_map, 0);
Hi both, On Wednesday 11 Oct 2023 at 16:25:46 (+0200), Vincent Guittot wrote: > On Wed, 11 Oct 2023 at 12:27, Pierre Gondois <pierre.gondois@arm.com> wrote: > > > > Hello Vincent, > > > > On 10/9/23 12:36, Vincent Guittot wrote: > > > cppc cpufreq driver can register an artificial energy model. In such case, > > > it also have to register the frequency that is used to define the CPU > > > capacity > > > > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > > --- > > > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > > > 1 file changed, 18 insertions(+) > > > > > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > > > index fe08ca419b3d..24c6ba349f01 100644 > > > --- a/drivers/cpufreq/cppc_cpufreq.c > > > +++ b/drivers/cpufreq/cppc_cpufreq.c > > > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > > > return 0; > > > } > > > > > > + > > > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > > > +{ > > > + struct cppc_perf_caps *perf_caps; > > > + struct cppc_cpudata *cpu_data; > > > + unsigned int ref_freq; > > > + > > > + cpu_data = policy->driver_data; > > > + perf_caps = &cpu_data->perf_caps; > > > + > > > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > > > + > > > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; > > > > 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in > > [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor > > should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used > > without energy model I believe. > > we can disable it by setting capacity_ref_freq to 0 so it will > fallback on cpuinfo like intel and amd which uses default > SCHED_CAPACITY_SCALE capacity > > Could you provide me with more details about your platform ? I still > try to understand how the cpu compute capacity is set up on your > system. How do you set per_cpu cpu_scale variable ? we should set the > ref freq at the same time > Yes, the best place to set it would be in: drivers/base/arch_topology.c: topology_init_cpu_capacity_cppc() But: - That function reuses topology_normalize_cpu_scale() and when called it needs to have capacity_ref_freq = 1. So either capacity_ref_freq needs to be set for each CPU after topology_normalize_cpu_scale() is called or we should not call topology_normalize_cpu_scale() here and just unpack a CPPC specific version of it in topology_init_cpu_capacity_cppc(). The latter is probably better as we avoid iterating through all CPUs a couple of times. - When set, capacity_ref_freq needs to be a "frequency" (at least in reference to the reference frequencies provided by CPPC). So cppc_cpufreq_khz_to_perf() and cppc_cpufreq_perf_to_khz() would need to move to drivers/acpi/cppc_acpi.c. They don't have any dependency on cpufreq (policies) so that should be alright. topology_init_cpu_capacity_cppc() is a better place to set capacity_ref_freq because one can do it for each CPU, and it not only caters for the EAS case but also for frequency invariance, when arch_set_freq_scale() is called, if no counters are supported. When counters are supported, there are still two loose threads: - amu_fie_setup(): Vincent, would you mind completely removing cpufreq_get_hw_max_freq() and reusing arch_scale_freq_ref() here? - It would be nice if cppc_scale_freq_workfn() would use arch_scale_freq_ref() as well, for consistency. But it would need to be converted back to performance before use, so that would mean extra work on the tick, which is not ideal. Basically it would be good if what gets used for capacity (arch_scale_freq_ref()) gets used for frequency invariance as well, in all locations. Thanks, Ionela. > > > > Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should > > be set for the whole perf domain in case this 'policy->cpu' goes offline. > > > > Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches > > the boosting frequency. We have: > > 'non-boosted max capacity' < 'boosted max capacity'. > > - > > If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max > > capacity'. The overutilization of the system seems to be triggered by comparing the CPU > > util to the 'boosted max capacity'. So systems might not be detected as overutilized. > > As Peter mentioned, we have to decide what is the original compute > capacity of your CPUs which is usually the sustainable max compute > capacity, especially when using EAS and EM > > > > > For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will > > be used. It is currently unknown to the function that the frequency request will be > > clamped by __resolve_freq(): > > get_next_freq() > > \-cpufreq_driver_resolve_freq() > > \-__resolve_freq() > > This means that the energy computation might use boosting frequencies, which are not > > available. > > > > Regards, > > Pierre > > > > [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency > > [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > > > > > +} > > > + > > > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > { > > > struct cppc_cpudata *cpu_data; > > > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > > > > > cpu_data = policy->driver_data; > > > + > > > + cppc_cpufreq_set_capacity_ref_freq(policy); > > > + > > > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > > > get_perf_level_count(policy), &em_cb, > > > cpu_data->shared_cpu_map, 0);
Hi Ionela, On Mon, 16 Oct 2023 at 14:13, Ionela Voinescu <ionela.voinescu@arm.com> wrote: > > Hi both, > > On Wednesday 11 Oct 2023 at 16:25:46 (+0200), Vincent Guittot wrote: > > On Wed, 11 Oct 2023 at 12:27, Pierre Gondois <pierre.gondois@arm.com> wrote: > > > > > > Hello Vincent, > > > > > > On 10/9/23 12:36, Vincent Guittot wrote: > > > > cppc cpufreq driver can register an artificial energy model. In such case, > > > > it also have to register the frequency that is used to define the CPU > > > > capacity > > > > > > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > --- > > > > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > > > > 1 file changed, 18 insertions(+) > > > > > > > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > > > > index fe08ca419b3d..24c6ba349f01 100644 > > > > --- a/drivers/cpufreq/cppc_cpufreq.c > > > > +++ b/drivers/cpufreq/cppc_cpufreq.c > > > > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > > > > return 0; > > > > } > > > > > > > > + > > > > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > > > > +{ > > > > + struct cppc_perf_caps *perf_caps; > > > > + struct cppc_cpudata *cpu_data; > > > > + unsigned int ref_freq; > > > > + > > > > + cpu_data = policy->driver_data; > > > > + perf_caps = &cpu_data->perf_caps; > > > > + > > > > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > > > > + > > > > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; > > > > > > 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in > > > [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor > > > should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used > > > without energy model I believe. > > > > we can disable it by setting capacity_ref_freq to 0 so it will > > fallback on cpuinfo like intel and amd which uses default > > SCHED_CAPACITY_SCALE capacity > > > > Could you provide me with more details about your platform ? I still > > try to understand how the cpu compute capacity is set up on your > > system. How do you set per_cpu cpu_scale variable ? we should set the > > ref freq at the same time > > > > Yes, the best place to set it would be in: > drivers/base/arch_topology.c: topology_init_cpu_capacity_cppc() Thanks. I didn't notice it > > But: > - That function reuses topology_normalize_cpu_scale() and when called > it needs to have capacity_ref_freq = 1. So either capacity_ref_freq > needs to be set for each CPU after topology_normalize_cpu_scale() is > called or we should not call topology_normalize_cpu_scale() here and > just unpack a CPPC specific version of it in > topology_init_cpu_capacity_cppc(). The latter is probably better as > we avoid iterating through all CPUs a couple of times. > > - When set, capacity_ref_freq needs to be a "frequency" (at least > in reference to the reference frequencies provided by CPPC). So > cppc_cpufreq_khz_to_perf() and cppc_cpufreq_perf_to_khz() would need > to move to drivers/acpi/cppc_acpi.c. They don't have any dependency > on cpufreq (policies) so that should be alright. > > topology_init_cpu_capacity_cppc() is a better place to set > capacity_ref_freq because one can do it for each CPU, and it not only I agree, topology_init_cpu_capacity_cppc() is the best place to set capacity_ref_freq() > caters for the EAS case but also for frequency invariance, when > arch_set_freq_scale() is called, if no counters are supported. > > When counters are supported, there are still two loose threads: > - amu_fie_setup(): Vincent, would you mind completely removing > cpufreq_get_hw_max_freq() and reusing arch_scale_freq_ref() here? I wonder if we can have a ordering dependency problem as both init_cpu_capacity_notifier() and init_amu_fie_notifier() are registered for the same CPUFREQ_POLICY_NOTIFIER event and I'm not sure it will happen in the right ordering > > - It would be nice if cppc_scale_freq_workfn() would use > arch_scale_freq_ref() as well, for consistency. But it would need > to be converted back to performance before use, so that would mean > extra work on the tick, which is not ideal. This once seems more complex as it implies other arch that are not using arch_topology.c and would need more rework so I would prefer to make it a separate patchset Thanks Vincent > > Basically it would be good if what gets used for capacity > (arch_scale_freq_ref()) gets used for frequency invariance as well, > in all locations. > > Thanks, > Ionela. > > > > > > > Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should > > > be set for the whole perf domain in case this 'policy->cpu' goes offline. > > > > > > Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches > > > the boosting frequency. We have: > > > 'non-boosted max capacity' < 'boosted max capacity'. > > > - > > > If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max > > > capacity'. The overutilization of the system seems to be triggered by comparing the CPU > > > util to the 'boosted max capacity'. So systems might not be detected as overutilized. > > > > As Peter mentioned, we have to decide what is the original compute > > capacity of your CPUs which is usually the sustainable max compute > > capacity, especially when using EAS and EM > > > > > > > > For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will > > > be used. It is currently unknown to the function that the frequency request will be > > > clamped by __resolve_freq(): > > > get_next_freq() > > > \-cpufreq_driver_resolve_freq() > > > \-__resolve_freq() > > > This means that the energy computation might use boosting frequencies, which are not > > > available. > > > > > > Regards, > > > Pierre > > > > > > [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency > > > [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > > > > > > > +} > > > > + > > > > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > { > > > > struct cppc_cpudata *cpu_data; > > > > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > > > > > > > cpu_data = policy->driver_data; > > > > + > > > > + cppc_cpufreq_set_capacity_ref_freq(policy); > > > > + > > > > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > > > > get_perf_level_count(policy), &em_cb, > > > > cpu_data->shared_cpu_map, 0);
Hi, On Monday 16 Oct 2023 at 17:32:03 (+0200), Vincent Guittot wrote: > Hi Ionela, > > On Mon, 16 Oct 2023 at 14:13, Ionela Voinescu <ionela.voinescu@arm.com> wrote: > > > > Hi both, > > > > On Wednesday 11 Oct 2023 at 16:25:46 (+0200), Vincent Guittot wrote: > > > On Wed, 11 Oct 2023 at 12:27, Pierre Gondois <pierre.gondois@arm.com> wrote: > > > > > > > > Hello Vincent, > > > > > > > > On 10/9/23 12:36, Vincent Guittot wrote: > > > > > cppc cpufreq driver can register an artificial energy model. In such case, > > > > > it also have to register the frequency that is used to define the CPU > > > > > capacity > > > > > > > > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > --- > > > > > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > > > > > 1 file changed, 18 insertions(+) > > > > > > > > > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > > > > > index fe08ca419b3d..24c6ba349f01 100644 > > > > > --- a/drivers/cpufreq/cppc_cpufreq.c > > > > > +++ b/drivers/cpufreq/cppc_cpufreq.c > > > > > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > > > > > return 0; > > > > > } > > > > > > > > > > + > > > > > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > > > > > +{ > > > > > + struct cppc_perf_caps *perf_caps; > > > > > + struct cppc_cpudata *cpu_data; > > > > > + unsigned int ref_freq; > > > > > + > > > > > + cpu_data = policy->driver_data; > > > > > + perf_caps = &cpu_data->perf_caps; > > > > > + > > > > > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > > > > > + > > > > > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; > > > > > > > > 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in > > > > [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor > > > > should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used > > > > without energy model I believe. > > > > > > we can disable it by setting capacity_ref_freq to 0 so it will > > > fallback on cpuinfo like intel and amd which uses default > > > SCHED_CAPACITY_SCALE capacity > > > > > > Could you provide me with more details about your platform ? I still > > > try to understand how the cpu compute capacity is set up on your > > > system. How do you set per_cpu cpu_scale variable ? we should set the > > > ref freq at the same time > > > > > > > Yes, the best place to set it would be in: > > drivers/base/arch_topology.c: topology_init_cpu_capacity_cppc() > > Thanks. I didn't notice it > > > > > But: > > - That function reuses topology_normalize_cpu_scale() and when called > > it needs to have capacity_ref_freq = 1. So either capacity_ref_freq > > needs to be set for each CPU after topology_normalize_cpu_scale() is > > called or we should not call topology_normalize_cpu_scale() here and > > just unpack a CPPC specific version of it in > > topology_init_cpu_capacity_cppc(). The latter is probably better as > > we avoid iterating through all CPUs a couple of times. > > > > - When set, capacity_ref_freq needs to be a "frequency" (at least > > in reference to the reference frequencies provided by CPPC). So > > cppc_cpufreq_khz_to_perf() and cppc_cpufreq_perf_to_khz() would need > > to move to drivers/acpi/cppc_acpi.c. They don't have any dependency > > on cpufreq (policies) so that should be alright. > > > > topology_init_cpu_capacity_cppc() is a better place to set > > capacity_ref_freq because one can do it for each CPU, and it not only > > I agree, topology_init_cpu_capacity_cppc() is the best place to set > capacity_ref_freq() > > > caters for the EAS case but also for frequency invariance, when > > arch_set_freq_scale() is called, if no counters are supported. > > > > When counters are supported, there are still two loose threads: > > - amu_fie_setup(): Vincent, would you mind completely removing > > cpufreq_get_hw_max_freq() and reusing arch_scale_freq_ref() here? > > I wonder if we can have a ordering dependency problem as both > init_cpu_capacity_notifier() and init_amu_fie_notifier() are > registered for the same CPUFREQ_POLICY_NOTIFIER event and I'm not sure > it will happen in the right ordering Yes, you are right, this would be a problem for DT systems. With the implementation above, ACPI systems would obtain capacity_ref_freq on processor probe so it should be then available at policy creation when amu_fie_setup() would be called. Initially I thought the only solution might be to move freq_inv_set_max_ratio() in the arch topology driver to the same callback that initialises capacity, but that quickly becomes ugly with making it support both DT and ACPI systems. And then there's the question on whether it belongs there. But I think the better option is to wrap policy->cpuinfo.max_freq in another getter function which can be used in both amu_fie_setup() and init_cpu_capacity_callback(). This can be implemented in the arch topology driver and exposed to the architecture specific topology files. I'm not sure if this might be worth leaving for another patchset as well. Let us know if you'd like us to help on theses ones. Thanks, Ionela. > > > > > - It would be nice if cppc_scale_freq_workfn() would use > > arch_scale_freq_ref() as well, for consistency. But it would need > > to be converted back to performance before use, so that would mean > > extra work on the tick, which is not ideal. > > This once seems more complex as it implies other arch that are not > using arch_topology.c and would need more rework so I would prefer to > make it a separate patchset > > Thanks > Vincent > > > > > Basically it would be good if what gets used for capacity > > (arch_scale_freq_ref()) gets used for frequency invariance as well, > > in all locations. > > > > Thanks, > > Ionela. > > > > > > > > > > Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should > > > > be set for the whole perf domain in case this 'policy->cpu' goes offline. > > > > > > > > Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches > > > > the boosting frequency. We have: > > > > 'non-boosted max capacity' < 'boosted max capacity'. > > > > - > > > > If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max > > > > capacity'. The overutilization of the system seems to be triggered by comparing the CPU > > > > util to the 'boosted max capacity'. So systems might not be detected as overutilized. > > > > > > As Peter mentioned, we have to decide what is the original compute > > > capacity of your CPUs which is usually the sustainable max compute > > > capacity, especially when using EAS and EM > > > > > > > > > > > For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will > > > > be used. It is currently unknown to the function that the frequency request will be > > > > clamped by __resolve_freq(): > > > > get_next_freq() > > > > \-cpufreq_driver_resolve_freq() > > > > \-__resolve_freq() > > > > This means that the energy computation might use boosting frequencies, which are not > > > > available. > > > > > > > > Regards, > > > > Pierre > > > > > > > > [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency > > > > [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > > > > > > > > > +} > > > > > + > > > > > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > > { > > > > > struct cppc_cpudata *cpu_data; > > > > > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > > > > > > > > > cpu_data = policy->driver_data; > > > > > + > > > > > + cppc_cpufreq_set_capacity_ref_freq(policy); > > > > > + > > > > > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > > > > > get_perf_level_count(policy), &em_cb, > > > > > cpu_data->shared_cpu_map, 0);
On Tue, 17 Oct 2023 at 11:02, Ionela Voinescu <ionela.voinescu@arm.com> wrote: > > Hi, > > On Monday 16 Oct 2023 at 17:32:03 (+0200), Vincent Guittot wrote: > > Hi Ionela, > > > > On Mon, 16 Oct 2023 at 14:13, Ionela Voinescu <ionela.voinescu@arm.com> wrote: > > > > > > Hi both, > > > > > > On Wednesday 11 Oct 2023 at 16:25:46 (+0200), Vincent Guittot wrote: > > > > On Wed, 11 Oct 2023 at 12:27, Pierre Gondois <pierre.gondois@arm.com> wrote: > > > > > > > > > > Hello Vincent, > > > > > > > > > > On 10/9/23 12:36, Vincent Guittot wrote: > > > > > > cppc cpufreq driver can register an artificial energy model. In such case, > > > > > > it also have to register the frequency that is used to define the CPU > > > > > > capacity > > > > > > > > > > > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > > --- > > > > > > drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++ > > > > > > 1 file changed, 18 insertions(+) > > > > > > > > > > > > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > > > > > > index fe08ca419b3d..24c6ba349f01 100644 > > > > > > --- a/drivers/cpufreq/cppc_cpufreq.c > > > > > > +++ b/drivers/cpufreq/cppc_cpufreq.c > > > > > > @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) > > > > > > return 0; > > > > > > } > > > > > > > > > > > > + > > > > > > +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) > > > > > > +{ > > > > > > + struct cppc_perf_caps *perf_caps; > > > > > > + struct cppc_cpudata *cpu_data; > > > > > > + unsigned int ref_freq; > > > > > > + > > > > > > + cpu_data = policy->driver_data; > > > > > > + perf_caps = &cpu_data->perf_caps; > > > > > > + > > > > > > + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); > > > > > > + > > > > > > + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; > > > > > > > > > > 'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in > > > > > [1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor > > > > > should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used > > > > > without energy model I believe. > > > > > > > > we can disable it by setting capacity_ref_freq to 0 so it will > > > > fallback on cpuinfo like intel and amd which uses default > > > > SCHED_CAPACITY_SCALE capacity > > > > > > > > Could you provide me with more details about your platform ? I still > > > > try to understand how the cpu compute capacity is set up on your > > > > system. How do you set per_cpu cpu_scale variable ? we should set the > > > > ref freq at the same time > > > > > > > > > > Yes, the best place to set it would be in: > > > drivers/base/arch_topology.c: topology_init_cpu_capacity_cppc() > > > > Thanks. I didn't notice it > > > > > > > > But: > > > - That function reuses topology_normalize_cpu_scale() and when called > > > it needs to have capacity_ref_freq = 1. So either capacity_ref_freq > > > needs to be set for each CPU after topology_normalize_cpu_scale() is > > > called or we should not call topology_normalize_cpu_scale() here and > > > just unpack a CPPC specific version of it in > > > topology_init_cpu_capacity_cppc(). The latter is probably better as > > > we avoid iterating through all CPUs a couple of times. > > > > > > - When set, capacity_ref_freq needs to be a "frequency" (at least > > > in reference to the reference frequencies provided by CPPC). So > > > cppc_cpufreq_khz_to_perf() and cppc_cpufreq_perf_to_khz() would need > > > to move to drivers/acpi/cppc_acpi.c. They don't have any dependency > > > on cpufreq (policies) so that should be alright. > > > > > > topology_init_cpu_capacity_cppc() is a better place to set > > > capacity_ref_freq because one can do it for each CPU, and it not only > > > > I agree, topology_init_cpu_capacity_cppc() is the best place to set > > capacity_ref_freq() > > > > > caters for the EAS case but also for frequency invariance, when > > > arch_set_freq_scale() is called, if no counters are supported. > > > > > > When counters are supported, there are still two loose threads: > > > - amu_fie_setup(): Vincent, would you mind completely removing > > > cpufreq_get_hw_max_freq() and reusing arch_scale_freq_ref() here? > > > > I wonder if we can have a ordering dependency problem as both > > init_cpu_capacity_notifier() and init_amu_fie_notifier() are > > registered for the same CPUFREQ_POLICY_NOTIFIER event and I'm not sure > > it will happen in the right ordering > > Yes, you are right, this would be a problem for DT systems. With the > implementation above, ACPI systems would obtain capacity_ref_freq on > processor probe so it should be then available at policy creation when > amu_fie_setup() would be called. yes. the problem is only for DT > > Initially I thought the only solution might be to move > freq_inv_set_max_ratio() in the arch topology driver to the same > callback that initialises capacity, but that quickly becomes ugly with > making it support both DT and ACPI systems. And then there's the > question on whether it belongs there. The goal would be to update the ratio while initializing everything else. But this means that we must initialize the ratio in such a way that amu will return by default SCHED_CAPACITY_SCALE until arch_topology.c initializes it. I will make it a try to check how ugly it will be > > But I think the better option is to wrap policy->cpuinfo.max_freq in > another getter function which can be used in both amu_fie_setup() and > init_cpu_capacity_callback(). This can be implemented in the > arch topology driver and exposed to the architecture specific topology > files. > > I'm not sure if this might be worth leaving for another patchset as > well. Let us know if you'd like us to help on theses ones. > > Thanks, > Ionela. > > > > > > > > > - It would be nice if cppc_scale_freq_workfn() would use > > > arch_scale_freq_ref() as well, for consistency. But it would need > > > to be converted back to performance before use, so that would mean > > > extra work on the tick, which is not ideal. > > > > This once seems more complex as it implies other arch that are not > > using arch_topology.c and would need more rework so I would prefer to > > make it a separate patchset > > > > Thanks > > Vincent > > > > > > > > Basically it would be good if what gets used for capacity > > > (arch_scale_freq_ref()) gets used for frequency invariance as well, > > > in all locations. > > > > > > Thanks, > > > Ionela. > > > > > > > > > > > > > Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should > > > > > be set for the whole perf domain in case this 'policy->cpu' goes offline. > > > > > > > > > > Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches > > > > > the boosting frequency. We have: > > > > > 'non-boosted max capacity' < 'boosted max capacity'. > > > > > - > > > > > If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max > > > > > capacity'. The overutilization of the system seems to be triggered by comparing the CPU > > > > > util to the 'boosted max capacity'. So systems might not be detected as overutilized. > > > > > > > > As Peter mentioned, we have to decide what is the original compute > > > > capacity of your CPUs which is usually the sustainable max compute > > > > capacity, especially when using EAS and EM > > > > > > > > > > > > > > For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will > > > > > be used. It is currently unknown to the function that the frequency request will be > > > > > clamped by __resolve_freq(): > > > > > get_next_freq() > > > > > \-cpufreq_driver_resolve_freq() > > > > > \-__resolve_freq() > > > > > This means that the energy computation might use boosting frequencies, which are not > > > > > available. > > > > > > > > > > Regards, > > > > > Pierre > > > > > > > > > > [1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency > > > > > [2]: https://lore.kernel.org/lkml/20230905113308.GF28319@noisy.programming.kicks-ass.net/ > > > > > > > > > > > +} > > > > > > + > > > > > > static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > > > { > > > > > > struct cppc_cpudata *cpu_data; > > > > > > @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) > > > > > > EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); > > > > > > > > > > > > cpu_data = policy->driver_data; > > > > > > + > > > > > > + cppc_cpufreq_set_capacity_ref_freq(policy); > > > > > > + > > > > > > em_dev_register_perf_domain(get_cpu_device(policy->cpu), > > > > > > get_perf_level_count(policy), &em_cb, > > > > > > cpu_data->shared_cpu_map, 0);
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index fe08ca419b3d..24c6ba349f01 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -636,6 +636,21 @@ static int populate_efficiency_class(void) return 0; } + +static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy) +{ + struct cppc_perf_caps *perf_caps; + struct cppc_cpudata *cpu_data; + unsigned int ref_freq; + + cpu_data = policy->driver_data; + perf_caps = &cpu_data->perf_caps; + + ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf); + + per_cpu(capacity_ref_freq, policy->cpu) = ref_freq; +} + static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) { struct cppc_cpudata *cpu_data; @@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy) EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost); cpu_data = policy->driver_data; + + cppc_cpufreq_set_capacity_ref_freq(policy); + em_dev_register_perf_domain(get_cpu_device(policy->cpu), get_perf_level_count(policy), &em_cb, cpu_data->shared_cpu_map, 0);