Message ID | 20221105174225.28673-2-rui.zhang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1110484wru; Sat, 5 Nov 2022 11:07:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5DXEBRFga2aEnMfoqUXWr3KYVXo9A4BhDCA5ou8Ab0DVQiCKG92kA2pG3qKQ6alEssBkPP X-Received: by 2002:a17:907:1dd7:b0:7ae:41e1:cdfb with SMTP id og23-20020a1709071dd700b007ae41e1cdfbmr5869203ejc.58.1667671657832; Sat, 05 Nov 2022 11:07:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667671657; cv=none; d=google.com; s=arc-20160816; b=O/RJlxEbddQpa39UOJAt85c8Tg5v8eIb2NG6PJGrNKhklMRxlrJi1g5eTHZkKwskqX FDbTsnAqaSljpyMSjohUeFCxAiG3zC/YyJtmI13o5shPSv9d9/8QwqTEuStYVgErFct9 IV7tYb90ETD3et6yV/EhO0Bip/GcyaSou55UwpdnY8TATz8BacNaWaPYzDxzzNIePxt+ ojIX0Hh/qn9F4Gqa18voywDCMENUMIjWBOl/DLG6QfWU5o2+RAUvc/3gFhH0Uec6vIUX bdmdjodi2Ltnd9WqOqZaJf6MQAthfNW32ZhW9JY7/jxzf48uEYan55mbdGagKcW6lBW2 aRtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=nNcAx9+aqn93A5pYrtQn6ZzkKp179k898jx1sMBNy1c=; b=unq9yMCsDNxb1onwEPjULxo6qT53gGYEdYD8hxmJfQYCxPV4jpyV1Q2TZJRyGZbTne s7P0q3fFPIvOIMOYpWDOEdUklmq6WBLKZdbC770NQeICvpMhRdGTUWgn0CxvoMh/dpCb DRR7LaZq7ZXQbHiYlmBkTw3RwKY2Qs1zaC1p2QYdC474tq5Tkn3JLCV61zGaWOhOlfbI QiVbjr70rjNnXKaiBOG9RLSszAdcwTOLqa9sii5PI4mIIVzqK60V/hNjjnYNgLvV+RyW 6V7wbRU/MWv38BPA/EirYy7tUip7jliAiQE2XChsJ+wD4UNq0nl0w1oA3nhp8a7nEZRg SrhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="CmY8mC/S"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cr19-20020a170906d55300b0078d484e0e7esi3163668ejc.488.2022.11.05.11.07.10; Sat, 05 Nov 2022 11:07:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="CmY8mC/S"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229888AbiKERkQ (ORCPT <rfc822;hjfbswb@gmail.com> + 99 others); Sat, 5 Nov 2022 13:40:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229841AbiKERkN (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 5 Nov 2022 13:40:13 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D0BC1CB13; Sat, 5 Nov 2022 10:40:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667670012; x=1699206012; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=UdMJwHAwQA1+imM7w6M6Py0x2bn4jGnAZ+JxoFIzVtw=; b=CmY8mC/SrzhxdEgJRq+uVjHIjTzUHngNp8K9SUQD43HVuU9ibfs4/2e8 4b0QP5wx1nBmrUoibOBULVsLTijW+/Mg0GIvM+OFexiNkbXfWxowPfKnC vDpGhE6BaCCtMVHtbAFtuRYot1Ou6a8vN5lSVeMgU0jxhnbfH8ipBiJm6 uwRBHcVJEOED+bhJ0rKw4uA2cEjRR0yxPwCIhFT9qKHoprxKRFX8a2kEk dqDqMbfIB8fssXB3VhzSVwh9Bu3YpYvMpU/hoILM1eGisQ8zlLoSy2oWw qjscBufQvNgaU1WEDy2J4igOMtoluhKVD3GVq4guR4PSep6lPhmRcHSUa w==; X-IronPort-AV: E=McAfee;i="6500,9779,10522"; a="297672767" X-IronPort-AV: E=Sophos;i="5.96,140,1665471600"; d="scan'208";a="297672767" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Nov 2022 10:40:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10522"; a="666729781" X-IronPort-AV: E=Sophos;i="5.96,140,1665471600"; d="scan'208";a="666729781" Received: from power-sh.sh.intel.com ([10.239.183.122]) by orsmga008.jf.intel.com with ESMTP; 05 Nov 2022 10:40:10 -0700 From: Zhang Rui <rui.zhang@intel.com> To: rjw@rjwysocki.net, daniel.lezcano@linaro.org Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 2/3] cpuidle: ladder: Tune promotion/demotion threshold Date: Sun, 6 Nov 2022 01:42:24 +0800 Message-Id: <20221105174225.28673-2-rui.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221105174225.28673-1-rui.zhang@intel.com> References: <20221105174225.28673-1-rui.zhang@intel.com> X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748680476101403921?= X-GMAIL-MSGID: =?utf-8?q?1748680476101403921?= |
Series |
[RFC,1/3] cpuidle: ladder: Fix bogus comparison between s64 and u64
|
|
Commit Message
Zhang, Rui
Nov. 5, 2022, 5:42 p.m. UTC
After fixing the bogus comparison between u64 and s64, the ladder
governor stops making promotion decisions errornously.
However, after this, it is found that the ladder governor demotes much
easier than promotes.
Below is captured using turbostat after a 30 seconds runtime idle,
Without previous patch,
Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
0.30 2373 0 0 0 4 9 25 122 326 2857 0.36 0.04 0.57 98.73 1.48
With previous patch,
Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
0.42 3071 0 771 838 447 327 336 382 299 344 34.18 16.21 17.69 31.51 2.00
And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT.
With this patch,
Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
0.39 2436 0 1 72 177 51 194 243 799 1883 0.50 0.32 0.35 98.45 1.53
Note that this is an experimental patch to illustrate the problem,
and it is checked with idle scenario only for now.
I will try to evaluate with more scenarios, and if someone can help
evaluate with more scenarios at the same time and provide data for the
benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that
would be great.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
drivers/cpuidle/governors/ladder.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@intel.com> wrote: > > After fixing the bogus comparison between u64 and s64, the ladder > governor stops making promotion decisions errornously. > > However, after this, it is found that the ladder governor demotes much > easier than promotes. "After fixing an error related to using signed and unsigned integers in the ladder governor in a previous patch, that governor turns out to demote much easier than promote" > Below is captured using turbostat after a 30 seconds runtime idle, > > Without previous patch, > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > 0.30 2373 0 0 0 4 9 25 122 326 2857 0.36 0.04 0.57 98.73 1.48 Why is the above relevant? > With previous patch, > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > 0.42 3071 0 771 838 447 327 336 382 299 344 34.18 16.21 17.69 31.51 2.00 > > And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT. I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT imbalance causes this. I guess more residency in the deeper idle state is expected? Or desired?? > With this patch, > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > 0.39 2436 0 1 72 177 51 194 243 799 1883 0.50 0.32 0.35 98.45 1.53 > > Note that this is an experimental patch to illustrate the problem, > and it is checked with idle scenario only for now. > I will try to evaluate with more scenarios, and if someone can help > evaluate with more scenarios at the same time and provide data for the > benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that > would be great. So yes, this requires more work. Overall, I think that you are concerned that the previous change might be regarded as a regression and are trying to compensate for it with a PROMOTION_COUNT/DEMOTION_COUNT change. I'm not sure I can agree with that approach, because the shallower idle states might be preferred by the original ladder design intentionally, for performance reasons. > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > --- > drivers/cpuidle/governors/ladder.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c > index fb61118aef37..4b47aa0a4da9 100644 > --- a/drivers/cpuidle/governors/ladder.c > +++ b/drivers/cpuidle/governors/ladder.c > @@ -20,8 +20,8 @@ > #include <asm/io.h> > #include <linux/uaccess.h> > > -#define PROMOTION_COUNT 4 > -#define DEMOTION_COUNT 1 > +#define PROMOTION_COUNT 2 > +#define DEMOTION_COUNT 4 > > struct ladder_device_state { > struct { > --
On 2022.11.23 09:50 Rafael wrote: > On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@intel.com> wrote: >> >> After fixing the bogus comparison between u64 and s64, the ladder >> governor stops making promotion decisions errornously. >> >> However, after this, it is found that the ladder governor demotes much >> easier than promotes. > > "After fixing an error related to using signed and unsigned integers > in the ladder governor in a previous patch, that governor turns out to > demote much easier than promote" > >> Below is captured using turbostat after a 30 seconds runtime idle, >> >> Without previous patch, >> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt >> 0.30 2373 0 0 0 4 9 25 122 326 2857 0.36 0.04 0.57 98.73 1.48 > > Why is the above relevant? > >> With previous patch, >> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt >> 0.42 3071 0 771 838 447 327 336 382 299 344 34.18 16.21 17.69 31.51 2.00 >> >> And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT. > > I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT > imbalance causes this. > > I guess more residency in the deeper idle state is expected? Or desired?? > >> With this patch, >> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt >> 0.39 2436 0 1 72 177 51 194 243 799 1883 0.50 0.32 0.35 98.45 1.53 >> >> Note that this is an experimental patch to illustrate the problem, >> and it is checked with idle scenario only for now. >> I will try to evaluate with more scenarios, and if someone can help >> evaluate with more scenarios at the same time and provide data for the >> benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that >> would be great. > > So yes, this requires more work. > > Overall, I think that you are concerned that the previous change might > be regarded as a regression and are trying to compensate for it with a > PROMOTION_COUNT/DEMOTION_COUNT change. > > I'm not sure I can agree with that approach, because the shallower > idle states might be preferred by the original ladder design > intentionally, for performance reasons. Hi All, Because I was continuing to test the teo governor with the util patch version 4, it was fairly easy for me to test this patch set also. However, I have had difficulties having enough time to write up my results. The best improvement was for a slow speed ping-pong (I did 3 speeds, fast, medium, and slow) 2 pairs of ping pongs, not forced CPU affinity, schedutil CPU scaling governor, intel_cpufreq CPU scaling driver, HWP disabled. The menu governor was considered the master reference: Old ladder was 44% slower. New ladder was 5.9% slower. Just for reference: Old teo was 29% slower. teo util V4 was 13% slower. The worst degradation was for a fast speed ping-pong 2 pairs of ping pongs, not forced CPU affinity, schedutil CPU scaling governor, intel_cpufreq CPU scaling driver, HWP disabled. The menu governor was considered the master reference: Old ladder was 64% slower. New ladder was 71% slower. Interestingly, the old ladder governor outperformed the menu governor on the slow 2 pair ping-pong with the performance governor: Old ladder was 0.56% faster. New ladder was 0.81% slower. Disclaimer: Test results using the schedutil CPU scaling governor are noisy, with questionable repeatability. I'll try to get all the test results written up soon. ... Doug
Hi, Rafael, thanks for reviewing the patch series. On Wed, 2022-11-23 at 18:50 +0100, Rafael J. Wysocki wrote: > On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@intel.com> wrote: > > After fixing the bogus comparison between u64 and s64, the ladder > > governor stops making promotion decisions errornously. > > > > However, after this, it is found that the ladder governor demotes > > much > > easier than promotes. > > "After fixing an error related to using signed and unsigned integers > in the ladder governor in a previous patch, that governor turns out > to > demote much easier than promote" > > > Below is captured using turbostat after a 30 seconds runtime idle, > > > > Without previous patch, > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > 0.30 2373 0 0 0 4 9 25 122 > > 326 2857 0.36 0.04 0.57 98.73 1.48 > > Why is the above relevant? Just for comparison purpose. For a pure idle scenario (Busy% < 0.5), with ladder governor, we used to have 99% CPU%c7, but now, with patch 1/3, CPU%c1 CPU%c3 CPU%c6 CPU%c7 34.18 16.21 17.69 31.51 This does not look like the correct behavior for any cpuidle governor. > > > With previous patch, > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > 0.42 3071 0 771 838 447 327 336 382 > > 299 344 34.18 16.21 17.69 31.51 2.00 > > > > And this is caused by the imbalanced > > PROMOTION_COUNT/DEMOTION_COUNT. > > I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT > imbalance causes this. sure, how about something below. The PROMOTION_COUNT/DEMOTION_COUNT are used as the threshold between the ladder governor detects it "should promote/demote", and the ladder governor does a real promotion/demotion. Currently, PROMOTION_COUNT is set to 4 and DEMOTION_COUNT is set to 1. This means that the ladder governor does real demotion immediately when it "should demote", but it does real promotion only if it "should promote" 4 times in a row, without a single "should demote" occur in between. As a result, this lower the chance to do real promotion and the ladder governor is more likely to choose a shallower state. > > I guess more residency in the deeper idle state is expected? Or > desired?? > > > With this patch, > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > 0.39 2436 0 1 72 177 51 194 243 > > 799 1883 0.50 0.32 0.35 98.45 1.53 > > > > Note that this is an experimental patch to illustrate the problem, > > and it is checked with idle scenario only for now. > > I will try to evaluate with more scenarios, and if someone can help > > evaluate with more scenarios at the same time and provide data for > > the > > benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that > > would be great. > > So yes, this requires more work. > > Overall, I think that you are concerned that the previous change > might > be regarded as a regression and are trying to compensate for it with > a > PROMOTION_COUNT/DEMOTION_COUNT change. Exactly. > I'm not sure I can agree with that approach, because the shallower > idle states might be preferred by the original ladder design > intentionally, for performance reasons. > hmmm, even if there is only 30% c7/c8/c9/c10 residency in a pure idle scenario? And further more, since the imbalanced PROMOTION_COUNT/DEMOTION_COUNT and the unsigned/signed integers problem are both there since the first day the ladder governor was introduced, commit 4f86d3a8e297 ("cpuidle: consolidate 2.6.22 cpuidle branch into one patch"), my guess is that the unsigned/signed integers problem introduces a lot of pseudo promotions, and the PROMOTION_COUNT/DEMOTION_COUNT is introduced to workaround this so that the ladder governor doesn't get stuck at deep idle state. I don't have a solid proof for this. But at least for the pure idle scenario, I don't think 30% deep idle residency is the right behavior, and it needs to be tuned anyway. thanks, rui > > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > > --- > > drivers/cpuidle/governors/ladder.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/cpuidle/governors/ladder.c > > b/drivers/cpuidle/governors/ladder.c > > index fb61118aef37..4b47aa0a4da9 100644 > > --- a/drivers/cpuidle/governors/ladder.c > > +++ b/drivers/cpuidle/governors/ladder.c > > @@ -20,8 +20,8 @@ > > #include <asm/io.h> > > #include <linux/uaccess.h> > > > > -#define PROMOTION_COUNT 4 > > -#define DEMOTION_COUNT 1 > > +#define PROMOTION_COUNT 2 > > +#define DEMOTION_COUNT 4 > > > > struct ladder_device_state { > > struct { > > --
On Fri, Nov 25, 2022 at 7:39 AM Zhang Rui <rui.zhang@intel.com> wrote: > > Hi, Rafael, > > thanks for reviewing the patch series. > > On Wed, 2022-11-23 at 18:50 +0100, Rafael J. Wysocki wrote: > > On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@intel.com> wrote: > > > After fixing the bogus comparison between u64 and s64, the ladder > > > governor stops making promotion decisions errornously. > > > > > > However, after this, it is found that the ladder governor demotes > > > much > > > easier than promotes. > > > > "After fixing an error related to using signed and unsigned integers > > in the ladder governor in a previous patch, that governor turns out > > to > > demote much easier than promote" > > > > > Below is captured using turbostat after a 30 seconds runtime idle, > > > > > > Without previous patch, > > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > > 0.30 2373 0 0 0 4 9 25 122 > > > 326 2857 0.36 0.04 0.57 98.73 1.48 > > > > Why is the above relevant? > > Just for comparison purpose. > For a pure idle scenario (Busy% < 0.5), with ladder governor, we used > to have 99% CPU%c7, but now, with patch 1/3, > > CPU%c1 CPU%c3 CPU%c6 CPU%c7 > 34.18 16.21 17.69 31.51 > This does not look like the correct behavior for any cpuidle governor. It all depends on what the design goal was and I don't really know that in this particular case. It looks like the plan was to make it promote less often than demote or the counts would have been chosen differently. > > > > > With previous patch, > > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > > 0.42 3071 0 771 838 447 327 336 382 > > > 299 344 34.18 16.21 17.69 31.51 2.00 > > > > > > And this is caused by the imbalanced > > > PROMOTION_COUNT/DEMOTION_COUNT. > > > > I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT > > imbalance causes this. > > sure, how about something below. > > The PROMOTION_COUNT/DEMOTION_COUNT are used as the threshold between > the ladder governor detects it "should promote/demote", and the ladder > governor does a real promotion/demotion. > > Currently, PROMOTION_COUNT is set to 4 and DEMOTION_COUNT is set to 1. > This means that the ladder governor does real demotion immediately when > it "should demote", but it does real promotion only if it "should > promote" 4 times in a row, without a single "should demote" occur in > between. > > As a result, this lower the chance to do real promotion and the ladder > governor is more likely to choose a shallower state. Sounds good and now the question is what's the behavior expected by users. Do we have any data? > > > > I guess more residency in the deeper idle state is expected? Or > > desired?? > > > > > With this patch, > > > Busy% IRQ POLL C1 C1E C3 C6 C7s C8 > > > C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt > > > 0.39 2436 0 1 72 177 51 194 243 > > > 799 1883 0.50 0.32 0.35 98.45 1.53 > > > > > > Note that this is an experimental patch to illustrate the problem, > > > and it is checked with idle scenario only for now. > > > I will try to evaluate with more scenarios, and if someone can help > > > evaluate with more scenarios at the same time and provide data for > > > the > > > benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that > > > would be great. > > > > So yes, this requires more work. > > > > Overall, I think that you are concerned that the previous change > > might > > be regarded as a regression and are trying to compensate for it with > > a > > PROMOTION_COUNT/DEMOTION_COUNT change. > > Exactly. > > > I'm not sure I can agree with that approach, because the shallower > > idle states might be preferred by the original ladder design > > intentionally, for performance reasons. > > > hmmm, even if there is only 30% c7/c8/c9/c10 residency in a pure idle > scenario? Yes, even in that case. All boils down to the question regarding user expectations. > And further more, since the imbalanced PROMOTION_COUNT/DEMOTION_COUNT > and the unsigned/signed integers problem are both there since the first > day the ladder governor was introduced, commit 4f86d3a8e297 ("cpuidle: > consolidate 2.6.22 cpuidle branch into one patch"), > > my guess is that > > the unsigned/signed integers problem introduces a lot of pseudo > promotions, and the PROMOTION_COUNT/DEMOTION_COUNT is introduced to > workaround this so that the ladder governor doesn't get stuck at deep > idle state. That may be a good guess, so I would add it to the changelog of the patch. > I don't have a solid proof for this. But at least for the pure idle > scenario, I don't think 30% deep idle residency is the right behavior, > and it needs to be tuned anyway. Well, have you checked what happens if the counts are set to the same value, e.g. 2?
> > > I don't have a solid proof for this. But at least for the pure idle > > scenario, I don't think 30% deep idle residency is the right > > behavior, > > and it needs to be tuned anyway. > > Well, have you checked what happens if the counts are set to the same > value, e.g. 2? Well, this is embarrassing. I found a problem with my previous data when I re-evaluate following your suggestion. In short, 1. the 30% deep idle residency problem was got when I added some trace_printk() in the ladder_select_state() 2, without those trace_printk(), after patch 1, the ladder governor can still get 98% CPU%c7 in pure idle scenario. Currently, my understanding is that trace_printk() can call __schedule() and this increased the chance that call_cpuidle() returns immediately. When this happens, dev->last_residency_ns is set to 0 and results in a real demotion next time. Anyway, you are right on questioning this approach, because this seems to be a different problem or even a false alarm. So, I think I will submit patch 1/3 and 3/3 as they are bug fixes, and drop this patch for now, and leave the tuning work, if there is any, for the real ladder governor users. What do you think? thanks, rui
diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c index fb61118aef37..4b47aa0a4da9 100644 --- a/drivers/cpuidle/governors/ladder.c +++ b/drivers/cpuidle/governors/ladder.c @@ -20,8 +20,8 @@ #include <asm/io.h> #include <linux/uaccess.h> -#define PROMOTION_COUNT 4 -#define DEMOTION_COUNT 1 +#define PROMOTION_COUNT 2 +#define DEMOTION_COUNT 4 struct ladder_device_state { struct {