[RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver
Message ID | 1669952252-32710-1-git-send-email-lirongqing@baidu.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp633764wrr; Thu, 1 Dec 2022 20:02:22 -0800 (PST) X-Google-Smtp-Source: AA0mqf7wk0jMP1rxU7EodcD8sby+br/DYZ/CtCpjnbXxoqoGcLaV16Hy7U+XGtlxiJm4Gi3F4ilC X-Received: by 2002:a17:903:3314:b0:189:a208:d13d with SMTP id jk20-20020a170903331400b00189a208d13dmr14694347plb.144.1669953742353; Thu, 01 Dec 2022 20:02:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669953742; cv=none; d=google.com; s=arc-20160816; b=DdpoLlEjKSrd/MWdI/7+Uz6XG6845l3VgDM4JXPOYdU3snz/kTbAaFBryy1cWQnsXh QHQvZlFhSvCicdpdJk0KMfw0KD6QjnS2qtNejb+aSfNc7BQZwAfmF++Lm1VhdJ+kDhNc yME6hqyegUPja7CByGLg875nl8l4E0kTA/5cEZ5n0sXAnB5IB8sC4by8tsoPCezWR+s7 rHzW2QcVzJEHm3SgT628gIfBT8D6Pt6LC9cuZCm7UeFwj6yveh3PnOBuq2Q4GreU+R/Z 4Mo37MFnxQiBhDPGFIgfE3JC05i6ECREyiuufxqQ49ByHF4xRjD9R4EJoeeFG1rWCBEw L3IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:to:from; bh=K3D3PiCRDu5wedC9m7/3OBFGpjWoIobNym58SqS+i2k=; b=WXFG2gF7SjpmAAO5DZjg4IKpE4gWcUw8t1CGPI+5qZZMo7S+M0Cw9K58PgDSFFlTRS Fzk+5TthmPudVK3bZEK3XVwpbFjQRpwgNxzi4LwhLmirzK01nxvZ98NPn3gl5WBWz+WJ veQ7BnkQb2dNltiyCJe1EhAhE0rAS+El13zBSkAI2c4XD+1dks+DrS6CeGORozY/qDyS c6oCixISWCDHkQ1ZSUvp78Wa77D62yjqmRqC6WUd3g6uWjJtl8rB7iICloWaZiQrYrpw Inyt3USVI5L/fiLaTjL3BN20ZSzaYli3uDfc/L7BPP3EjsCfP/eH4q/M5Noi+kyFMa/C NrMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p7-20020a634f47000000b00477f41d51f1si6079522pgl.779.2022.12.01.20.02.07; Thu, 01 Dec 2022 20:02:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231562AbiLBD5L (ORCPT <rfc822;heyuhang3455@gmail.com> + 99 others); Thu, 1 Dec 2022 22:57:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230382AbiLBD5J (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 1 Dec 2022 22:57:09 -0500 X-Greylist: delayed 600 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 01 Dec 2022 19:57:06 PST Received: from mx417.baidu.com (mx417.baidu.com [124.64.200.157]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 42113CFE63 for <linux-kernel@vger.kernel.org>; Thu, 1 Dec 2022 19:57:06 -0800 (PST) Received: from bjhw-sys-rpm015653cc5.bjhw.baidu.com (bjhw-sys-rpm015653cc5.bjhw.baidu.com [10.227.53.39]) by mx417.baidu.com (Postfix) with ESMTP id 99CED19B80054; Fri, 2 Dec 2022 11:37:32 +0800 (CST) Received: from localhost (localhost [127.0.0.1]) by bjhw-sys-rpm015653cc5.bjhw.baidu.com (Postfix) with ESMTP id 940AED9932; Fri, 2 Dec 2022 11:37:32 +0800 (CST) From: lirongqing@baidu.com To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, peterz@infradead.org, tony.luck@intel.com, wyes.karny@amd.com, linux-kernel@vger.kernel.org, rafael.j.wysocki@intel.com Subject: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver Date: Fri, 2 Dec 2022 11:37:32 +0800 Message-Id: <1669952252-32710-1-git-send-email-lirongqing@baidu.com> X-Mailer: git-send-email 1.7.1 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751073415756292488?= X-GMAIL-MSGID: =?utf-8?q?1751073415756292488?= |
Series |
[RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver
|
|
Commit Message
Li,Rongqing
Dec. 2, 2022, 3:37 a.m. UTC
From: Li RongQing <lirongqing@baidu.com> x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will cause performance degradation, so override prefer_mwait_c1_over_halt to a new value, aviod loading cpuidle-haltpoll driver Signed-off-by: Li RongQing <lirongqing@baidu.com> --- arch/x86/include/asm/processor.h | 2 +- arch/x86/kernel/process.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
Comments
Li! On Fri, Dec 02 2022 at 11:37, lirongqing@baidu.com wrote: > From: Li RongQing <lirongqing@baidu.com> > > x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will > cause performance degradation, so override prefer_mwait_c1_over_halt > to a new value, aviod loading cpuidle-haltpoll driver Neither the subject line nor the above makes any sense to me. prefer_mwait_c1_over_halt() is a function which is invoked and when it returns true then the execution ends up in the code path you are patching. > @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c) > } else if (prefer_mwait_c1_over_halt(c)) { > pr_info("using mwait in idle threads\n"); > x86_idle = mwait_idle; > + boot_option_idle_override = IDLE_PREF_MWAIT; What you do is setting boot_option_idle_override to a new value, but that has nothing to do with prefer_mwait_c1_over_halt() at all. So how are you overriding that function to a new value? But that's just a word smithing problem. The real and way worse problem is that you pick a variable, which has the purpose to capture the idle override on the kernel command line, and modify it as you see fit, just to prevent that driver from loading. select_idle_routine() reads it to check whether there was a command line override or not. But it is not supposed to write it. Why? Have you checked what else evaluates that variable? Obviously not, because a simple grep would have told you: drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE) drivers/idle/intel_idle.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE) Congratulations! Your patch breaks the default setup of every recent Intel system on the planet because it not only prevents the cpuidle-haltpoll, but also the intel_idle driver from loading. Seriously. It's not too much asked to do at least a simple grep and look at all _nine_ places which evaluate boot_option_idle_override. Thanks, tglx
> -----Original Message----- > From: Thomas Gleixner <tglx@linutronix.de> > Sent: Saturday, December 3, 2022 2:48 AM > To: Li,Rongqing <lirongqing@baidu.com>; mingo@redhat.com; bp@alien8.de; > dave.hansen@linux.intel.com; x86@kernel.org; peterz@infradead.org; > tony.luck@intel.com; wyes.karny@amd.com; linux-kernel@vger.kernel.org; > rafael.j.wysocki@intel.com > Subject: Re: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid > loading cpuidle-haltpoll driver > > Li! > > On Fri, Dec 02 2022 at 11:37, lirongqing@baidu.com wrote: > > From: Li RongQing <lirongqing@baidu.com> > > > > x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will > > cause performance degradation, so override prefer_mwait_c1_over_halt > > to a new value, aviod loading cpuidle-haltpoll driver > > Neither the subject line nor the above makes any sense to me. > > prefer_mwait_c1_over_halt() is a function which is invoked and when it returns > true then the execution ends up in the code path you are patching. > > > @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c) > > } else if (prefer_mwait_c1_over_halt(c)) { > > pr_info("using mwait in idle threads\n"); > > x86_idle = mwait_idle; > > + boot_option_idle_override = IDLE_PREF_MWAIT; > > What you do is setting boot_option_idle_override to a new value, but that has > nothing to do with prefer_mwait_c1_over_halt() at all. > > So how are you overriding that function to a new value? > > But that's just a word smithing problem. > > The real and way worse problem is that you pick a variable, which has the > purpose to capture the idle override on the kernel command line, and modify it > as you see fit, just to prevent that driver from loading. > > select_idle_routine() reads it to check whether there was a command line > override or not. But it is not supposed to write it. Why? > > Have you checked what else evaluates that variable? Obviously not, because a > simple grep would have told you: > > drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override != > IDLE_NO_OVERRIDE) > drivers/idle/intel_idle.c: if (boot_option_idle_override != > IDLE_NO_OVERRIDE) > > Congratulations! > > Your patch breaks the default setup of every recent Intel system on the planet > because it not only prevents the cpuidle-haltpoll, but also the intel_idle driver > from loading. > > Seriously. It's not too much asked to do at least a simple grep and look at all > _nine_ places which evaluate boot_option_idle_override. > Sorry for the careless Thanks for the review, I will send a new version, which export a function to tell haltpoll driver whether or not mwait is used, like below diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 67c9d73..159ef33 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -862,4 +862,6 @@ bool arch_is_platform_page(u64 paddr); #define arch_is_platform_page arch_is_platform_page #endif +bool is_mwait_idle(void); + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index c21b734..330972c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -896,6 +896,12 @@ void select_idle_routine(const struct cpuinfo_x86 *c) x86_idle = default_idle; } +bool is_mwait_idle(void) +{ + return x86_idle == mwait_idle; +} +EXPORT_SYMBOL_GPL(is_mwait_idle); + void amd_e400_c1e_apic_setup(void) { if (boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) { diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c index 3a39a7f..8cf1ddf 100644 --- a/drivers/cpuidle/cpuidle-haltpoll.c +++ b/drivers/cpuidle/cpuidle-haltpoll.c @@ -17,6 +17,7 @@ #include <linux/sched/idle.h> #include <linux/kvm_para.h> #include <linux/cpuidle_haltpoll.h> +#include <linux/processor.h> static bool force __read_mostly; module_param(force, bool, 0444); @@ -111,6 +112,9 @@ static int __init haltpoll_init(void) if (!kvm_para_available() || !haltpoll_want()) return -ENODEV; + if (is_mwait_idle()) + return -ENODEV; + cpuidle_poll_state_init(drv); ret = cpuidle_register_driver(drv); Thanks -Li
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 67c9d73..6bc94fd 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -658,7 +658,7 @@ extern void amd_e400_c1e_apic_setup(void); extern unsigned long boot_option_idle_override; enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT, - IDLE_POLL}; + IDLE_POLL, IDLE_PREF_MWAIT}; extern void enable_sep_cpu(void); extern int sysenter_setup(void); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index c21b734..a16985c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c) } else if (prefer_mwait_c1_over_halt(c)) { pr_info("using mwait in idle threads\n"); x86_idle = mwait_idle; + boot_option_idle_override = IDLE_PREF_MWAIT; } else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) { pr_info("using TDX aware idle routine\n"); x86_idle = tdx_safe_halt;