Message ID | 20231115151325.6262-4-frederic@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2605678vqg; Wed, 15 Nov 2023 07:14:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IFNDd9yTMEIdM7EP3F5yxcm9JcakooV9dUsO2igxa4A4J4jOTveTeZ82svGfk+L57IMY6gx X-Received: by 2002:a05:6a20:244b:b0:14d:6309:fc96 with SMTP id t11-20020a056a20244b00b0014d6309fc96mr11471789pzc.4.1700061290264; Wed, 15 Nov 2023 07:14:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700061290; cv=none; d=google.com; s=arc-20160816; b=hbg5ge0fc531URByjsEGJEiJdSZ2QRMG+sYMYoskHO2nNfmx8m0lA1dzGQha8q+d98 E+hkoGmV5kBtKgbtWfDg/68t5oyUb6BPnOX1HXTMGer317r9brHgwW2m3aJYjvi00bDL qeKh5tczp8xcZDRBYf585MpmWzm0t2AFwgenBC8Wo96E1MjQkmfVO7NayhBPQeQj/6lB 8Auk8y1f82Ls7PqdLpkLmcC00G/+rJ26jKZFz5ZhyCyRzkRtlEUtMcKiF0HN+77Lsiic 2RKXPMvne0z7gqiXm9u01zBrVJzcHIpGlyoDWfX+cuWQZEKl7gpMyxD75T8d/Osf9Qhx e/oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gXBM4+jjqNI6ShI3pMwXdx9ihI+g+PeOrL8L79kGSJU=; fh=vOxJ5zAh/ByOGHIIhVBN1rQVKUP9RG+2PkGMcshoCBs=; b=cMlo94j9Gq2g/7sezzYjuxO5ZK7RNuzKdvbGd24lANGumNc4cX9zzO7RmuJVZwA2R3 lrMErcwH8LA2EbNDSlvw6fAa9zvvdXsAmMM6dq1BtvY1qcECJPNG4iGQd42cALVNYeSI G988LA4M37AuIAutIhimhEKS+TAJLbgYOxgo77Lx6/uWrcOJCV3SsNFFsMyqtp29lGOh LyUUfZoiLHKuZIirxsFRCXh1bCkOLrEG4xFBbIP4BYi0uuvcXOEwGCglXvNVUQ7cbULi KZmochrZtnhHBaRGccb++ZR77W8uj6713oOFAH6QIMDq5LXJL1r2VxOnrkOdBTnrb9vp WLwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bLxROmPD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id bd33-20020a056a0027a100b006c382648737si10315550pfb.115.2023.11.15.07.14.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 07:14:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bLxROmPD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 6AE47801B334; Wed, 15 Nov 2023 07:14:47 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344494AbjKOPOT (ORCPT <rfc822;lhua1029@gmail.com> + 29 others); Wed, 15 Nov 2023 10:14:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344461AbjKOPOI (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 15 Nov 2023 10:14:08 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3967919F for <linux-kernel@vger.kernel.org>; Wed, 15 Nov 2023 07:14:06 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4D18EC433C9; Wed, 15 Nov 2023 15:14:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700061245; bh=FusWax7dBGiMSsQErHycJtFZnQRX1tPbGgeRj6wuVTc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bLxROmPDxBB6f9W3KC+aeY3ToBV7kGUt66qKwhArVFuIVl87ae+RbGrQ74Ga8mVV3 M/lVlVMjqtzBH4RnCGlTlUqbhkdg8LAH9b1rr2R4MrF8J7gRa9gaPT1YcUm1AE1WDZ IhOLJpdDo1RaSybJc+2cJVSJ937/aUnnkCBeizYwLgrHawuWVWsp7f6e/9PXdrrcNG XNJmPZSZTDb8ClGnMixOvjNO6bSNkvG70y/Nkn96YV5XhKWq3eldjYJjwTbN+fvwPK pDudPFcrP2qXQidfvj2nXoOzY1YrXC7jB9kNpGuq3N/GvjSZgFaqiGAnbImq42a3cJ vNxgJheaI2lAw== From: Frederic Weisbecker <frederic@kernel.org> To: LKML <linux-kernel@vger.kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, Peter Zijlstra <peterz@infradead.org>, x86@kernel.org, "Rafael J . Wysocki" <rafael@kernel.org> Subject: [PATCH 3/4] x86: Remove __current_clr_polling() from mwait_idle() Date: Wed, 15 Nov 2023 10:13:24 -0500 Message-ID: <20231115151325.6262-4-frederic@kernel.org> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231115151325.6262-1-frederic@kernel.org> References: <20231115151325.6262-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 15 Nov 2023 07:14:47 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782643467497276358 X-GMAIL-MSGID: 1782643467497276358 |
Series |
x86/cpuidle fixes and optimization
|
|
Commit Message
Frederic Weisbecker
Nov. 15, 2023, 3:13 p.m. UTC
mwait_idle() is only ever called through by cpuidle, either from default_idle_call() or from cpuidle_enter(). In any case cpuidle_idle_call() sets again TIF_NR_POLLING after calling it so there is no point for this atomic operation upon idle exit. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> --- arch/x86/kernel/process.c | 1 - 1 file changed, 1 deletion(-)
Comments
On Wed, Nov 15, 2023 at 10:13:24AM -0500, Frederic Weisbecker wrote: > mwait_idle() is only ever called through by cpuidle, either from > default_idle_call() or from cpuidle_enter(). In any case > cpuidle_idle_call() sets again TIF_NR_POLLING after calling it so there > is no point for this atomic operation upon idle exit. > > Acked-by: Rafael J. Wysocki <rafael@kernel.org> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > --- > arch/x86/kernel/process.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index b6f4e8399fca..fc7a38084606 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -930,7 +930,6 @@ static __cpuidle void mwait_idle(void) > raw_local_irq_disable(); > } > } > - __current_clr_polling(); > } > > void select_idle_routine(const struct cpuinfo_x86 *c) Urgh at this and the next one... That is, yes we can do this, but it makes these function asymmetric and doesn't actually solve the underlying problem that all of the polling stuff is inside-out. Idle loop sets polling, then clears polling because it assumes all arch/driver idle loops are non-polling, then individual drivers re-set polling, and to be symmetric (above) clear it again, for the generic code to set it again, only to clear it again when leaving idle. Follow that? ;-) Anyway, drivers ought to tell up-front if they're polling and then we can avoid the whole dance and everything is better. Something like the very crude below. --- arch/x86/include/asm/mwait.h | 3 +-- arch/x86/kernel/process.c | 3 +-- drivers/acpi/processor_idle.c | 2 +- drivers/idle/intel_idle.c | 3 +++ include/linux/cpuidle.h | 6 ++++++ include/linux/sched/idle.h | 7 ++++++- kernel/sched/idle.c | 22 +++++++++++++++++----- 7 files changed, 35 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 778df05f8539..c0c60fde7a4d 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -107,7 +107,7 @@ static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx) */ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx) { - if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) { + if (static_cpu_has_bug(X86_BUG_MONITOR) || !need_resched()) { if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) { mb(); clflush((void *)¤t_thread_info()->flags); @@ -118,7 +118,6 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo if (!need_resched()) __mwait(eax, ecx); } - current_clr_polling(); } /* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index b6f4e8399fca..73baa82584c2 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -917,7 +917,7 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c) */ static __cpuidle void mwait_idle(void) { - if (!current_set_polling_and_test()) { + if (!need_resched()) { if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) { mb(); /* quirk */ clflush((void *)¤t_thread_info()->flags); @@ -930,7 +930,6 @@ static __cpuidle void mwait_idle(void) raw_local_irq_disable(); } } - __current_clr_polling(); } void select_idle_routine(const struct cpuinfo_x86 *c) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 3a34a8c425fe..58765557b1b8 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1219,7 +1219,7 @@ static int acpi_processor_setup_lpi_states(struct acpi_processor *pr) state->target_residency = lpi->min_residency; state->flags |= arch_get_idle_state_flags(lpi->arch_flags); if (i != 0 && lpi->entry_method == ACPI_CSTATE_FFH) - state->flags |= CPUIDLE_FLAG_RCU_IDLE; + state->flags |= CPUIDLE_FLAG_RCU_IDLE | CPUIDLE_FLAG_NO_IPI; state->enter = acpi_idle_lpi_enter; drv->safe_state_index = i; } diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index dcda0afecfc5..496b6a309f4f 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1569,6 +1569,7 @@ static void __init intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) state->target_residency *= 3; state->flags = MWAIT2flg(cx->address); + state->flags |= CPUIDLE_FLAG_NO_IPI; if (cx->type > ACPI_STATE_C2) state->flags |= CPUIDLE_FLAG_TLB_FLUSHED; @@ -1841,6 +1842,8 @@ static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) static void state_update_enter_method(struct cpuidle_state *state, int cstate) { + state->flags |= CPUIDLE_FLAG_NO_IPI; + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { /* * Combining with XSTATE with IBRS or IRQ_ENABLE flags diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 3183aeb7f5b4..623d88bd7658 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -85,6 +85,7 @@ struct cpuidle_state { #define CPUIDLE_FLAG_OFF BIT(4) /* disable this state by default */ #define CPUIDLE_FLAG_TLB_FLUSHED BIT(5) /* idle-state flushes TLBs */ #define CPUIDLE_FLAG_RCU_IDLE BIT(6) /* idle-state takes care of RCU */ +#define CPUIDLE_FLAG_NO_IPI BIT(7) /* has TIF_POLLING_NRFLAG */ struct cpuidle_device_kobj; struct cpuidle_state_kobj; @@ -168,6 +169,11 @@ struct cpuidle_driver { }; #ifdef CONFIG_CPU_IDLE +bool cpuidle_state_needs_ipi(struct cpuidle_driver *drv, int state) +{ + return !(drv->states[state].flags & CPUIDLE_FLAG_NO_IPI); +} + extern void disable_cpuidle(void); extern bool cpuidle_not_available(struct cpuidle_driver *drv, struct cpuidle_device *dev); diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h index 478084f9105e..24b29bb5d43b 100644 --- a/include/linux/sched/idle.h +++ b/include/linux/sched/idle.h @@ -68,6 +68,8 @@ static __always_inline bool __must_check current_set_polling_and_test(void) static __always_inline bool __must_check current_clr_polling_and_test(void) { + bool ret; + __current_clr_polling(); /* @@ -76,7 +78,10 @@ static __always_inline bool __must_check current_clr_polling_and_test(void) */ smp_mb__after_atomic(); - return unlikely(tif_need_resched()); + bool ret = unlikely(tif_need_resched()); + if (ret) + __current_set_polling(); + return ret; } #else diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 565f8374ddbb..65e2474cb82b 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -94,11 +94,12 @@ void __cpuidle default_idle_call(void) stop_critical_timings(); ct_cpuidle_enter(); - arch_cpu_idle(); + arch_cpu_idle(); // XXX assumes !polling ct_cpuidle_exit(); start_critical_timings(); trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + __current_set_polling(); } local_irq_enable(); instrumentation_end(); @@ -107,20 +108,27 @@ void __cpuidle default_idle_call(void) static int call_cpuidle_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) { + int ret; + if (current_clr_polling_and_test()) return -EBUSY; - return cpuidle_enter_s2idle(drv, dev); + ret = cpuidle_enter_s2idle(drv, dev); + __current_set_polling(); + return ret; } static int call_cpuidle(struct cpuidle_driver *drv, struct cpuidle_device *dev, int next_state) { + bool need_ipi = cpuidle_state_needs_ipi(drv, next_state); + int ret; + /* * The idle task must be scheduled, it is pointless to go to idle, just * update no idle residency and return. */ - if (current_clr_polling_and_test()) { + if (needs_ipi && current_clr_polling_and_test()) { dev->last_residency_ns = 0; local_irq_enable(); return -EBUSY; @@ -131,7 +139,12 @@ static int call_cpuidle(struct cpuidle_driver *drv, struct cpuidle_device *dev, * This function will block until an interrupt occurs and will take * care of re-enabling the local interrupts */ - return cpuidle_enter(drv, dev, next_state); + ret = cpuidle_enter(drv, dev, next_state); + + if (needs_ipi) + __current_set_polling(); + + return ret; } /** @@ -220,7 +233,6 @@ static void cpuidle_idle_call(void) } exit_idle: - __current_set_polling(); /* * It is up to the idle functions to reenable local interrupts
Le Thu, Nov 16, 2023 at 04:13:16PM +0100, Peter Zijlstra a écrit : > On Wed, Nov 15, 2023 at 10:13:24AM -0500, Frederic Weisbecker wrote: > > mwait_idle() is only ever called through by cpuidle, either from > > default_idle_call() or from cpuidle_enter(). In any case > > cpuidle_idle_call() sets again TIF_NR_POLLING after calling it so there > > is no point for this atomic operation upon idle exit. > > > > Acked-by: Rafael J. Wysocki <rafael@kernel.org> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > --- > > arch/x86/kernel/process.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > > index b6f4e8399fca..fc7a38084606 100644 > > --- a/arch/x86/kernel/process.c > > +++ b/arch/x86/kernel/process.c > > @@ -930,7 +930,6 @@ static __cpuidle void mwait_idle(void) > > raw_local_irq_disable(); > > } > > } > > - __current_clr_polling(); > > } > > > > void select_idle_routine(const struct cpuinfo_x86 *c) > > > Urgh at this and the next one... That is, yes we can do this, but it > makes these function asymmetric and doesn't actually solve the > underlying problem that all of the polling stuff is inside-out. > > Idle loop sets polling, then clears polling because it assumes all > arch/driver idle loops are non-polling, then individual drivers re-set > polling, and to be symmetric (above) clear it again, for the generic > code to set it again, only to clear it again when leaving idle. > > Follow that? ;-) That's right :-) > > Anyway, drivers ought to tell up-front if they're polling and then we > can avoid the whole dance and everything is better. > > Something like the very crude below. Yeah that makes perfect sense (can I use your SoB right away?) Though I sometimes wonder why we even bother with setting TIF_NR_POLLING for some short parts in the generic idle loop even on !mwait and !cpuidle-state-polling states. Like for example why do we bother with setting TIF_NR_POLLING for just the portion in the generic idle loop that looks up the cpuidle state and stops the tick then clear TIF_NR_POLLING before calling wfi on ARM? Or may be it's a frequent pattern to have a remote wake up happening while entering the idle loop? Thanks.
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index b6f4e8399fca..fc7a38084606 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -930,7 +930,6 @@ static __cpuidle void mwait_idle(void) raw_local_irq_disable(); } } - __current_clr_polling(); } void select_idle_routine(const struct cpuinfo_x86 *c)