Message ID | 20221104145737.71236-6-anna-maria@linutronix.de |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp456672wru; Fri, 4 Nov 2022 08:03:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5kAKNOQumG6PMIfIcMOlS/AZkiGB2wgNhRINrCymjKTaSbg22e+PyjClss47QPKUWmFdRf X-Received: by 2002:a17:906:6a27:b0:7a6:c537:ba4 with SMTP id qw39-20020a1709066a2700b007a6c5370ba4mr33318777ejc.517.1667574216258; Fri, 04 Nov 2022 08:03:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667574216; cv=none; d=google.com; s=arc-20160816; b=gA9EY4q2ohtYlT59CW9tIWCCo2YR6Toh6ZmJCQC5mM8YdnSOzZf25LvRO15SQ2IjQp JlGMo2149FAqnYhP/zH7C/DlMNOeBFU0Q7dgaDYyJrinxxxHJzHxPkRAkbBAI45LEE4+ 5Z5Dx8jRiSanb9m67VHtlj+2SrwRHwRiYOTjVd0r5cFNpPAYNMuWeyNeOxtjuYUyK2I8 /fBAl7SVa2J8adsnd2dSWGhdIpIkzRua15Dp1yEibK6ye5eOy0p95TsL7xG7JU2qFs97 yH/CrTxLGRPCyhSG4a5SG9SCfBcUlK+8/9SJlRT8et+VxJm3hXDh7oj2gXw+cPTALXbD RODw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=SJQXT73M5EqpgN/OaScjUbemA9h45EuN/DjSPRp7+B8=; b=lko5gW1klxKHxhWOJGl/AD5tWg7nLGTOP75gst9z1vPZUpUhg4dXBnmiyPHkAi+MJ8 4zlzwNwq4EpYWQMzZjcA9GENnyYkKqfOf8QB0xMpiD89vghh/2OFRCNWcrdv6s5kzJDz tuSWpHTz5ptm40eWDraZT86ngzXiq0KAAbVXLEKBnb5JbIOEDSZ7SAl/RIoVmCNiK2pQ f3x70mUlB4m0v+DFRMLiOLBfr9TnuEVs+B3nHTFCkUDKPNECmzZrdtSR1qpI0LvvVs4L gT1njcIDAFp5psn/GclxtYMOUWwlm0sbW3GaP8rQUvlkCy0HwNzzo0rGj8V57QrObaOA 2W7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=opSSMs8+; dkim=neutral (no key) header.i=@linutronix.de header.b=Qw00fyiu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d14-20020a50f68e000000b004608c0b9a8asi4956463edn.201.2022.11.04.08.03.10; Fri, 04 Nov 2022 08:03:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=opSSMs8+; dkim=neutral (no key) header.i=@linutronix.de header.b=Qw00fyiu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232298AbiKDO6t (ORCPT <rfc822;jimliu8233@gmail.com> + 99 others); Fri, 4 Nov 2022 10:58:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232263AbiKDO6C (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 4 Nov 2022 10:58:02 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C7B52ED53 for <linux-kernel@vger.kernel.org>; Fri, 4 Nov 2022 07:57:58 -0700 (PDT) From: Anna-Maria Behnsen <anna-maria@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1667573875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SJQXT73M5EqpgN/OaScjUbemA9h45EuN/DjSPRp7+B8=; b=opSSMs8+INQWcYQu5UHb3U05ouat0+yID7SNf3MMsxOGnyiItb/TVF8NFamERP/PRs9WS/ QWiFzEGdPN/gDJwLu1w2iZ2XRZ8GRyJn3I3oCyzc/Nl/PYbJMmTnoanSIQTN+fBTorjgoK dbJVSy4yNQxYIFHfHhUIbtR+XyXGAKEyBdg+DvZLEXBIPBgMXpta3mEiCbv2hUXANnRDxm +ZJ264kASAAprhV9amckn3jET3X8I/cL32JGSblkbgX6pHJ4hb5j3SHni7EPbfMn6tA5PZ 88Tk1AUB2GexmeMPAEmKJpEL57n6qdZFJHtdUx4rlzq7frqsvT/zcOLSnB2Xsg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1667573875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SJQXT73M5EqpgN/OaScjUbemA9h45EuN/DjSPRp7+B8=; b=Qw00fyiuSOq+8nMDyzrrq/BLOY+V56inCGAMwhxbouDigpsmgPMJJjh8IN+Ez30tZDJ+3x c55XuXSl6NCE/HCg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org>, John Stultz <jstultz@google.com>, Thomas Gleixner <tglx@linutronix.de>, Eric Dumazet <edumazet@google.com>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>, Arjan van de Ven <arjan@infradead.org>, "Paul E . McKenney" <paulmck@kernel.org>, Frederic Weisbecker <fweisbec@gmail.com>, Rik van Riel <riel@surriel.com>, Anna-Maria Behnsen <anna-maria@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, "Theodore Ts'o" <tytso@mit.edu>, "Jason A. Donenfeld" <Jason@zx2c4.com>, Stephen Boyd <sboyd@kernel.org>, Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com> Subject: [PATCH v4 05/16] add_timer_on(): Make sure callers have TIMER_PINNED flag Date: Fri, 4 Nov 2022 15:57:26 +0100 Message-Id: <20221104145737.71236-6-anna-maria@linutronix.de> In-Reply-To: <20221104145737.71236-1-anna-maria@linutronix.de> References: <20221104145737.71236-1-anna-maria@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748578301556558906?= X-GMAIL-MSGID: =?utf-8?q?1748578301556558906?= |
Series |
timer: Move from a push remote at enqueue to a pull at expiry model
|
|
Commit Message
Anna-Maria Behnsen
Nov. 4, 2022, 2:57 p.m. UTC
The implementation of the hierachical timer pull model will change the
timer bases per CPU. Timers, that have to expire on a specific CPU, require
the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU
but in global timer base and those timers could also expire on other
CPUs. Timers with TIMER_DEFERRABLE flag end up in a separate base anyway
and are executed on the local CPU only.
Therefore add the missing TIMER_PINNED flag for those callers who use
add_timer_on() without the flag. No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: John Stultz <jstultz@google.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
---
v4:
- Move patch before local and global base are introduced
- Add missing user (drivers/char/random.c) of add_timer_on() without
TIMER_PINNED flag (kernel test robot)
---
arch/x86/kernel/tsc_sync.c | 3 ++-
drivers/char/random.c | 2 +-
kernel/time/clocksource.c | 2 +-
kernel/workqueue.c | 7 +++++--
4 files changed, 9 insertions(+), 5 deletions(-)
Comments
On Fri, Nov 04, 2022 at 03:57:26PM +0100, Anna-Maria Behnsen wrote: > The implementation of the hierachical timer pull model will change the > timer bases per CPU. Timers, that have to expire on a specific CPU, require > the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU > but in global timer base and those timers could also expire on other > CPUs. Timers with TIMER_DEFERRABLE flag end up in a separate base anyway > and are executed on the local CPU only. > > Therefore add the missing TIMER_PINNED flag for those callers who use > add_timer_on() without the flag. No functional change. You're fixing the current callers but what about the future ones? add_timer_on() should always guarantee that a timer runs on the right destination, which is not the case after your patchset if the timer hasn't been set to TIMER_PINNED. Therefore I think we should either have: * add_timer_on() enforce TIMER_PINNED (doesn't work because if the timer is later called with mod_timer(), we should expect it to run anywhere) or * add_timer_on() warns if !TIMER_PINNED or * have an internal flag TIMER_LOCAL, that is turned on when add_timer_on() is called or add_timer()/mod_timer() is called on a TIMER_PINNED. Otherwise it is turned off. The last solution should work with existing API and you don't need to chase the current and future users of add_timer_on(). Thanks.
On Fri, 4 Nov 2022, Frederic Weisbecker wrote: > On Fri, Nov 04, 2022 at 03:57:26PM +0100, Anna-Maria Behnsen wrote: > > The implementation of the hierachical timer pull model will change the > > timer bases per CPU. Timers, that have to expire on a specific CPU, require > > the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU > > but in global timer base and those timers could also expire on other > > CPUs. Timers with TIMER_DEFERRABLE flag end up in a separate base anyway > > and are executed on the local CPU only. > > > > Therefore add the missing TIMER_PINNED flag for those callers who use > > add_timer_on() without the flag. No functional change. > > You're fixing the current callers but what about the future ones? > > add_timer_on() should always guarantee that a timer runs on the > right destination, which is not the case after your patchset if the > timer hasn't been set to TIMER_PINNED. > > Therefore I think we should either have: > > * add_timer_on() enforce TIMER_PINNED (doesn't work because if the timer is > later called with mod_timer(), we should expect it to run anywhere) > > or > > * add_timer_on() warns if !TIMER_PINNED This is already part of the last patch of the queue where also the crystalball logic is removed. But the patch where I added the WARN_ONCE() might be the wrong patch, it should be better part of the next patch where the new timer bases are introduced. > or > > * have an internal flag TIMER_LOCAL, that is turned on when > add_timer_on() is called or add_timer()/mod_timer() is called > on a TIMER_PINNED. Otherwise it is turned off. > > The last solution should work with existing API and you don't need to > chase the current and future users of add_timer_on(). With the last approach it doesn't matter how the timer is setup. Everything is done by timer code implicitly. When a future caller uses add_timer_on() and wants to modfiy this "implicitly pinned timer", he will call mod_timer() and the timer is no longer pinned (if it do not end up in the same bucket it was before). For a user this does not seems to be very obvious, or am I wrong? But if the caller sets up the timer correctly we do not need this extra timer flag. With the WARN_ONCE() in place, callers need to do the timer setup properly and it is more clear to the caller what should be done. BTW, the hunk in this patch for the workqueue is also not a final fix in my opinion. I'm preparing a cleanup queue (it's part of the deferrable cleanup queue), where I want to set the timer flags properly when initializing/defining the workers. I should have added a comment here... Thanks, Anna-Maria
On Mon, Nov 07, 2022 at 09:11:11AM +0100, Anna-Maria Behnsen wrote: > On Fri, 4 Nov 2022, Frederic Weisbecker wrote: > > > On Fri, Nov 04, 2022 at 03:57:26PM +0100, Anna-Maria Behnsen wrote: > > > The implementation of the hierachical timer pull model will change the > > > timer bases per CPU. Timers, that have to expire on a specific CPU, require > > > the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU > > > but in global timer base and those timers could also expire on other > > > CPUs. Timers with TIMER_DEFERRABLE flag end up in a separate base anyway > > > and are executed on the local CPU only. > > > > > > Therefore add the missing TIMER_PINNED flag for those callers who use > > > add_timer_on() without the flag. No functional change. > > > > You're fixing the current callers but what about the future ones? > > > > add_timer_on() should always guarantee that a timer runs on the > > right destination, which is not the case after your patchset if the > > timer hasn't been set to TIMER_PINNED. > > > > Therefore I think we should either have: > > > > * add_timer_on() enforce TIMER_PINNED (doesn't work because if the timer is > > later called with mod_timer(), we should expect it to run anywhere) > > > > or > > > > * add_timer_on() warns if !TIMER_PINNED > > This is already part of the last patch of the queue where also the > crystalball logic is removed. But the patch where I added the WARN_ONCE() > might be the wrong patch, it should be better part of the next patch where > the new timer bases are introduced. Ok. > > > or > > > > * have an internal flag TIMER_LOCAL, that is turned on when > > add_timer_on() is called or add_timer()/mod_timer() is called > > on a TIMER_PINNED. Otherwise it is turned off. > > > > The last solution should work with existing API and you don't need to > > chase the current and future users of add_timer_on(). > > With the last approach it doesn't matter how the timer is setup. Everything > is done by timer code implicitly. When a future caller uses add_timer_on() > and wants to modfiy this "implicitly pinned timer", he will call > mod_timer() and the timer is no longer pinned (if it do not end up in the > same bucket it was before). For a user this does not seems to be very > obvious, or am I wrong? That's right indeed. > > But if the caller sets up the timer correctly we do not need this extra > timer flag. With the WARN_ONCE() in place, callers need to do the timer > setup properly and it is more clear to the caller what should be done. Yeah that sounds better. > BTW, the hunk in this patch for the workqueue is also not a final fix in my > opinion. I'm preparing a cleanup queue (it's part of the deferrable cleanup > queue), where I want to set the timer flags properly when > initializing/defining the workers. I should have added a comment here... Ok, if we have some pinned initializers such as DECLARE_DELAYED_WORK_PINNED() and DECLARE_DEFERRABKE_WORK_PINNED() then I think that cleans the situation. Sounds good, thanks! > > Thanks, > > Anna-Maria >
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c index 9452dc9664b5..eab827288e0f 100644 --- a/arch/x86/kernel/tsc_sync.c +++ b/arch/x86/kernel/tsc_sync.c @@ -110,7 +110,8 @@ static int __init start_sync_check_timer(void) if (!cpu_feature_enabled(X86_FEATURE_TSC_ADJUST) || tsc_clocksource_reliable) return 0; - timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, 0); + timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, + TIMER_PINNED); tsc_sync_check_timer.expires = jiffies + SYNC_CHECK_INTERVAL; add_timer(&tsc_sync_check_timer); diff --git a/drivers/char/random.c b/drivers/char/random.c index 69754155300e..2cae98dc86dc 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -949,7 +949,7 @@ static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = { #define FASTMIX_PERM HSIPHASH_PERMUTATION .pool = { HSIPHASH_CONST_0, HSIPHASH_CONST_1, HSIPHASH_CONST_2, HSIPHASH_CONST_3 }, #endif - .mix = __TIMER_INITIALIZER(mix_interrupt_randomness, 0) + .mix = __TIMER_INITIALIZER(mix_interrupt_randomness, TIMER_PINNED) }; /* diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 8058bec87ace..f8c310e62758 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -523,7 +523,7 @@ static inline void clocksource_start_watchdog(void) { if (watchdog_running || !watchdog || list_empty(&watchdog_list)) return; - timer_setup(&watchdog_timer, clocksource_watchdog, 0); + timer_setup(&watchdog_timer, clocksource_watchdog, TIMER_PINNED); watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL; add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask)); watchdog_running = 1; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7cd5f5e7e0a1..a0f7bf7be6f2 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1670,10 +1670,13 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq, dwork->cpu = cpu; timer->expires = jiffies + delay; - if (unlikely(cpu != WORK_CPU_UNBOUND)) + if (unlikely(cpu != WORK_CPU_UNBOUND)) { + timer->flags |= TIMER_PINNED; add_timer_on(timer, cpu); - else + } else { + timer->flags &= ~TIMER_PINNED; add_timer(timer); + } } /**