From patchwork Wed Mar 1 14:17:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62913 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658272wrd; Wed, 1 Mar 2023 06:22:22 -0800 (PST) X-Google-Smtp-Source: AK7set92Gxy/gyb9b1qHoUIt6DimJMksrN7+QkwaJpHc+8tofMeoi2kse9v0DMrdb0oQFBURI3aM X-Received: by 2002:a17:907:7244:b0:879:ab3:93d1 with SMTP id ds4-20020a170907724400b008790ab393d1mr9429682ejc.4.1677680542629; Wed, 01 Mar 2023 06:22:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680542; cv=none; d=google.com; s=arc-20160816; b=OyExELX3Q072Db76I+teLT4DT4pZNdU3mhFlHBcePq8oj9+3uPmAAu6Cu1D9s6AGoe RD+m0e52D2D784owGUtyzJuBrobFZE7UsaoGnhdVVnG/X+KpZNq6QkCqNws3XGyKcu/n y2yE4V+krQUddu84F8Vy/y0eJDVVxUWz8FtsgOPNOrhZNVRAjcMFfYGYUShjPU24fYnB UE+1NLZ38Rq2ucNh6ul34pR7gFFd3RZF9lg+L1W8Q+FaVlXYUI8D3dPzQHNciljzyuEc af26nMlH7Yte/Ze+Z3Uf4wpgs32C9MMsRc01H2BDB3tJBddiITORFVZZM+CWGFSGKCP8 EabQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=BOCDJ0nGoVQscn5FVoqed4VCDME+xrNe/Iu7/rdK+w0=; b=zeqQQZMWFO131BS902RHwgk9rvKQGZvnY8KqnEd98Jo8/iiLfq6BEpoYo5z4k7kpGP UhKTXGbOtFe1ElOB2h/F97vERiqtvEgdqfTvve7c7hRgwUIbY9/lxugYaxv4kIqPTr+Z V+LsEonvVD7gEUnsVS0zVF4kTVHy2YoXkmMW1IiXFR89hJ4sR7fqmYBgNFodtt2c5g0l wZVBLsO5HZ7FLOPJtzpJarfAYY5NnZZbUIBuHJSkNjr/lqqvRP/4+9TqC4FV7bmTUmpM 5z5QQs19UtdCx4XUkeEnm/nCoTDfnH8CRcd6cAFWocGUrBGOVcF7A9a0C1HwtXljw8EL RyuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=vVYQJSb9; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=z4koYsck; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id la18-20020a170906ad9200b008b17dfa77fdsi6186426ejb.127.2023.03.01.06.22.00; Wed, 01 Mar 2023 06:22:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=vVYQJSb9; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=z4koYsck; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230074AbjCAOSN (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229823AbjCAOSG (ORCPT ); Wed, 1 Mar 2023 09:18:06 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06DBE241E9 for ; Wed, 1 Mar 2023 06:18:04 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BOCDJ0nGoVQscn5FVoqed4VCDME+xrNe/Iu7/rdK+w0=; b=vVYQJSb9yVianAUa97TQoewFr+qTYrLwvmDerfNVM54QBpb6Asw+XvHxhSxUyuG8zUS4/8 2ILd7xEeSICxqzuQAROg2Pd+xnjftEkTW4yjSBmrHwQVUYhMVdIISb/rxGn9kmN9OsSvE3 t93Hklqifm0acmfPUhdOiheIyDSNxfWS1pbfDkC9EtHzk5SmwfTFv3eGk88siM1YhA1zEp 1cRCrf6fYJbmj9JrMog1nRWsZYga0avUVdi2KUD6ulMUe5eh3IOfjb9kY3jWTWPRDLml9C ExGMz8hthVqwPS3LA5h5r7l8pnn2SzEtgOeL+iSxJIXg7bjKL1zvEZR2F5oFUw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BOCDJ0nGoVQscn5FVoqed4VCDME+xrNe/Iu7/rdK+w0=; b=z4koYsck+hd+4bABjdVaOyloe9V3/a0wwqXl5oOBDAzih9YRhwXlbBqREIzjgDc6HjydLP Qx+i7ASmmkRCAZCA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Frederic Weisbecker Subject: [PATCH v5 01/18] tick-sched: Warn when next tick seems to be in the past Date: Wed, 1 Mar 2023 15:17:27 +0100 Message-Id: <20230301141744.16063-2-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175552761275168?= X-GMAIL-MSGID: =?utf-8?q?1759175552761275168?= When the next tick is in the past, the delta between basemono and the next tick gets negativ. But the next tick should never be in the past. The negative effect of a wrong next tick might be a stop of the tick and timers might expire late. To prevent expensive debugging when changing underlying code, add a WARN_ON_ONCE into this code path. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Thomas Gleixner Reviewed-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index b0e3c9205946..7ffdc7ba19b4 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -826,6 +826,8 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) * If the tick is due in the next period, keep it ticking or * force prod the timer. */ + WARN_ON_ONCE(basemono > next_tick); + delta = next_tick - basemono; if (delta <= (u64)TICK_NSEC) { /* From patchwork Wed Mar 1 14:17:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62912 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658251wrd; Wed, 1 Mar 2023 06:22:19 -0800 (PST) X-Google-Smtp-Source: AK7set8xlR025lRj+WT+165eST/r0MbyBG0L6SZvb9cBXge83F3hu6l3Rm5MuseUezcR4W7XwcCl X-Received: by 2002:a17:907:2bc2:b0:87f:546d:7cb4 with SMTP id gv2-20020a1709072bc200b0087f546d7cb4mr6872682ejc.64.1677680539093; Wed, 01 Mar 2023 06:22:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680539; cv=none; d=google.com; s=arc-20160816; b=kWpJoutmLi7AmokoOOFuSfaj/Svd5/yGYGpDzOG2RiBzE0WkBXZJc59QH/DAGqrqQn GDAldbM1LGrs7F6z54i8H4tlkV0yu9l3ViFQ3CV3QjhDEu2WJ92LpoxNNJYW59I+liFT 6Yz5QXMkbwtP0RkGyLQW9+IRGJhai3E0O2CXXNDP0DkAlTzcBg15Jf2rurgsWLE/g1uC atzoB2cKexL+bKsKmUAGz6kBiYocza4SRobU47j69OqC5NWlODzc6ICO7ks3Chdqu/RT 1KM6ykVgirI78UZSpqu/lx63GigqiBLhxJrD3qFqX2MWDH1jqslKAdZBqLu+ROgp0res J23Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=dI0Osk/OAmUYMH0JGdEpZhhzt84bkmaFLBo02yUJopk=; b=js3tmh/4hcmDRriZ81sP8zUh/micJMJeJbBDtql0wHh2WGBILxfDbUds1hFRVOojfD azJRjjjg6bdIV1JNrCreLOovPfw6h+MsOPDVfIWOal9HfeHqAVFnyFCuy8g7iKNvQIHj DU4sza30al6/ZAtOYxmhFoMNHhry+JjJ5dkeXMj2TaqCWKNCUeZ2ganL1mqV7VZ13mh2 V5OxKeNAX38nsmgZ817GepfTC7bgEvx73GSMa22qxJ17PgwAkhU6M4+lVoBjY08ryIqD 5k/Tk19zZo8nCQDs8Dc9kK8ThlgKJpl77aoHbzwwn8wQ4rRhl594j5d7zf3p4AW4c0fR 83qw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RllZtvlP; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id br13-20020a170906d14d00b008d3be841ccdsi1283861ejb.326.2023.03.01.06.21.56; Wed, 01 Mar 2023 06:22:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RllZtvlP; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230047AbjCAOSJ (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbjCAOSG (ORCPT ); Wed, 1 Mar 2023 09:18:06 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0704C2749F for ; Wed, 1 Mar 2023 06:18:04 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dI0Osk/OAmUYMH0JGdEpZhhzt84bkmaFLBo02yUJopk=; b=RllZtvlPLoBcZswhHJc6ZhEoJJgFlOOwrIYL8LeIhuaaau50JfVbyOeTHN7knaPSH3D0ck aHBLmFtcVdVAKo6GEGqgaArorj65LtFxjeM4KTw2+ElhVckeoHufXxUQdENO/xFNZj3HHy zxfInWeZXhQUwj6IEwAudBLQb6qM6f6tfsWjiFz/1KCPaHdK4n17utM8UVFmYSx+LqluPo rNU4UfhgViPD5E3ty2fepVQ2fGwFbco1BIdAuVj5LGaEFO4+ZIRFqMPGTyZYyt2898SLYE l0Exabb/QWjE9dFVDZtvOl5qNPe3I06rVnbHqJnlo0F6tAGVNu6TYo+dwlpp1A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dI0Osk/OAmUYMH0JGdEpZhhzt84bkmaFLBo02yUJopk=; b=2YeVth/SqIUuVhpAhfCwmY9LcAG4Bp69PIpwirO/x9u0N2rHiZh0AuHCi6DpPBdEQXPb4b vV+zGbwAnt0bpmDQ== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 02/18] timer: Add comment to get_next_timer_interrupt() description Date: Wed, 1 Mar 2023 15:17:28 +0100 Message-Id: <20230301141744.16063-3-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175548889804637?= X-GMAIL-MSGID: =?utf-8?q?1759175548889804637?= get_next_timer_interrupt() does more than simply getting the next timer interrupt. The timer bases are forwarded and also marked as idle whenever possible. To get not confused, add a comment to function description. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- v5: New patch, which adds only a comment to get_next_timer_interrupt() instead of changing the function name --- kernel/time/timer.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 63a8ce7177dd..ffb94bc3852f 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1915,6 +1915,10 @@ static u64 cmp_next_hrtimer_event(u64 basem, u64 expires) * @basej: base time jiffies * @basem: base time clock monotonic * + * If required, base->clk is forwarded and base is also marked as + * idle. Idle handling of timer bases is allowed only to be done by CPU + * itself. + * * Returns the tick aligned clock monotonic time of the next pending * timer or KTIME_MAX if no timer is pending. */ From patchwork Wed Mar 1 14:17:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62911 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658011wrd; Wed, 1 Mar 2023 06:21:46 -0800 (PST) X-Google-Smtp-Source: AK7set+oDKEgoxDtqmtDCfB1J+zu7uvdCxrEOix776AFAkyLGkxjOMxJevjl2Us/xWQ6PUmhzisA X-Received: by 2002:a17:906:2847:b0:8b1:2c37:ae97 with SMTP id s7-20020a170906284700b008b12c37ae97mr6197013ejc.43.1677680506698; Wed, 01 Mar 2023 06:21:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680506; cv=none; d=google.com; s=arc-20160816; b=dT7oAnwuC0XPL0wA1YOUMn6IbnsYiCWcgVk1flDIT37ULi0ra8Twz6cLkaWWT8EC6U 50sewItYII39fFSLn7liHl3JtYs1Oowts6nr+P4SfUX1MNXWH5Pd73z7YHRiLLW9pjJT l4IaDiBBonbbzCWvFxl1c9+ajzJKhHHgTG8OItOU8uCcHmnCVdiNiKCCU3PP+9v4gp9j 1X1YBf0UZDCbnYfQ4iKkgL4mSgZBlqnKE7/Hw6r3+TXPUjpE/4myXC+kOEj7Rl7aMYJn NpfMY5nNPGrW35b2tvOEgdZZvgCS03aLVzhP0aaABc0H++449QtHpowEAu3jjFTA+Ofx TX8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=jmOQM9bwoCoHle0ulyYXU2rtf5ilIqCMlhZV0ILFNe8=; b=T46cS5vL1Ml+7ol3R9GjeHrCibwCnROv1pTz8+D3mY3W8OS8fPeADejbeNEREsMhIi ip/nMxLwBgjlkkfaxfEs8MxlG4HI5NMP4f70DcC9IxCyDFe0zABkCutvSHpn8m4YNH6y Dfc7IE4Jz0c1czOiWcV5HAxearmn/RG44ybyU/KXGl2//Lb4a2kmZM8XbGXsUhN/Pbjf dkJgjmj5s6DNjxV6sNttbrEZ5opZWsusRrxzJdf6PjyxhEZEpA04VF8LjcfxvzsoyqTk HghuNEm0x0JTiF7LU9FPBoxjMaCH4sc/r7eFbmD9BCmZPumU164zCEOmlL+5mOe3clsf pzbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Gel4Q84I; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jx26-20020a170906ca5a00b008cc211391bdsi612318ejb.820.2023.03.01.06.21.22; Wed, 01 Mar 2023 06:21:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Gel4Q84I; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbjCAOSR (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230045AbjCAOSI (ORCPT ); Wed, 1 Mar 2023 09:18:08 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 348A02A6F7 for ; Wed, 1 Mar 2023 06:18:05 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jmOQM9bwoCoHle0ulyYXU2rtf5ilIqCMlhZV0ILFNe8=; b=Gel4Q84I4QAfdW72PCqmjGOQRsi7WJeEFby+cMdun2iQn7RO4ldqyf343ts5pEDP2DOsYB Ilu8sRRYfFX2tkNQsReEpd7By1ZHfZfODrLeAWW/2Eoc04LnVmWNWgEKfi3+69CJ+2SQ08 MiUm290aeVF++k7Ma6nMtUgc+zfOqiiHFpTDOWAwjSRgMZR9aKowwbPVzOZtuP4MkY8//T BLhSoZaTozxF9MIgy2p223DqdGLXKmSCTfNz/V1rF52HfnOtBDPtrqyAC7bH0j9WxlAaca VEQF7pbb5sBc4kc6MHOMTt8ZpJo24x9ejlfCFi99C3tBnycRa1aas0GQOK5kCQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jmOQM9bwoCoHle0ulyYXU2rtf5ilIqCMlhZV0ILFNe8=; b=w9ilXrrOvxoN9BykI8dgrEiuqQtwQBSHRfGcZDg4tqdiN4MvkRJpS8KGuveiX2wjRnoSAw GnDfWAX7SSfs2WAg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 03/18] timer: Move store of next event into __next_timer_interrupt() Date: Wed, 1 Mar 2023 15:17:29 +0100 Message-Id: <20230301141744.16063-4-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175514985238018?= X-GMAIL-MSGID: =?utf-8?q?1759175514985238018?= Both call sides of __next_timer_interrupt() store return value directly in base->next_expiry. Move the store into __next_timer_interrupt() and to make purpose more clear, rename function to next_expiry_recalc(). No functional change. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Thomas Gleixner Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ffb94bc3852f..08e855727ff8 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1803,8 +1803,10 @@ static int next_pending_bucket(struct timer_base *base, unsigned offset, /* * Search the first expiring timer in the various clock levels. Caller must * hold base->lock. + * + * Store next expiry time in base->next_expiry. */ -static unsigned long __next_timer_interrupt(struct timer_base *base) +static void next_expiry_recalc(struct timer_base *base) { unsigned long clk, next, adj; unsigned lvl, offset = 0; @@ -1870,10 +1872,11 @@ static unsigned long __next_timer_interrupt(struct timer_base *base) clk += adj; } + base->next_expiry = next; base->next_expiry_recalc = false; base->timers_pending = !(next == base->clk + NEXT_TIMER_MAX_DELTA); - return next; + return; } #ifdef CONFIG_NO_HZ_COMMON @@ -1937,7 +1940,7 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) raw_spin_lock(&base->lock); if (base->next_expiry_recalc) - base->next_expiry = __next_timer_interrupt(base); + next_expiry_recalc(base); nextevt = base->next_expiry; /* @@ -2020,7 +2023,7 @@ static inline void __run_timers(struct timer_base *base) WARN_ON_ONCE(!levels && !base->next_expiry_recalc && base->timers_pending); base->clk++; - base->next_expiry = __next_timer_interrupt(base); + next_expiry_recalc(base); while (levels--) expire_timers(base, heads + levels); From patchwork Wed Mar 1 14:17:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62914 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658384wrd; Wed, 1 Mar 2023 06:22:32 -0800 (PST) X-Google-Smtp-Source: AK7set9jaSC5AUYEijQ96csDKldgWwC3NkJdVjCX5YICSr7vYxQsOEiHtX8/RImySbCubl3iOoSs X-Received: by 2002:a17:906:859:b0:872:84dd:8903 with SMTP id f25-20020a170906085900b0087284dd8903mr6078371ejd.59.1677680552501; Wed, 01 Mar 2023 06:22:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680552; cv=none; d=google.com; s=arc-20160816; b=rC4j7fu87X/sHt24lV5g/MZkqCfmGXhq1Asb4PwPffQioyBJY0HgPrZ5BISV+9P1gs vaemQ/zJYPzJVz9ty2GFoQ+PORE9CA7BknFwuLpVytlY6A6LEbmpfNcODdyu0GDXKpeU NdlF3IzTeiDoeimCbY+Qn0FpOF90WYqVdw6wieCHcoNB45Uge3hhAIpctEfhAnV+Enmf L3hvSQOJRHeEXaM1ctF7qnQNu0O0AFCfJlMz/eLw2EcV+za/fXDvs27V0+TezlfROvi9 PnDcQy/kW4hgnamIRIJzbblgYFwp9G8SRpBIfjkoOG0j0XyxChuENtTZ4ep8Pa48XySt hJzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=A3VhrY17dPpOY3y4FlMoo2pmszX0SRI6dHh159vCceQ=; b=u1PqnD7FY970b0fltku/sS6fANrvyO2dOHctkGxxFiH8UeLkNt1Q6QWpoy+xLMKXHP adGqL3M6JL1f5HWyQPiH2MB3WIKSVjb3LgswoQDmbxGsadNJRaOZnYPVg75MgZKKuIaS nr23AYzxlD7WRoaEXyWcUUMaq0Nj4HRbz72CxboGFv0v3HNWQbphGlOisBcK6ZFjv6Po cWw7VpHmQaEEYJigb7OWQ8qZLKKciJvsHlsHwnCqlpeqH6IqYcJTsurUKA+Zu506U2Dr 5kL3nIaRw6h/Nlic3xpJMLlJ1eq8RhuMBfb8NP3b1nDP/wsMBheLW+RM6Uh8Vbdaho8F wWkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Zcha6x1H; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=iq9Z6Stf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b21-20020aa7c6d5000000b0049f1fb93929si1965024eds.164.2023.03.01.06.20.54; Wed, 01 Mar 2023 06:22:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Zcha6x1H; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=iq9Z6Stf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230051AbjCAOSV (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230042AbjCAOSG (ORCPT ); Wed, 1 Mar 2023 09:18:06 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93C7C303C6 for ; Wed, 1 Mar 2023 06:18:05 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A3VhrY17dPpOY3y4FlMoo2pmszX0SRI6dHh159vCceQ=; b=Zcha6x1HtunQVxVsE5cwm1zCjM17QNul8CZuCnZdpsd+0xfgMvLqTKgFOsJyVh09MRLaOO laXLfzFngjJL119Am6gvRYJnsGTXx+MlTNlJgwsHTbJWwN/Clj8JRiRkrINaQrVklaFoZb Kki/VJSYpu65N/DRF8iakrixsY2EPqHqCl/XHZ6WP1hfe57XwSPORenLPR2nhY7xZsapDK rbkS80MR9/AKm8e9WFpt1Lrrn8J9OOqB548uuCO/P7rLasKRMOroVO2SYET2lv8FtFf/1M 1pVDkAf4IHr6E16ODitO8+Z41PhIWwLHQWJRqkGjLf/cIzXrN7XkGyKBSjWy+w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A3VhrY17dPpOY3y4FlMoo2pmszX0SRI6dHh159vCceQ=; b=iq9Z6Stf0/aTlghbUUKcQsToba/xMsTjP5ZEpIa9jo7mP50KPoT0L0WOZixlsaeHZUMXXV CH5R+PzcM08S2WBA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Frederic Weisbecker Subject: [PATCH v5 04/18] timer: Split next timer interrupt logic Date: Wed, 1 Mar 2023 15:17:30 +0100 Message-Id: <20230301141744.16063-5-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175563270795043?= X-GMAIL-MSGID: =?utf-8?q?1759175563270795043?= Logic for getting next timer interrupt (no matter of recalculated or already stored in base->next_expiry) is split into a separate function "next_timer_interrupt()" to make it available for new call sites. No functional change. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Thomas Gleixner Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 08e855727ff8..9e6c2058889b 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1913,6 +1913,14 @@ static u64 cmp_next_hrtimer_event(u64 basem, u64 expires) return DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC; } +static unsigned long next_timer_interrupt(struct timer_base *base) +{ + if (base->next_expiry_recalc) + next_expiry_recalc(base); + + return base->next_expiry; +} + /** * get_next_timer_interrupt - return the time (clock mono) of the next timer * @basej: base time jiffies @@ -1939,9 +1947,8 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) return expires; raw_spin_lock(&base->lock); - if (base->next_expiry_recalc) - next_expiry_recalc(base); - nextevt = base->next_expiry; + + nextevt = next_timer_interrupt(base); /* * We have a fresh next event. Check whether we can forward the From patchwork Wed Mar 1 14:17:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62915 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658478wrd; Wed, 1 Mar 2023 06:22:47 -0800 (PST) X-Google-Smtp-Source: AK7set/FrdxyrnqBoqpRPodDuKCME8iZqOrsPNs9o4kVUgb2WXv9VabvKURHnd0cbg4CwXAyKGs+ X-Received: by 2002:aa7:dd4b:0:b0:4b1:b71d:cbfe with SMTP id o11-20020aa7dd4b000000b004b1b71dcbfemr7379403edw.2.1677680566847; Wed, 01 Mar 2023 06:22:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680566; cv=none; d=google.com; s=arc-20160816; b=D+GwjvfCSUktHRN5tPJyLejQ/fP+KU2SWX7/m+PLSJHdLVbOwGX87N7y9Jn38hNPZr k1YhsmFXBx/Uw2bSJGVdvUFREp+7BDgBM7Y6zPTTxqQEFjvyN3DoH3z3NpB1KXObgqA2 ulRdn1OAjU1Lr3i/8v2PQPQh5ZD4urLJmqm4hCkmnlQ6IGTZzwMsCmuvyFfAAduLatlf oGYq+kE1AcTV5vQD+btrYDzbIfj0v7Hjtf9VT8oX84N3HZsBsKKBnGgsvXo30b6QlvFq 2DXK8NODHd/nVIn23N6zIy9LZ+ZJqcWxi5Ytlv6+A7Bcl+jjrEqvVFm8N9GKrk9J6TKf xTRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=mxBQLfqFMmn/4LKpc4xr9a8h2eIYUaZukb+bTAZlfes=; b=lSe2Gh8DYVKOl1K3Rv92ECRhAoxk6IT8IVJLEra43SHRMTxI5brykzRwTJ7GeHrIhM 8YCUnJkk53ES5XZKSIGHkJbTQAn4iX4/JRfeU0Bei72jyP1gzu4vFKLJUNgmrHsbJBBM QDD1pipRWxOS03B2IST7hDV9X6Zw9ea3RfKmcqsicL0HIItus09bUrzvyCiXUXH4hzJC qkye/WKGWwm9+Ljkf9JLIWZ+ufKC4OpbqNHaRdBqfTGFP4QvWfA7z1RWGSWR7NpsqoeE h9d6I2s5Jn+R8BbdYkFonzhOa4L0JTAgf32AaaqznpK8tvAK+yDm4wYQThNUPZBHdvQy S0JQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=4se4AVrv; dkim=neutral (no key) header.i=@linutronix.de header.b=BTBv5acI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c20-20020a056402121400b004afadaef702si13961890edw.606.2023.03.01.06.22.23; Wed, 01 Mar 2023 06:22:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=4se4AVrv; dkim=neutral (no key) header.i=@linutronix.de header.b=BTBv5acI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230110AbjCAOSZ (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230055AbjCAOSJ (ORCPT ); Wed, 1 Mar 2023 09:18:09 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAE1623DA4 for ; Wed, 1 Mar 2023 06:18:07 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mxBQLfqFMmn/4LKpc4xr9a8h2eIYUaZukb+bTAZlfes=; b=4se4AVrvLdzBcWQjXgIXW9CmrWvBZGxWhLunDIKEAhH1PUHbrFnelH0e5fnHB5qQvT7l/J uxO4fQAqLOXIgahydxDpwa/A+fvwbPnYb6AAN/+b4HOBDK5odzWSHu6Tf1tLp3hmMNzLM5 aGondTb2qrb1CLSytKZF0GOJ3rR1z/D25PuZmHw+VLrOCI9MmnrgpErECf9tCQEfQWP3x2 +W7zmO6z0HiV9R2ks9sSCfTPA/BSCBhDNLRyOZXVDFqFXqI9oBNjbwfgkWwTkuEg/sERJ/ QNfO4eJSJX+9tFjroGFam4wERVfB1nXbbNj2HxCXAaOVcRAU7WAFGb/FaKzu9A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mxBQLfqFMmn/4LKpc4xr9a8h2eIYUaZukb+bTAZlfes=; b=BTBv5acIz2sWHYAB6pSVgBKf4KjRx45jcrlu+rqaSbUZwZ81H3AAla2zQhp/eOt0Rkz2ar NNMCk5FbcTjXVqBw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Frederic Weisbecker Subject: [PATCH v5 05/18] timer: Rework idle logic Date: Wed, 1 Mar 2023 15:17:31 +0100 Message-Id: <20230301141744.16063-6-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175578159019315?= X-GMAIL-MSGID: =?utf-8?q?1759175578159019315?= From: Thomas Gleixner To improve readability of the code, split base->idle calculation and expires calculation into separate parts. Thereby the following subtle change happens if the next event is just one jiffy ahead and the tick was already stopped: Originally base->is_idle remains true in this situation. Now base->is_idle turns to false. This may spare an IPI if a timer is enqueued remotely to an idle CPU that is going to tick on the next jiffy. Signed-off-by: Thomas Gleixner Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 9e6c2058889b..d74d538e06a2 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1962,21 +1962,20 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) base->clk = nextevt; } - if (time_before_eq(nextevt, basej)) { - expires = basem; - base->is_idle = false; - } else { - if (base->timers_pending) - expires = basem + (u64)(nextevt - basej) * TICK_NSEC; - /* - * If we expect to sleep more than a tick, mark the base idle. - * Also the tick is stopped so any added timer must forward - * the base clk itself to keep granularity small. This idle - * logic is only maintained for the BASE_STD base, deferrable - * timers may still see large granularity skew (by design). - */ - if ((expires - basem) > TICK_NSEC) - base->is_idle = true; + /* + * Base is idle if the next event is more than a tick away. Also + * the tick is stopped so any added timer must forward the base clk + * itself to keep granularity small. This idle logic is only + * maintained for the BASE_STD base, deferrable timers may still + * see large granularity skew (by design). + */ + base->is_idle = time_after(nextevt, basej + 1); + + if (base->timers_pending) { + /* If we missed a tick already, force 0 delta */ + if (time_before(nextevt, basej)) + nextevt = basej; + expires = basem + (u64)(nextevt - basej) * TICK_NSEC; } raw_spin_unlock(&base->lock); From patchwork Wed Mar 1 14:17:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62916 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658521wrd; Wed, 1 Mar 2023 06:22:52 -0800 (PST) X-Google-Smtp-Source: AK7set9KvWtkiw3QPV+g3kpsr5GCvvoCkhev2DOp/nVY9ijGI7krJ4CQfTtOPJICSm2f2+iKB156 X-Received: by 2002:a05:6402:652:b0:49d:a60f:7827 with SMTP id u18-20020a056402065200b0049da60f7827mr7625957edx.6.1677680572034; Wed, 01 Mar 2023 06:22:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680572; cv=none; d=google.com; s=arc-20160816; b=xlEhinZoXG1ICPCG49pgEoxnT8cGB72r5pAjWv3LvVBawg+/DfCqL/Jn2uJCU3Ba9E hNybYirXBVNzKhSr7EfKUlEHagL9o+Mw1QUv+J/Z7bF4UO6wKi9H5JrUT8t5UgA2bkcI lzdtB10SFYceoDwmLZkaLLAVRNSjk/UoQONudfCTPslXQONEXSAqv3ca1ppRdPtzWXlg i0oncxuaiLA0cdAlWMp+iIT+fI8En4GkvvhQZvubpW19YmI2F2JiCNP9T/72tuB8oXrP 5MgvY4Yoao7zYgy4bhSh7Hyh8riCF/0iTGrDskub9eHWfPA6rJ3hbSnwURjbsMGdC3Q1 Z+DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=61R/6m3cWzpu0i+kgy8BBTevSM03VxMJMhiETBOD2eY=; b=B2LK2cbYMRLwAP8Jl8XjX3jwhZ/QXE3FxuUcdzkKofCv8wVUOxU9YVPtoPLy3X3b+/ yxtoG6jkE9U5FWENcdrITBQxfuNT5q0EMFHAIBD1z8nU06bVF5tzDzbCIW2fMBmKOqfr 47KUXl0tDmdlQ1JVoz+CePLzTB/JN4F6lm5BLo3aAypYM5BiHElD/ANxBv0M3FIGL1YW p0ZZFtBBIfy5huGGPolTJMCTnZOi16/EFC1XLY/b2hYzrblxiHC3jSzcDamO5UATA7ni SJj+A5RrzgYohRQc7YboppOGM5E25Tx6vc4hwm9rxmmPre22I0ZO4Qe1mJE2Dz2+mfQ9 b5XQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=hJGMDtSm; dkim=neutral (no key) header.i=@linutronix.de header.b="EfG+Pih/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id by14-20020a0564021b0e00b004accb68cc93si2477028edb.387.2023.03.01.06.22.27; Wed, 01 Mar 2023 06:22:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=hJGMDtSm; dkim=neutral (no key) header.i=@linutronix.de header.b="EfG+Pih/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230088AbjCAOSc (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230057AbjCAOSK (ORCPT ); Wed, 1 Mar 2023 09:18:10 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA3122331A for ; Wed, 1 Mar 2023 06:18:07 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680285; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=61R/6m3cWzpu0i+kgy8BBTevSM03VxMJMhiETBOD2eY=; b=hJGMDtSmaW6X+SmKXImqcyNQ41a6XFWvmTIEMMKcClZ2tyoL4yyvd/rL6PtyPF6gPwbVEa NmNsfJrEF82YQLynRZ8T/QGvXijl9QHlBcLZX/xch6QeK2ETGvGnlcI6pP5a88C+pcPrIW uf3w/UakWz9nM5hqk9Vd36nhRPZL0ieTWjSDPtZ5KuAcE08aK7X2nDwhGibHuwEb6oSxRL sjMPcs7R/TXKNZikPJ7UKa6NciIIM4Ks0+dZBBw5YD88BbNstfPKqoXyOb8JC3LVmf+fAo /pzmKe47Lm4j4KhPBuRANu1lt77XwJ3G7swxcIWuwoGwBPrFWXrJAavG2MjzXw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680285; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=61R/6m3cWzpu0i+kgy8BBTevSM03VxMJMhiETBOD2eY=; b=EfG+Pih/pAPGwXQLXhIy/5G1cbdLAIxWy49IZB0zbJf7Hyl29HbCpokJIlDMoEkhctkA/+ Z+sCzFUC1d6jasCw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , "Theodore Ts'o" , "Jason A. Donenfeld" , Stephen Boyd , Tejun Heo , Lai Jiangshan Subject: [PATCH v5 06/18] add_timer_on(): Make sure callers have TIMER_PINNED flag Date: Wed, 1 Mar 2023 15:17:32 +0100 Message-Id: <20230301141744.16063-7-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175583888852093?= X-GMAIL-MSGID: =?utf-8?q?1759175583888852093?= The implementation of the hierachical timer pull model will change the timer bases per CPU. Timers, that have to expire on a specific CPU, require the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU but in global timer base and those timers could also expire on other CPUs. Timers with TIMER_DEFERRABLE flag end up in a separate base anyway and are executed on the local CPU only. Therefore add the missing TIMER_PINNED flag for those callers who use add_timer_on() without the flag. No functional change. Signed-off-by: Anna-Maria Behnsen Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: x86@kernel.org Cc: "H. Peter Anvin" Cc: "Theodore Ts'o" Cc: "Jason A. Donenfeld" Cc: John Stultz Cc: Stephen Boyd Cc: Tejun Heo Cc: Lai Jiangshan --- v5: Add comment in workqueue.c that it's only a workaround for now --- arch/x86/kernel/tsc_sync.c | 3 ++- drivers/char/random.c | 2 +- kernel/time/clocksource.c | 2 +- kernel/workqueue.c | 15 +++++++++++++-- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c index 9452dc9664b5..eab827288e0f 100644 --- a/arch/x86/kernel/tsc_sync.c +++ b/arch/x86/kernel/tsc_sync.c @@ -110,7 +110,8 @@ static int __init start_sync_check_timer(void) if (!cpu_feature_enabled(X86_FEATURE_TSC_ADJUST) || tsc_clocksource_reliable) return 0; - timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, 0); + timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, + TIMER_PINNED); tsc_sync_check_timer.expires = jiffies + SYNC_CHECK_INTERVAL; add_timer(&tsc_sync_check_timer); diff --git a/drivers/char/random.c b/drivers/char/random.c index ce3ccd172cc8..db6a7c0695de 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1007,7 +1007,7 @@ static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = { #define FASTMIX_PERM HSIPHASH_PERMUTATION .pool = { HSIPHASH_CONST_0, HSIPHASH_CONST_1, HSIPHASH_CONST_2, HSIPHASH_CONST_3 }, #endif - .mix = __TIMER_INITIALIZER(mix_interrupt_randomness, 0) + .mix = __TIMER_INITIALIZER(mix_interrupt_randomness, TIMER_PINNED) }; /* diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 91836b727cef..e982c119e3c9 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -561,7 +561,7 @@ static inline void clocksource_start_watchdog(void) { if (watchdog_running || !watchdog || list_empty(&watchdog_list)) return; - timer_setup(&watchdog_timer, clocksource_watchdog, 0); + timer_setup(&watchdog_timer, clocksource_watchdog, TIMER_PINNED); watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL; add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask)); watchdog_running = 1; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index b8b541caed48..a428d94084ee 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1677,10 +1677,21 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq, dwork->cpu = cpu; timer->expires = jiffies + delay; - if (unlikely(cpu != WORK_CPU_UNBOUND)) + if (unlikely(cpu != WORK_CPU_UNBOUND)) { + /* + * TODO: Setting the flag is a workaround for now; needs to + * be cleaned up with new work initializers and defines + */ + timer->flags |= TIMER_PINNED; add_timer_on(timer, cpu); - else + } else { + /* + * TODO: Resetting the flag is a workaround for now; needs + * to be cleaned up with new work initializers and defines + */ + timer->flags &= ~TIMER_PINNED; add_timer(timer); + } } /** From patchwork Wed Mar 1 14:17:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62919 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658643wrd; Wed, 1 Mar 2023 06:23:08 -0800 (PST) X-Google-Smtp-Source: AK7set/CjP7ZScNaaNiGzhLIfqjdAsn1xp0PF1FOorEw/5o1XoO2UbgTJC15uu941MOjozUFSiGJ X-Received: by 2002:a17:906:48ca:b0:8f1:937c:f450 with SMTP id d10-20020a17090648ca00b008f1937cf450mr6183132ejt.13.1677680587961; Wed, 01 Mar 2023 06:23:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680587; cv=none; d=google.com; s=arc-20160816; b=yVd/cICvI+RvEjsQQYkCg0DehKDH9j+kRYpo7XgL2yN/stqbj0ugq2lVLUqZ3iY7x2 uVuglvofG7XpKO62oK63PKBL+PplStZP4hrh8cIKys7ysLw1gPh4YVzJswASKQzq3haW zC2pEsgcGf2hLKj2nx3dHSOEo2KmbylTh36dqWF6/hDAGx3R7p7C2adu8JnM9P45IInz wN1hYkHNCb3tk0EO1qmIzYwdVLfN0QqILVt/+DNtrIKuyWF2UtttxRfuJ7ImUm7CdVQy 7/1AAQHCsut3QL9VT6XGg2jiSjGvCjn9/MEGyVyhDmlnTUTrL2URcvU9O9UNpPponwlm TYBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=pdbSgLslPmcPgItz29y18IhXxIbpFPnnUkwjNqQj+/I=; b=KlIVhwwGJ2aVtm2Rf2bR5xh4XeZ0m3MFxclvl7B1oh8W3VcrtM5GyIK79e8j1oUXGf M86v0UrjJvOmJAZm4a7UOg7/Nto9k2soXiJPd36JCiPJC4JF52uTidTxQlJdXPe0Pi50 75mDC6X5c6g/pMl2lQ4zZHM7Oeuv42dVHrFNMFHqf79FQxRopyBA1cgJBd4V9EDP8JpD YZr2gByzH0Qnlp3covXWqwX1fHi14vXj7MG54EJaK4LqUBF+FZI9Ly5u0nCLgcMD9V8W 9WTaPU8B88uNR29YgVEUkqCg5wUkoDUjmu1tuL+T6XvWsBn5MuwAmLqFSegtYwxxiupP rp9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=kdRAvwzm; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=aGnRgYH9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ce26-20020a170906b25a00b008cfc3be35f6si15105721ejb.646.2023.03.01.06.22.44; Wed, 01 Mar 2023 06:23:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=kdRAvwzm; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=aGnRgYH9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230116AbjCAOTB (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230044AbjCAOSJ (ORCPT ); Wed, 1 Mar 2023 09:18:09 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBA962E822 for ; Wed, 1 Mar 2023 06:18:07 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680285; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pdbSgLslPmcPgItz29y18IhXxIbpFPnnUkwjNqQj+/I=; b=kdRAvwzm4OaR9OSE+hCZ2Z+1tZkRsCTw9fEsGRaVm904YQvE4Yv5DCAdiAinJ3CBWXni5G iWU1J6V+GxlQez5MunpLVnxD+vcDBXMHLkFkJ1pVmQgZwXkUHIuoWbKGxQylzq295cfskn oNHtGxGsDTeKITFP6Pg8lxTPeUGrxzinv3Swe9ScFCNgoyEUPUGu+Cz3DlrLugJEsM7aDf RTHSMYZqxmX2StmvWJwIPsVwsL2j1uFbBpfgsI6MLl0ZN60X3W4d11gfpfgUnAdVrLeRIV NHL2skas4RnRubw+cLaPtKjtWO4PdeSO5+8YniwOg2UPey7a2UW2kWTgLghxjw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680285; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pdbSgLslPmcPgItz29y18IhXxIbpFPnnUkwjNqQj+/I=; b=aGnRgYH9dOdbi+nZxO3yuGEPkIGWxsbkVMmrXvXUkzA3XkjWHaYujIebGohEXaCFdSqWHE qEf8jXH30jlQ0vCQ== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 07/18] timers: Ease code in run_local_timers() Date: Wed, 1 Mar 2023 15:17:33 +0100 Message-Id: <20230301141744.16063-8-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175600500108288?= X-GMAIL-MSGID: =?utf-8?q?1759175600500108288?= The logic for raising a softirq the way it is implemented right now, is readable for two timer bases. When increasing numbers of timer bases, code gets harder to read. With the introduction of the timer migration hierarchy, there will be three timer bases. Therefore ease the code. No functional change. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- v5: New patch to decrease patch size of follow up patches --- kernel/time/timer.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index d74d538e06a2..d3e1776b505b 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2058,16 +2058,14 @@ static void run_local_timers(void) struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); hrtimer_run_queues(); - /* Raise the softirq only if required. */ - if (time_before(jiffies, base->next_expiry)) { - if (!IS_ENABLED(CONFIG_NO_HZ_COMMON)) - return; - /* CPU is awake, so check the deferrable base. */ - base++; - if (time_before(jiffies, base->next_expiry)) + + for (int i = 0; i < NR_BASES; i++, base++) { + /* Raise the softirq only if required. */ + if (time_after_eq(jiffies, base->next_expiry)) { + raise_softirq(TIMER_SOFTIRQ); return; + } } - raise_softirq(TIMER_SOFTIRQ); } /* From patchwork Wed Mar 1 14:17:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62910 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3657663wrd; Wed, 1 Mar 2023 06:21:11 -0800 (PST) X-Google-Smtp-Source: AK7set+TvG2NQ1cNo74q0CGaONjzdRllkU2TL8+VKrmWk83Mc8YTua8kWSgVlT4dbXAVsJRF22Da X-Received: by 2002:a17:907:2ce2:b0:874:e17e:2526 with SMTP id hz2-20020a1709072ce200b00874e17e2526mr8995379ejc.72.1677680471730; Wed, 01 Mar 2023 06:21:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680471; cv=none; d=google.com; s=arc-20160816; b=RW5/NhadoZdJXww5tHfBE522RNDxJCRgyseOu4eZRR72cPHyv2fvM/yAZiYftZtwkZ UOtK5qOXJ0ktSckdESaLT0j+nMdP109SaUl45q7lm1kvsnx8QIRRakzazEdw917qth4s DMj9tZ/bjvERP5ti3r8HuItBn+IMpf2hie709JNon9f2WHrCxv1xgVMaenSD7EkqlReb 1CJAH+2ubN8ULkvnJDkmUeS86NGmnhpVc2fNjZbRKHHMip7Vb3xAVibbPZK6B/slRSdJ eF3X/T+9jTiiQeP7lAbZTxD3Lp8LqtACT1LpUBnUG1RWFxyXBJo9BJGlE30e4P/OhJyH mMVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=DTQzvJTD2qoYsldLldPQ4blVqjcabsGVyHc0OvLoC/I=; b=KE9vsLo0qy5Nm/tutt21NS+EIF/0u+g7m5ncNcuEZzIG92mU2K7ltZEFIOQjSRePNK oG8W5Wtq8Ce9Aimfy+I6zd/USNA3NEO2riN3lizcdYV+XTyt9QrpyE7HlrfVhoH6OFhS yXeM0azySxbngfr4DGgmji0nyL4CC8hGZkvQoehnSSKMq27mW3tNPeeOiZ04J9u8rwb5 KUaroVUN/U3vAqPSQRIj52RDbD8SGBGDtrcTkBWGyzK1xhycwEH+L4oMMwo0KdXIMxpy H4Y2TmiBzORVEV8Ro9bPIYFB9UUK/4UTP7SDQ6Yeq7a2UgA77SBSFV7xaIk+7apUbMu6 ozmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SmV1bpGL; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gx1-20020a1709068a4100b008dd83608278si1228266ejc.933.2023.03.01.06.20.48; Wed, 01 Mar 2023 06:21:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SmV1bpGL; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229983AbjCAOSv (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229992AbjCAOSJ (ORCPT ); Wed, 1 Mar 2023 09:18:09 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF2732FCC3 for ; Wed, 1 Mar 2023 06:18:07 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DTQzvJTD2qoYsldLldPQ4blVqjcabsGVyHc0OvLoC/I=; b=SmV1bpGLnAae9OlLTD1SsP39Ud6P4rQIb994A1U9VKBb/SJZl6xNzz1ZYJzP5wRqB3IT9Z 35oxpmI1i+UnVHUhbtz89fgJJASZ9e11ITJJcn9IKFO4+ePY5Fd7fQ/Q4Ksw2vXeYD7ZMJ C+8R7Le8DvWN4n7o6ZJmMbnomaYq+0XUjXwTHt7umvwdQrZ+x9kDq4qc7fArljDpEmkPsU iXgSD9upMvchB1VTV//NTrwRNrWxjDxqJLDRJq2IxRbaI1a3Nk/G7ugV3gmeYWqIV9xkN/ S1Djvqv53WKQGJwBjlNyDTNfSpsxWXf7dhMyiIfc2WE39VTfXW4dhZPfb3NHYA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DTQzvJTD2qoYsldLldPQ4blVqjcabsGVyHc0OvLoC/I=; b=Zyfi9TyrvM+Yp7g0q5mAPA+mQMbBrNyodh5McLxyN6tsfNnYQCU1aSEhKwu3kzMjUejJi2 Njzexu+OoLjhKFBw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 08/18] timers: Create helper function to forward timer base clk Date: Wed, 1 Mar 2023 15:17:34 +0100 Message-Id: <20230301141744.16063-9-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175478227832784?= X-GMAIL-MSGID: =?utf-8?q?1759175478227832784?= The logic for forwarding timer base clock is splitted into a separte function to make it accessible for other call sites. No functional change. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- v5: New patch to simplify next patch --- kernel/time/timer.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index d3e1776b505b..1629ccf24dd0 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1921,6 +1921,21 @@ static unsigned long next_timer_interrupt(struct timer_base *base) return base->next_expiry; } +/* + * Forward base clock is done only when @basej is past base->clk, otherwise + * base-clk might be rewind. + */ +static void forward_base_clk(struct timer_base *base, unsigned long nextevt, + unsigned long basej) +{ + if (time_after(basej, base->clk)) { + if (time_after(nextevt, basej)) + base->clk = basej; + else if (time_after(nextevt, base->clk)) + base->clk = nextevt; + } +} + /** * get_next_timer_interrupt - return the time (clock mono) of the next timer * @basej: base time jiffies @@ -1952,15 +1967,9 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) /* * We have a fresh next event. Check whether we can forward the - * base. We can only do that when @basej is past base->clk - * otherwise we might rewind base->clk. + * base. */ - if (time_after(basej, base->clk)) { - if (time_after(nextevt, basej)) - base->clk = basej; - else if (time_after(nextevt, base->clk)) - base->clk = nextevt; - } + forward_base_clk(base, nextevt, basej); /* * Base is idle if the next event is more than a tick away. Also From patchwork Wed Mar 1 14:17:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62918 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658549wrd; Wed, 1 Mar 2023 06:22:55 -0800 (PST) X-Google-Smtp-Source: AK7set/7+mO2mddsfXU1D0I6RI3CGVUazZr+gfoucrSdBA2IS/KkGJ0wbM1LfyLDPauFxpEW9N2N X-Received: by 2002:a17:907:a40f:b0:8d7:153:1486 with SMTP id sg15-20020a170907a40f00b008d701531486mr7224202ejc.20.1677680575038; Wed, 01 Mar 2023 06:22:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680575; cv=none; d=google.com; s=arc-20160816; b=pe28zJAki/ni41JXHOBJAqFqQ0RoxH6/6Mp9M2flmQsLEtNJtmcg/dp69iL44GYcVc BowNtMD9Bxl6JbN/SEck/DDZfRFaUtexXVICornFGVh0fuRi41wUyibYet0DRRTw9Qh4 dFFjyOJbmOSndy9wQMDzEkf4KiBEQc9Zchg2IKQQjWpcFeYQ5XQqGz5vHSIMrAHUImVV DqNFUtcLuPjn7Mp7PiOLzuswqDmPJzayF5Uhq8vppPyYRT8bohza8ZBfKsBNzIYu12dq Kf8lrGvXyUvpS0ab/fyPNqythWWBZP8uZ6OjtL/mwZcOjB17LqwBatgiJvnTjFA5j1rZ UIHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=SQpmpypWODYpg3JqMPzgFH5QjuGBcDA2FILZ1mngw/c=; b=p1rY6JVw5g7pv4AUsjAgIyrNIyHrB/YhDs/uuh1bSZZjW2WSWi2MYaDJ4aJgM8gkwR VFJpLKF8JbsG+w/SKiku/q4SEGjo8KxBuT8T8jiHHJnlHIga7JGLYVk9mY6IePPagNSg AjJhvUfil2pNr6Jb7GKDklHJME+L06I56jUncvjCLbPvhLjRNCF0jCsFKAeO/LkJ/4Sy 4FuvagJak44aR3XF3z3qv1jGA9uqhea2mCzwWrf/qXDlqltlN4swdQy/91ajB8bg0HQg xNjc7SyEi37ZGsCLIjCecWuhY6m4YZSmnkK/bNx3wma5To4AX6RyshDmiVN1wBJy47Xa E9gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=oY1g4+IB; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a9-20020a17090682c900b008b9b135aee9si13792492ejy.350.2023.03.01.06.22.32; Wed, 01 Mar 2023 06:22:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=oY1g4+IB; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230152AbjCAOS4 (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230059AbjCAOSK (ORCPT ); Wed, 1 Mar 2023 09:18:10 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3B32233FF for ; Wed, 1 Mar 2023 06:18:07 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SQpmpypWODYpg3JqMPzgFH5QjuGBcDA2FILZ1mngw/c=; b=oY1g4+IBVsf1u2c/lJ3SYagVSxBqVPGDwACsSDJHe7Vtp+X/qr4cBTF4eri1gkT3uoUJFT yG5KdvqTuKNE4GOvG3bjpd25vZkIlb4kJaAyccn8uaS1qEsUl0poEkiU2328C5PMSh7RbA 602c2ELJU9kMX477F6L9bA5y31Ap9OQkQ0ENRk7ZyMxZr/ZGCmb7fpn/03874Awf0Ez+0F U7NNOqAUQGBxvUWRZOV9uveBQ5sU9qPzhP+J9DCiBi6igH/+mqbchEaBNlvs7nnITACv1e ipozQf7AMJIY5mXmm1PIQtkjRcObajLoreTCuov2RJs+lVl5C6usUkN161chhg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SQpmpypWODYpg3JqMPzgFH5QjuGBcDA2FILZ1mngw/c=; b=6TaNrjQInrjWjlipC82q9XcIGCh0LI2OGhE6v1x2fQ1wyxhrZ7RyEDzykneh+tEBKJhXXe 63311D/WuCBqJZCA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Richard Cochran , Frederic Weisbecker Subject: [PATCH v5 09/18] timer: Keep the pinned timers separate from the others Date: Wed, 1 Mar 2023 15:17:35 +0100 Message-Id: <20230301141744.16063-10-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175586692863004?= X-GMAIL-MSGID: =?utf-8?q?1759175586692863004?= Separate the storage space for pinned timers. Deferrable timers (doesn't matter if pinned or non pinned) are still enqueued into their own base. This is preparatory work for changing the NOHZ timer placement from a push at enqueue time to a pull at expiry time model. When a timer is added via add_timer_on(), TIMER_PINNED flag is required to ensure it expires on the specified CPU. Otherwise it will be enqueued in the global timer base which could be expired by a remote CPU. WARN_ONCE() is added to prevent misuse. Beside of that no functional change because all callers of add_timer_on() already use TIMER_PINNED flag. Originally-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- v5: - Add WARN_ONCE() in add_timer_on() - Decrease patch size by splitting into three patches (this patch and the two before) --- kernel/time/timer.c | 91 +++++++++++++++++++++++++++++++++------------ 1 file changed, 68 insertions(+), 23 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 1629ccf24dd0..7656eab1bf20 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -187,12 +187,18 @@ EXPORT_SYMBOL(jiffies_64); #define WHEEL_SIZE (LVL_SIZE * LVL_DEPTH) #ifdef CONFIG_NO_HZ_COMMON -# define NR_BASES 2 -# define BASE_STD 0 -# define BASE_DEF 1 +/* + * If multiple bases need to be locked, use the base ordering for lock + * nesting, i.e. lowest number first. + */ +# define NR_BASES 3 +# define BASE_LOCAL 0 +# define BASE_GLOBAL 1 +# define BASE_DEF 2 #else # define NR_BASES 1 -# define BASE_STD 0 +# define BASE_LOCAL 0 +# define BASE_GLOBAL 0 # define BASE_DEF 0 #endif @@ -902,7 +908,10 @@ static int detach_if_pending(struct timer_list *timer, struct timer_base *base, static inline struct timer_base *get_timer_cpu_base(u32 tflags, u32 cpu) { - struct timer_base *base = per_cpu_ptr(&timer_bases[BASE_STD], cpu); + int index = tflags & TIMER_PINNED ? BASE_LOCAL : BASE_GLOBAL; + struct timer_base *base; + + base = per_cpu_ptr(&timer_bases[index], cpu); /* * If the timer is deferrable and NO_HZ_COMMON is set then we need @@ -915,7 +924,10 @@ static inline struct timer_base *get_timer_cpu_base(u32 tflags, u32 cpu) static inline struct timer_base *get_timer_this_cpu_base(u32 tflags) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); + int index = tflags & TIMER_PINNED ? BASE_LOCAL : BASE_GLOBAL; + struct timer_base *base; + + base = this_cpu_ptr(&timer_bases[index]); /* * If the timer is deferrable and NO_HZ_COMMON is set then we need @@ -1264,6 +1276,12 @@ void add_timer_on(struct timer_list *timer, int cpu) if (WARN_ON_ONCE(timer_pending(timer))) return; + WARN_ONCE(!(timer->flags & TIMER_PINNED), "TIMER_PINNED flag for " + "add_timer_on() is missing: timer=%p function=%ps", + timer, timer->function); + /* Make sure timer flags have TIMER_PINNED flag set */ + timer->flags |= TIMER_PINNED; + new_base = get_timer_cpu_base(timer->flags, cpu); /* @@ -1950,9 +1968,10 @@ static void forward_base_clk(struct timer_base *base, unsigned long nextevt, */ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); + unsigned long nextevt, nextevt_local, nextevt_global; + struct timer_base *base_local, *base_global; + bool local_first, is_idle; u64 expires = KTIME_MAX; - unsigned long nextevt; /* * Pretend that there is no timer pending if the cpu is offline. @@ -1961,32 +1980,57 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) if (cpu_is_offline(smp_processor_id())) return expires; - raw_spin_lock(&base->lock); + base_local = this_cpu_ptr(&timer_bases[BASE_LOCAL]); + base_global = this_cpu_ptr(&timer_bases[BASE_GLOBAL]); - nextevt = next_timer_interrupt(base); + raw_spin_lock(&base_local->lock); + raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING); + + nextevt_local = next_timer_interrupt(base_local); + nextevt_global = next_timer_interrupt(base_global); /* * We have a fresh next event. Check whether we can forward the * base. */ - forward_base_clk(base, nextevt, basej); + forward_base_clk(base_local, nextevt_local, basej); + forward_base_clk(base_global, nextevt_global, basej); /* - * Base is idle if the next event is more than a tick away. Also + * Check whether the local event is expiring before or at the same + * time as the global event. + * + * Note, that nextevt_global and nextevt_local might be based on + * different base->clk values. So it's not guaranteed that + * comparing with empty bases results in a correct local_first. + */ + if (base_local->timers_pending && base_global->timers_pending) + local_first = time_before_eq(nextevt_local, nextevt_global); + else + local_first = base_local->timers_pending; + + nextevt = local_first ? nextevt_local : nextevt_global; + + /* + * Bases are idle if the next event is more than a tick away. Also * the tick is stopped so any added timer must forward the base clk * itself to keep granularity small. This idle logic is only - * maintained for the BASE_STD base, deferrable timers may still - * see large granularity skew (by design). + * maintained for the BASE_LOCAL and BASE_GLOBAL base, deferrable + * timers may still see large granularity skew (by design). */ - base->is_idle = time_after(nextevt, basej + 1); + is_idle = time_after(nextevt, basej + 1); + + /* We need to mark both bases in sync */ + base_local->is_idle = base_global->is_idle = is_idle; - if (base->timers_pending) { + if (base_local->timers_pending || base_global->timers_pending) { /* If we missed a tick already, force 0 delta */ if (time_before(nextevt, basej)) nextevt = basej; expires = basem + (u64)(nextevt - basej) * TICK_NSEC; } - raw_spin_unlock(&base->lock); + raw_spin_unlock(&base_global->lock); + raw_spin_unlock(&base_local->lock); return cmp_next_hrtimer_event(basem, expires); } @@ -1998,15 +2042,14 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) */ void timer_clear_idle(void) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); - /* * We do this unlocked. The worst outcome is a remote enqueue sending * a pointless IPI, but taking the lock would just make the window for * sending the IPI a few instructions smaller for the cost of taking * the lock in the exit from idle path. */ - base->is_idle = false; + __this_cpu_write(timer_bases[BASE_LOCAL].is_idle, false); + __this_cpu_write(timer_bases[BASE_GLOBAL].is_idle, false); } #endif @@ -2052,11 +2095,13 @@ static inline void __run_timers(struct timer_base *base) */ static __latent_entropy void run_timer_softirq(struct softirq_action *h) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); + struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); __run_timers(base); - if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) + if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) { + __run_timers(this_cpu_ptr(&timer_bases[BASE_GLOBAL])); __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); + } } /* @@ -2064,7 +2109,7 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h) */ static void run_local_timers(void) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); + struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); hrtimer_run_queues(); From patchwork Wed Mar 1 14:17:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62909 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3657660wrd; Wed, 1 Mar 2023 06:21:11 -0800 (PST) X-Google-Smtp-Source: AK7set8j/HAJ1lu2yqCYOMyHGUdTIAN8WXRa7Ki/n040ED+wd0Q4O+jFGRc122zREIITitXgGjMp X-Received: by 2002:a17:906:3699:b0:8b2:7567:9c30 with SMTP id a25-20020a170906369900b008b275679c30mr9573985ejc.59.1677680471533; Wed, 01 Mar 2023 06:21:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680471; cv=none; d=google.com; s=arc-20160816; b=nugsbdKix3szCPJM8yKUliKLXVL+LynplphBfCbbIWx7g+7E+bXY+A5G6xzP2fRlYz 8zqsZPbaHa1R+zA+3NVtUjAohicZJhhuSQtVI7DbkJaoJj0Z15XEu4fVAe7A6CgTIxRP RZXqGCSk2w9pbBdCaWid1YQpHUv0rbgYJQe/6mtJDWoriW/gZx2JX6454GTr9NHrQd5R wiCjmmZlX1oi6Up759z8En4GYxEUEt4delBgW3TBwxOwEAekaxCTCWUYknkda08vgaH3 x9Gyb1nRfexoLFA7dRIYqcSXG6tIlr3MzaaZWGEh7BLALpi5AhXzdjLNFQQzwyhkMJlK CTog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=7LmHc98G0hhyTlVp1eBaBtU+iOdxru2/lJNxgEN+R0I=; b=TuLK/4w3bRvJCZnuY90aQCkT2AOhsURsiaJQ1YN/dxJ3bv8UhQtEYPUuvKuLFf1jQW opa90DAOhgtRTia3ev8OwE51/LF75thntpy5t93B+n2cyvLWr6kFaHqSae9tRhWQCovQ qo9X2OymiXhrlMPNJWo78sBtButnVgqaeSDM84MOiIYVSzufTbAZr+GNctM3c833y2NG iHheKK7E3wBpkjWIOlipbWVrbMjj5w4TUUZOG64phAQQkoRgEQt+SBNUON8Kki6sRM2K MFV0KR9LnRFBhJeYcZzJK9AWPTpt1WfYNXmfBIjtMRC9EZyqDASwtgqtl8UJjh68oMLr kFHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="AQMc/FeE"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=l6r1Tlld; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e25-20020a170906249900b008e416a5f368si12685248ejb.349.2023.03.01.06.20.48; Wed, 01 Mar 2023 06:21:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="AQMc/FeE"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=l6r1Tlld; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229553AbjCAOSj (ORCPT + 99 others); Wed, 1 Mar 2023 09:18:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230060AbjCAOSK (ORCPT ); Wed, 1 Mar 2023 09:18:10 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFA9C241E9 for ; Wed, 1 Mar 2023 06:18:08 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7LmHc98G0hhyTlVp1eBaBtU+iOdxru2/lJNxgEN+R0I=; b=AQMc/FeEQyC4v0UshKaDe2INekwpr3bkZcUCHbZa0XOyDismO4P4cy1WnalkpjwiegUh+I EXPRZF6Il3mnoZ9T+oMeTL8oZMFXzhFfVPnkXmzLJcr64m/BRm8qnc0nlsV7iR/jbHzf3p toRq0ATMuCYomxpAkSFTBRi0CwB981LniRDTtOchDarxuCsVLXvM0CaBi34+kFoscOm1YC R58mVj6FT38KZWo1UmuqbQl9gVwFRfxavqd7RH8ZtDbP1MFG/b/1s3KNgV7SPTXqj9IBYJ oebxWyWrAY+/QRdasWYVfdQseZnlhPQnaTgiy7yqkFM2xwFySUOq7/i4XPAFEw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7LmHc98G0hhyTlVp1eBaBtU+iOdxru2/lJNxgEN+R0I=; b=l6r1TlldI9GdIPKkCE4H82PWKOrg92M/x+gnwAcHrheV6rV02Ig9c3VcAGhtYma9d3rpvI vwk4VdYC8GqTdaAA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Richard Cochran , Frederic Weisbecker Subject: [PATCH v5 10/18] timer: Retrieve next expiry of pinned/non-pinned timers seperately Date: Wed, 1 Mar 2023 15:17:36 +0100 Message-Id: <20230301141744.16063-11-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175477655007947?= X-GMAIL-MSGID: =?utf-8?q?1759175477655007947?= For the conversion of the NOHZ timer placement to a pull at expiry time model it's required to have seperate expiry times for the pinned and the non-pinned (movable) timers. Therefore struct timer_events is introduced. No functional change Originally-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 45 ++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 5 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 7656eab1bf20..ff41d978cb22 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -221,6 +221,11 @@ struct timer_base { static DEFINE_PER_CPU(struct timer_base, timer_bases[NR_BASES]); +struct timer_events { + u64 local; + u64 global; +}; + #ifdef CONFIG_NO_HZ_COMMON static DEFINE_STATIC_KEY_FALSE(timers_nohz_active); @@ -1968,17 +1973,17 @@ static void forward_base_clk(struct timer_base *base, unsigned long nextevt, */ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) { + struct timer_events tevt = { .local = KTIME_MAX, .global = KTIME_MAX }; unsigned long nextevt, nextevt_local, nextevt_global; struct timer_base *base_local, *base_global; bool local_first, is_idle; - u64 expires = KTIME_MAX; /* * Pretend that there is no timer pending if the cpu is offline. * Possible pending timers will be migrated later to an active cpu. */ if (cpu_is_offline(smp_processor_id())) - return expires; + return tevt.local; base_local = this_cpu_ptr(&timer_bases[BASE_LOCAL]); base_global = this_cpu_ptr(&timer_bases[BASE_GLOBAL]); @@ -2023,16 +2028,46 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) /* We need to mark both bases in sync */ base_local->is_idle = base_global->is_idle = is_idle; - if (base_local->timers_pending || base_global->timers_pending) { + /* + * If the bases are not marked idle, i.e one of the events is at + * max. one tick away, then the CPU can't go into a NOHZ idle + * sleep. Use the earlier event of both and store it in the local + * expiry value. The next global event is irrelevant in this case + * and can be left as KTIME_MAX. CPU will wakeup on time. + */ + if (!is_idle) { /* If we missed a tick already, force 0 delta */ if (time_before(nextevt, basej)) nextevt = basej; - expires = basem + (u64)(nextevt - basej) * TICK_NSEC; + tevt.local = basem + (u64)(nextevt - basej) * TICK_NSEC; + goto unlock; } + + /* + * If the bases are marked idle, i.e. the next event on both the + * local and the global queue are farther away than a tick, + * evaluate both bases. No need to check whether one of the bases + * has an already expired timer as this is caught by the !is_idle + * condition above. + */ + if (base_local->timers_pending) + tevt.local = basem + (u64)(nextevt_local - basej) * TICK_NSEC; + + /* + * If the local queue expires first, then the global event can be + * ignored. The CPU wakes up before that. If the global queue is + * empty, nothing to do either. + */ + if (!local_first && base_global->timers_pending) + tevt.global = basem + (u64)(nextevt_global - basej) * TICK_NSEC; + +unlock: raw_spin_unlock(&base_global->lock); raw_spin_unlock(&base_local->lock); - return cmp_next_hrtimer_event(basem, expires); + tevt.local = min_t(u64, tevt.local, tevt.global); + + return cmp_next_hrtimer_event(basem, tevt.local); } /** From patchwork Wed Mar 1 14:17:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62921 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658817wrd; Wed, 1 Mar 2023 06:23:28 -0800 (PST) X-Google-Smtp-Source: AK7set//E0V/W+M8yYZuuYbMAxSR8YESZPyMIQgsWT77HcGTy9M4GQfncoA2U9ECfgXhEEuTbDJt X-Received: by 2002:aa7:d748:0:b0:4af:5aa1:cfa8 with SMTP id a8-20020aa7d748000000b004af5aa1cfa8mr7639719eds.16.1677680608165; Wed, 01 Mar 2023 06:23:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680608; cv=none; d=google.com; s=arc-20160816; b=mZL6ADgfv9AsRM+VVEN94x++3iQ7sn1+/Y/EmUeT0nyXUFmo0KxuOVqcpsUKVmNPtS UFR+rCJLqI26ka3n4x7XdkNQ+4uOSkBgeJ6E/4bYhTo6e0AAuq6xE/rNAXUkpEjmtrF5 d04AdQcgfu+oTFkXd6KiVOUOdO/bxaBG7G7KJR1jix0+8jMJ5Hhpc/tPVV61QtXAqkjn ygMrxRg0Ugcl19/9AIaZVc7A+Tx7PF3MtXowE59xsiASBeZWfH267hcAlKJPDB40Lgc4 VPC4WceYrHLun7XPyjpykuHXySrcduw+XAKYBP0tgY9489D6lUTqdQGSm30keRKgPWrO JCqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=K7TGxAEPdX981ks1AGMS1UlpllpfQeWHwCRWcpqbDH4=; b=ciIXSf/YIBZbksffQS6uoKvD26LC8C0Barn6SYfI4zWZqUv2bnNoMeEw0oCZY7aL18 wPZBMni4SWv7kdsi/xGrrX1UeqC821nNKGPr5UNQJDOOyKl7PQAgKZ2maQx1adHdhJ0o 7mvTCP14bSwsZ5ZH8kk/AdocKqWg5OnqamkbSfYEoSXfeKfe+t4qdJULVTgOIlk762CW QUuQUWgtqrhWj4h4cFuHRJzkIqJDfir5aJKRbF+/Dxy+p/59sRyJiitUQNF3lzDzL926 MyxLZs+R+ZdmShdpRxEvSdUi61sFCfKwDYiba+HYXfBZF0i6YtVp6/llJ2hkox65JrlB mBMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Vwz1hm8G; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=iGABhEL7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s4-20020a1709064d8400b008dc4b2fbd26si12430034eju.688.2023.03.01.06.23.05; Wed, 01 Mar 2023 06:23:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Vwz1hm8G; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=iGABhEL7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229966AbjCAOTK (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbjCAOSK (ORCPT ); Wed, 1 Mar 2023 09:18:10 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D34372ED74 for ; Wed, 1 Mar 2023 06:18:08 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K7TGxAEPdX981ks1AGMS1UlpllpfQeWHwCRWcpqbDH4=; b=Vwz1hm8GpxV4mAbvHzX+bYSBwH3s2J7efJtHwSAtxsxCx0Q1JijqU0uSjKR0Npp9KgGOnE ltWygBSCWq3bcMghCM1DDOzdogwikwr2nl3VkXkKrpCm//3FSkVbuqc5KrOuFZ7t0c4UAk wi2igyAEA2/RiuOTYDwOMR2Qma26oGZbFTd9/5Ias2RaWEVT2ijixzEVXFHcIWYJnC3QUl MXowSvk9tfKtrR4YqXZCzWGWh8p5b69u5a3CqjtEhHpUlK9lAhj8iFO7OoDIANgJkVKi/8 CXQbKWRtloioiI8INCj/9U4VDpFESHnKiAa6v2jQs1GTECNaweziFEle+35dsg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K7TGxAEPdX981ks1AGMS1UlpllpfQeWHwCRWcpqbDH4=; b=iGABhEL79yPDYZWtKAVio4p4u/KtoHqBn2Gnhm87dX337g0/y1KyE5HGvYzKXcCw9YSOum cgtnXN+2ADyoIbAw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 11/18] timer: Split out "get next timer interrupt" functionality Date: Wed, 1 Mar 2023 15:17:37 +0100 Message-Id: <20230301141744.16063-12-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175621166903850?= X-GMAIL-MSGID: =?utf-8?q?1759175621166903850?= The functionallity for getting the next timer interrupt in get_next_timer_interrupt() is splitted into a separate function fetch_next_timer_interrupt() to be usable by other callsides. This is preparatory work for the conversion of the NOHZ timer placement to a pull at expiry time model. No functional change. Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- v5: Update commit message --- kernel/time/timer.c | 91 +++++++++++++++++++++++++-------------------- 1 file changed, 50 insertions(+), 41 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ff41d978cb22..dfc744545159 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1944,6 +1944,46 @@ static unsigned long next_timer_interrupt(struct timer_base *base) return base->next_expiry; } +static unsigned long fetch_next_timer_interrupt(struct timer_base *base_local, + struct timer_base *base_global, + unsigned long basej, u64 basem, + struct timer_events *tevt) +{ + unsigned long nextevt_local, nextevt_global; + bool local_first; + + nextevt_local = next_timer_interrupt(base_local); + nextevt_global = next_timer_interrupt(base_global); + + /* + * Check whether the local event is expiring before or at the same + * time as the global event. + * + * Note, that nextevt_global and nextevt_local might be based on + * different base->clk values. So it's not guaranteed that + * comparing with empty bases results in a correct local_first. + */ + if (base_local->timers_pending && base_global->timers_pending) + local_first = time_before_eq(nextevt_local, nextevt_global); + else + local_first = base_local->timers_pending; + + /* + * Update tevt->* values: + * + * If the local queue expires first, then the global event can + * be ignored. If the global queue is empty, nothing to do + * either. + */ + if (!local_first && base_global->timers_pending) + tevt->global = basem + (u64)(nextevt_global - basej) * TICK_NSEC; + + if (base_local->timers_pending) + tevt->local = basem + (u64)(nextevt_local - basej) * TICK_NSEC; + + return local_first ? nextevt_local : nextevt_global; +} + /* * Forward base clock is done only when @basej is past base->clk, otherwise * base-clk might be rewind. @@ -1976,7 +2016,7 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) struct timer_events tevt = { .local = KTIME_MAX, .global = KTIME_MAX }; unsigned long nextevt, nextevt_local, nextevt_global; struct timer_base *base_local, *base_global; - bool local_first, is_idle; + bool is_idle; /* * Pretend that there is no timer pending if the cpu is offline. @@ -1991,8 +2031,11 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) raw_spin_lock(&base_local->lock); raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING); - nextevt_local = next_timer_interrupt(base_local); - nextevt_global = next_timer_interrupt(base_global); + nextevt = fetch_next_timer_interrupt(base_local, base_global, + basej, basem, &tevt); + + nextevt_local = base_local->next_expiry; + nextevt_global = base_global->next_expiry; /* * We have a fresh next event. Check whether we can forward the @@ -2001,21 +2044,6 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) forward_base_clk(base_local, nextevt_local, basej); forward_base_clk(base_global, nextevt_global, basej); - /* - * Check whether the local event is expiring before or at the same - * time as the global event. - * - * Note, that nextevt_global and nextevt_local might be based on - * different base->clk values. So it's not guaranteed that - * comparing with empty bases results in a correct local_first. - */ - if (base_local->timers_pending && base_global->timers_pending) - local_first = time_before_eq(nextevt_local, nextevt_global); - else - local_first = base_local->timers_pending; - - nextevt = local_first ? nextevt_local : nextevt_global; - /* * Bases are idle if the next event is more than a tick away. Also * the tick is stopped so any added timer must forward the base clk @@ -2028,6 +2056,9 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) /* We need to mark both bases in sync */ base_local->is_idle = base_global->is_idle = is_idle; + raw_spin_unlock(&base_global->lock); + raw_spin_unlock(&base_local->lock); + /* * If the bases are not marked idle, i.e one of the events is at * max. one tick away, then the CPU can't go into a NOHZ idle @@ -2040,31 +2071,9 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) if (time_before(nextevt, basej)) nextevt = basej; tevt.local = basem + (u64)(nextevt - basej) * TICK_NSEC; - goto unlock; + tevt.global = KTIME_MAX; } - /* - * If the bases are marked idle, i.e. the next event on both the - * local and the global queue are farther away than a tick, - * evaluate both bases. No need to check whether one of the bases - * has an already expired timer as this is caught by the !is_idle - * condition above. - */ - if (base_local->timers_pending) - tevt.local = basem + (u64)(nextevt_local - basej) * TICK_NSEC; - - /* - * If the local queue expires first, then the global event can be - * ignored. The CPU wakes up before that. If the global queue is - * empty, nothing to do either. - */ - if (!local_first && base_global->timers_pending) - tevt.global = basem + (u64)(nextevt_global - basej) * TICK_NSEC; - -unlock: - raw_spin_unlock(&base_global->lock); - raw_spin_unlock(&base_local->lock); - tevt.local = min_t(u64, tevt.local, tevt.global); return cmp_next_hrtimer_event(basem, tevt.local); From patchwork Wed Mar 1 14:17:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62920 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658716wrd; Wed, 1 Mar 2023 06:23:14 -0800 (PST) X-Google-Smtp-Source: AK7set8hjZh/H/7UnqKBzWM4kTsQ6DEu4iLPQk9eqOUaDEneH2y3wdGVHdRauuD914UKk50Yxuoc X-Received: by 2002:a17:906:3b82:b0:889:d998:1576 with SMTP id u2-20020a1709063b8200b00889d9981576mr6605672ejf.66.1677680594442; Wed, 01 Mar 2023 06:23:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680594; cv=none; d=google.com; s=arc-20160816; b=x/QKrUBTnPqiZvexaNVRFDUFVv9vp2J4c37LM1WXhEKv0S5ivCfyjXv1Y4c4AIWIpA RDv7zDrLeQmDfvsHs97+Ii+S+k7IhEqf2HkbpgGk77USaDHVMF7dP1cFNwbNielJl7yD Jr4UMc6CSaQVGfiG1KnF4OItkRIy32b0ebSHW4RPIxSWcV30ktb34vgp1kH4kntu6XEN Z6m4fYFpdIxlIl+Brlq8bCm2loYYTySyL6tQk3VmKPh/8PDTNDrJ+P67Rc43GDwbT8KM 1wwvCyeMzUE9zYApI10OSC49uYtR3Szh0geURvxojxbdRoSGQ3xeRbclpEiVHuJgsFV7 BdcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=VpDSCQnNc4s0GTZc/8uTJZlimFSIHsKz1UPfERjKnI8=; b=PjxgDtjcUZoktNY8NUNzl1JISg+IlU1U9km/gWso0j8eeQSvsJ1Lf/6NVIx9WoVGkH M88Zqfu17/69rGZ605mVMvwXGQe6UoJ4p+74fcX7vT6ELx1NwnlXB4goVh84qcl/MU06 M9RzMd8ygd5bQ5mz4g5Bl0Ax5m2woefsoGCQ+m23kvRG/RVG37+uWWl7ucPJfGRU1epQ kODuopHzMBb8trPXnSnqczM3YHMcaiVfBn9+WtR+fw5dfyq80JAPMhaa47Ho4pIQRdGL ymVujQCnC/Nf8mTK+UwQMCbwrZv85bJLhASMNkhNO5hIJuCpHLP8pnqi3CDintoE2ki6 KUcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Uyg0dL6E; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rp8-20020a170906d96800b008bd8270048dsi15164517ejb.371.2023.03.01.06.22.51; Wed, 01 Mar 2023 06:23:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Uyg0dL6E; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230044AbjCAOTE (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230040AbjCAOSL (ORCPT ); Wed, 1 Mar 2023 09:18:11 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B93CF6A73 for ; Wed, 1 Mar 2023 06:18:09 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VpDSCQnNc4s0GTZc/8uTJZlimFSIHsKz1UPfERjKnI8=; b=Uyg0dL6EYzTTDFMRiXND4aJPhbj6wTr14qABVK5tmOYitP7LgAv07U++eCmTbuxqYHmyg9 j9CjS6S9GfdKKdxcYIyHsq1BloFWheXZL1oDQf09ID8xaREyDPQFyBjZlpvg4TwAmrjqB3 y9xZKVtW6sOtHv6GO8Lk4kyR5lfU/a1Pd99FNCoF7jjhg+4AsEkYJ/GLKfjfJkBxyxpVAX d/3erg8cPNNspyM6wlzrM26g7rk/KT+vqP/f35r3lRp/DmDinEgY5NFo1chvZrlM8+ChPM ka7TzBrv5Ovt3en+dPs7y6bQK2GHl1JLODh5TH9w/gPILAE3VdUR2UNO2zw+BQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680287; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VpDSCQnNc4s0GTZc/8uTJZlimFSIHsKz1UPfERjKnI8=; b=3E8OfASZGSVsrorgzwXmLFf6+2XmRclKO9aBdiDYIBDMpk8QjDtF4bGkfJVKr+T7TQjYaD 87fweegl1rxd9aDA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 12/18] timer: Add get next timer interrupt functionality for remote CPUs Date: Wed, 1 Mar 2023 15:17:38 +0100 Message-Id: <20230301141744.16063-13-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175606693726209?= X-GMAIL-MSGID: =?utf-8?q?1759175606693726209?= To prepare for the conversion of the NOHZ timer placement to a pull at expiry time model it's required to have functionality available getting the next timer interrupt on a remote CPU. Signed-off-by: Anna-Maria Behnsen --- kernel/time/tick-internal.h | 8 +++++++ kernel/time/timer.c | 46 +++++++++++++++++++++++++++++++++---- 2 files changed, 49 insertions(+), 5 deletions(-) diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h index 649f2b48e8f0..28471c8f8c9c 100644 --- a/kernel/time/tick-internal.h +++ b/kernel/time/tick-internal.h @@ -8,6 +8,11 @@ #include "timekeeping.h" #include "tick-sched.h" +struct timer_events { + u64 local; + u64 global; +}; + #ifdef CONFIG_GENERIC_CLOCKEVENTS # define TICK_DO_TIMER_NONE -1 @@ -164,6 +169,9 @@ static inline void timers_update_nohz(void) { } DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases); extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem); +extern void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem, + struct timer_events *tevt, + unsigned int cpu); void timer_clear_idle(void); #define CLOCK_SET_WALL \ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index dfc744545159..9daaef5d2f6f 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -221,11 +221,6 @@ struct timer_base { static DEFINE_PER_CPU(struct timer_base, timer_bases[NR_BASES]); -struct timer_events { - u64 local; - u64 global; -}; - #ifdef CONFIG_NO_HZ_COMMON static DEFINE_STATIC_KEY_FALSE(timers_nohz_active); @@ -1984,6 +1979,47 @@ static unsigned long fetch_next_timer_interrupt(struct timer_base *base_local, return local_first ? nextevt_local : nextevt_global; } +/** + * fetch_next_timer_interrupt_remote + * @basej: base time jiffies + * @basem: base time clock monotonic + * @tevt: Pointer to the storage for the expiry values + * @cpu: Remote CPU + * + * Stores the next pending local and global timer expiry values in the + * struct pointed to by @tevt. If a queue is empty the corresponding + * field is set to KTIME_MAX. If local event expires before global + * event, global event is set to KTIME_MAX as well. + */ +void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem, + struct timer_events *tevt, + unsigned int cpu) +{ + struct timer_base *base_local, *base_global; + unsigned long flags; + + /* Preset local / global events */ + tevt->local = tevt->global = KTIME_MAX; + + /* + * Pretend that there is no timer pending if the cpu is offline. + * Possible pending timers will be migrated later to an active cpu. + */ + if (cpu_is_offline(cpu)) + return; + + base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu); + base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu); + + raw_spin_lock_irqsave(&base_local->lock, flags); + raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING); + + fetch_next_timer_interrupt(base_local, base_global, basej, basem, tevt); + + raw_spin_unlock(&base_global->lock); + raw_spin_unlock_irqrestore(&base_local->lock, flags); +} + /* * Forward base clock is done only when @basej is past base->clk, otherwise * base-clk might be rewind. From patchwork Wed Mar 1 14:17:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62917 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658537wrd; Wed, 1 Mar 2023 06:22:53 -0800 (PST) X-Google-Smtp-Source: AK7set+rThver9mwxeF3k17g/Zj4LxjEN5FpYkFhtPiajtrFHbiXRXw0TMBBqJl1nXKNsnynVAqa X-Received: by 2002:a17:906:c2c3:b0:878:42af:aa76 with SMTP id ch3-20020a170906c2c300b0087842afaa76mr7806925ejb.54.1677680573751; Wed, 01 Mar 2023 06:22:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680573; cv=none; d=google.com; s=arc-20160816; b=ciYt1hSEEy9kzA88BLrQ2SfN+UwHb5iyVFBXm7yJSDXEMtFHgr8RlEVSh/UPpn1w6a tx6Fv3dsnF27+DAZwEj4Df8J3dO/61hgCvnNhqQlzVCbyqtdwyU7TtTPmExphTS+/W2o Z6iOcXmTUokuC4PoQ69xQd7xGDssfFTOnFGMeZDrs7GoKdeWfSuf7+OnBdyl8WTDufou ZC87GhiToh6+2zcGca7FWNgYNIaghPqJbhFJZ01VNDVOhwT0nKaMKi2Ak4eaerTtdxRx 5KYQyzbmbV1KKVbxMGoUU4T82nW6B9PF0xCiYPIL0JEulFNNpgUsCG0/xEn3OTE0UWst A4Yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=hYkRaj7fFuHhhF0p5MKMIEGo5VHWJduanTtTi0xnzx4=; b=cEs72Pgn3eFOog2T1U7HnjMuquDmjO+h8+Yj5ILlZHRao+QMOS3IRdbncsCEgG18i1 1teYSC8ehJYdwULVmgDq0PjTIdUC8B5lx/D01hJPi7Jj11b4fL5uvPv6DSaoMbdbp5Dy 4jFNVio5+mc4654bkEDl235A4hYQme9mNZyMJE/2frxI/baZGIOmr46CG9ufnR4zb1pa pPf2X8+YTsQNDjxLEcfLphoydeZsTy+Nnkt+SlJXNQcXyPs1NKmFJiyLj2Gd5Se/GzzE qbPf4E/xF08GMRRssINOufcCdPCZ+AbrbG2OT76uq9t+xILc6XuWwVWqNhK75hmULmD6 EjbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=mVDzJUVS; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g8-20020a1709065d0800b008dcca98e6edsi6450483ejt.29.2023.03.01.06.22.31; Wed, 01 Mar 2023 06:22:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=mVDzJUVS; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229934AbjCAOT0 (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230064AbjCAOSM (ORCPT ); Wed, 1 Mar 2023 09:18:12 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC735302AB for ; Wed, 1 Mar 2023 06:18:09 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680288; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hYkRaj7fFuHhhF0p5MKMIEGo5VHWJduanTtTi0xnzx4=; b=mVDzJUVSL8YbK+9WDuug7CP+lk9mwKhlGxGqYBLWtRQSlXkhIBWl4oBxypRJGdqj+TUatt VExTVmNdLsACHM/tNqpr80K/MOPP9k4If66+GlQJgP0/m/jekcPdlSBuaFeVCvKkMIhvfU saLM+xNjbyzEH5tLcJbm51/1SQS5qpe3IiXo7BQgQCZHtgwIt4VWluS3RZqDtCjOG5PMUI 7MhBnPA0pTWsRHK/BPjCprpdFLZwBb8ASGDnXzkWn5DNGzeDjErUjTANfpLYAaho1qjKoJ emM45ZMQBHGv+aHZODTbLTZXEAsaFpM5kechy4w0V/0wRoJlsaVkV3E2KpqDLA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680288; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hYkRaj7fFuHhhF0p5MKMIEGo5VHWJduanTtTi0xnzx4=; b=Dxk2RIhDpMLPxR2d11a/L4F5Oxuy6YM3yRzV2H/snMQzjS/yU651n5EfrTTDPrSORm0i4N y175wT56vIlHjNCw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , "Richard Cochran (linutronix GmbH)" , Anna-Maria Behnsen Subject: [PATCH v5 13/18] timer: Restructure internal locking Date: Wed, 1 Mar 2023 15:17:39 +0100 Message-Id: <20230301141744.16063-14-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175585198659533?= X-GMAIL-MSGID: =?utf-8?q?1759175585198659533?= From: "Richard Cochran (linutronix GmbH)" Move the locking out from __run_timers() to the call sites, so the protected section can be extended at the call site. Preparatory patch for changing the NOHZ timer placement to a pull at expiry time model. No functional change. Signed-off-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- kernel/time/timer.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 9daaef5d2f6f..be085e94afcc 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2142,11 +2142,7 @@ static inline void __run_timers(struct timer_base *base) struct hlist_head heads[LVL_DEPTH]; int levels; - if (time_before(jiffies, base->next_expiry)) - return; - - timer_base_lock_expiry(base); - raw_spin_lock_irq(&base->lock); + lockdep_assert_held(&base->lock); while (time_after_eq(jiffies, base->clk) && time_after_eq(jiffies, base->next_expiry)) { @@ -2166,21 +2162,36 @@ static inline void __run_timers(struct timer_base *base) while (levels--) expire_timers(base, heads + levels); } +} + +static void __run_timer_base(struct timer_base *base) +{ + if (time_before(jiffies, base->next_expiry)) + return; + + timer_base_lock_expiry(base); + raw_spin_lock_irq(&base->lock); + __run_timers(base); raw_spin_unlock_irq(&base->lock); timer_base_unlock_expiry(base); } +static void run_timer_base(int index) +{ + struct timer_base *base = this_cpu_ptr(&timer_bases[index]); + + __run_timer_base(base); +} + /* * This function runs timers and the timer-tq in bottom half context. */ static __latent_entropy void run_timer_softirq(struct softirq_action *h) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); - - __run_timers(base); + run_timer_base(BASE_LOCAL); if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) { - __run_timers(this_cpu_ptr(&timer_bases[BASE_GLOBAL])); - __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); + run_timer_base(BASE_GLOBAL); + run_timer_base(BASE_DEF); } } From patchwork Wed Mar 1 14:17:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62924 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658980wrd; Wed, 1 Mar 2023 06:23:48 -0800 (PST) X-Google-Smtp-Source: AK7set/lh6ASGZLGIBFDPFnNy3/LxMD+5XA1WflYkyYBX304ZJD6SQ6lhM55lYXpT4k8DolIyh9Z X-Received: by 2002:aa7:ccd8:0:b0:4be:3918:9217 with SMTP id y24-20020aa7ccd8000000b004be39189217mr757853edt.8.1677680628737; Wed, 01 Mar 2023 06:23:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680628; cv=none; d=google.com; s=arc-20160816; b=karaLRkR8Dlh1tuhVDWPymaWoUqeMe2wT/ZCpyqG8kGcrXktE5WWxpYJLJLNeAX2+Y WS8/Ew90dQFjk9l4g2Gm/bI+ab2mZ0LJ0AKRv54yjch0OgCQHpzUQGwD9xUJyUZGng5a qhM6P0xHYcSvL0crM0g6dg4ewhZnsZZlUHZB1ig+mbQMX+t7SD8M8D5zblyGG3ghliHS WGRIbdT/SH/CXdi63WbRDLxmq/tp0PxEXVtHUsv/kHHAxOULOuRwTrmFGPytbdXvE/nI MYKWETvI+RYurKvhj8U9jP2t71NLwKKV4XavHvCU7W5Z6NpRWYWz983k1eeSBwV3fuaH Om7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=u6oVUtVUJMlppty9lAAWCtxtvp5SfOd42zCaYGloWms=; b=Y15OKW5M29SbRzn6CfKY02hzdSt/4Q4uxpOQHooTbq3vFQ0Cua/BIv3lHB0B9VSa7C VTLz6AB1FWe7lXNwoBUbZyNd/CCzk+fjIjJYvNYQysN5EWkVtE8odutHKhFhCG1h+mwF V3ON7vajIjn34Zs3EDf3l0kccR9e6oUWwQzuz4xF939KRzp1VS4NXgfRuQCnijBuOVXG rokLbgEy2E1Ej4y/9skn90m3UdsAH5HSAvTOs7mRL8jnP6Mt/HaILfDAECTRcRCjikSP pTl2HIYQXYabCJeqxj1GgMj/s3vaYgHq+uCTBIFcYaxVjnFMmo6HKpK+ZNjmYNLV+8YD /iWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=rJRFd6NO; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u4-20020aa7d544000000b004ad738a5cefsi15262071edr.248.2023.03.01.06.23.24; Wed, 01 Mar 2023 06:23:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=rJRFd6NO; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230120AbjCAOTO (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230048AbjCAOSL (ORCPT ); Wed, 1 Mar 2023 09:18:11 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBE5023DA4 for ; Wed, 1 Mar 2023 06:18:10 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680288; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u6oVUtVUJMlppty9lAAWCtxtvp5SfOd42zCaYGloWms=; b=rJRFd6NOExYRfgT5nUW82t+EbfhJXKGEn+S4VrbWmOaUrRCVmkH0Z4aVUjVf1ghpT1L+Cu 1z7wFyjMtSctofwD0dhyLCKgi5sBXrLLWggPm0uyYnWdOids2qOXOY2Nn8rSczq1SMFPul HAvqboFCms/NIg7HXcNPqDdyTHVRDldIAFuRvejog1SHUJPYfO2gJGkYfRCD9N29p9JTFm O7vjcKC/ZU/kcWmXjhw8TTUAG6KOnx5LWoeDZ9ZpkYuOGiaS712ItMYdX7UTKa4KtFcVVO +kvgTECWLTpoDP20ixLWyzw7Jb56jxqAoH4YRtPZRfQhbPx7vM+3VUZ/csAyXA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680288; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u6oVUtVUJMlppty9lAAWCtxtvp5SfOd42zCaYGloWms=; b=H3Sg05TucHQ4eIeUoWGT/6UIuntHMs16WRxX+dYai1BoYVUOoOyK7Iedwlfgf3vGnLESI+ iL5bcenD6+DaY+Cw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 14/18] timer: Check if timers base is handled already Date: Wed, 1 Mar 2023 15:17:40 +0100 Message-Id: <20230301141744.16063-15-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175642592533246?= X-GMAIL-MSGID: =?utf-8?q?1759175642592533246?= Due to the conversion of the NOHZ timer placement to a pull at expiry time model, the per CPU timer bases with non pinned timers are no longer handled only by the local CPU. In case a remote CPU already expires the non pinned timers base of the local cpu, nothing more needs to be done by the local CPU. A check at the begin of the expire timers routine is required, because timer base lock is dropped before executing the timer callback function. This is a preparatory work, but has no functional impact right now. Signed-off-by: Anna-Maria Behnsen --- kernel/time/timer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index be085e94afcc..9553da99e262 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2144,6 +2144,9 @@ static inline void __run_timers(struct timer_base *base) lockdep_assert_held(&base->lock); + if (!!base->running_timer) + return; + while (time_after_eq(jiffies, base->clk) && time_after_eq(jiffies, base->next_expiry)) { levels = collect_expired_timers(base, heads); From patchwork Wed Mar 1 14:17:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62925 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3659261wrd; Wed, 1 Mar 2023 06:24:25 -0800 (PST) X-Google-Smtp-Source: AK7set/cuByPBnQIXq1j1R5Lpp8VmFmzzIm/kNbVdsGTc72rrQ/S/jrD1HXE8YxRYhL4+LH8/qwJ X-Received: by 2002:a62:3281:0:b0:5a8:b419:9a51 with SMTP id y123-20020a623281000000b005a8b4199a51mr5353909pfy.26.1677680664795; Wed, 01 Mar 2023 06:24:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680664; cv=none; d=google.com; s=arc-20160816; b=OZQM/yUu3sqKptxYGFl1B9XZway3geUUPG9RdLsyfLx0ntaLXq+E79ih0OtCSlYfbG bRroulOoQaY6uJ5QO1vSX8dj8fei7jtSqmYTkZQ8zKPEFU82paWAxVyDvM1TRTcglNAd Bw9sz+J8aSuV1szEBRU+o7FLDw71qkQDGbBF3fZkQmhioxFOTLNJalv6ou8GgED8Esoc CdsiuRYEkojMpvMiw5Hs3jUavYG7yXumzWrix7QviqXDdWluMazM646+AojnQBYyzSSr 9wVZAIs5RSTgxO1/l2Hi7N5DyoV+zFgxcjvqo5ciVCldNvQAVHys8rJbD4lDkB/o5xqk tCjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=t+mcuo9SHWn+iEcKKGd8AZLU0jiixo7TxXcneH/7+TQ=; b=sX16c22AmVnYtOjlP487jNjF3VVSXttufulIJ+H9Tx+Igf0c9geHR25oj/YNolLCt1 7VNsgA1AEl7IT8sIuNekJ5dXxGQFd/pFff+SOe1Xwkkm+HiyLwq34GZUQZoII8Z5xwEq /oO6PmhwaMonxLNarQQ1q7vL8HL0e8LNL3iN/oyy9PvuVc4YPt6qjhuWGNv1zj1TDJW5 YX5equXc4NYVm4MuhpBTH2d4FhWmmE6NzsNqzAn0g13rPaUxDRTqUdmT5aUnYguzjiaZ QV/00xxtQIcN2pxvWK3TtaKYlzaEAhpo3ZJ/auMadCXwX2xIaJHbxegDGeNRnJTYnsDw z1fQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SrPf7y1v; dkim=neutral (no key) header.i=@linutronix.de header.b=kuAwiw+d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t7-20020aa79467000000b00571a5c5eaa5si12314634pfq.150.2023.03.01.06.24.10; Wed, 01 Mar 2023 06:24:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=SrPf7y1v; dkim=neutral (no key) header.i=@linutronix.de header.b=kuAwiw+d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230171AbjCAOTX (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230054AbjCAOSM (ORCPT ); Wed, 1 Mar 2023 09:18:12 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC68B302AA for ; Wed, 1 Mar 2023 06:18:10 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t+mcuo9SHWn+iEcKKGd8AZLU0jiixo7TxXcneH/7+TQ=; b=SrPf7y1vkX+RuYby2o5l5l7a3LdDvkkNtYZsyBDAaS9XNWnv0qcE+OLxuSn8c8RXvYcg68 2TZM+b/ZTEU/og9uvEgZR+YsaHtaS2JwIl8BSjXZgu8ASFMPCKZTypgVcGHZcX9a5rI8bo PpFSnaJ2D9D0Wrpyo4cvOVa9bVDGoHpsDZvirMhaYjW3HMJAJiTuSHne61WhDyf/20LNxq QZbAXa6Xhw/wNJKFlM32PqWP7aq3qVqiylmBrvaoi3+FFbyrmJPltkwhnsrtdwwQOFIXC1 vv+RlNPnpjVQbpX9mEc8bytaUgpgBRNj55EjZwBEEoe0V6eoLfmZEDOvb3qOVg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t+mcuo9SHWn+iEcKKGd8AZLU0jiixo7TxXcneH/7+TQ=; b=kuAwiw+dM7gcSLkt/DagJPiL3KKlgmAfEhHTLwbVH///lMraK+wXG6CQ8QRcYLtKvXXxCr m9CHzzGTRqPzDyDg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , "Richard Cochran (linutronix GmbH)" , Anna-Maria Behnsen Subject: [PATCH v5 15/18] tick/sched: Split out jiffies update helper function Date: Wed, 1 Mar 2023 15:17:41 +0100 Message-Id: <20230301141744.16063-16-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175680757906301?= X-GMAIL-MSGID: =?utf-8?q?1759175680757906301?= From: "Richard Cochran (linutronix GmbH)" The logic to get the time of the last jiffies update will be needed by the timer pull model as well. Move the code into a global funtion in anticipation of the new caller. No functional change. Signed-off-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- kernel/time/tick-internal.h | 1 + kernel/time/tick-sched.c | 18 +++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h index 28471c8f8c9c..296cd06bbb24 100644 --- a/kernel/time/tick-internal.h +++ b/kernel/time/tick-internal.h @@ -158,6 +158,7 @@ static inline void tick_nohz_init(void) { } #ifdef CONFIG_NO_HZ_COMMON extern unsigned long tick_nohz_active; extern void timers_update_nohz(void); +extern u64 get_jiffies_update(unsigned long *basej); # ifdef CONFIG_SMP extern struct static_key_false timers_migration_enabled; # endif diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 7ffdc7ba19b4..1075697f2aa8 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -782,18 +782,30 @@ static inline bool local_timer_softirq_pending(void) return local_softirq_pending() & BIT(TIMER_SOFTIRQ); } -static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) +/* + * Read jiffies and the time when jiffies were updated last + */ +u64 get_jiffies_update(unsigned long *basej) { - u64 basemono, next_tick, delta, expires; unsigned long basejiff; unsigned int seq; + u64 basemono; - /* Read jiffies and the time when jiffies were updated last */ do { seq = read_seqcount_begin(&jiffies_seq); basemono = last_jiffies_update; basejiff = jiffies; } while (read_seqcount_retry(&jiffies_seq, seq)); + *basej = basejiff; + return basemono; +} + +static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) +{ + u64 basemono, next_tick, delta, expires; + unsigned long basejiff; + + basemono = get_jiffies_update(&basejiff); ts->last_jiffies = basejiff; ts->timer_expires_base = basemono; From patchwork Wed Mar 1 14:17:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62927 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3660628wrd; Wed, 1 Mar 2023 06:27:09 -0800 (PST) X-Google-Smtp-Source: AK7set/8YhJ02mQSARf2CZ+AbtrV6SdeCQfUPO729lOlQ5pDFZ6s1KklP22yEJMfuO4YifqSDazN X-Received: by 2002:aa7:c612:0:b0:4ad:7224:ce9d with SMTP id h18-20020aa7c612000000b004ad7224ce9dmr7032570edq.17.1677680829563; Wed, 01 Mar 2023 06:27:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680829; cv=none; d=google.com; s=arc-20160816; b=qQlbNLUPHL/4X3pQyvUF5ktoZyUAsljJ5cd+9DvZFMW29ieEe0bvLESFgrZo77z36c zYla8ZaC0itnYfhABHTP8gFhcsNYwxkTEHmanDu2QKpjVIoKYP9ksPliPVj2EhoFc0LV WM05IIotOwHq7LOchD2R3HHLjETybGvQr/c0sVji5zvLeydbSvw5WqUCd/H4DlstD8rX bRO7weWDHMcjLut8NFaAsRXpk8ajl0IL87ds2COiAM6+A2uB+NmIn7gRKuQ9hr0+qhrX ZUNG7zm1EuYRALuopeajJTTfFNKuZY13grRMHxeOqqS6XgGK52gPNVrZPn68CL8HzqtX 31uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=23uLY+Q3T9qMqfkrglyn8tQYB4V0tDtKtJNSP2hp1Nk=; b=QqxeXJRdWi36G6eG5igcmr/0hp+9YAKEDYCxRrIi6JEuFROkO3AJV6ma/Kvu3p9bOr /RyHPQdZZ7bJ4S686II4ZYTPp2vAlxvKiDrgVvpAt3qalN9PaW4DykpLfdrmIxAnoshY aGLuTQyaSDIsBnLDx0MCQLZ9vLQSQ4eqHG3RhGtlLUYJcl1emiHXYB5Ds2q9Ovrjjj8I bV9qWZHVthawaMoA3v8HfcRl5ulBwih2OPw9rYAezYYIEsu/m6thhHDepwGp+m9zkQ5P OJ2x8JAnfQbnoFSXbjqHT04s2ewPmgD3IKXPAYHYbpcimLiuQHcvIHOyCzibqeebsyMa 9NOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RpttKZ3o; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p6-20020a170906140600b008c960203230si14703019ejc.351.2023.03.01.06.26.44; Wed, 01 Mar 2023 06:27:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RpttKZ3o; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229928AbjCAOTg (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230079AbjCAOSR (ORCPT ); Wed, 1 Mar 2023 09:18:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B54D31A485 for ; Wed, 1 Mar 2023 06:18:11 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=23uLY+Q3T9qMqfkrglyn8tQYB4V0tDtKtJNSP2hp1Nk=; b=RpttKZ3osNtudAFRmcaI4lQOmp921Vy3ozoeVRTkvKP0yVUC2VSY+lxxa9n9J9Qa4jHmB9 NJ3CUXQZneXPuo/ILuSuky05Llvo+G8YBXaKrSPP3kp+ZV+luLqySNl6k4RdjRlpdLOl/5 3I+G7k2xu+cGot8rb1dVDshfxuuJN7sOxdHVFQ3oFtiDrirdl1GZHfpBpCq8Afy4uWzAch pxqLRSB5uVBuRdQsxzQ86aW0AAlJH67cut9Zv/+P3XebQRACpoiHZX7xE6TPN4iD2WzYb8 XTPXYPEAMu8xxI6JE8LvgmGt/GLx5nka7kVkE4jrMFG6Efe5yOVs8heYEy19qg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=23uLY+Q3T9qMqfkrglyn8tQYB4V0tDtKtJNSP2hp1Nk=; b=nDXHsYzfzoQJ6sHhr9bxJowpemNt2DA9wbVUPKjzY30S+WD86O4tcS6kw7atcU7KewmyXH hyqkUOz6I0wlkWBQ== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 16/18] timer: Implement the hierarchical pull model Date: Wed, 1 Mar 2023 15:17:42 +0100 Message-Id: <20230301141744.16063-17-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175853668014309?= X-GMAIL-MSGID: =?utf-8?q?1759175853668014309?= Placing timers at enqueue time on a target CPU based on dubious heuristics does not make any sense: 1) Most timer wheel timers are canceled or rearmed before they expire. 2) The heuristics to predict which CPU will be busy when the timer expires are wrong by definition. So placing the timers at enqueue wastes precious cycles. The proper solution to this problem is to always queue the timers on the local CPU and allow the non pinned timers to be pulled onto a busy CPU at expiry time. Therefore split the timer storage into local pinned and global timers: Local pinned timers are always expired on the CPU on which they have been queued. Global timers can be expired on any CPU. As long as a CPU is busy it expires both local and global timers. When a CPU goes idle it arms for the first expiring local timer. If the first expiring pinned (local) timer is before the first expiring movable timer, then no action is required because the CPU will wake up before the first movable timer expires. If the first expiring movable timer is before the first expiring pinned (local) timer, then this timer is queued into a idle timerqueue and eventually expired by some other active CPU. To avoid global locking the timerqueues are implemented as a hierarchy. The lowest level of the hierarchy holds the CPUs. The CPUs are associated to groups of 8, which are seperated per node. If more than one CPU group exist, then a second level in the hierarchy collects the groups. Depending on the size of the system more than 2 levels are required. Each group has a "migrator" which checks the timerqueue during the tick for remote expirable timers. If the last CPU in a group goes idle it reports the first expiring event in the group up to the next group(s) in the hierarchy. If the last CPU goes idle it arms its timer for the first system wide expiring timer to ensure that no timer event is missed. Signed-off-by: Anna-Maria Behnsen --- v5: - Review remarks of Frederic - Return nextevt when CPU is marked offline in timer migration hierarchy instead of KTIME_MAX - Fix update of group events issue, after remote expiring v4: - Fold typo fix in comment into proper patch "timer: Split out "get next timer interrupt" functionality" - Update wrong comment for tmigr_state union definition - Fix fallout of kernel test robot --- include/linux/cpuhotplug.h | 1 + kernel/time/Makefile | 3 + kernel/time/timer.c | 67 +- kernel/time/timer_migration.c | 1255 +++++++++++++++++++++++++++++++++ kernel/time/timer_migration.h | 123 ++++ 5 files changed, 1441 insertions(+), 8 deletions(-) create mode 100644 kernel/time/timer_migration.c create mode 100644 kernel/time/timer_migration.h diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index c6fab004104a..6d79a5f5560b 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -244,6 +244,7 @@ enum cpuhp_state { CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE, CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE, CPUHP_AP_PERF_CSKY_ONLINE, + CPUHP_AP_TMIGR_ONLINE, CPUHP_AP_WATCHDOG_ONLINE, CPUHP_AP_WORKQUEUE_ONLINE, CPUHP_AP_RANDOM_ONLINE, diff --git a/kernel/time/Makefile b/kernel/time/Makefile index 7e875e63ff3b..4af2a264a160 100644 --- a/kernel/time/Makefile +++ b/kernel/time/Makefile @@ -17,6 +17,9 @@ endif obj-$(CONFIG_GENERIC_SCHED_CLOCK) += sched_clock.o obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o tick-sched.o obj-$(CONFIG_LEGACY_TIMER_TICK) += tick-legacy.o +ifeq ($(CONFIG_SMP),y) + obj-$(CONFIG_NO_HZ_COMMON) += timer_migration.o +endif obj-$(CONFIG_HAVE_GENERIC_VDSO) += vsyscall.o obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o obj-$(CONFIG_TEST_UDELAY) += test_udelay.o diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 9553da99e262..01e97342ad0d 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -53,6 +53,7 @@ #include #include "tick-internal.h" +#include "timer_migration.h" #define CREATE_TRACE_POINTS #include @@ -2044,8 +2045,11 @@ static void forward_base_clk(struct timer_base *base, unsigned long nextevt, * idle. Idle handling of timer bases is allowed only to be done by CPU * itself. * - * Returns the tick aligned clock monotonic time of the next pending - * timer or KTIME_MAX if no timer is pending. + * Returns the tick aligned clock monotonic time of the next pending timer + * or KTIME_MAX if no timer is pending. If timer of global base was queued + * into timer migration hierarchy, first global timer is not taken into + * account. If it was the last CPU of timer migration hierarchy going idle, + * first global event is taken into account. */ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) { @@ -2089,6 +2093,40 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) */ is_idle = time_after(nextevt, basej + 1); + if (is_idle) { + u64 next_tmigr; + + /* + * Enqueue first global timer into timer migration + * hierarchy, afterwards tevt.global is no longer used. + */ + next_tmigr = tmigr_cpu_deactivate(tevt.global); + + /* + * If CPU is the last going idle in timer migration + * hierarchy, make sure CPU will wake up in time to handle + * remote timers. next_tmigr == KTIME_MAX if other CPUs are + * still active. + */ + if (next_tmigr < tevt.local) { + u64 tmp; + + /* If we missed a tick already, force 0 delta */ + if (next_tmigr < basem) + next_tmigr = basem; + + tmp = div_u64(next_tmigr - basem, TICK_NSEC); + + nextevt = basej + (unsigned long)tmp; + tevt.local = next_tmigr; + is_idle = time_after(nextevt, basej + 1); + } + /* + * Update of nextevt is not required in an else path, as it + * is revisited in !is_idle path only. + */ + } + /* We need to mark both bases in sync */ base_local->is_idle = base_global->is_idle = is_idle; @@ -2099,19 +2137,16 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) * If the bases are not marked idle, i.e one of the events is at * max. one tick away, then the CPU can't go into a NOHZ idle * sleep. Use the earlier event of both and store it in the local - * expiry value. The next global event is irrelevant in this case - * and can be left as KTIME_MAX. CPU will wakeup on time. + * expiry value. tevt.global update is superfluous and is + * ignored. CPU will wakeup on time. */ if (!is_idle) { /* If we missed a tick already, force 0 delta */ if (time_before(nextevt, basej)) nextevt = basej; tevt.local = basem + (u64)(nextevt - basej) * TICK_NSEC; - tevt.global = KTIME_MAX; } - tevt.local = min_t(u64, tevt.local, tevt.global); - return cmp_next_hrtimer_event(basem, tevt.local); } @@ -2130,6 +2165,9 @@ void timer_clear_idle(void) */ __this_cpu_write(timer_bases[BASE_LOCAL].is_idle, false); __this_cpu_write(timer_bases[BASE_GLOBAL].is_idle, false); + + /* Activate without holding the timer_base->lock */ + tmigr_cpu_activate(); } #endif @@ -2186,6 +2224,15 @@ static void run_timer_base(int index) __run_timer_base(base); } +#ifdef CONFIG_SMP +void timer_expire_remote(unsigned int cpu) +{ + struct timer_base *base = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu); + + __run_timer_base(base); +} +#endif + /* * This function runs timers and the timer-tq in bottom half context. */ @@ -2195,6 +2242,9 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h) if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) { run_timer_base(BASE_GLOBAL); run_timer_base(BASE_DEF); + + if (is_timers_nohz_active()) + tmigr_handle_remote(); } } @@ -2209,7 +2259,8 @@ static void run_local_timers(void) for (int i = 0; i < NR_BASES; i++, base++) { /* Raise the softirq only if required. */ - if (time_after_eq(jiffies, base->next_expiry)) { + if (time_after_eq(jiffies, base->next_expiry) || + (i == BASE_DEF && tmigr_requires_handle_remote())) { raise_softirq(TIMER_SOFTIRQ); return; } diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c new file mode 100644 index 000000000000..5a600de3623b --- /dev/null +++ b/kernel/time/timer_migration.c @@ -0,0 +1,1255 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Infrastructure for migrateable timers + * + * Copyright(C) 2022 linutronix GmbH + */ +#include +#include +#include +#include +#include + +#include "timer_migration.h" +#include "tick-internal.h" + +/* + * The timer migration mechanism is built on a hierarchy of groups. The + * lowest level group contains CPUs, the next level groups of CPU groups + * and so forth. The CPU groups are kept per node so for the normal case + * lock contention won't happen accross nodes. Depending on the number of + * CPUs per node even the next level might be kept as groups of CPU groups + * per node and only the levels above cross the node topology. + * + * Example topology for a two node system with 24 CPUs each. + * + * LVL 2 [GRP2:0] + * GRP1:0 = GRP1:M + * + * LVL 1 [GRP1:0] [GRP1:1] + * GRP0:0 - GRP0:2 GRP0:3 - GRP0:5 + * + * LVL 0 [GRP0:0] [GRP0:1] [GRP0:2] [GRP0:3] [GRP0:4] [GRP0:5] + * CPUS 0-7 8-15 16-23 24-31 32-39 40-47 + * + * The groups hold a timer queue of events sorted by expiry time. These + * queues are updated when CPUs go in idle. When they come out of idle + * ignore bit of events is set. + * + * Each group has a designated migrator CPU/group as long as a CPU/group is + * active in the group. This designated role is necessary to avoid that all + * active CPUs in a group try to migrate expired timers from other cpus, + * which would result in massive lock bouncing. + * + * When a CPU is awake, it checks in it's own timer tick the group + * hierarchy up to the point where it is assigned the migrator role or if + * no CPU is active, it also checks the groups where no migrator is set + * (TMIGR_NONE). + * + * If it finds expired timers in one of the group queues it pulls them over + * from the idle CPU and runs the timer function. After that it updates the + * group and the parent groups if required. + * + * CPUs which go idle arm their CPU local timer hardware for the next local + * (pinned) timer event. If the next migrateable timer expires after the + * next local timer or the CPU has no migrateable timer pending then the + * CPU does not queue an event in the LVL0 group. If the next migrateable + * timer expires before the next local timer then the CPU queues that timer + * in the LVL0 group. In both cases the CPU marks itself idle in the LVL0 + * group. + * + * If the CPU is the migrator of the group then it delegates that role to + * the next active CPU in the group or sets migrator to TMIGR_NONE when + * there is no active CPU in the group. This delegation needs to be + * propagated up the hierarchy so hand over from other leaves can happen at + * all hierarchy levels w/o doing a search. + * + * When the last CPU in the system goes idle, then it drops all migrator + * duties up to the top level of the hierarchy (LVL2 in the example). It + * then has to make sure, that it arms it's own local hardware timer for + * the earliest event in the system. + * + * Lifetime rules: + * + * The groups are built up at init time or when CPUs come online. They are + * not destroyed when a group becomes empty due to offlining. The group + * just won't participate in the hierachary management anymore. Destroying + * groups would result in interesting race conditions which would just make + * the whole mechanism slow and complex. + * + * Locking rules: + * + * For setting up new groups and handling events it's required to lock both + * child and parent group. The lock odering is always bottom up. This also + * includes the per CPU locks in struct tmigr_cpu. For updating migrator + * and active CPU/group information atomic_cmpxchg() is used instead and + * only per CPU tmigr_cpu->lock is held. + * + * During setup of groups tmigr_level_list is required. It is protected by + * tmigr_mutex. + */ + +#ifdef DEBUG +# define DBG_BUG_ON(x) BUG_ON(x) +#else +# define DBG_BUG_ON(x) +#endif + +static DEFINE_MUTEX(tmigr_mutex); +static struct list_head *tmigr_level_list __read_mostly; + +static unsigned int tmigr_cores_per_group __read_mostly; +static unsigned int tmigr_hierarchy_levels __read_mostly; +static unsigned int tmigr_crossnode_level __read_mostly; + +static DEFINE_PER_CPU(struct tmigr_cpu, tmigr_cpu); + +#define TMIGR_NONE 0xFF +#define BIT_CNT 8 + +static DEFINE_STATIC_KEY_FALSE(tmigr_enabled); + +static inline bool is_tmigr_enabled(void) +{ + return static_branch_unlikely(&tmigr_enabled); +} + +/* + * Returns true, when @childmask corresponds to group migrator or when group + * is not active - so no migrator is set. + */ +static bool tmigr_check_migrator(struct tmigr_group *group, u32 childmask) +{ + union tmigr_state s; + + s.state = atomic_read(group->migr_state); + + if ((s.migrator != (u8)childmask) && (s.migrator != TMIGR_NONE)) + return false; + + return true; +} + +typedef bool (*up_f)(struct tmigr_group *, struct tmigr_group *, void *); + +static void __walk_groups(up_f up, void *data, + struct tmigr_cpu *tmc) +{ + struct tmigr_group *child = NULL, *group = tmc->tmgroup; + + do { + DBG_BUG_ON(group->level >= tmigr_hierarchy_levels); + + if (up(group, child, data)) + break; + + child = group; + group = group->parent; + } while (group); +} + +static void walk_groups(up_f up, void *data, struct tmigr_cpu *tmc) +{ + lockdep_assert_held(&tmc->lock); + + __walk_groups(up, data, tmc); +} + +/** + * struct tmigr_walk - data required for walking the hierarchy + * @evt: Pointer to tmigr_event which needs to be queued (of idle + * child group) + * @childmask: childmask of child group + * @nextexp: Next CPU event expiry information which is handed + * into tmigr code by timer code + * (get_next_timer_interrupt()); it is furthermore + * used for first event which is queued, if timer + * migration hierarchy is completely idle + * @childstate: tmigr_group->migr_state of child - will be only reread + * when cmpxchg in group fails (is required for deactive + * path and new timer path) + * @groupstate: tmigr_group->migr_state of group - will be only reread + * when cmpxchg in group fails (is required for active, + * deactive and new timer path) + * @remote: Is set, when new timer path is executed in + * tmigr_handle_remote_cpu() + */ +struct tmigr_walk { + struct tmigr_event *evt; + u32 childmask; + u64 nextexp; + union tmigr_state childstate; + union tmigr_state groupstate; + bool remote; +}; + +/** + * struct tmigr_remote_data - data required for (check) remote expiry + * hierarchy walk + * @basej: timer base in jiffies + * @now: timer base monotonic + * @childmask: childmask of child group + * @check: is set to 1 if there is the need to handle remote timers; + * required in tmigr_check_handle_remote() only + * @wakeup: returns expiry of first timer in idle timer migration hierarchy + * to make sure timer is handled in time; it is stored in per CPU + * tmigr_cpu struct of CPU which expires remote timers + */ +struct tmigr_remote_data { + unsigned long basej; + u64 now; + u32 childmask; + int check; + u64 wakeup; +}; + +/* + * Returns next event of timerqueue @group->events + * + * Removes timers with ignore bits and update next_expiry and event cpu + * value in group. Expiry value of group event is updated in + * tmigr_update_events() only. + */ +static struct tmigr_event *tmigr_next_groupevt(struct tmigr_group *group) +{ + struct timerqueue_node *node = NULL; + struct tmigr_event *evt = NULL; + + lockdep_assert_held(&group->lock); + + group->next_expiry = KTIME_MAX; + + while ((node = timerqueue_getnext(&group->events))) { + evt = container_of(node, struct tmigr_event, nextevt); + + if (!evt->ignore) { + group->next_expiry = evt->nextevt.expires; + return evt; + } + + /* + * Remove next timers with ignore bits, because group lock + * is held anyway + */ + if (!timerqueue_del(&group->events, node)) + break; + } + + return NULL; +} + +/* + * Return next event which is already expired of group timerqueue + * + * Event is also removed from queue. + */ +static struct tmigr_event *tmigr_next_expired_groupevt(struct tmigr_group *group, + u64 now) +{ + struct tmigr_event *evt = tmigr_next_groupevt(group); + + if (!evt || now < evt->nextevt.expires) + return NULL; + + /* + * Event is already expired. Remove it. If it's not the last event, + * then update all group event related information. + */ + if (timerqueue_del(&group->events, &evt->nextevt)) + tmigr_next_groupevt(group); + else + group->next_expiry = KTIME_MAX; + + return evt; +} + +static u64 tmigr_next_groupevt_expires(struct tmigr_group *group) +{ + struct tmigr_event *evt; + + evt = tmigr_next_groupevt(group); + + if (!evt) + return KTIME_MAX; + else + return evt->nextevt.expires; +} + +static bool tmigr_active_up(struct tmigr_group *group, + struct tmigr_group *child, + void *ptr) +{ + union tmigr_state curstate, newstate; + struct tmigr_walk *data = ptr; + bool walk_done; + u32 childmask; + + childmask = data->childmask; + newstate = curstate = data->groupstate; + +retry: + walk_done = true; + + if (newstate.migrator == TMIGR_NONE) { + newstate.migrator = (u8)childmask; + + /* Changes need to be propagated */ + walk_done = false; + } + + newstate.active |= (u8)childmask; + + newstate.seq++; + + if (atomic_cmpxchg(group->migr_state, curstate.state, newstate.state) != curstate.state) { + newstate.state = curstate.state = atomic_read(group->migr_state); + goto retry; + } + + if (group->parent && (walk_done == false)) { + data->groupstate.state = atomic_read(group->parent->migr_state); + data->childmask = group->childmask; + } + + /* + * Group is active, event will be ignored - bit is updated without + * holding the lock. In case bit is set while another CPU already + * handles remote events, nothing happens, because it is clear that + * CPU became active just in this moment, or in worst case event is + * handled remote. Nothing to worry about. + */ + group->groupevt.ignore = 1; + + return walk_done; +} + +static void __tmigr_cpu_activate(struct tmigr_cpu *tmc) +{ + struct tmigr_walk data; + data.childmask = tmc->childmask; + data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); + + tmc->cpuevt.ignore = 1; + + walk_groups(&tmigr_active_up, &data, tmc); +} + +void tmigr_cpu_activate(void) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + + if (!is_tmigr_enabled() || !tmc->tmgroup || !tmc->online || !tmc->idle) + return; + + raw_spin_lock(&tmc->lock); + tmc->idle = 0; + tmc->wakeup = KTIME_MAX; + __tmigr_cpu_activate(tmc); + raw_spin_unlock(&tmc->lock); +} + +/* + * Returns true, if there is nothing to be propagated to the next level + * + * @data->nextexp is reset to KTIME_MAX; it is reused for first global + * event which needs to be handled by migrator (in toplevel group) + * + * This is the only place where group event expiry value is set. + */ +static bool tmigr_update_events(struct tmigr_group *group, + struct tmigr_group *child, + struct tmigr_walk *data) +{ + struct tmigr_event *evt, *first_childevt; + bool walk_done, remote = data->remote; + u64 nextexp; + + if (child) { + if (data->childstate.active) + return true; + + raw_spin_lock(&child->lock); + raw_spin_lock_nested(&group->lock, SINGLE_DEPTH_NESTING); + + first_childevt = tmigr_next_groupevt(child); + nextexp = child->next_expiry; + evt = &child->groupevt; + } else { + nextexp = data->nextexp; + + /* + * Set @data->nextexp to KTIME_MAX; it is reused for first + * global event which needs to be handled by migrator (in + * toplevel group) + */ + data->nextexp = KTIME_MAX; + + first_childevt = evt = data->evt; + if (evt->ignore) + return true; + + raw_spin_lock(&group->lock); + } + + if (nextexp == KTIME_MAX) + evt->ignore = 1; + + /* + * When next event could be ignored (nextexp is KTIME MAX) and + * there was no remote timer handling before or the group is + * already active active, there is no need to walk the hierarchy + * even if there is a parent group. + * + * The other way round: even if the event could be ignored, but if + * a remote timer handling was executed before and the group is not + * active, walking the hierarchy is required to not miss a enqueued + * timer in the non active group. The enqueued timer needs to be + * propagated to a higher level to ensure it is handeld. + */ + if (nextexp == KTIME_MAX && (!remote || data->groupstate.active)) { + walk_done = true; + goto unlock; + } + + walk_done = !group->parent; + + /* + * Update of event cpu and ignore bit is required only when @child + * is set (child is equal or higher than lvl0), but it doesn't + * matter if it is written once more to per cpu event; make the + * update unconditional. + */ + evt->cpu = first_childevt->cpu; + evt->ignore = 0; + + /* + * If child event is already queued in group, remove it from queue + * when expiry time changed only + */ + if (timerqueue_node_queued(&evt->nextevt)) { + if (evt->nextevt.expires == nextexp) + goto check_toplvl; + else if (!timerqueue_del(&group->events, &evt->nextevt)) + group->next_expiry = KTIME_MAX; + } + + evt->nextevt.expires = nextexp; + + if (timerqueue_add(&group->events, &evt->nextevt)) + group->next_expiry = nextexp; + +check_toplvl: + if (walk_done && (data->groupstate.migrator == TMIGR_NONE)) { + /* + * Toplevel group is idle and it has to be ensured global + * timers are handled in time. (This could be optimized by + * keeping track of the last global scheduled event and + * only arming it on CPU if the new event is earlier. Not + * sure if its worth the complexity.) + */ + data->nextexp = tmigr_next_groupevt_expires(group); + } + +unlock: + raw_spin_unlock(&group->lock); + + if (child) + raw_spin_unlock(&child->lock); + + return walk_done; +} + +static bool tmigr_new_timer_up(struct tmigr_group *group, + struct tmigr_group *child, + void *ptr) +{ + struct tmigr_walk *data = ptr; + bool walk_done; + + walk_done = tmigr_update_events(group, child, data); + + if (!walk_done) { + /* Update state information for next iteration */ + data->childstate.state = atomic_read(group->migr_state); + if (group->parent) + data->groupstate.state = atomic_read(group->parent->migr_state); + } + + return walk_done; +} + +/* + * Returns expiry of next timer that needs to be handled. KTIME_MAX is + * returned, when an active CPU will handle all timer migration hierarchy + * timers. + */ +static u64 tmigr_new_timer(struct tmigr_cpu *tmc, u64 nextexp) +{ + struct tmigr_walk data = { .evt = &tmc->cpuevt, + .nextexp = nextexp }; + + lockdep_assert_held(&tmc->lock); + + if (tmc->remote) + return KTIME_MAX; + + tmc->cpuevt.ignore = 0; + + data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); + data.remote = false; + + walk_groups(&tmigr_new_timer_up, &data, tmc); + + /* If there is a new first global event, make sure it is handled */ + return data.nextexp; +} + +static bool tmigr_inactive_up(struct tmigr_group *group, + struct tmigr_group *child, + void *ptr) +{ + union tmigr_state curstate, newstate; + struct tmigr_walk *data = ptr; + bool walk_done; + u32 childmask; + + childmask = data->childmask; + newstate = curstate = data->groupstate; + +retry: + walk_done = true; + + /* Reset active bit when child is no longer active */ + if (!data->childstate.active) + newstate.active &= ~(u8)childmask; + + if (newstate.migrator == (u8)childmask) { + /* + * Find a new migrator for the group, because child group + * is idle! + */ + if (!data->childstate.active) { + unsigned long new_migr_bit, active = newstate.active; + + new_migr_bit = find_first_bit(&active, BIT_CNT); + + /* Changes need to be propagated */ + walk_done = false; + + if (new_migr_bit != BIT_CNT) + newstate.migrator = BIT(new_migr_bit); + else + newstate.migrator = TMIGR_NONE; + } + } + + newstate.seq++; + + DBG_BUG_ON((newstate.migrator != TMIGR_NONE) && !(newstate.active)); + + if (atomic_cmpxchg(group->migr_state, curstate.state, newstate.state) != curstate.state) { + /* + * Something changed in child/parent group in the meantime, + * reread the state of child and parent; Update of + * data->childstate is required for event handling; + */ + if (child) + data->childstate.state = atomic_read(child->migr_state); + newstate.state = curstate.state = atomic_read(group->migr_state); + + goto retry; + } + + data->groupstate = newstate; + data->remote = false; + + /* Event Handling */ + tmigr_update_events(group, child, data); + + if (group->parent && (walk_done == false)) { + data->childmask = group->childmask; + data->childstate = newstate; + data->groupstate.state = atomic_read(group->parent->migr_state); + } + + /* + * data->nextexp was set by tmigr_update_events() and contains the + * expiry of first global event which needs to be handled + */ + if (data->nextexp != KTIME_MAX) { + DBG_BUG_ON(group->parent); + /* + * Toplevel path - If this cpu is about going offline wake + * up some random other cpu so it will take over the + * migrator duty and program its timer properly. Ideally + * wake the cpu with the closest expiry time, but that's + * overkill to figure out. + */ + if (!(this_cpu_ptr(&tmigr_cpu)->online)) { + unsigned int cpu = smp_processor_id(); + + cpu = cpumask_any_but(cpu_online_mask, cpu); + smp_send_reschedule(cpu); + } + } + + return walk_done; +} + +static u64 __tmigr_cpu_deactivate(struct tmigr_cpu *tmc, u64 nextexp) +{ + struct tmigr_walk data = { .childmask = tmc->childmask, + .evt = &tmc->cpuevt, + .nextexp = nextexp, + .childstate.state = 0 }; + + data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); + + /* + * If nextexp is KTIME_MAX, CPU event will be ignored because, + * local timer expires before global timer, no global timer is set + * or CPU goes offline. + */ + if (nextexp != KTIME_MAX) + tmc->cpuevt.ignore = 0; + + walk_groups(&tmigr_inactive_up, &data, tmc); + return data.nextexp; +} + +/** + * tmigr_cpu_deactivate - Put current CPU into inactive state + * @nextexp: The next timer event expiry set in the current CPU + * + * Must be called with interrupts disabled. + * + * Return: next event of the current CPU or next event from the hierarchy + * if this CPU is the top level migrator or hierarchy is completely idle. + */ +u64 tmigr_cpu_deactivate(u64 nextexp) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + u64 ret; + + if (!is_tmigr_enabled() || !tmc->tmgroup || !tmc->online) + return nextexp; + + raw_spin_lock(&tmc->lock); + + /* + * CPU is already deactivated in timer migration + * hierarchy. tick_nohz_get_sleep_length() calls + * tick_nohz_next_event() and thereby timer idle path is + * executed once more. tmc->wakeup holds the first timer, when + * timer migration hierarchy is completely idle and remote + * expiry was done. If there is no new next expiry value + * handed in which should be inserted into the timer migration + * hierarchy, wakeup value is returned. + */ + if (tmc->idle) { + ret = tmc->wakeup; + + tmc->wakeup = KTIME_MAX; + + if (nextexp != KTIME_MAX) { + if (nextexp != tmc->cpuevt.nextevt.expires || + tmc->cpuevt.ignore) + ret = tmigr_new_timer(tmc, nextexp); + } + + goto unlock; + } + + /* + * When tmigr_remote is active, set cpu inactive path and queuing of + * nextexp is done by handle remote path. + */ + ret = __tmigr_cpu_deactivate(tmc, nextexp); + + tmc->idle = 1; + +unlock: + raw_spin_unlock(&tmc->lock); + return ret; +} + +static u64 tmigr_handle_remote_cpu(unsigned int cpu, u64 now, + unsigned long jif) +{ + struct timer_events tevt; + struct tmigr_walk data; + struct tmigr_cpu *tmc; + u64 next = KTIME_MAX; + unsigned long flags; + + tmc = per_cpu_ptr(&tmigr_cpu, cpu); + + raw_spin_lock_irqsave(&tmc->lock, flags); + /* + * Remote CPU is offline or no longer idle or other cpu handles cpu + * timers already or next event was already expired - return! + */ + if (!tmc->online || tmc->remote || tmc->cpuevt.ignore || + now < tmc->cpuevt.nextevt.expires) { + raw_spin_unlock_irqrestore(&tmc->lock, flags); + return next; + } + + tmc->remote = 1; + + /* Drop the lock to allow the remote CPU to exit idle */ + raw_spin_unlock_irqrestore(&tmc->lock, flags); + + if (cpu != smp_processor_id()) + timer_expire_remote(cpu); + + /* next event of cpu */ + fetch_next_timer_interrupt_remote(jif, now, &tevt, cpu); + + raw_spin_lock_irqsave(&tmc->lock, flags); + /* + * Nothing more to do when CPU came out of idle in the meantime - needs + * to be checked when holding the base lock to prevent race. + */ + if (!tmc->idle) + goto unlock; + + data.evt = &tmc->cpuevt; + data.nextexp = tevt.global; + data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); + data.remote = true; + tmc->cpuevt.ignore = 0; + + walk_groups(&tmigr_new_timer_up, &data, tmc); + + next = data.nextexp; + +unlock: + tmc->remote = 0; + raw_spin_unlock_irqrestore(&tmc->lock, flags); + + return next; +} + +static bool tmigr_handle_remote_up(struct tmigr_group *group, + struct tmigr_group *child, + void *ptr) +{ + struct tmigr_remote_data *data = ptr; + u64 now, next = KTIME_MAX; + unsigned long flags, jif; + struct tmigr_event *evt; + u32 childmask; + + jif = data->basej; + now = data->now; + + childmask = data->childmask; + +again: + /* + * Handle the group only if @childmask is the migrator or if the + * group has no migrator. Otherwise the group is active and is + * handled by its own migrator. + */ + if (!tmigr_check_migrator(group, childmask)) + return true; + + raw_spin_lock_irqsave(&group->lock, flags); + + evt = tmigr_next_expired_groupevt(group, now); + + if (evt) { + unsigned int remote_cpu = evt->cpu; + + raw_spin_unlock_irqrestore(&group->lock, flags); + + next = tmigr_handle_remote_cpu(remote_cpu, now, jif); + + /* check if there is another event, that needs to be handled */ + goto again; + } else { + raw_spin_unlock_irqrestore(&group->lock, flags); + } + + /* Update of childmask for next level */ + data->childmask = group->childmask; + data->wakeup = next; + + return false; +} + +/** + * tmigr_handle_remote - Handle migratable timers on remote idle CPUs + * + * Called from the timer soft interrupt with interrupts enabled. + */ +void tmigr_handle_remote(void) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + struct tmigr_remote_data data; + unsigned long flags; + + if (!is_tmigr_enabled() || !tmc->tmgroup || !tmc->online) + return; + + /* + * NOTE: This is a doubled check because migrator test will be done + * in tmigr_handle_remote_up() anyway. Keep this check to fasten + * the return when nothing has to be done. + */ + if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask)) + return; + + data.now = get_jiffies_update(&data.basej); + data.childmask = tmc->childmask; + data.wakeup = KTIME_MAX; + + __walk_groups(&tmigr_handle_remote_up, &data, tmc); + + raw_spin_lock_irqsave(&tmc->lock, flags); + if (tmc->idle) + tmc->wakeup = data.wakeup; + + raw_spin_unlock_irqrestore(&tmc->lock, flags); + + return; +} + +static bool tmigr_requires_handle_remote_up(struct tmigr_group *group, + struct tmigr_group *child, + void *ptr) +{ + struct tmigr_remote_data *data = ptr; + u32 childmask; + + childmask = data->childmask; + + /* + * Handle the group only if child is the migrator or if the group + * has no migrator. Otherwise the group is active and is handled by + * its own migrator. + */ + if (!tmigr_check_migrator(group, childmask)) + return true; + + /* + * Racy lockless check for next_expiry + */ + if (data->now >= group->next_expiry) { + data->check = 1; + return true; + } + + /* Update of childmask for next level */ + data->childmask = group->childmask; + return false; +} + +int tmigr_requires_handle_remote(void) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + struct tmigr_remote_data data; + + if (!is_tmigr_enabled() || !tmc->tmgroup || !tmc->online) + return 0; + + if (!tmigr_check_migrator(tmc->tmgroup, tmc->childmask)) + return 0; + + data.now = get_jiffies_update(&data.basej); + data.childmask = tmc->childmask; + + __walk_groups(&tmigr_requires_handle_remote_up, &data, tmc); + + return data.check; +} + +static void tmigr_init_group(struct tmigr_group *group, unsigned int lvl, + unsigned int node, atomic_t *migr_state) +{ + union tmigr_state s; + + raw_spin_lock_init(&group->lock); + + group->level = lvl; + group->numa_node = lvl < tmigr_crossnode_level ? node : NUMA_NO_NODE; + + group->num_childs = 0; + + /* + * num_cores is required for level=0 groups only during setup and + * when siblings exists but it doesn't matter if this value is set + * in other groups as well + */ + group->num_cores = 1; + + s.migrator = TMIGR_NONE; + s.active = 0; + s.seq = 0; + atomic_set(migr_state, s.state); + + group->migr_state = migr_state; + + timerqueue_init_head(&group->events); + timerqueue_init(&group->groupevt.nextevt); + group->groupevt.nextevt.expires = KTIME_MAX; + group->next_expiry = KTIME_MAX; + group->groupevt.ignore = 1; +} + +static bool sibling_in_group(int newcpu, struct tmigr_group *group) +{ + int i, cpu; + + /* Find a sibling of newcpu in group members */ + for (i = 0; i < group->num_childs; i++) { + cpu = group->cpus[i]; + + if (cpumask_test_cpu(newcpu, topology_sibling_cpumask(cpu))) + return true; + } + return false; +} + +static struct tmigr_group *tmigr_get_group(unsigned int cpu, unsigned int node, + unsigned int lvl) +{ + struct tmigr_group *tmp, *group = NULL; + bool first_loop = true; + atomic_t *migr_state; + +reloop: + /* Try to attach to an exisiting group first */ + list_for_each_entry(tmp, &tmigr_level_list[lvl], list) { + /* + * If @lvl is below the cross numa node level, check whether + * this group belongs to the same numa node. + */ + if (lvl < tmigr_crossnode_level && tmp->numa_node != node) + continue; + + /* Capacity left? */ + if (tmp->num_childs >= TMIGR_CHILDS_PER_GROUP) + continue; + + /* + * If this is the lowest level of the hierarchy, make sure + * that thread siblings share a group. It is only executed + * when siblings exist. ALL groups of lowest level needs to + * be checked for thread sibling, before thread cpu is + * added to a random group with capacity. When all groups + * are checked and no thread sibling was found, reloop of + * level zero groups is required to get a group with + * capacity. + */ + if (!lvl && (tmigr_cores_per_group != TMIGR_CHILDS_PER_GROUP)) { + if (first_loop == true && !sibling_in_group(cpu, tmp)) { + continue; + } else if (first_loop == false) { + if (tmp->num_cores >= tmigr_cores_per_group) + continue; + else + tmp->num_cores++; + } + } + + group = tmp; + break; + } + + if (group) { + return group; + } else if (first_loop == true) { + first_loop = false; + goto reloop; + } + + /* Allocate and set up a new group with corresponding migr_state */ + group = kzalloc_node(sizeof(*group), GFP_KERNEL, node); + if (!group) + return ERR_PTR(-ENOMEM); + + migr_state = kzalloc_node(sizeof(atomic_t), GFP_KERNEL, node); + if (!migr_state) { + kfree(group); + return ERR_PTR(-ENOMEM); + } + + tmigr_init_group(group, lvl, node, migr_state); + /* Setup successful. Add it to the hierarchy */ + list_add(&group->list, &tmigr_level_list[lvl]); + return group; +} + +static void tmigr_connect_child_parent(struct tmigr_group *child, + struct tmigr_group *parent) +{ + union tmigr_state childstate; + unsigned long flags; + + raw_spin_lock_irqsave(&child->lock, flags); + raw_spin_lock_nested(&parent->lock, SINGLE_DEPTH_NESTING); + + child->parent = parent; + child->childmask = BIT(parent->num_childs++); + + raw_spin_unlock(&parent->lock); + raw_spin_unlock_irqrestore(&child->lock, flags); + + /* + * To prevent inconsistent states, active childs needs to be active + * in new parent as well. Inactive childs are already marked + * inactive in parent group. + */ + childstate.state = atomic_read(child->migr_state); + if (childstate.migrator != TMIGR_NONE) { + struct tmigr_walk data; + + data.childmask = child->childmask; + data.groupstate.state = atomic_read(parent->migr_state); + + /* + * There is only one new level per time. When connecting + * child and parent and set child active when parent is + * inactive, parent needs to be the upperst + * level. Otherwise there went something wrong! + */ + WARN_ON(!tmigr_active_up(parent, child, &data) && parent->parent); + } +} + +static int tmigr_setup_groups(unsigned int cpu, unsigned int node) +{ + struct tmigr_group *group, *child, **stack; + int top = 0, err = 0, i = 0; + struct list_head *lvllist; + size_t sz; + + sz = sizeof(struct tmigr_group *) * tmigr_hierarchy_levels; + stack = kzalloc(sz, GFP_KERNEL); + if (!stack) + return -ENOMEM; + + do { + group = tmigr_get_group(cpu, node, i); + if (IS_ERR(group)) { + err = IS_ERR(group); + break; + } + + top = i; + stack[i++] = group; + + /* + * When booting only less CPUs of a system than CPUs are + * available, not all calculated hierarchy levels are required. + * + * The loop is aborted as soon as the highest level, which might + * be different from tmigr_hierarchy_levels, contains only a + * single group. + */ + if (group->parent || i == tmigr_hierarchy_levels || + (list_empty(&tmigr_level_list[i]) && + list_is_singular(&tmigr_level_list[i - 1]))) + break; + + } while (i < tmigr_hierarchy_levels); + + do { + group = stack[--i]; + + if (err < 0) { + list_del(&group->list); + kfree(group); + continue; + } + + DBG_BUG_ON(i != group->level); + + /* + * Update tmc -> group / child -> group connection + */ + if (i == 0) { + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + unsigned long flags; + + raw_spin_lock_irqsave(&group->lock, flags); + + tmc->tmgroup = group; + tmc->childmask = BIT(group->num_childs); + + group->cpus[group->num_childs++] = cpu; + + raw_spin_unlock_irqrestore(&group->lock, flags); + + /* There are no childs that needs to be connected */ + continue; + } else { + child = stack[i - 1]; + tmigr_connect_child_parent(child, group); + } + + /* check if upperst level was newly created */ + if (top != i) + continue; + + DBG_BUG_ON(top == 0); + + lvllist = &tmigr_level_list[top]; + if (group->num_childs == 1 && list_is_singular(lvllist)) { + lvllist = &tmigr_level_list[top - 1]; + list_for_each_entry(child, lvllist, list) { + if (child->parent) + continue; + + tmigr_connect_child_parent(child, group); + } + } + } while (i > 0); + + kfree(stack); + + return err; +} + +static int tmigr_add_cpu(unsigned int cpu) +{ + unsigned int node = cpu_to_node(cpu); + int ret; + mutex_lock(&tmigr_mutex); + ret = tmigr_setup_groups(cpu, node); + mutex_unlock(&tmigr_mutex); + + return ret; +} + +static int tmigr_cpu_online(unsigned int cpu) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + unsigned long flags; + unsigned int ret; + + /* First online attempt? Initialize CPU data */ + if (!tmc->tmgroup) { + raw_spin_lock_init(&tmc->lock); + + ret = tmigr_add_cpu(cpu); + if (ret < 0) + return ret; + + if (tmc->childmask == 0) + return -EINVAL; + + timerqueue_init(&tmc->cpuevt.nextevt); + tmc->cpuevt.nextevt.expires = KTIME_MAX; + tmc->cpuevt.ignore = 1; + tmc->cpuevt.cpu = cpu; + + tmc->remote = 0; + tmc->idle = 0; + tmc->wakeup = KTIME_MAX; + } + raw_spin_lock_irqsave(&tmc->lock, flags); + __tmigr_cpu_activate(tmc); + tmc->online = 1; + raw_spin_unlock_irqrestore(&tmc->lock, flags); + return 0; +} + +static int tmigr_cpu_offline(unsigned int cpu) +{ + struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu); + + raw_spin_lock_irq(&tmc->lock); + tmc->online = 0; + __tmigr_cpu_deactivate(tmc, KTIME_MAX); + raw_spin_unlock_irq(&tmc->lock); + + return 0; +} + +static int __init tmigr_init(void) +{ + unsigned int cpulvl, nodelvl, cpus_per_node, i, ns; + unsigned int nnodes = num_possible_nodes(); + unsigned int ncpus = num_possible_cpus(); + int ret = -ENOMEM; + size_t sz; + + /* Nothing to do if running on UP */ + if (ncpus == 1) + return 0; + + /* + * Unfortunately there is no reliable way to determine the number of SMT + * siblings in a generic way. tmigr_init() is called after SMP bringup, + * so for the normal boot case it can be assumed that all siblings have + * been brought up and the number of siblings of the current cpu can be + * used. If someone booted with 'maxcpus=N/2' on the kernel command line + * and (at least x86) bring up the siblings later then the siblings will + * end up in different groups. Bad luck. + */ + ns = cpumask_weight(topology_sibling_cpumask(raw_smp_processor_id())); + tmigr_cores_per_group = TMIGR_CHILDS_PER_GROUP; + if (ns >= 2 && ns < TMIGR_CHILDS_PER_GROUP) + tmigr_cores_per_group /= ns; + + /* + * Calculate the required hierarchy levels. Unfortunately there is no + * reliable information available, unless all possible CPUs have been + * brought up and all numa nodes are populated. + * + * Estimate the number of levels with the number of possible nodes and + * the number of possible cpus. Assume CPUs are spread evenly accross + * nodes. We cannot rely on cpumask_of_node() because there only already + * online CPUs are considered. + */ + cpus_per_node = DIV_ROUND_UP(ncpus, nnodes); + + /* Calc the hierarchy levels required to hold the CPUs of a node */ + cpulvl = DIV_ROUND_UP(order_base_2(cpus_per_node), + ilog2(TMIGR_CHILDS_PER_GROUP)); + + /* Calculate the extra levels to connect all nodes */ + nodelvl = DIV_ROUND_UP(order_base_2(nnodes), + ilog2(TMIGR_CHILDS_PER_GROUP)); + + tmigr_hierarchy_levels = cpulvl + nodelvl; + + /* + * If a numa node spawns more than one CPU level group then the next + * level(s) of the hierarchy contains groups which handle all CPU groups + * of the same numa node. The level above goes accross numa nodes. Store + * this information for the setup code to decide when node matching is + * not longer required. + */ + tmigr_crossnode_level = cpulvl; + + sz = sizeof(struct list_head) * tmigr_hierarchy_levels; + tmigr_level_list = kzalloc(sz, GFP_KERNEL); + if (!tmigr_level_list) + goto err; + + for (i = 0; i < tmigr_hierarchy_levels; i++) + INIT_LIST_HEAD(&tmigr_level_list[i]); + + pr_info("Timer migration: %d hierarchy levels; %d childs per group;" + " %d cores_per_group; %d crossnode level\n", + tmigr_hierarchy_levels, TMIGR_CHILDS_PER_GROUP, + tmigr_cores_per_group, tmigr_crossnode_level); + + ret = cpuhp_setup_state(CPUHP_AP_TMIGR_ONLINE, "tmigr:online", + tmigr_cpu_online, tmigr_cpu_offline); + if (ret) + goto err; + + static_branch_enable(&tmigr_enabled); + + return 0; + +err: + pr_err("Timer migration setup failed\n"); + return ret; +} +late_initcall(tmigr_init); diff --git a/kernel/time/timer_migration.h b/kernel/time/timer_migration.h new file mode 100644 index 000000000000..ceb336e705df --- /dev/null +++ b/kernel/time/timer_migration.h @@ -0,0 +1,123 @@ +#ifndef _KERNEL_TIME_MIGRATION_H +#define _KERNEL_TIME_MIGRATION_H + +/* Per group capacity. Must be a power of 2! */ +#define TMIGR_CHILDS_PER_GROUP 8 + +/** + * struct tmigr_event - a timer event associated to a CPU + * @nextevt: The node to enqueue an event in the parent group queue + * @cpu: The CPU to which this event belongs + * @ignore: Hint whether the event could be ignored; it is set when + * CPU or group is active; + */ +struct tmigr_event { + struct timerqueue_node nextevt; + unsigned int cpu; + int ignore; +}; + +/** + * struct tmigr_group - timer migration hierarchy group + * @lock: Lock protecting the event information + * @cpus: Array with CPUs which are member of group; required for + * sibling CPUs; used only when level == 0 + * @parent: Pointer to parent group + * @list: List head that is added to per level tmigr_level_list + * @level: Hierarchy level of group + * @numa_node: Is set to numa node when level < tmigr_crossnode_level; + * otherwise it is set to NUMA_NO_NODE; Required for setup + * only + * @num_childs: Counter of group childs; Required for setup only + * @num_cores: Counter of cores per group; Required for setup only when + * level == 0 and siblings exist + * @migr_state: State of group (see struct tmigr_state) + * @childmask: childmask of group in parent group; is set during setup + * never changed; could be read lockless + * @events: Timer queue for child events queued in the group + * @groupevt: Next event of group; it is only reliable when group is + * !active (ignore bit is set when group is active) + * @next_expiry: Base monotonic expiry time of next event of group; + * Used for racy lockless check whether remote expiry is + * required; it is always reliable + */ +struct tmigr_group { + raw_spinlock_t lock; + unsigned int cpus[TMIGR_CHILDS_PER_GROUP]; + struct tmigr_group *parent; + struct list_head list; + unsigned int level; + unsigned int numa_node; + unsigned int num_childs; + unsigned int num_cores; + atomic_t *migr_state; + u32 childmask; + struct timerqueue_head events; + struct tmigr_event groupevt; + u64 next_expiry; +}; + +/** + * struct tmigr_cpu - timer migration per CPU group + * @lock: Lock protecting tmigr_cpu group information + * @online: Indicates wheter CPU is online + * @idle: Indicates wheter CPU is idle in timer migration hierarchy + * @remote: Is set when timers of CPU are expired remote + * @tmgroup: Pointer to parent group + * @childmask: childmask of tmigr_cpu in parent group + * @cpuevt: CPU event which could be queued into parent group + * @wakeup: Stores the first timer when the timer migration hierarchy is + * completely idle and remote expiry was done; is returned to + * timer code when tmigr_cpu_deactive() is called and group is + * idle; afterwards a reset to KTIME_MAX is required; + */ +struct tmigr_cpu { + raw_spinlock_t lock; + int online; + int idle; + int remote; + struct tmigr_group *tmgroup; + u32 childmask; + struct tmigr_event cpuevt; + u64 wakeup; +}; + +/** + * union tmigr_state - state of tmigr_group + * @state: Combined version of the state - only used for atomic + * read/cmpxchg function + * @struct: Splitted version of the state - only use the struct members to + * update information to stay independant of endianess + */ +union tmigr_state { + u32 state; + /** + * struct - splitted state of tmigr_group + * @active: Contains each childmask bit of active childs + * @migrator: Contains childmask of child which is migrator + * @seq: Seqence number to prevent race when update in child + * group are propagated in wrong order (especially when + * migrator changes are involved) + */ + struct { + u8 active; + u8 migrator; + u16 seq; + } __packed; +}; + +#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) +extern void tmigr_handle_remote(void); +extern int tmigr_requires_handle_remote(void); +extern void tmigr_cpu_activate(void); +extern u64 tmigr_cpu_deactivate(u64 nextevt); +extern void timer_expire_remote(unsigned int cpu); +#else +static inline void tmigr_handle_remote(void) { } +extern inline int tmigr_requires_handle_remote(void) { return 0; } +static inline void tmigr_cpu_activate(void) { } +static inline u64 tmigr_cpu_deactivate(u64 nextevt) { return KTIME_MAX; } +extern inline void timer_expire_remote(unsigned int cpu) { } +#endif + +#endif From patchwork Wed Mar 1 14:17:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62922 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658965wrd; Wed, 1 Mar 2023 06:23:46 -0800 (PST) X-Google-Smtp-Source: AK7set+FZvT3Bp3gr1SjiJAsCo6YyqqX125cI9dWhz+u/G94HWTDIO8g/pXmojhJfIEvdbJD2hYb X-Received: by 2002:a17:906:fc4:b0:8c0:6422:e0c2 with SMTP id c4-20020a1709060fc400b008c06422e0c2mr5819871ejk.22.1677680626409; Wed, 01 Mar 2023 06:23:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680626; cv=none; d=google.com; s=arc-20160816; b=ad9vJtRAKBQ/LgqI7xpe2En/di1HFHMsZR6XGsOu8i5Kcx0PIbIC4voP+h9D/lJHqN UbrhD3lXiRqkHyleLPfsnRAzHr8pUj5mgw15U07uuP7LDRO09JAoSCl6+D/4JOF1wyma F6Z1qDTmsZMgQ/rf8y5bM1Pu9Vn7VMQ+UDnCDTSxlPeKvBDaK9ccYINvUxqNLs29rD4i F23oA+j+TrfFrq/9htus+Az/vhcHlSsM0kJ1Qhe+gZmvtlYKBUD4Jht3uIlNEnZFhaMC TQF9+tgF0zRd5K6tAUNwp5v2O8cwtzt5Y0xwO+WXKtCENd5faL1iYnZ8AC8uJZErZzm9 U1Aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=2MBIeOLlrHqRhX6QL4OE3pPIfaDDO8TjpNEVJjFfspc=; b=QwUUxPFpq/8IqmrJGurIJ2e8Pau6ShkngGaDDemVaslVwuJvljC6SDHRZmVzuwoSxz knPONMMkWn8dDPaJAmDpL1fCawwJShNJYcsyT4YfkDX3IMuNfVo9Wn6EWI2M1JRxRAGu kMSB/GHJuvuHKbRMSmKx+hMrurNWU8zxg3kYAQlvjtFi5KqeH4V5dYkf0b/aph+6YQrT hZXY3GNbdVZ5+D2pcOhFUij38vR/pLgCk5eoZhMOInQMV0ewXHShfgpHMf/hwuy5RXLQ tSS+7QUl0oTdUpEddnSmo9DMN91G+YHpCD2C7osOc7/uXr2B9yfPgdSJePU2wiCS7+66 twZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=h7Lvvp03; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l25-20020a1709061c5900b008d77c083939si14799489ejg.463.2023.03.01.06.23.23; Wed, 01 Mar 2023 06:23:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=h7Lvvp03; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229691AbjCAOT2 (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230052AbjCAOSN (ORCPT ); Wed, 1 Mar 2023 09:18:13 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B56F1233FF for ; Wed, 1 Mar 2023 06:18:11 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680290; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2MBIeOLlrHqRhX6QL4OE3pPIfaDDO8TjpNEVJjFfspc=; b=h7Lvvp03+RugOIGosqnBV98qAj+HcLzNQTqm9+o6HZ1lISBDOeGmPB/Ecm1yP5bNn1XFMI A60PEuXNon0i0Q5LQcDCAYgVAHnmE2AdZ8td3QUt9I4g/mQWjYedNSd588ZJUUx+uWJnzF P1smgh/N18IYFKSP96o/wkIwUNrOCSuj0CnjJGSR5Op8gmTBERzVgl8g3rMpHvjmv0cM33 hYPj1xgm2CHLuLFfFvEbKRxcwHHO0w67nREzs0m81OnfLkjqGGCfHUC7bzW3i3I7T3vk0d i1Uor+G5huRXWcm2/nl84DQ95oE0KjBlnhIx54qFdi8PggMimR4qV5TTHqb5Qg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680290; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2MBIeOLlrHqRhX6QL4OE3pPIfaDDO8TjpNEVJjFfspc=; b=iWOek0rkR5+g5UYio1TKLdxrwr+KT6aNCz5E0yId2wsyU/6Yhc9iwIe1a+mXeY+tkeQwcb 3slQN7mWQuW0FeAg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v5 17/18] timer_migration: Add tracepoints Date: Wed, 1 Mar 2023 15:17:43 +0100 Message-Id: <20230301141744.16063-18-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175640582664234?= X-GMAIL-MSGID: =?utf-8?q?1759175640582664234?= The timer pull logic needs proper debugging aids. Add tracepoints so the hierarchical idle machinery can be diagnosed. Signed-off-by: Anna-Maria Behnsen --- include/trace/events/timer_migration.h | 277 +++++++++++++++++++++++++ kernel/time/timer_migration.c | 24 +++ 2 files changed, 301 insertions(+) create mode 100644 include/trace/events/timer_migration.h diff --git a/include/trace/events/timer_migration.h b/include/trace/events/timer_migration.h new file mode 100644 index 000000000000..0c4824056930 --- /dev/null +++ b/include/trace/events/timer_migration.h @@ -0,0 +1,277 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM timer_migration + +#if !defined(_TRACE_TIMER_MIGRATION_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_TIMER_MIGRATION_H + +#include + +/* Group events */ +TRACE_EVENT(tmigr_group_set, + + TP_PROTO(struct tmigr_group *group), + + TP_ARGS(group), + + TP_STRUCT__entry( + __field( void *, group ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->lvl = group->level; + __entry->numa_node = group->numa_node; + ), + + TP_printk("group=%p lvl=%d numa=%d", + __entry->group, __entry->lvl, __entry->numa_node) +); + +TRACE_EVENT(tmigr_connect_child_parent, + + TP_PROTO(struct tmigr_group *child), + + TP_ARGS(child), + + TP_STRUCT__entry( + __field( void *, child ) + __field( void *, parent ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( unsigned int, num_childs ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->child = child; + __entry->parent = child->parent; + __entry->lvl = child->parent->level; + __entry->numa_node = child->parent->numa_node; + __entry->numa_node = child->parent->num_childs; + __entry->childmask = child->childmask; + ), + + TP_printk("group=%p childmask=%0x parent=%p lvl=%d numa=%d num_childs=%d", + __entry->child, __entry->childmask, __entry->parent, + __entry->lvl, __entry->numa_node, __entry->num_childs) +); + +TRACE_EVENT(tmigr_connect_cpu_parent, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc), + + TP_STRUCT__entry( + __field( void *, parent ) + __field( unsigned int, cpu ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( unsigned int, num_childs ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->parent = tmc->tmgroup; + __entry->cpu = tmc->cpuevt.cpu; + __entry->lvl = tmc->tmgroup->level; + __entry->numa_node = tmc->tmgroup->numa_node; + __entry->numa_node = tmc->tmgroup->num_childs; + __entry->childmask = tmc->childmask; + ), + + TP_printk("cpu=%d childmask=%0x parent=%p lvl=%d numa=%d num_childs=%d", + __entry->cpu, __entry->childmask, __entry->parent, + __entry->lvl, __entry->numa_node, __entry->num_childs) +); + +DECLARE_EVENT_CLASS(tmigr_group_and_cpu, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask), + + TP_STRUCT__entry( + __field( void *, group ) + __field( void *, parent ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( u8, active ) + __field( u8, migrator ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->parent = group->parent; + __entry->lvl = group->level; + __entry->numa_node = group->numa_node; + __entry->active = state.active; + __entry->migrator = state.migrator; + __entry->childmask = childmask; + ), + + TP_printk("group=%p lvl=%d numa=%d active=%0x migrator=%0x " + "parent=%p childmask=%0x", + __entry->group, __entry->lvl, __entry->numa_node, + __entry->active, __entry->migrator, + __entry->parent, __entry->childmask) +); + +DEFINE_EVENT(tmigr_group_and_cpu, tmigr_group_set_cpu_inactive, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask) +); + +DEFINE_EVENT(tmigr_group_and_cpu, tmigr_group_set_cpu_active, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask) +); + +/* CPU events*/ +DECLARE_EVENT_CLASS(tmigr_cpugroup, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc), + + TP_STRUCT__entry( + __field( void *, parent) + __field( unsigned int, cpu) + ), + + TP_fast_assign( + __entry->cpu = tmc->cpuevt.cpu; + __entry->parent = tmc->tmgroup; + ), + + TP_printk("cpu=%d parent=%p", __entry->cpu, __entry->parent) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_new_timer, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_active, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_online, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_offline, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_handle_remote_cpu, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +TRACE_EVENT(tmigr_cpu_idle, + + TP_PROTO(struct tmigr_cpu *tmc, u64 nextevt), + + TP_ARGS(tmc, nextevt), + + TP_STRUCT__entry( + __field( void *, parent) + __field( unsigned int, cpu) + __field( u64, nextevt) + ), + + TP_fast_assign( + __entry->cpu = tmc->cpuevt.cpu; + __entry->parent = tmc->tmgroup; + __entry->nextevt = nextevt; + ), + + TP_printk("cpu=%d parent=%p nextevt=%llu", + __entry->cpu, __entry->parent, __entry->nextevt) +); + +TRACE_EVENT(tmigr_update_events, + + TP_PROTO(struct tmigr_group *child, struct tmigr_group *group, + union tmigr_state childstate, union tmigr_state groupstate, + u64 nextevt), + + TP_ARGS(child, group, childstate, groupstate, nextevt), + + TP_STRUCT__entry( + __field( void *, child ) + __field( void *, group ) + __field( u64, nextevt ) + __field( u64, group_next_expiry ) + __field( unsigned int, group_lvl ) + __field( u8, child_active ) + __field( u8, group_active ) + __field( unsigned int, child_evtcpu ) + __field( u64, child_evt_expiry ) + ), + + TP_fast_assign( + __entry->child = child; + __entry->group = group; + __entry->nextevt = nextevt; + __entry->group_next_expiry = group->next_expiry; + __entry->group_lvl = group->level; + __entry->child_active = childstate.active; + __entry->group_active = groupstate.active; + __entry->child_evtcpu = child ? child->groupevt.cpu : 0; + __entry->child_evt_expiry = child ? child->groupevt.nextevt.expires : 0; + ), + + TP_printk("child=%p group=%p group_lvl=%d child_active=%0x group_active=%0x " + "nextevt=%llu next_expiry=%llu child_evt_expiry=%llu child_evtcpu=%d", + __entry->child, __entry->group, __entry->group_lvl, __entry->child_active, + __entry->group_active, + __entry->nextevt, __entry->group_next_expiry, __entry->child_evt_expiry, + __entry->child_evtcpu) +); + +TRACE_EVENT(tmigr_handle_remote, + + TP_PROTO(struct tmigr_group *group), + + TP_ARGS(group), + + TP_STRUCT__entry( + __field( void * , group ) + __field( unsigned int , lvl ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->lvl = group->level; + ), + + TP_printk("group=%p lvl=%d", + __entry->group, __entry->lvl) +); + +#endif /* _TRACE_TIMER_MIGRATION_H */ + +/* This part must be outside protection */ +#include diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c index 5a600de3623b..5a371bc252d4 100644 --- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -13,6 +13,9 @@ #include "timer_migration.h" #include "tick-internal.h" +#define CREATE_TRACE_POINTS +#include + /* * The timer migration mechanism is built on a hierarchy of groups. The * lowest level group contains CPUs, the next level groups of CPU groups @@ -320,6 +323,8 @@ static bool tmigr_active_up(struct tmigr_group *group, */ group->groupevt.ignore = 1; + trace_tmigr_group_set_cpu_active(group, newstate, childmask); + return walk_done; } @@ -344,6 +349,7 @@ void tmigr_cpu_activate(void) raw_spin_lock(&tmc->lock); tmc->idle = 0; tmc->wakeup = KTIME_MAX; + trace_tmigr_cpu_active(tmc); __tmigr_cpu_activate(tmc); raw_spin_unlock(&tmc->lock); } @@ -450,6 +456,9 @@ static bool tmigr_update_events(struct tmigr_group *group, data->nextexp = tmigr_next_groupevt_expires(group); } + trace_tmigr_update_events(child, group, data->childstate, + data->groupstate, nextexp); + unlock: raw_spin_unlock(&group->lock); @@ -493,6 +502,8 @@ static u64 tmigr_new_timer(struct tmigr_cpu *tmc, u64 nextexp) if (tmc->remote) return KTIME_MAX; + trace_tmigr_cpu_new_timer(tmc); + tmc->cpuevt.ignore = 0; data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); @@ -593,6 +604,8 @@ static bool tmigr_inactive_up(struct tmigr_group *group, } } + trace_tmigr_group_set_cpu_inactive(group, newstate, childmask); + return walk_done; } @@ -669,6 +682,7 @@ u64 tmigr_cpu_deactivate(u64 nextexp) tmc->idle = 1; unlock: + trace_tmigr_cpu_idle(tmc, ret); raw_spin_unlock(&tmc->lock); return ret; } @@ -695,6 +709,8 @@ static u64 tmigr_handle_remote_cpu(unsigned int cpu, u64 now, return next; } + trace_tmigr_handle_remote_cpu(tmc); + tmc->remote = 1; /* Drop the lock to allow the remote CPU to exit idle */ @@ -746,6 +762,7 @@ static bool tmigr_handle_remote_up(struct tmigr_group *group, childmask = data->childmask; + trace_tmigr_handle_remote(group); again: /* * Handle the group only if @childmask is the migrator or if the @@ -979,6 +996,7 @@ static struct tmigr_group *tmigr_get_group(unsigned int cpu, unsigned int node, tmigr_init_group(group, lvl, node, migr_state); /* Setup successful. Add it to the hierarchy */ list_add(&group->list, &tmigr_level_list[lvl]); + trace_tmigr_group_set(group); return group; } @@ -997,6 +1015,8 @@ static void tmigr_connect_child_parent(struct tmigr_group *child, raw_spin_unlock(&parent->lock); raw_spin_unlock_irqrestore(&child->lock, flags); + trace_tmigr_connect_child_parent(child); + /* * To prevent inconsistent states, active childs needs to be active * in new parent as well. Inactive childs are already marked @@ -1083,6 +1103,8 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node) raw_spin_unlock_irqrestore(&group->lock, flags); + trace_tmigr_connect_cpu_parent(tmc); + /* There are no childs that needs to be connected */ continue; } else { @@ -1151,6 +1173,7 @@ static int tmigr_cpu_online(unsigned int cpu) tmc->wakeup = KTIME_MAX; } raw_spin_lock_irqsave(&tmc->lock, flags); + trace_tmigr_cpu_online(tmc); __tmigr_cpu_activate(tmc); tmc->online = 1; raw_spin_unlock_irqrestore(&tmc->lock, flags); @@ -1164,6 +1187,7 @@ static int tmigr_cpu_offline(unsigned int cpu) raw_spin_lock_irq(&tmc->lock); tmc->online = 0; __tmigr_cpu_deactivate(tmc, KTIME_MAX); + trace_tmigr_cpu_offline(tmc); raw_spin_unlock_irq(&tmc->lock); return 0; From patchwork Wed Mar 1 14:17:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 62923 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp3658978wrd; Wed, 1 Mar 2023 06:23:48 -0800 (PST) X-Google-Smtp-Source: AK7set9HqJGlCSRPnpypZaOYjTJ4uJ3Wvty6ZA8ifxxgcLAOFqOvG3Xe69fswTmjrLQQIDGDNtcr X-Received: by 2002:a05:6402:5484:b0:4af:6e63:b9c6 with SMTP id fg4-20020a056402548400b004af6e63b9c6mr15788233edb.1.1677680628429; Wed, 01 Mar 2023 06:23:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677680628; cv=none; d=google.com; s=arc-20160816; b=XCoiDviyAd/vqXUohpQ0yGOieVIJLd+OcdQk93be7lxTkJdpIduD5HTkgiGpBhz3UB yX+WTW/5F3Gs/1p+Z93jiTArzj5/VbM91ikHITG2WhB7Fh3qZrn7bhC7nXpBvu7zX9U+ JDmPn7SFOoYePA9cuEByOeTXGx3PdXKrGD8fYKmXa+furLSu3UTmLkS05o5KnhCGltWr 2C5ICZBHgnhLa2x+GSJek/GoIjAxtXIx77YuJoemBgGA0tlrAOc5NWIE/EzqAsKJUkdj Q3A2uuSHezS5YxhFM6PXWl2gFYDl2ZH3g1R3TocgxPcL5rMjpr6j/A3MycxWez3LvvS9 M7+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=wFuN72QUR/n6TFBpH6jGwM/fR1ePAiONUmDEuywEKXs=; b=PCBWxmRkQl0bhIq8FzqJKCv8s6T87ZTj4VNQR3bRNYQv/8lXjIpLJXmfYq/ASs1z7t L7Udd7PJH++AikRlihjZrwBL5ByJpbL13fh4INa5tIW+/XiOEPw/RT9srBxc93bWV0GV Jjr/W+O924vB229jDtiFmG28RYV8/aYsOk19KTb+R21aRNZcf1oCJ7r8W3eznx7DKbx0 z3fQwLXcwBPp4OXaRy9zX26uYiNmvEp8/SVVyhV35zJuWcLMtX72d0zfrnDLHfskhjwK VgsFoJ5aGP6H1FyA6XZbwVvKUJXxjyDIcsoqOKTmTQZ5lnJlW/xDf9YvRG5qFB3rgX5w WAug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PNc2YyDd; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m11-20020a170906258b00b008e1cbdcd3basi13318479ejb.998.2023.03.01.06.23.24; Wed, 01 Mar 2023 06:23:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PNc2YyDd; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230134AbjCAOTd (ORCPT + 99 others); Wed, 1 Mar 2023 09:19:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230076AbjCAOSP (ORCPT ); Wed, 1 Mar 2023 09:18:15 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C7E22BF1B for ; Wed, 1 Mar 2023 06:18:12 -0800 (PST) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677680290; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wFuN72QUR/n6TFBpH6jGwM/fR1ePAiONUmDEuywEKXs=; b=PNc2YyDdI6KSTzwKK5ZgENc6YvtSeFHLyWCmQ3VuF9VDC/PAiwG5qq8C0g3UuXzRyjG6uX Zx2Dmkhcdx0Ffw4aY1JJEyLLnY6CQ6xdiIbe4a9lc4abdJAzCI35GOM77LV05L9QNL7dqa IvxFiXix2VtO3YEQx+ljzt4efam+sz+9s8ntXSutVHBXVLRmHMSjF8WcFZzqp2OAM95PYL VBYhR6vGzq7jh/MVtNrR90QQTMtAZW3EXSDKbbXxPEeBgJy+yahCHZC+5N5h7E/j/+l4iq 0AMBt6cJEaSaw2J2u3t/qOxtn7/jKWYAt7ANL6Wjm6KvjLkvEv/a1fjWPxvWVQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677680290; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wFuN72QUR/n6TFBpH6jGwM/fR1ePAiONUmDEuywEKXs=; b=WEsHSE3dE0sBi2kXUp25bxnvoTYgWHirqXw6wzXWxZ9Vr/Ziaf0Kt0v6bw2uXqOmEqRPrY uiZLFfItP5dxi6Bg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Richard Cochran Subject: [PATCH v5 18/18] timer: Always queue timers on the local CPU Date: Wed, 1 Mar 2023 15:17:44 +0100 Message-Id: <20230301141744.16063-19-anna-maria@linutronix.de> In-Reply-To: <20230301141744.16063-1-anna-maria@linutronix.de> References: <20230301141744.16063-1-anna-maria@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759175642717088760?= X-GMAIL-MSGID: =?utf-8?q?1759175642717088760?= The timer pull model is in place so we can remove the heuristics which try to guess the best target CPU at enqueue/modification time. All non pinned timers are queued on the local CPU in the seperate storage and eventually pulled at expiry time to a remote CPU. Originally-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- v5: - Move WARN_ONCE() in add_timer_on() into a previous patch - Fold "crystallball magic" related hunks into this patch --- include/linux/timer.h | 5 ++--- kernel/time/timer.c | 42 ++++++++++++++++++++---------------------- 2 files changed, 22 insertions(+), 25 deletions(-) diff --git a/include/linux/timer.h b/include/linux/timer.h index 9162f275819a..aaedacac0b56 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -50,9 +50,8 @@ struct timer_list { * workqueue locking issues. It's not meant for executing random crap * with interrupts disabled. Abuse is monitored! * - * @TIMER_PINNED: A pinned timer will not be affected by any timer - * placement heuristics (like, NOHZ) and will always expire on the CPU - * on which the timer was enqueued. + * @TIMER_PINNED: A pinned timer will always expire on the CPU on which + * the timer was enqueued. * * Note: Because enqueuing of timers can migrate the timer from one * CPU to another, pinned timers are not guaranteed to stay on the diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 01e97342ad0d..a441ec9dae39 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -593,10 +593,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer) /* * We might have to IPI the remote CPU if the base is idle and the - * timer is not deferrable. If the other CPU is on the way to idle - * then it can't set base->is_idle as we hold the base lock: + * timer is pinned. If it is a non pinned timer, it is only queued + * on the remote CPU, when timer was running during queueing. Then + * everything is handled by remote CPU anyway. If the other CPU is + * on the way to idle then it can't set base->is_idle as we hold + * the base lock: */ - if (base->is_idle) + if (base->is_idle && timer->flags & TIMER_PINNED) wake_up_nohz_cpu(base->cpu); } @@ -944,17 +947,6 @@ static inline struct timer_base *get_timer_base(u32 tflags) return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK); } -static inline struct timer_base * -get_target_base(struct timer_base *base, unsigned tflags) -{ -#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) - if (static_branch_likely(&timers_migration_enabled) && - !(tflags & TIMER_PINNED)) - return get_timer_cpu_base(tflags, get_nohz_timer_target()); -#endif - return get_timer_this_cpu_base(tflags); -} - static inline void forward_timer_base(struct timer_base *base) { unsigned long jnow = READ_ONCE(jiffies); @@ -1106,7 +1098,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option if (!ret && (options & MOD_TIMER_PENDING_ONLY)) goto out_unlock; - new_base = get_target_base(base, timer->flags); + new_base = get_timer_this_cpu_base(timer->flags); if (base != new_base) { /* @@ -2127,8 +2119,14 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) */ } - /* We need to mark both bases in sync */ - base_local->is_idle = base_global->is_idle = is_idle; + /* + * base->is_idle information is required to wakeup a idle CPU when + * a new timer was enqueued. Only pinned timers could be enqueued + * remotely into a idle base. Therefore do maintain only + * base_local->is_idle information and ignore base_global->is_idle + * information. + */ + base_local->is_idle = is_idle; raw_spin_unlock(&base_global->lock); raw_spin_unlock(&base_local->lock); @@ -2158,13 +2156,13 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) void timer_clear_idle(void) { /* - * We do this unlocked. The worst outcome is a remote enqueue sending - * a pointless IPI, but taking the lock would just make the window for - * sending the IPI a few instructions smaller for the cost of taking - * the lock in the exit from idle path. + * We do this unlocked. The worst outcome is a remote pinned timer + * enqueue sending a pointless IPI, but taking the lock would just + * make the window for sending the IPI a few instructions smaller + * for the cost of taking the lock in the exit from idle + * path. Required for BASE_LOCAL only. */ __this_cpu_write(timer_bases[BASE_LOCAL].is_idle, false); - __this_cpu_write(timer_bases[BASE_GLOBAL].is_idle, false); /* Activate without holding the timer_base->lock */ tmigr_cpu_activate();