Message ID | 8d492ee4a391bd089a01c218b0b4e05cf8ea593c.1674729407.git.geert+renesas@glider.be |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp208124wrn; Thu, 26 Jan 2023 02:52:21 -0800 (PST) X-Google-Smtp-Source: AMrXdXuv7845G8OhR2fazG0/FHvbdmr8Av5sIYoU+shjjr2cN5um7RTKixHWJOE9vmlaw2VvAmB/ X-Received: by 2002:a17:90a:1285:b0:229:679b:bf9c with SMTP id g5-20020a17090a128500b00229679bbf9cmr37163730pja.9.1674730341594; Thu, 26 Jan 2023 02:52:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674730341; cv=none; d=google.com; s=arc-20160816; b=pcjMBz9xw7AfPTw9ELeKpnRvJW+4DYWeFximmMMGy4yatJxE7U6GxCIgdvV2aWma1S Y5KeuZSfqT5UH2rDtlIT0Q9i0udBfrx8fM22B4UIN+vdqzKIw5p7I4J+/+ppWLHnyJwv B2z0zFvfmOR8RYTyl+odUSL2LJkG5fXM1bW0nCSqhZE7/iPdVGsnVKeu4juq7fMdlBcP 1M077hnAh9pJ3+P6NUPcRcE1jcOgIbntPaibKjayEzmyvIFzrzqzgyjF20lf8Xkl1wzJ DalcXWsgMB2q3wLItg3W9ODka7qpP8F1W/6JGhAKUqAYm2hPBGD3VeQ7948hgHr2Fr/O dV4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=4tsLjh6Wefzr96qiloYxuCj/RhC5RlkPePlnDstVD1s=; b=zVoGAS0sDm+WsIIixYt2jV2Ai4iTn/0wzZ1RsSIp1YceaPjna6T7vFXC0Sn9CuEnTw GuaWlhjbaW37f4tEGWgbHcL2uWEJMrwZMAm/Mz2W2VtAwn64+yUjd5ZjO1rB/oun4JyK hozk0Hinikd/v0IfpRl8On4HsxvTcGkngGFVS7BLyBmX67iyqemcBZx7yQnR8tzRxqwE HgVG4bLc1nIdIzXsmuIroF3y+O2j5C2OyJdun8ao4IwHGG+ENMkaPSYssLaKE6ZcC2yh pRbg+uqQo1YOANHhpCTQ+pa41CrdYugTxFJn4YnY6zoJmUJnN6jEw+NIR12JSG53Hg7W B+oA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g15-20020a63ad0f000000b0048e225a6e16si784904pgf.199.2023.01.26.02.52.09; Thu, 26 Jan 2023 02:52:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231428AbjAZKp5 (ORCPT <rfc822;lekhanya01809@gmail.com> + 99 others); Thu, 26 Jan 2023 05:45:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233159AbjAZKpz (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 26 Jan 2023 05:45:55 -0500 Received: from albert.telenet-ops.be (albert.telenet-ops.be [IPv6:2a02:1800:110:4::f00:1a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BACE46154 for <linux-kernel@vger.kernel.org>; Thu, 26 Jan 2023 02:45:52 -0800 (PST) Received: from ramsan.of.borg ([IPv6:2a02:1810:ac12:ed50:735b:e883:8069:427]) by albert.telenet-ops.be with bizsmtp id DNle2900V4pY4D906NlfsF; Thu, 26 Jan 2023 11:45:50 +0100 Received: from rox.of.borg ([192.168.97.57]) by ramsan.of.borg with esmtp (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1pKzl1-007R9r-V2; Thu, 26 Jan 2023 11:45:38 +0100 Received: from geert by rox.of.borg with local (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1pKzlC-003tYg-Pt; Thu, 26 Jan 2023 11:45:38 +0100 From: Geert Uytterhoeven <geert+renesas@glider.be> To: Stephen Boyd <sboyd@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>, Tomasz Figa <tomasz.figa@gmail.com>, Sylwester Nawrocki <s.nawrocki@samsung.com>, Will Deacon <will@kernel.org>, Arnd Bergmann <arnd@arndb.de>, Wolfram Sang <wsa+renesas@sang-engineering.com>, Dejin Zheng <zhengdejin5@gmail.com>, Kai-Heng Feng <kai.heng.feng@canonical.com>, Nicholas Piggin <npiggin@gmail.com>, Heiko Carstens <hca@linux.ibm.com>, Peter Zijlstra <peterz@infradead.org>, Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org, linux-renesas-soc@vger.kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven <geert+renesas@glider.be> Subject: [PATCH resend] iopoll: Call cpu_relax() in busy loops Date: Thu, 26 Jan 2023 11:45:37 +0100 Message-Id: <8d492ee4a391bd089a01c218b0b4e05cf8ea593c.1674729407.git.geert+renesas@glider.be> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756082042902938325?= X-GMAIL-MSGID: =?utf-8?q?1756082042902938325?= |
Series |
[resend] iopoll: Call cpu_relax() in busy loops
|
|
Commit Message
Geert Uytterhoeven
Jan. 26, 2023, 10:45 a.m. UTC
It is considered good practice to call cpu_relax() in busy loops, see
Documentation/process/volatile-considered-harmful.rst. This can not
only lower CPU power consumption or yield to a hyperthreaded twin
processor, but also allows an architecture to mitigate hardware issues
(e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
architecture-specific cpu_relax() implementation.
As the iopoll helpers lack calls to cpu_relax(), people are sometimes
reluctant to use them, and may fall back to open-coded polling loops
(including cpu_relax() calls) instead.
Fix this by adding calls to cpu_relax() to the iopoll helpers:
- For the non-atomic case, it is sufficient to call cpu_relax() in
case of a zero sleep-between-reads value, as a call to
usleep_range() is a safe barrier otherwise.
- For the atomic case, cpu_relax() must be called regardless of the
sleep-between-reads value, as there is no guarantee all
architecture-specific implementations of udelay() handle this.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Resent with a larger audience due to lack of comments.
This has been discussed before, but I am not aware of any patches moving
forward:
- "Re: [PATCH 6/7] clk: renesas: rcar-gen3: Add custom clock for PLLs"
https://lore.kernel.org/all/CAMuHMdWUEhs=nwP+a0vO2jOzkq-7FEOqcJ+SsxAGNXX1PQ2KMA@mail.gmail.com/
- "Re: [PATCH v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops"
https://lore.kernel.org/all/20200811164628.GA7958@kozik-lap
---
include/linux/iopoll.h | 3 +++
1 file changed, 3 insertions(+)
Comments
On Thu, Jan 26, 2023 at 11:45:37AM +0100, Geert Uytterhoeven wrote: > It is considered good practice to call cpu_relax() in busy loops, see > Documentation/process/volatile-considered-harmful.rst. This can not > only lower CPU power consumption or yield to a hyperthreaded twin > processor, but also allows an architecture to mitigate hardware issues > (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the > architecture-specific cpu_relax() implementation. > > As the iopoll helpers lack calls to cpu_relax(), people are sometimes > reluctant to use them, and may fall back to open-coded polling loops > (including cpu_relax() calls) instead. > > Fix this by adding calls to cpu_relax() to the iopoll helpers: > - For the non-atomic case, it is sufficient to call cpu_relax() in > case of a zero sleep-between-reads value, as a call to > usleep_range() is a safe barrier otherwise. > - For the atomic case, cpu_relax() must be called regardless of the > sleep-between-reads value, as there is no guarantee all > architecture-specific implementations of udelay() handle this. > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> In addition to these dodgy architecture fails, cpu_relax() is also a compiler barrier, it is not immediately obvious that the @op argument 'function' will result in an actual function call (inlining ftw). Where a function call is a C sequence point, this is lost on inlining. Therefore, with agressive enough optimization it might be possible for the compiler to hoist the: (val) = op(args); 'load' out of the loop because it doesn't see the value changing. The addition of cpu_relax() will inhibit this. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > Resent with a larger audience due to lack of comments. > > This has been discussed before, but I am not aware of any patches moving > forward: > - "Re: [PATCH 6/7] clk: renesas: rcar-gen3: Add custom clock for PLLs" > https://lore.kernel.org/all/CAMuHMdWUEhs=nwP+a0vO2jOzkq-7FEOqcJ+SsxAGNXX1PQ2KMA@mail.gmail.com/ > - "Re: [PATCH v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops" > https://lore.kernel.org/all/20200811164628.GA7958@kozik-lap > --- > include/linux/iopoll.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h > index 2c8860e406bd8cae..73132721d1891a2e 100644 > --- a/include/linux/iopoll.h > +++ b/include/linux/iopoll.h > @@ -53,6 +53,8 @@ > } \ > if (__sleep_us) \ > usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ > + else \ > + cpu_relax(); \ There's a simplicitly argument to be had for making it unconditional here too I suppose. usleep() is 'slow' anyway. > } \ > (cond) ? 0 : -ETIMEDOUT; \ > }) > @@ -95,6 +97,7 @@ > } \ > if (__delay_us) \ > udelay(__delay_us); \ > + cpu_relax(); \ > } \ > (cond) ? 0 : -ETIMEDOUT; \ > }) > -- > 2.34.1 >
On Thu, Jan 26, 2023, at 11:45, Geert Uytterhoeven wrote: > It is considered good practice to call cpu_relax() in busy loops, see > Documentation/process/volatile-considered-harmful.rst. This can not > only lower CPU power consumption or yield to a hyperthreaded twin > processor, but also allows an architecture to mitigate hardware issues > (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the > architecture-specific cpu_relax() implementation. > > As the iopoll helpers lack calls to cpu_relax(), people are sometimes > reluctant to use them, and may fall back to open-coded polling loops > (including cpu_relax() calls) instead. > > Fix this by adding calls to cpu_relax() to the iopoll helpers: > - For the non-atomic case, it is sufficient to call cpu_relax() in > case of a zero sleep-between-reads value, as a call to > usleep_range() is a safe barrier otherwise. > - For the atomic case, cpu_relax() must be called regardless of the > sleep-between-reads value, as there is no guarantee all > architecture-specific implementations of udelay() handle this. > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de>
diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h index 2c8860e406bd8cae..73132721d1891a2e 100644 --- a/include/linux/iopoll.h +++ b/include/linux/iopoll.h @@ -53,6 +53,8 @@ } \ if (__sleep_us) \ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \ + else \ + cpu_relax(); \ } \ (cond) ? 0 : -ETIMEDOUT; \ }) @@ -95,6 +97,7 @@ } \ if (__delay_us) \ udelay(__delay_us); \ + cpu_relax(); \ } \ (cond) ? 0 : -ETIMEDOUT; \ })