Message ID | 20230922111247.497-3-ansuelsmth@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5598903vqi; Fri, 22 Sep 2023 07:03:16 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEdiGZX4OlskLeyCWoUQYYO5s9R4BDXvrZsV4IY+SdajsqeYJlG15tmiWz47oICTDNkSNKJ X-Received: by 2002:a17:903:230b:b0:1c5:d77c:644a with SMTP id d11-20020a170903230b00b001c5d77c644amr5118606plh.20.1695391396068; Fri, 22 Sep 2023 07:03:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695391396; cv=none; d=google.com; s=arc-20160816; b=NizlK9ra/5snGYJy9bVcwYkhXfX8yOQwGl/ZJGyH9Fdc9fhp9gYD19rb/u9U3/1NWu nnxgro/pe9EnbghIDazmJu46ujKVg0deAhN7BegnNCOU2kTf5Xx+GjJW6RV96CUrb5Fx 0JW5eiT+x7ZN9RbsRtG6O3/WhV7oQYRhwJg65Hg9ftJ6HuuSDc92q5F6owVMJX1jLZW3 5tFM2ameqmthZqZ2IQ7RtIilnmQMMDIOTVrfZzlCYHKKnPxiT7Nv2R2XIaFYRzRGDs0b PG75aEJYLut2zfv5NTbTBZutoFkf8ecKMakwft4lEM2oHE7SADxzi69HbpgyXDKAB5Mx Cb+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/ijlwZ6JRdK1U0SnHerxGcEJGjK9VRf4PRp/VeOrzMQ=; fh=TxHs6sUnnw2wDr6C2s/JBmmxBLX36/Oe5QTDYgldwiY=; b=Q/+FQiuZpgXIU9VPk/PhKvusWiSiv+dhI7H0sLkN2SSdjf+mn0RBa5X8oddCDR8+Bz g93fXd6t5XCIGhkhSSBrOlpNP+6PPtHkkumBEaZmNDz9xYxHV7ggfWf47zZgRRMn/MvD sNoytYxEWUHlKWxAV/uVrhDfGhepQo1XQEF7R0FIZkTVLl5AbsFoGyb/L44K4p76eHUd 9K2FEyZ8QiEQeDVopsDEsiWN4aL/K3iQjbgzdGeDH1ldzy22s+EnJpRT/rfEGHd3KOyJ FoeDcLtk9ca1x9wnSaaiTGBIaGohs7Zw5Qz1kXi++m5obyKpbQxq4vJUDXM7WsZJfhKg NFTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=Ui54MJAW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id iw13-20020a170903044d00b001b25e9a76d7si3703445plb.316.2023.09.22.07.03.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 07:03:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=Ui54MJAW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 17ACB80677E9; Fri, 22 Sep 2023 04:13:40 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233556AbjIVLNV (ORCPT <rfc822;chrisfriedt@gmail.com> + 30 others); Fri, 22 Sep 2023 07:13:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233540AbjIVLNR (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 22 Sep 2023 07:13:17 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82A6D114; Fri, 22 Sep 2023 04:13:10 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-4053db20d03so6060525e9.2; Fri, 22 Sep 2023 04:13:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695381189; x=1695985989; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/ijlwZ6JRdK1U0SnHerxGcEJGjK9VRf4PRp/VeOrzMQ=; b=Ui54MJAW4B/VxxzITX2OiWb98Jy5pJ7NHhxjB+lfMtvygA44FuyaWuBSmeiVF77qR+ P8GX72ChlT8RsA454fmSjj9RH6PMLFVmRyYhWt3xKCpAFP/hrK06HigE1MnOQSyfaU9t ws9sQKJuegRYhQacMuFu9YMCfGtJGeQR68VO5X3tpiJmXljCJjxjftbhSmQ0/l5TDLeU yA5S54fyVyDpFWi7nveLTxOo0GrJ+l/dcuWBKHx+2D42Rp+5sibILDLV9Kcz1UoU2CPN DkCtMUhx9p1x11JFcx22kUOY1f8oUXTceZINFfn+sujZ5CtJU05vUIP4gZuSWVUfxDRC syqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695381189; x=1695985989; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/ijlwZ6JRdK1U0SnHerxGcEJGjK9VRf4PRp/VeOrzMQ=; b=ITlib30bkGKJfD9QCXCBPzA5k1zJpX0bBIgokw/NyT9DL6aeaY6Usoxr2uRfIoS2nC Bq26EMby0mlVHIJtzw6SbUF3LqdIdbvInJ0QbqPxqWxvlM/5yOZA5ydNpcy8F9LABOId +AkfmGwdjs/k9TCRPPxlqHrHs7fCr77PHfUqRAQeCS8p4chPahhWxl/n98cXFy+mkxA/ astVIYobGCISOjkpekowgftVMD03EWPyh7gUWOaOEd4NOS4fQDAyd9fglUcsgwOVFz5T 8nJTTye30tphF8VrqIIJgXf05ojh2551iO4I3urZhw5SUsX6Iz4ED8uQWVhWqgZAwANU JCkQ== X-Gm-Message-State: AOJu0YymSZd3YVRkHwbCOraqFR4unpzJwNpp24GjwRTPMxMAzIf3ZuUJ +MgINd11X7CtYrlKnT5RKAw= X-Received: by 2002:a05:600c:152:b0:403:cc64:2dbf with SMTP id w18-20020a05600c015200b00403cc642dbfmr7548257wmm.27.1695381188563; Fri, 22 Sep 2023 04:13:08 -0700 (PDT) Received: from localhost.localdomain (93-34-89-13.ip49.fastwebnet.it. [93.34.89.13]) by smtp.googlemail.com with ESMTPSA id g10-20020adffc8a000000b003176c6e87b1sm4191765wrr.81.2023.09.22.04.13.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:13:07 -0700 (PDT) From: Christian Marangi <ansuelsmth@gmail.com> To: Vincent Whitchurch <vincent.whitchurch@axis.com>, Raju Rangoju <rajur@chelsio.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Alexandre Torgue <alexandre.torgue@foss.st.com>, Jose Abreu <joabreu@synopsys.com>, Maxime Coquelin <mcoquelin.stm32@gmail.com>, Ping-Ke Shih <pkshih@realtek.com>, Kalle Valo <kvalo@kernel.org>, Simon Horman <horms@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Jiri Pirko <jiri@resnulli.us>, Hangbin Liu <liuhangbin@gmail.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, linux-wireless@vger.kernel.org Cc: Christian Marangi <ansuelsmth@gmail.com> Subject: [net-next PATCH 3/3] net: stmmac: increase TX coalesce timer to 5ms Date: Fri, 22 Sep 2023 13:12:47 +0200 Message-Id: <20230922111247.497-3-ansuelsmth@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230922111247.497-1-ansuelsmth@gmail.com> References: <20230922111247.497-1-ansuelsmth@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 22 Sep 2023 04:13:40 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777746728620988662 X-GMAIL-MSGID: 1777746728620988662 |
Series |
[net-next,1/3] net: introduce napi_is_scheduled helper
|
|
Commit Message
Christian Marangi
Sept. 22, 2023, 11:12 a.m. UTC
Commit 8fce33317023 ("net: stmmac: Rework coalesce timer and fix
multi-queue races") decreased the TX coalesce timer from 40ms to 1ms.
This caused some performance regression on some target (regression was
reported at least on ipq806x) in the order of 600mbps dropping from
gigabit handling to only 200mbps.
The problem was identified in the TX timer getting armed too much time.
While this was fixed and improved in another commit, performance can be
improved even further by increasing the timer delay a bit moving from
1ms to 5ms.
The value is a good balance between battery saving by prevending too
much interrupt to be generated and permitting good performance for
internet oriented devices.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Fri, Sep 22, 2023 at 01:12:47PM +0200, Christian Marangi wrote: > Commit 8fce33317023 ("net: stmmac: Rework coalesce timer and fix > multi-queue races") decreased the TX coalesce timer from 40ms to 1ms. > > This caused some performance regression on some target (regression was > reported at least on ipq806x) in the order of 600mbps dropping from > gigabit handling to only 200mbps. > > The problem was identified in the TX timer getting armed too much time. > While this was fixed and improved in another commit, performance can be > improved even further by increasing the timer delay a bit moving from > 1ms to 5ms. > > The value is a good balance between battery saving by prevending too > much interrupt to be generated and permitting good performance for > internet oriented devices. ethtool has a settings you can use for this: ethtool -C|--coalesce devname [adaptive-rx on|off] [adaptive-tx on|off] [rx-usecs N] [rx-frames N] [rx-usecs-irq N] [rx-frames-irq N] [tx-usecs N] [tx-frames N] [tx-usecs-irq N] [tx-frames-irq N] [stats-block-usecs N] [pkt-rate-low N] [rx-usecs-low N] [rx-frames-low N] [tx-usecs-low N] [tx-frames-low N] [pkt-rate-high N] [rx-usecs-high N] [rx-frames-high N] [tx-usecs-high N] [tx-frames-high N] [sample-interval N] [cqe-mode-rx on|off] [cqe-mode-tx on|off] [tx-aggr-max-bytes N] [tx-aggr-max-frames N] [tx-aggr-time-usecs N] If this is not implemented, i suggest you add support for it. Changing the default might cause regressions. Say there is a VoIP application which wants this low latency? It would be safer to allow user space to configure it as wanted. Andrew
On Fri, Sep 22, 2023 at 02:28:06PM +0200, Andrew Lunn wrote: > On Fri, Sep 22, 2023 at 01:12:47PM +0200, Christian Marangi wrote: > > Commit 8fce33317023 ("net: stmmac: Rework coalesce timer and fix > > multi-queue races") decreased the TX coalesce timer from 40ms to 1ms. > > > > This caused some performance regression on some target (regression was > > reported at least on ipq806x) in the order of 600mbps dropping from > > gigabit handling to only 200mbps. > > > > The problem was identified in the TX timer getting armed too much time. > > While this was fixed and improved in another commit, performance can be > > improved even further by increasing the timer delay a bit moving from > > 1ms to 5ms. > > > > The value is a good balance between battery saving by prevending too > > much interrupt to be generated and permitting good performance for > > internet oriented devices. > > ethtool has a settings you can use for this: > > ethtool -C|--coalesce devname [adaptive-rx on|off] [adaptive-tx on|off] > [rx-usecs N] [rx-frames N] [rx-usecs-irq N] [rx-frames-irq N] > [tx-usecs N] [tx-frames N] [tx-usecs-irq N] [tx-frames-irq N] > [stats-block-usecs N] [pkt-rate-low N] [rx-usecs-low N] > [rx-frames-low N] [tx-usecs-low N] [tx-frames-low N] > [pkt-rate-high N] [rx-usecs-high N] [rx-frames-high N] > [tx-usecs-high N] [tx-frames-high N] [sample-interval N] > [cqe-mode-rx on|off] [cqe-mode-tx on|off] [tx-aggr-max-bytes N] > [tx-aggr-max-frames N] [tx-aggr-time-usecs N] > > If this is not implemented, i suggest you add support for it. > > Changing the default might cause regressions. Say there is a VoIP > application which wants this low latency? It would be safer to allow > user space to configure it as wanted. > Yep stmmac already support it. Idea here was to not fallback to use ethtool and find a good value. Just for reference before one commit, the value was set to 40ms and nobody ever pointed out regression about VoIP application. Wtih some testing I found 5ms a small increase that restore original perf and should not cause any regression. (for reference keeping this to 1ms cause a lost of about 100-200mbps) (also the tx timer implementation was created before any napi poll logic and before dma interrupt handling was a thing, with the later change I expect this timer to be very little used in VoIP scenario or similar with continuous traffic as napi will take care of handling packet) Aside from these reason I totally get the concern and totally ok with this not getting applied, was just an idea to push for a common value. Just preferred to handle this here instead of script+userspace :( (the important part is the previous patch)
On Fri, Sep 22, 2023 at 5:39 AM Christian Marangi <ansuelsmth@gmail.com> wrote: > > On Fri, Sep 22, 2023 at 02:28:06PM +0200, Andrew Lunn wrote: > > On Fri, Sep 22, 2023 at 01:12:47PM +0200, Christian Marangi wrote: > > > Commit 8fce33317023 ("net: stmmac: Rework coalesce timer and fix > > > multi-queue races") decreased the TX coalesce timer from 40ms to 1ms. > > > > > > This caused some performance regression on some target (regression was > > > reported at least on ipq806x) in the order of 600mbps dropping from > > > gigabit handling to only 200mbps. > > > > > > The problem was identified in the TX timer getting armed too much time. > > > While this was fixed and improved in another commit, performance can be > > > improved even further by increasing the timer delay a bit moving from > > > 1ms to 5ms. I am always looking for finding ways to improve interrupt service time, rather than paper over the problem by increasing batchi-ness. http://www.taht.net/~d/broadcom_aug9_2018.pdf But also looking for hard data, particularly as to observed power savings. How much power does upping this number save? I have tried to question other assumptions more modern kernels are making, in particular I wish more folk would experience with decreasing the overlarge (IMHO) NAPI default of 64 packets to, say 8 in the mq case, benefiting from multiple arm cores still equipped with limited cache, as well as looking at the impact of TLB flushes. Other deferred multi-core processing... that is looking good on a modern xeon, but might not be so good on a more limited arm, worries me. Over here there was an enormous test series recently run against a bunch of older arm64s which appears to indicate that memory bandwidth is a source of problems: https://docs.google.com/document/d/1HxIU_TEBI6xG9jRHlr8rzyyxFEN43zMcJXUFlRuhiUI/edit We are looking to add more devices to that testbed. > > > > > > The value is a good balance between battery saving by prevending too > > > much interrupt to be generated and permitting good performance for > > > internet oriented devices. > > > > ethtool has a settings you can use for this: > > > > ethtool -C|--coalesce devname [adaptive-rx on|off] [adaptive-tx on|off] > > [rx-usecs N] [rx-frames N] [rx-usecs-irq N] [rx-frames-irq N] > > [tx-usecs N] [tx-frames N] [tx-usecs-irq N] [tx-frames-irq N] > > [stats-block-usecs N] [pkt-rate-low N] [rx-usecs-low N] > > [rx-frames-low N] [tx-usecs-low N] [tx-frames-low N] > > [pkt-rate-high N] [rx-usecs-high N] [rx-frames-high N] > > [tx-usecs-high N] [tx-frames-high N] [sample-interval N] > > [cqe-mode-rx on|off] [cqe-mode-tx on|off] [tx-aggr-max-bytes N] > > [tx-aggr-max-frames N] [tx-aggr-time-usecs N] > > > > If this is not implemented, i suggest you add support for it. > > > > Changing the default might cause regressions. Say there is a VoIP > > application which wants this low latency? It would be safer to allow > > user space to configure it as wanted. > > > > Yep stmmac already support it. Idea here was to not fallback to use > ethtool and find a good value. > > Just for reference before one commit, the value was set to 40ms and > nobody ever pointed out regression about VoIP application. Wtih some > testing I found 5ms a small increase that restore original perf and > should not cause any regression. Does this driver have BQL? > (for reference keeping this to 1ms cause a lost of about 100-200mbps) > (also the tx timer implementation was created before any napi poll logic > and before dma interrupt handling was a thing, with the later change I > expect this timer to be very little used in VoIP scenario or similar > with continuous traffic as napi will take care of handling packet) I would be pretty interested in a kernel flame graph of the before vs the after. > Aside from these reason I totally get the concern and totally ok with > this not getting applied, was just an idea to push for a common value. I try to get people to run much longer and more complicated tests such as the flent rrul test to see what kind of damage bigger buffers did to latency, as well as how other problems might show up. Really notable in the above test series was how badly various devices behaved over time on that workload. Extremely notable in that test series above was how badly the jetson performed: https://github.com/randomizedcoder/cake/blob/2023_09_02/pfifo_fast/jetson.png And the nanopi was weird. https://github.com/randomizedcoder/cake/blob/2023_09_02/pfifo_fast/nanopi-neo3.png > Just preferred to handle this here instead of script+userspace :( > (the important part is the previous patch) > > -- > Ansuel > -- Oct 30: https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html Dave Täht CSO, LibreQos
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index 403cb397d4d3..2d9f895c2193 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -290,7 +290,7 @@ struct stmmac_safety_stats { #define MIN_DMA_RIWT 0x10 #define DEF_DMA_RIWT 0xa0 /* Tx coalesce parameters */ -#define STMMAC_COAL_TX_TIMER 1000 +#define STMMAC_COAL_TX_TIMER 5000 #define STMMAC_MAX_COAL_TX_TICK 100000 #define STMMAC_TX_MAX_FRAMES 256 #define STMMAC_TX_FRAMES 25