[net-next,0/2] net: Use SMP threads for backlog NAPI (or optional).

Message ID	20230929162121.1822900-1-bigeasy@linutronix.de
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Wander Lairson Costa <hawk@kernel.org> Subject: [PATCH net-next 0/2] net: Use SMP threads for backlog NAPI (or optional). Date: Fri, 29 Sep 2023 18:20:18 +0200 Message-ID: <20230929162121.1822900-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk
Series	net: Use SMP threads for backlog NAPI (or optional). \| [net-next,0/2] net: Use SMP threads for backlog NAPI (or optional). [net-next,1/2] net: Use SMP threads for backlog NAPI. [net-next,2/2] net: Allow to use SMP threads for backlog NAPI.

Message ID

20230929162121.1822900-1-bigeasy@linutronix.de

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::3:2 as permitted sender)
 client-ip=2620:137:e000::3:2;
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>,
        Eric Dumazet <edumazet@google.com>,
        Jakub Kicinski <kuba@kernel.org>,
        Paolo Abeni <pabeni@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Wander Lairson Costa <hawk@kernel.org>
Subject: [PATCH net-next 0/2] net: Use SMP threads for backlog NAPI (or
 optional).
Date: Fri, 29 Sep 2023 18:20:18 +0200
Message-ID: <20230929162121.1822900-1-bigeasy@linutronix.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Precedence: bulk

Series

net: Use SMP threads for backlog NAPI (or optional). |

Message

Sebastian Andrzej Siewior Sept. 29, 2023, 4:20 p.m. UTC

  The RPS code and "deferred skb free" both send IPI/ function call
to a remote CPU in which a softirq is raised. This leads to a warning on
PREEMPT_RT because raising softiqrs from function call led to undesired
behaviour in the past. I had duct tape in RT for the "deferred skb free"
and Wander Lairson Costa reported the RPS case.

Changes:
- RFC…v1 https://lore.kernel.org/all/20230818092111.5d86e351@kernel.org

   - Patch #2 has been removed. Removing the warning is still an option.

   - There are two patches in the series:
     - Patch #1 always creates backlog threads
     - Patch #2 creates the backlog threads if requested at boot time,
       mandatory on PREEMPT_RT.
     So it is either or and I wanted to show how both look like.

   - The kernel test robot reported a performance regression with
     loopback (stress-ng --udp X --udp-ops Y) against the RFC version.
     The regression is now avoided by using local-NAPI if backlog
     processing is requested on the local CPU.

Sebastian

Comments

Jakub Kicinski Oct. 4, 2023, 10:46 p.m. UTC | #1

On Fri, 29 Sep 2023 18:20:18 +0200 Sebastian Andrzej Siewior wrote:
>    - Patch #2 has been removed. Removing the warning is still an option.
> 
>    - There are two patches in the series:
>      - Patch #1 always creates backlog threads
>      - Patch #2 creates the backlog threads if requested at boot time,
>        mandatory on PREEMPT_RT.
>      So it is either or and I wanted to show how both look like.
> 
>    - The kernel test robot reported a performance regression with
>      loopback (stress-ng --udp X --udp-ops Y) against the RFC version.
>      The regression is now avoided by using local-NAPI if backlog
>      processing is requested on the local CPU.

Not what we asked for, and it doesn't apply.

Sebastian Andrzej Siewior Oct. 7, 2023, 3:59 p.m. UTC | #2

On 2023-10-04 15:46:09 [-0700], Jakub Kicinski wrote:
> On Fri, 29 Sep 2023 18:20:18 +0200 Sebastian Andrzej Siewior wrote:
> >    - Patch #2 has been removed. Removing the warning is still an option.
> > 
> >    - There are two patches in the series:
> >      - Patch #1 always creates backlog threads
> >      - Patch #2 creates the backlog threads if requested at boot time,
> >        mandatory on PREEMPT_RT.
> >      So it is either or and I wanted to show how both look like.
> > 
> >    - The kernel test robot reported a performance regression with
> >      loopback (stress-ng --udp X --udp-ops Y) against the RFC version.
> >      The regression is now avoided by using local-NAPI if backlog
> >      processing is requested on the local CPU.
> 
> Not what we asked for, and it doesn't apply.

Apologies if I misunderstood. You said to make it optional which I did
with the static key in the second patch of this series. The first patch
is indeed not what we talked about I just to show what it would look
like now that there is no "delay" for backlog-NAPI on the local CPU.

If the optional part is okay then I can repost only that patch against
current net-next.

Sebastian

Jakub Kicinski Oct. 10, 2023, 1:09 a.m. UTC | #3

On Sat, 7 Oct 2023 17:59:57 +0200 Sebastian Andrzej Siewior wrote:
> Apologies if I misunderstood. You said to make it optional which I did
> with the static key in the second patch of this series. The first patch
> is indeed not what we talked about I just to show what it would look
> like now that there is no "delay" for backlog-NAPI on the local CPU.
> 
> If the optional part is okay then I can repost only that patch against
> current net-next.

Do we have reason to believe nobody uses RPS?

Sebastian Andrzej Siewior Oct. 16, 2023, 9:53 a.m. UTC | #4

Sorry, getting back that late, I was traveling the last two weeks…

On 2023-10-09 18:09:37 [-0700], Jakub Kicinski wrote:
> On Sat, 7 Oct 2023 17:59:57 +0200 Sebastian Andrzej Siewior wrote:
> > Apologies if I misunderstood. You said to make it optional which I did
> > with the static key in the second patch of this series. The first patch
> > is indeed not what we talked about I just to show what it would look
> > like now that there is no "delay" for backlog-NAPI on the local CPU.
> > 
> > If the optional part is okay then I can repost only that patch against
> > current net-next.
> 
> Do we have reason to believe nobody uses RPS?

Not sure what you relate to. I would assume that RPS is used in general
on actual devices and not on loopback where backlog is used. But it is
just an assumption.
The performance drop, which I observed with RPS and stress-ng --udp, is
within the same range with threads and IPIs (based on memory). I can
re-run the test and provide actual numbers if you want.

Sebastian

Jakub Kicinski Oct. 16, 2023, 2:17 p.m. UTC | #5

On Mon, 16 Oct 2023 11:53:21 +0200 Sebastian Andrzej Siewior wrote:
> > Do we have reason to believe nobody uses RPS?  
> 
> Not sure what you relate to. I would assume that RPS is used in general
> on actual devices and not on loopback where backlog is used. But it is
> just an assumption.
> The performance drop, which I observed with RPS and stress-ng --udp, is
> within the same range with threads and IPIs (based on memory). I can
> re-run the test and provide actual numbers if you want.

I was asking about RPS because with your current series RPS processing
is forced into threads. IDK how well you can simulate the kind of
workload which requires RPS. I've seen it used mostly on proxyies 
and gateways. For proxies Meta's experiments with threaded NAPI show
regressions across the board. So "force-threading" RPS will most likely
also cause regressions.

Sebastian Andrzej Siewior Oct. 16, 2023, 2:53 p.m. UTC | #6

On 2023-10-16 07:17:56 [-0700], Jakub Kicinski wrote:
> On Mon, 16 Oct 2023 11:53:21 +0200 Sebastian Andrzej Siewior wrote:
> > > Do we have reason to believe nobody uses RPS?  
> > 
> > Not sure what you relate to. I would assume that RPS is used in general
> > on actual devices and not on loopback where backlog is used. But it is
> > just an assumption.
> > The performance drop, which I observed with RPS and stress-ng --udp, is
> > within the same range with threads and IPIs (based on memory). I can
> > re-run the test and provide actual numbers if you want.
> 
> I was asking about RPS because with your current series RPS processing
> is forced into threads. IDK how well you can simulate the kind of
> workload which requires RPS. I've seen it used mostly on proxyies 
> and gateways. For proxies Meta's experiments with threaded NAPI show
> regressions across the board. So "force-threading" RPS will most likely
> also cause regressions.

Understood.

Wandere/ Juri: Do you have any benchmark/ workload where you would see
whether RPS with IPI (now) vs RPS (this patch) shows any regression?

Sebastian

Sebastian Andrzej Siewior Oct. 31, 2023, 10:14 a.m. UTC | #7

On 2023-10-16 16:53:39 [+0200], To Jakub Kicinski wrote:
> On 2023-10-16 07:17:56 [-0700], Jakub Kicinski wrote:
> > On Mon, 16 Oct 2023 11:53:21 +0200 Sebastian Andrzej Siewior wrote:
> > > > Do we have reason to believe nobody uses RPS?  
> > > 
> > > Not sure what you relate to. I would assume that RPS is used in general
> > > on actual devices and not on loopback where backlog is used. But it is
> > > just an assumption.
> > > The performance drop, which I observed with RPS and stress-ng --udp, is
> > > within the same range with threads and IPIs (based on memory). I can
> > > re-run the test and provide actual numbers if you want.
> > 
> > I was asking about RPS because with your current series RPS processing
> > is forced into threads. IDK how well you can simulate the kind of
> > workload which requires RPS. I've seen it used mostly on proxyies 
> > and gateways. For proxies Meta's experiments with threaded NAPI show
> > regressions across the board. So "force-threading" RPS will most likely
> > also cause regressions.
> 
> Understood.
> 
> Wandere/ Juri: Do you have any benchmark/ workload where you would see
> whether RPS with IPI (now) vs RPS (this patch) shows any regression?

So I poked offlist other RH people and I've been told that they hardly
ever test RPS since the NICs these days have RSS in hardware.

Sebastian

Wander Lairson Costa Oct. 31, 2023, 11:36 a.m. UTC | #8

On Tue, Oct 31, 2023 at 7:14 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-10-16 16:53:39 [+0200], To Jakub Kicinski wrote:
> > On 2023-10-16 07:17:56 [-0700], Jakub Kicinski wrote:
> > > On Mon, 16 Oct 2023 11:53:21 +0200 Sebastian Andrzej Siewior wrote:
> > > > > Do we have reason to believe nobody uses RPS?
> > > >
> > > > Not sure what you relate to. I would assume that RPS is used in general
> > > > on actual devices and not on loopback where backlog is used. But it is
> > > > just an assumption.
> > > > The performance drop, which I observed with RPS and stress-ng --udp, is
> > > > within the same range with threads and IPIs (based on memory). I can
> > > > re-run the test and provide actual numbers if you want.
> > >
> > > I was asking about RPS because with your current series RPS processing
> > > is forced into threads. IDK how well you can simulate the kind of
> > > workload which requires RPS. I've seen it used mostly on proxyies
> > > and gateways. For proxies Meta's experiments with threaded NAPI show
> > > regressions across the board. So "force-threading" RPS will most likely
> > > also cause regressions.
> >
> > Understood.
> >
> > Wandere/ Juri: Do you have any benchmark/ workload where you would see
> > whether RPS with IPI (now) vs RPS (this patch) shows any regression?
>
> So I poked offlist other RH people and I've been told that they hardly
> ever test RPS since the NICs these days have RSS in hardware.

Sorry, Juri is in PTO and I am just back from sick leave and still
catching up. I've been contacting some QE people, but so far it is
like you said, no stress test for RPS. If I have some news, I let you
know.


>
> Sebastian
>