[net-next,00/12] Add tc-mqprio and tc-taprio support for preemptible traffic classes

Message ID 20230216232126.3402975-1-vladimir.oltean@nxp.com
Headers
Series Add tc-mqprio and tc-taprio support for preemptible traffic classes |

Message

Vladimir Oltean Feb. 16, 2023, 11:21 p.m. UTC
  The last RFC in August 2022 contained a proposal for the UAPI of both
TSN standards which together form Frame Preemption (802.1Q and 802.3):
https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/

It wasn't clear at the time whether the 802.1Q portion of Frame Preemption
should be exposed via the tc qdisc (mqprio, taprio) or via some other
layer (perhaps also ethtool like the 802.3 portion).

So the 802.3 portion got submitted separately and finally was accepted:
https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/

leaving the only remaining question: how do we expose the 802.1Q bits?

This series proposes that we use the Qdisc layer, through separate
(albeit very similar) UAPI in mqprio and taprio, and that both these
Qdiscs pass the information down to the offloading device driver through
the common mqprio offload structure (which taprio also passes).

Implementations are provided for the NXP LS1028A on-board Ethernet
(enetc, felix).

Some patches should have maybe belonged to separate series, leaving here
only patches 09/12 - 12/12, for ease of review. That may be true,
however due to a perceived lack of time to wait for the prerequisite
cleanup to be merged, here they are all together.

Vladimir Oltean (12):
  net: enetc: rename "mqprio" to "qopt"
  net: mscc: ocelot: add support for mqprio offload
  net: dsa: felix: act upon the mqprio qopt in taprio offload
  net: ethtool: fix __ethtool_dev_mm_supported() implementation
  net: ethtool: create and export ethtool_dev_mm_supported()
  net/sched: mqprio: simplify handling of nlattr portion of TCA_OPTIONS
  net/sched: mqprio: add extack to mqprio_parse_nlattr()
  net/sched: mqprio: add an extack message to mqprio_parse_opt()
  net/sched: mqprio: allow per-TC user input of FP adminStatus
  net/sched: taprio: allow per-TC user input of FP adminStatus
  net: mscc: ocelot: add support for preemptible traffic classes
  net: enetc: add support for preemptible traffic classes

 drivers/net/dsa/ocelot/felix_vsc9959.c        |  44 ++++-
 drivers/net/ethernet/freescale/enetc/enetc.c  |  31 ++-
 drivers/net/ethernet/freescale/enetc/enetc.h  |   1 +
 .../net/ethernet/freescale/enetc/enetc_hw.h   |   4 +
 drivers/net/ethernet/mscc/ocelot.c            |  51 +++++
 drivers/net/ethernet/mscc/ocelot.h            |   2 +
 drivers/net/ethernet/mscc/ocelot_mm.c         |  56 ++++++
 include/linux/ethtool_netlink.h               |   6 +
 include/net/pkt_sched.h                       |   1 +
 include/soc/mscc/ocelot.h                     |   6 +
 include/uapi/linux/pkt_sched.h                |  17 ++
 net/ethtool/mm.c                              |  24 ++-
 net/sched/sch_mqprio.c                        | 182 +++++++++++++++---
 net/sched/sch_mqprio_lib.c                    |  14 ++
 net/sched/sch_mqprio_lib.h                    |   2 +
 net/sched/sch_taprio.c                        |  65 +++++--
 16 files changed, 459 insertions(+), 47 deletions(-)
  

Comments

Vladimir Oltean Feb. 18, 2023, 3:20 p.m. UTC | #1
On Fri, Feb 17, 2023 at 01:21:14AM +0200, Vladimir Oltean wrote:
> The last RFC in August 2022 contained a proposal for the UAPI of both
> TSN standards which together form Frame Preemption (802.1Q and 802.3):
> https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/
> 
> It wasn't clear at the time whether the 802.1Q portion of Frame Preemption
> should be exposed via the tc qdisc (mqprio, taprio) or via some other
> layer (perhaps also ethtool like the 802.3 portion).
> 
> So the 802.3 portion got submitted separately and finally was accepted:
> https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/
> 
> leaving the only remaining question: how do we expose the 802.1Q bits?
> 
> This series proposes that we use the Qdisc layer, through separate
> (albeit very similar) UAPI in mqprio and taprio, and that both these
> Qdiscs pass the information down to the offloading device driver through
> the common mqprio offload structure (which taprio also passes).
> 
> Implementations are provided for the NXP LS1028A on-board Ethernet
> (enetc, felix).
> 
> Some patches should have maybe belonged to separate series, leaving here
> only patches 09/12 - 12/12, for ease of review. That may be true,
> however due to a perceived lack of time to wait for the prerequisite
> cleanup to be merged, here they are all together.
> 
> Vladimir Oltean (12):
>   net: enetc: rename "mqprio" to "qopt"
>   net: mscc: ocelot: add support for mqprio offload
>   net: dsa: felix: act upon the mqprio qopt in taprio offload
>   net: ethtool: fix __ethtool_dev_mm_supported() implementation
>   net: ethtool: create and export ethtool_dev_mm_supported()
>   net/sched: mqprio: simplify handling of nlattr portion of TCA_OPTIONS
>   net/sched: mqprio: add extack to mqprio_parse_nlattr()
>   net/sched: mqprio: add an extack message to mqprio_parse_opt()
>   net/sched: mqprio: allow per-TC user input of FP adminStatus
>   net/sched: taprio: allow per-TC user input of FP adminStatus
>   net: mscc: ocelot: add support for preemptible traffic classes
>   net: enetc: add support for preemptible traffic classes
> 
>  drivers/net/dsa/ocelot/felix_vsc9959.c        |  44 ++++-
>  drivers/net/ethernet/freescale/enetc/enetc.c  |  31 ++-
>  drivers/net/ethernet/freescale/enetc/enetc.h  |   1 +
>  .../net/ethernet/freescale/enetc/enetc_hw.h   |   4 +
>  drivers/net/ethernet/mscc/ocelot.c            |  51 +++++
>  drivers/net/ethernet/mscc/ocelot.h            |   2 +
>  drivers/net/ethernet/mscc/ocelot_mm.c         |  56 ++++++
>  include/linux/ethtool_netlink.h               |   6 +
>  include/net/pkt_sched.h                       |   1 +
>  include/soc/mscc/ocelot.h                     |   6 +
>  include/uapi/linux/pkt_sched.h                |  17 ++
>  net/ethtool/mm.c                              |  24 ++-
>  net/sched/sch_mqprio.c                        | 182 +++++++++++++++---
>  net/sched/sch_mqprio_lib.c                    |  14 ++
>  net/sched/sch_mqprio_lib.h                    |   2 +
>  net/sched/sch_taprio.c                        |  65 +++++--
>  16 files changed, 459 insertions(+), 47 deletions(-)
> 
> -- 
> 2.34.1
>

Seeing that there is no feedback on the proposed UAPI, I'd be tempted
to resend this, with just the modular build fixed (export the
ethtool_dev_mm_supported() symbol).

Would anyone hate me for doing this, considering that the merge window
is close? Does anyone need some time to take a closer look at this, or
think about a better alternative?
  
Ferenc Fejes Feb. 19, 2023, 9:47 a.m. UTC | #2
Hi Vladimir!

On Sat, 2023-02-18 at 17:20 +0200, Vladimir Oltean wrote:
> On Fri, Feb 17, 2023 at 01:21:14AM +0200, Vladimir Oltean wrote:
> > The last RFC in August 2022 contained a proposal for the UAPI of
> > both
> > TSN standards which together form Frame Preemption (802.1Q and
> > 802.3):
> > https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/
> > 
> > It wasn't clear at the time whether the 802.1Q portion of Frame
> > Preemption
> > should be exposed via the tc qdisc (mqprio, taprio) or via some
> > other
> > layer (perhaps also ethtool like the 802.3 portion).
> > 
> > So the 802.3 portion got submitted separately and finally was
> > accepted:
> > https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/
> > 
> > leaving the only remaining question: how do we expose the 802.1Q
> > bits?
> > 
> > This series proposes that we use the Qdisc layer, through separate
> > (albeit very similar) UAPI in mqprio and taprio, and that both
> > these
> > Qdiscs pass the information down to the offloading device driver
> > through
> > the common mqprio offload structure (which taprio also passes).
> > 
> > Implementations are provided for the NXP LS1028A on-board Ethernet
> > (enetc, felix).
> > 
> > Some patches should have maybe belonged to separate series, leaving
> > here
> > only patches 09/12 - 12/12, for ease of review. That may be true,
> > however due to a perceived lack of time to wait for the
> > prerequisite
> > cleanup to be merged, here they are all together.
> > 
> > Vladimir Oltean (12):
> >   net: enetc: rename "mqprio" to "qopt"
> >   net: mscc: ocelot: add support for mqprio offload
> >   net: dsa: felix: act upon the mqprio qopt in taprio offload
> >   net: ethtool: fix __ethtool_dev_mm_supported() implementation
> >   net: ethtool: create and export ethtool_dev_mm_supported()
> >   net/sched: mqprio: simplify handling of nlattr portion of
> > TCA_OPTIONS
> >   net/sched: mqprio: add extack to mqprio_parse_nlattr()
> >   net/sched: mqprio: add an extack message to mqprio_parse_opt()
> >   net/sched: mqprio: allow per-TC user input of FP adminStatus
> >   net/sched: taprio: allow per-TC user input of FP adminStatus
> >   net: mscc: ocelot: add support for preemptible traffic classes
> >   net: enetc: add support for preemptible traffic classes
> > 
> >  drivers/net/dsa/ocelot/felix_vsc9959.c        |  44 ++++-
> >  drivers/net/ethernet/freescale/enetc/enetc.c  |  31 ++-
> >  drivers/net/ethernet/freescale/enetc/enetc.h  |   1 +
> >  .../net/ethernet/freescale/enetc/enetc_hw.h   |   4 +
> >  drivers/net/ethernet/mscc/ocelot.c            |  51 +++++
> >  drivers/net/ethernet/mscc/ocelot.h            |   2 +
> >  drivers/net/ethernet/mscc/ocelot_mm.c         |  56 ++++++
> >  include/linux/ethtool_netlink.h               |   6 +
> >  include/net/pkt_sched.h                       |   1 +
> >  include/soc/mscc/ocelot.h                     |   6 +
> >  include/uapi/linux/pkt_sched.h                |  17 ++
> >  net/ethtool/mm.c                              |  24 ++-
> >  net/sched/sch_mqprio.c                        | 182
> > +++++++++++++++---
> >  net/sched/sch_mqprio_lib.c                    |  14 ++
> >  net/sched/sch_mqprio_lib.h                    |   2 +
> >  net/sched/sch_taprio.c                        |  65 +++++--
> >  16 files changed, 459 insertions(+), 47 deletions(-)
> > 
> > -- 
> > 2.34.1
> > 
> 
> Seeing that there is no feedback on the proposed UAPI, I'd be tempted
> to resend this, with just the modular build fixed (export the
> ethtool_dev_mm_supported() symbol).
> 
> Would anyone hate me for doing this, considering that the merge
> window
> is close? Does anyone need some time to take a closer look at this,
> or
> think about a better alternative?

Do you have the iproute2 part? Sorry if I missed it, but it would be
nice to see how is that UAPI exposed for the config tools. Is there any
new parameter for mqprio/taprio?

Best,
Ferenc
  
Vladimir Oltean Feb. 19, 2023, 12:58 p.m. UTC | #3
Hi Ferenc,

On Sun, Feb 19, 2023 at 10:47:31AM +0100, Ferenc Fejes wrote:
> Do you have the iproute2 part? Sorry if I missed it, but it would be
> nice to see how is that UAPI exposed for the config tools. Is there any
> new parameter for mqprio/taprio?

I haven't posted the iproute2 part (yet). For those familiar with my
recent development, FP is a per-traffic-class netlink attribute just
like queueMaxSDU from tc-taprio. That was exposed in iproute2 as an
array of values, one per tc.

What I have in my tree would allow something like this:

tc qdisc replace dev $swp1 root stab overhead 20 taprio \
	num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	base-time 0 \
	sched-entry S 0x7e 900000 \
	sched-entry S 0x82 100000 \
	max-sdu 0 0 0 0 0 0 0 200 \
	fp P E E E E E E E \   # this is new (one entry per tc)
	flags 0x2

tc qdisc replace dev $swp1 root mqprio \
	num_tc 8 \
	map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
	fp P E E E E E E E \   # this is new (one entry per tc)
	hw 1

of course the exact syntax is a potential matter of debate on its own,
and does not really matter for the purpose of defining the kernel UAPI,
which is why I wanted to keep discussions separate.

For hardware which understands preemptible queues rather than traffic
classes, how many queues are preemptible, and what are their offsets,
will be deduced by translating the "queues" argument.

For hardware which understands preemptible priorities rather than
traffic classes, which priorities are preemptible will be deduced by
translating the "map" argument.

The traffic class is the kernel entity which has the preemptible
priority in my proposed UAPI because this is what my analysis of the
standard has deduced that the preemptible quality is fundamentally
attached to.

Considering that the UAPI for FP is a topic that has been discussed to
death at least since August without any really new input since then, I'm
going to submit v2 later today, and the iproute2 patch set afterwards
(still need to write man page entries for that).
  
Ferenc Fejes Feb. 20, 2023, 8:11 a.m. UTC | #4
Hi Vladimir!

Thank you for the update!

On Sun, 2023-02-19 at 14:58 +0200, Vladimir Oltean wrote:
> Hi Ferenc,
> 
> On Sun, Feb 19, 2023 at 10:47:31AM +0100, Ferenc Fejes wrote:
> > Do you have the iproute2 part? Sorry if I missed it, but it would
> > be
> > nice to see how is that UAPI exposed for the config tools. Is there
> > any
> > new parameter for mqprio/taprio?
> 
> I haven't posted the iproute2 part (yet). For those familiar with my
> recent development, FP is a per-traffic-class netlink attribute just
> like queueMaxSDU from tc-taprio. That was exposed in iproute2 as an
> array of values, one per tc.
> 
> What I have in my tree would allow something like this:
> 
> tc qdisc replace dev $swp1 root stab overhead 20 taprio \
>         num_tc 8 \
>         map 0 1 2 3 4 5 6 7 \
>         queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
>         base-time 0 \
>         sched-entry S 0x7e 900000 \
>         sched-entry S 0x82 100000 \
>         max-sdu 0 0 0 0 0 0 0 200 \
>         fp P E E E E E E E \   # this is new (one entry per tc)
>         flags 0x2
> 
> tc qdisc replace dev $swp1 root mqprio \
>         num_tc 8 \
>         map 0 1 2 3 4 5 6 7 \
>         queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
>         fp P E E E E E E E \   # this is new (one entry per tc)
>         hw 1
> 
> of course the exact syntax is a potential matter of debate on its
> own,
> and does not really matter for the purpose of defining the kernel
> UAPI,
> which is why I wanted to keep discussions separate.

Fair enough. What you have right here is pretty straightforward IMO, I
would definitely support something like this.

> 
> For hardware which understands preemptible queues rather than traffic
> classes, how many queues are preemptible, and what are their offsets,
> will be deduced by translating the "queues" argument.
> 
> For hardware which understands preemptible priorities rather than
> traffic classes, which priorities are preemptible will be deduced by
> translating the "map" argument.

Great, that cover both cases with the same UAPI. I love the fact that
this even lets open the possibility to use prio-s (map) instead of
queues for FP.

> 
> The traffic class is the kernel entity which has the preemptible
> priority in my proposed UAPI because this is what my analysis of the
> standard has deduced that the preemptible quality is fundamentally
> attached to.
> 
> Considering that the UAPI for FP is a topic that has been discussed
> to
> death at least since August without any really new input since then,
> I'm
> going to submit v2 later today, and the iproute2 patch set afterwards
> (still need to write man page entries for that).

Best,
Ferenc