[v3,net-next,1/3] ipv6: remove dst_allfrag test on ipv6 output

Message ID e721c615e22fc4d3d53bfa230d5d71462ae9c9a8.1697779681.git.yan@cloudflare.com
State New
Headers
Series ipv6: avoid atomic fragment on GSO output |

Commit Message

Yan Zhai Oct. 20, 2023, 5:32 a.m. UTC
  dst_allfrag was added before the first git commit:

https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html

The feature would send packets to the fragmentation path if a box
receives a PMTU value with less than 1280 byte. However, since commit
9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
message would be simply discarded. The feature flag is neither supported
in iproute2 utility. In theory one can still manipulate it with direct
netlink message, but it is not ideal because it was based on obsoleted
guidance of RFC-2460 (replaced by RFC-8200).

The feature test would always return false at the moment, so remove it
from the output path.

Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
 net/ipv6/ip6_output.c | 1 -
 1 file changed, 1 deletion(-)
  

Comments

Eric Dumazet Oct. 20, 2023, 6:06 a.m. UTC | #1
On Fri, Oct 20, 2023 at 7:32 AM Yan Zhai <yan@cloudflare.com> wrote:
>
> dst_allfrag was added before the first git commit:
>
> https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html
>
> The feature would send packets to the fragmentation path if a box
> receives a PMTU value with less than 1280 byte. However, since commit
> 9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
> message would be simply discarded. The feature flag is neither supported
> in iproute2 utility. In theory one can still manipulate it with direct
> netlink message, but it is not ideal because it was based on obsoleted
> guidance of RFC-2460 (replaced by RFC-8200).
>
> The feature test would always return false at the moment, so remove it
> from the output path.

What about other callers of dst_allfrag() ?

This feature seems broken atm.
  
Yan Zhai Oct. 20, 2023, 6:39 a.m. UTC | #2
On Fri, Oct 20, 2023 at 1:06 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Oct 20, 2023 at 7:32 AM Yan Zhai <yan@cloudflare.com> wrote:
> >
> > dst_allfrag was added before the first git commit:
> >
> > https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html
> >
> > The feature would send packets to the fragmentation path if a box
> > receives a PMTU value with less than 1280 byte. However, since commit
> > 9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
> > message would be simply discarded. The feature flag is neither supported
> > in iproute2 utility. In theory one can still manipulate it with direct
> > netlink message, but it is not ideal because it was based on obsoleted
> > guidance of RFC-2460 (replaced by RFC-8200).
> >
> > The feature test would always return false at the moment, so remove it
> > from the output path.
>
> What about other callers of dst_allfrag() ?
>
> This feature seems broken atm.

It is broken as far as I can tell. The reason I removed just one
caller here is to keep the code simpler and consistent. If I don't do
so, I ought to test it for both GSO fast path and slow path to be
logically consistent. Seems an overkill to me. For the removal of the
rest, I'd hope it could come in as a standalone patch(set) because it
is not just callers but also those unnecessary flags and tests on IP
corks and sockets, not quite aligned with this patch's intention. I
noted you have drafted something like this in the past:

https://lkml.kernel.org/netdev/1335348157.3274.30.camel@edumazet-glaptop/

I guess it might be a good base point to work on as a new patch(set)?
What's your call on this?

Yan
  
Eric Dumazet Oct. 20, 2023, 9:23 a.m. UTC | #3
On Fri, Oct 20, 2023 at 8:39 AM Yan Zhai <yan@cloudflare.com> wrote:
>
> On Fri, Oct 20, 2023 at 1:06 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Fri, Oct 20, 2023 at 7:32 AM Yan Zhai <yan@cloudflare.com> wrote:
> > >
> > > dst_allfrag was added before the first git commit:
> > >
> > > https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html
> > >
> > > The feature would send packets to the fragmentation path if a box
> > > receives a PMTU value with less than 1280 byte. However, since commit
> > > 9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
> > > message would be simply discarded. The feature flag is neither supported
> > > in iproute2 utility. In theory one can still manipulate it with direct
> > > netlink message, but it is not ideal because it was based on obsoleted
> > > guidance of RFC-2460 (replaced by RFC-8200).
> > >
> > > The feature test would always return false at the moment, so remove it
> > > from the output path.
> >
> > What about other callers of dst_allfrag() ?
> >
> > This feature seems broken atm.
>
> It is broken as far as I can tell. The reason I removed just one
> caller here is to keep the code simpler and consistent. If I don't do
> so, I ought to test it for both GSO fast path and slow path to be
> logically consistent. Seems an overkill to me. For the removal of the
> rest, I'd hope it could come in as a standalone patch(set) because it
> is not just callers but also those unnecessary flags and tests on IP
> corks and sockets, not quite aligned with this patch's intention. I
> noted you have drafted something like this in the past:
>
> https://lkml.kernel.org/netdev/1335348157.3274.30.camel@edumazet-glaptop/
>
> I guess it might be a good base point to work on as a new patch(set)?
> What's your call on this?
>

I am about to send a TCP patch series to enable usec TSval (instead of ms),
and was planning to use a new RTAX_FEATURE_TCP_USEC_TS.

I also noticed that iproute2 was not supporting RTAX_FEATURE_ALLFRAG,
so we might kill it completely ?

commit b258c87639f77d772c077a4e45dad602c62c9f1c
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Oct 18 09:33:38 2023 +0000

    tcp: add RTAX_FEATURE_TCP_USEC_TS

    This new dst feature flag will be used to allow TCP to use usec
    based timestamps instead of msec ones.

    ip route .... feature tcp_usec_ts

    Also document that RTAX_FEATURE_SACK and RTAX_FEATURE_TIMESTAMP
    are unused.

    Note that iproute2 does yet support RTAX_FEATURE_ALLFRAG ?

    Signed-off-by: Eric Dumazet <edumazet@google.com>

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 51c13cf9c5aee4a2d1ab33c1a89043383d67b9cf..aa2482a0614aa685590fcc73819cbe1baac63d66
100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -502,13 +502,17 @@ enum {

 #define RTAX_MAX (__RTAX_MAX - 1)

-#define RTAX_FEATURE_ECN       (1 << 0)
-#define RTAX_FEATURE_SACK      (1 << 1)
-#define RTAX_FEATURE_TIMESTAMP (1 << 2)
-#define RTAX_FEATURE_ALLFRAG   (1 << 3)
-
-#define RTAX_FEATURE_MASK      (RTAX_FEATURE_ECN | RTAX_FEATURE_SACK | \
-                                RTAX_FEATURE_TIMESTAMP | RTAX_FEATURE_ALLFRAG)
+#define RTAX_FEATURE_ECN               (1 << 0)
+#define RTAX_FEATURE_SACK              (1 << 1) /* unused */
+#define RTAX_FEATURE_TIMESTAMP         (1 << 2) /* unused */
+#define RTAX_FEATURE_ALLFRAG           (1 << 3)
+#define RTAX_FEATURE_TCP_USEC_TS       (1 << 4)
+
+#define RTAX_FEATURE_MASK      (RTAX_FEATURE_ECN |             \
+                                RTAX_FEATURE_SACK |            \
+                                RTAX_FEATURE_TIMESTAMP |       \
+                                RTAX_FEATURE_ALLFRAG |         \
+                                RTAX_FEATURE_TCP_USEC_TS)

 struct rta_session {
        __u8    proto;
  
Florian Westphal Oct. 20, 2023, 10 a.m. UTC | #4
Eric Dumazet <edumazet@google.com> wrote:
> I also noticed that iproute2 was not supporting RTAX_FEATURE_ALLFRAG,
> so we might kill it completely ?

Yes, I intentionally did not expose it in iproute2.

Lets remove ALLFRAG.
  
Yan Zhai Oct. 20, 2023, 1:57 p.m. UTC | #5
On Fri, Oct 20, 2023 at 5:01 AM Florian Westphal <fw@strlen.de> wrote:
>
> Eric Dumazet <edumazet@google.com> wrote:
> > I also noticed that iproute2 was not supporting RTAX_FEATURE_ALLFRAG,
> > so we might kill it completely ?
>
> Yes, I intentionally did not expose it in iproute2.
>
> Lets remove ALLFRAG.

I will do that in V4 later today to completely clean it up then.
Always cheerful to delete code!

Yan
  

Patch

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index a471c7e91761..ae87a3817d4a 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -189,7 +189,6 @@  static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff
 		return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu);
 
 	if ((skb->len > mtu && !skb_is_gso(skb)) ||
-	    dst_allfrag(skb_dst(skb)) ||
 	    (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size))
 		return ip6_fragment(net, sk, skb, ip6_finish_output2);
 	else