[4.19,092/229] once: add DO_ONCE_SLOW() for sleepable contexts

Message ID 20221024113002.025977656@linuxfoundation.org
State New
Headers
Series None |

Commit Message

Greg KH Oct. 24, 2022, 11:30 a.m. UTC
  From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 62c07983bef9d3e78e71189441e1a470f0d1e653 ]

Christophe Leroy reported a ~80ms latency spike
happening at first TCP connect() time.

This is because __inet_hash_connect() uses get_random_once()
to populate a perturbation table which became quite big
after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")

get_random_once() uses DO_ONCE(), which block hard irqs for the duration
of the operation.

This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
for operations where we prefer to stay in process context.

Then __inet_hash_connect() can use get_random_slow_once()
to populate its perturbation table.

Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/once.h       | 28 ++++++++++++++++++++++++++++
 lib/once.c                 | 30 ++++++++++++++++++++++++++++++
 net/ipv4/inet_hashtables.c |  4 ++--
 3 files changed, 60 insertions(+), 2 deletions(-)
  

Comments

Yongqin Liu Nov. 1, 2022, 6:07 a.m. UTC | #1
Hello,

As mentioned in the thread for the 5.4 version here[1], it causes a
crash for the 4.19 kernel too.
Just paste the log here for reference:

[ 1375.219830] wlan0: authenticate with 98:da:c4:4f:77:11
[ 1375.230257] wlan0: send auth to 98:da:c4:4f:77:11 (try 1/3)
[ 1375.262598] wlan0: authenticated
[ 1375.267766] wlan0: associate with 98:da:c4:4f:77:11 (try 1/3)
[ 1375.285716] wlan0: RX AssocResp from 98:da:c4:4f:77:11
(capab=0x1411 status=0 aid=1)
[ 1375.310066] wlan0: associated
[ 1375.332056] wlcore: Association completed.
[ 1376.105149] Unable to handle kernel write to read-only memory at
virtual address ffffff80092a6f18
[ 1376.114129] Mem abort info:
[ 1376.117072]   ESR = 0x9600004f
[ 1376.120138]   Exception class = DABT (current EL), IL = 32 bits
[ 1376.126091]   SET = 0, FnV = 0
[ 1376.129147]   EA = 0, S1PTW = 0
[ 1376.132301] Data abort info:
[ 1376.135176]   ISV = 0, ISS = 0x0000004f
[ 1376.139012]   CM = 0, WnR = 1
[ 1376.141997] swapper pgtable: 4k pages, 39-bit VAs, pgdp = 00000000a9880db2
[ 1376.148923] [ffffff80092a6f18] pgd=000000021fffe003,
pud=000000021fffe003, pmd=000000021fffb003, pte=00600000012a6793
[ 1376.159550] Internal error: Oops: 9600004f [#1] PREEMPT SMP
[ 1376.165119] Modules linked in:
[ 1376.168168] Process TlsVerify_100 (pid: 5454, stack limit =
0x0000000025f9c863)
[ 1376.175472] CPU: 5 PID: 5454 Comm: TlsVerify_100 Not tainted
4.19.262-g8479d939a3b0 #1
[ 1376.183381] Hardware name: HiKey960 (DT)
[ 1376.187295] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 1376.192085] pc : __do_once_slow_done+0x14/0x98
[ 1376.196535] lr : __inet_hash_connect+0x480/0x484
[ 1376.201146] sp : ffffff800e203b80
[ 1376.204460] x29: ffffff800e203b80 x28: 0000000000000000
[ 1376.209768] x27: 0000000000006e48 x26: 0000000000000000
[ 1376.215071] x25: ffffff80098e6000 x24: 9b29771fbf0b881c
[ 1376.220375] x23: ffffff8009766600 x22: ffffff8009766bc0
[ 1376.225678] x21: ffffff8008c34198 x20: ffffff80098e65c0
[ 1376.230982] x19: ffffffc04cabd1c0 x18: 0000000005c65eec
[ 1376.236285] x17: 00000000175fece9 x16: 0000000071679066
[ 1376.241588] x15: 00000000467bf177 x14: 000000005f331e5c
[ 1376.246891] x13: 0000000000000014 x12: 00000000b82286ab
[ 1376.252194] x11: 0000000084c874ea x10: 00000000fdb36642
[ 1376.257497] x9 : 68962c3381a87000 x8 : 0000000000000001
[ 1376.262800] x7 : 0000000000000000 x6 : ffffffc2197c0000
[ 1376.268103] x5 : 000000008113af5d x4 : 0000000000000008
[ 1376.273406] x3 : 0000000000000030 x2 : 0000000000000000
[ 1376.278709] x1 : ffffff800976e568 x0 : ffffff80092a6f18
[ 1376.284012] Call trace:
[ 1376.286456]  __do_once_slow_done+0x14/0x98
[ 1376.290544]  __inet_hash_connect+0x480/0x484
[ 1376.294805]  inet_hash_connect+0x48/0x54
[ 1376.298720]  tcp_v4_connect+0x26c/0x3e4
[ 1376.302549]  __inet_stream_connect+0x2ac/0x308
[ 1376.306984]  inet_stream_connect+0x44/0x68
[ 1376.311072]  __sys_connect+0xb4/0x100
[ 1376.314725]  __arm64_sys_connect+0x1c/0x28
[ 1376.318816]  el0_svc_common+0xa4/0x188
[ 1376.322556]  el0_svc_handler+0x60/0x68
[ 1376.326298]  el0_svc+0x8/0x300
[ 1376.329345] Code: f9000bf5 a9024ff4 910003fd 52800028 (39000008)
[ 1376.335431] ---[ end trace 28410d3ffccfb491 ]---
[ 1376.351416] Kernel panic - not syncing: Fatal exception
[ 1376.356640] SMP: stopping secondary CPUs
[ 1376.360775] Kernel Offset: disabled
[ 1376.364257] CPU features: 0x10,20082004
[ 1376.368083] Memory Limit: none
[ 1376.382434] Rebooting in 5 seconds..

[1]: https://lore.kernel.org/lkml/20221029011211.4049810-1-ovt@google.com/

Thanks,
Yongqin Liu
  
Greg KH Nov. 1, 2022, 6:25 a.m. UTC | #2
On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote:
> Hello,
> 
> As mentioned in the thread for the 5.4 version here[1], it causes a
> crash for the 4.19 kernel too.
> Just paste the log here for reference:

Can you try this patch please:


diff --git a/include/linux/once.h b/include/linux/once.h
index bb58e1c3aa03..3a6671d961b9 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key,
 #define DO_ONCE_SLOW(func, ...)						     \
 	({								     \
 		bool ___ret = false;					     \
-		static bool __section(".data.once") ___done = false;	     \
+		static bool __section(.data.once) ___done = false;	     \
 		static DEFINE_STATIC_KEY_TRUE(___once_key);		     \
 		if (static_branch_unlikely(&___once_key)) {		     \
 			___ret = __do_once_slow_start(&___done);	     \
  
Yongqin Liu Nov. 1, 2022, noon UTC | #3
Hi, Greg

On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote:
> > Hello,
> >
> > As mentioned in the thread for the 5.4 version here[1], it causes a
> > crash for the 4.19 kernel too.
> > Just paste the log here for reference:
>
> Can you try this patch please:
>
>
> diff --git a/include/linux/once.h b/include/linux/once.h
> index bb58e1c3aa03..3a6671d961b9 100644
> --- a/include/linux/once.h
> +++ b/include/linux/once.h
> @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key,
>  #define DO_ONCE_SLOW(func, ...)                                                     \
>         ({                                                                   \
>                 bool ___ret = false;                                         \
> -               static bool __section(".data.once") ___done = false;         \
> +               static bool __section(.data.once) ___done = false;           \
>                 static DEFINE_STATIC_KEY_TRUE(___once_key);                  \
>                 if (static_branch_unlikely(&___once_key)) {                  \
>                         ___ret = __do_once_slow_start(&___done);             \


This change works, it does not cause kernel panic again after this
change is applied.
  
Greg KH Nov. 1, 2022, 1:35 p.m. UTC | #4
On Tue, Nov 01, 2022 at 08:00:03PM +0800, Yongqin Liu wrote:
> Hi, Greg
> 
> On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote:
> > > Hello,
> > >
> > > As mentioned in the thread for the 5.4 version here[1], it causes a
> > > crash for the 4.19 kernel too.
> > > Just paste the log here for reference:
> >
> > Can you try this patch please:
> >
> >
> > diff --git a/include/linux/once.h b/include/linux/once.h
> > index bb58e1c3aa03..3a6671d961b9 100644
> > --- a/include/linux/once.h
> > +++ b/include/linux/once.h
> > @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> >  #define DO_ONCE_SLOW(func, ...)                                                     \
> >         ({                                                                   \
> >                 bool ___ret = false;                                         \
> > -               static bool __section(".data.once") ___done = false;         \
> > +               static bool __section(.data.once) ___done = false;           \
> >                 static DEFINE_STATIC_KEY_TRUE(___once_key);                  \
> >                 if (static_branch_unlikely(&___once_key)) {                  \
> >                         ___ret = __do_once_slow_start(&___done);             \
> 
> 
> This change works, it does not cause kernel panic again after this
> change is applied.

Great, thanks!  Can I get a Tested-by: line for the changelog?

I'll queue this up in a bit and get it fixed in the next release.

thanks,

greg k-h
  
Yongqin Liu Nov. 1, 2022, 1:44 p.m. UTC | #5
On Tue, 1 Nov 2022 at 21:34, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Nov 01, 2022 at 08:00:03PM +0800, Yongqin Liu wrote:
> > Hi, Greg
> >
> > On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote:
> > > > Hello,
> > > >
> > > > As mentioned in the thread for the 5.4 version here[1], it causes a
> > > > crash for the 4.19 kernel too.
> > > > Just paste the log here for reference:
> > >
> > > Can you try this patch please:
> > >
> > >
> > > diff --git a/include/linux/once.h b/include/linux/once.h
> > > index bb58e1c3aa03..3a6671d961b9 100644
> > > --- a/include/linux/once.h
> > > +++ b/include/linux/once.h
> > > @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> > >  #define DO_ONCE_SLOW(func, ...)                                                     \
> > >         ({                                                                   \
> > >                 bool ___ret = false;                                         \
> > > -               static bool __section(".data.once") ___done = false;         \
> > > +               static bool __section(.data.once) ___done = false;           \
> > >                 static DEFINE_STATIC_KEY_TRUE(___once_key);                  \
> > >                 if (static_branch_unlikely(&___once_key)) {                  \
> > >                         ___ret = __do_once_slow_start(&___done);             \
> >
> >
> > This change works, it does not cause kernel panic again after this
> > change is applied.
>
> Great, thanks!  Can I get a Tested-by: line for the changelog?

Sure, Yongqin Liu <yongqin.liu@linaro.org>

> I'll queue this up in a bit and get it fixed in the next release.

BTW, to be clear, I tested it with one 4.19-q tree based on the latest
ACK android-4.19-q branch[1],
If there is something else I could help test, please let me know.

[1]: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.19-q
  

Patch

diff --git a/include/linux/once.h b/include/linux/once.h
index ae6f4eb41cbe..bb58e1c3aa03 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -5,10 +5,18 @@ 
 #include <linux/types.h>
 #include <linux/jump_label.h>
 
+/* Helpers used from arbitrary contexts.
+ * Hard irqs are blocked, be cautious.
+ */
 bool __do_once_start(bool *done, unsigned long *flags);
 void __do_once_done(bool *done, struct static_key_true *once_key,
 		    unsigned long *flags, struct module *mod);
 
+/* Variant for process contexts only. */
+bool __do_once_slow_start(bool *done);
+void __do_once_slow_done(bool *done, struct static_key_true *once_key,
+			 struct module *mod);
+
 /* Call a function exactly once. The idea of DO_ONCE() is to perform
  * a function call such as initialization of random seeds, etc, only
  * once, where DO_ONCE() can live in the fast-path. After @func has
@@ -52,9 +60,29 @@  void __do_once_done(bool *done, struct static_key_true *once_key,
 		___ret;							     \
 	})
 
+/* Variant of DO_ONCE() for process/sleepable contexts. */
+#define DO_ONCE_SLOW(func, ...)						     \
+	({								     \
+		bool ___ret = false;					     \
+		static bool __section(".data.once") ___done = false;	     \
+		static DEFINE_STATIC_KEY_TRUE(___once_key);		     \
+		if (static_branch_unlikely(&___once_key)) {		     \
+			___ret = __do_once_slow_start(&___done);	     \
+			if (unlikely(___ret)) {				     \
+				func(__VA_ARGS__);			     \
+				__do_once_slow_done(&___done, &___once_key,  \
+						    THIS_MODULE);	     \
+			}						     \
+		}							     \
+		___ret;							     \
+	})
+
 #define get_random_once(buf, nbytes)					     \
 	DO_ONCE(get_random_bytes, (buf), (nbytes))
 #define get_random_once_wait(buf, nbytes)                                    \
 	DO_ONCE(get_random_bytes_wait, (buf), (nbytes))                      \
 
+#define get_random_slow_once(buf, nbytes)				     \
+	DO_ONCE_SLOW(get_random_bytes, (buf), (nbytes))
+
 #endif /* _LINUX_ONCE_H */
diff --git a/lib/once.c b/lib/once.c
index 59149bf3bfb4..351f66aad310 100644
--- a/lib/once.c
+++ b/lib/once.c
@@ -66,3 +66,33 @@  void __do_once_done(bool *done, struct static_key_true *once_key,
 	once_disable_jump(once_key, mod);
 }
 EXPORT_SYMBOL(__do_once_done);
+
+static DEFINE_MUTEX(once_mutex);
+
+bool __do_once_slow_start(bool *done)
+	__acquires(once_mutex)
+{
+	mutex_lock(&once_mutex);
+	if (*done) {
+		mutex_unlock(&once_mutex);
+		/* Keep sparse happy by restoring an even lock count on
+		 * this mutex. In case we return here, we don't call into
+		 * __do_once_done but return early in the DO_ONCE_SLOW() macro.
+		 */
+		__acquire(once_mutex);
+		return false;
+	}
+
+	return true;
+}
+EXPORT_SYMBOL(__do_once_slow_start);
+
+void __do_once_slow_done(bool *done, struct static_key_true *once_key,
+			 struct module *mod)
+	__releases(once_mutex)
+{
+	*done = true;
+	mutex_unlock(&once_mutex);
+	once_disable_jump(once_key, mod);
+}
+EXPORT_SYMBOL(__do_once_slow_done);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 5295a579ec82..70070f1003a0 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -765,8 +765,8 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	if (likely(remaining > 1))
 		remaining &= ~1U;
 
-	net_get_random_once(table_perturb,
-			    INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
+	get_random_slow_once(table_perturb,
+			     INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
 	index = port_offset & (INET_TABLE_PERTURB_SIZE - 1);
 
 	offset = READ_ONCE(table_perturb[index]) + (port_offset >> 32);