Commit Message
Greg KH
Oct. 24, 2022, 11:30 a.m. UTC
From: Eric Dumazet <edumazet@google.com> [ Upstream commit 62c07983bef9d3e78e71189441e1a470f0d1e653 ] Christophe Leroy reported a ~80ms latency spike happening at first TCP connect() time. This is because __inet_hash_connect() uses get_random_once() to populate a perturbation table which became quite big after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16") get_random_once() uses DO_ONCE(), which block hard irqs for the duration of the operation. This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock for operations where we prefer to stay in process context. Then __inet_hash_connect() can use get_random_slow_once() to populate its perturbation table. Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16") Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> --- include/linux/once.h | 28 ++++++++++++++++++++++++++++ lib/once.c | 30 ++++++++++++++++++++++++++++++ net/ipv4/inet_hashtables.c | 4 ++-- 3 files changed, 60 insertions(+), 2 deletions(-)
Comments
Hello, As mentioned in the thread for the 5.4 version here[1], it causes a crash for the 4.19 kernel too. Just paste the log here for reference: [ 1375.219830] wlan0: authenticate with 98:da:c4:4f:77:11 [ 1375.230257] wlan0: send auth to 98:da:c4:4f:77:11 (try 1/3) [ 1375.262598] wlan0: authenticated [ 1375.267766] wlan0: associate with 98:da:c4:4f:77:11 (try 1/3) [ 1375.285716] wlan0: RX AssocResp from 98:da:c4:4f:77:11 (capab=0x1411 status=0 aid=1) [ 1375.310066] wlan0: associated [ 1375.332056] wlcore: Association completed. [ 1376.105149] Unable to handle kernel write to read-only memory at virtual address ffffff80092a6f18 [ 1376.114129] Mem abort info: [ 1376.117072] ESR = 0x9600004f [ 1376.120138] Exception class = DABT (current EL), IL = 32 bits [ 1376.126091] SET = 0, FnV = 0 [ 1376.129147] EA = 0, S1PTW = 0 [ 1376.132301] Data abort info: [ 1376.135176] ISV = 0, ISS = 0x0000004f [ 1376.139012] CM = 0, WnR = 1 [ 1376.141997] swapper pgtable: 4k pages, 39-bit VAs, pgdp = 00000000a9880db2 [ 1376.148923] [ffffff80092a6f18] pgd=000000021fffe003, pud=000000021fffe003, pmd=000000021fffb003, pte=00600000012a6793 [ 1376.159550] Internal error: Oops: 9600004f [#1] PREEMPT SMP [ 1376.165119] Modules linked in: [ 1376.168168] Process TlsVerify_100 (pid: 5454, stack limit = 0x0000000025f9c863) [ 1376.175472] CPU: 5 PID: 5454 Comm: TlsVerify_100 Not tainted 4.19.262-g8479d939a3b0 #1 [ 1376.183381] Hardware name: HiKey960 (DT) [ 1376.187295] pstate: 60400005 (nZCv daif +PAN -UAO) [ 1376.192085] pc : __do_once_slow_done+0x14/0x98 [ 1376.196535] lr : __inet_hash_connect+0x480/0x484 [ 1376.201146] sp : ffffff800e203b80 [ 1376.204460] x29: ffffff800e203b80 x28: 0000000000000000 [ 1376.209768] x27: 0000000000006e48 x26: 0000000000000000 [ 1376.215071] x25: ffffff80098e6000 x24: 9b29771fbf0b881c [ 1376.220375] x23: ffffff8009766600 x22: ffffff8009766bc0 [ 1376.225678] x21: ffffff8008c34198 x20: ffffff80098e65c0 [ 1376.230982] x19: ffffffc04cabd1c0 x18: 0000000005c65eec [ 1376.236285] x17: 00000000175fece9 x16: 0000000071679066 [ 1376.241588] x15: 00000000467bf177 x14: 000000005f331e5c [ 1376.246891] x13: 0000000000000014 x12: 00000000b82286ab [ 1376.252194] x11: 0000000084c874ea x10: 00000000fdb36642 [ 1376.257497] x9 : 68962c3381a87000 x8 : 0000000000000001 [ 1376.262800] x7 : 0000000000000000 x6 : ffffffc2197c0000 [ 1376.268103] x5 : 000000008113af5d x4 : 0000000000000008 [ 1376.273406] x3 : 0000000000000030 x2 : 0000000000000000 [ 1376.278709] x1 : ffffff800976e568 x0 : ffffff80092a6f18 [ 1376.284012] Call trace: [ 1376.286456] __do_once_slow_done+0x14/0x98 [ 1376.290544] __inet_hash_connect+0x480/0x484 [ 1376.294805] inet_hash_connect+0x48/0x54 [ 1376.298720] tcp_v4_connect+0x26c/0x3e4 [ 1376.302549] __inet_stream_connect+0x2ac/0x308 [ 1376.306984] inet_stream_connect+0x44/0x68 [ 1376.311072] __sys_connect+0xb4/0x100 [ 1376.314725] __arm64_sys_connect+0x1c/0x28 [ 1376.318816] el0_svc_common+0xa4/0x188 [ 1376.322556] el0_svc_handler+0x60/0x68 [ 1376.326298] el0_svc+0x8/0x300 [ 1376.329345] Code: f9000bf5 a9024ff4 910003fd 52800028 (39000008) [ 1376.335431] ---[ end trace 28410d3ffccfb491 ]--- [ 1376.351416] Kernel panic - not syncing: Fatal exception [ 1376.356640] SMP: stopping secondary CPUs [ 1376.360775] Kernel Offset: disabled [ 1376.364257] CPU features: 0x10,20082004 [ 1376.368083] Memory Limit: none [ 1376.382434] Rebooting in 5 seconds.. [1]: https://lore.kernel.org/lkml/20221029011211.4049810-1-ovt@google.com/ Thanks, Yongqin Liu
On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote: > Hello, > > As mentioned in the thread for the 5.4 version here[1], it causes a > crash for the 4.19 kernel too. > Just paste the log here for reference: Can you try this patch please: diff --git a/include/linux/once.h b/include/linux/once.h index bb58e1c3aa03..3a6671d961b9 100644 --- a/include/linux/once.h +++ b/include/linux/once.h @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key, #define DO_ONCE_SLOW(func, ...) \ ({ \ bool ___ret = false; \ - static bool __section(".data.once") ___done = false; \ + static bool __section(.data.once) ___done = false; \ static DEFINE_STATIC_KEY_TRUE(___once_key); \ if (static_branch_unlikely(&___once_key)) { \ ___ret = __do_once_slow_start(&___done); \
Hi, Greg On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote: > > Hello, > > > > As mentioned in the thread for the 5.4 version here[1], it causes a > > crash for the 4.19 kernel too. > > Just paste the log here for reference: > > Can you try this patch please: > > > diff --git a/include/linux/once.h b/include/linux/once.h > index bb58e1c3aa03..3a6671d961b9 100644 > --- a/include/linux/once.h > +++ b/include/linux/once.h > @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key, > #define DO_ONCE_SLOW(func, ...) \ > ({ \ > bool ___ret = false; \ > - static bool __section(".data.once") ___done = false; \ > + static bool __section(.data.once) ___done = false; \ > static DEFINE_STATIC_KEY_TRUE(___once_key); \ > if (static_branch_unlikely(&___once_key)) { \ > ___ret = __do_once_slow_start(&___done); \ This change works, it does not cause kernel panic again after this change is applied.
On Tue, Nov 01, 2022 at 08:00:03PM +0800, Yongqin Liu wrote: > Hi, Greg > > On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > > > On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote: > > > Hello, > > > > > > As mentioned in the thread for the 5.4 version here[1], it causes a > > > crash for the 4.19 kernel too. > > > Just paste the log here for reference: > > > > Can you try this patch please: > > > > > > diff --git a/include/linux/once.h b/include/linux/once.h > > index bb58e1c3aa03..3a6671d961b9 100644 > > --- a/include/linux/once.h > > +++ b/include/linux/once.h > > @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key, > > #define DO_ONCE_SLOW(func, ...) \ > > ({ \ > > bool ___ret = false; \ > > - static bool __section(".data.once") ___done = false; \ > > + static bool __section(.data.once) ___done = false; \ > > static DEFINE_STATIC_KEY_TRUE(___once_key); \ > > if (static_branch_unlikely(&___once_key)) { \ > > ___ret = __do_once_slow_start(&___done); \ > > > This change works, it does not cause kernel panic again after this > change is applied. Great, thanks! Can I get a Tested-by: line for the changelog? I'll queue this up in a bit and get it fixed in the next release. thanks, greg k-h
On Tue, 1 Nov 2022 at 21:34, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Tue, Nov 01, 2022 at 08:00:03PM +0800, Yongqin Liu wrote: > > Hi, Greg > > > > On Tue, 1 Nov 2022 at 14:26, Greg Kroah-Hartman > > <gregkh@linuxfoundation.org> wrote: > > > > > > On Tue, Nov 01, 2022 at 02:07:35PM +0800, Yongqin Liu wrote: > > > > Hello, > > > > > > > > As mentioned in the thread for the 5.4 version here[1], it causes a > > > > crash for the 4.19 kernel too. > > > > Just paste the log here for reference: > > > > > > Can you try this patch please: > > > > > > > > > diff --git a/include/linux/once.h b/include/linux/once.h > > > index bb58e1c3aa03..3a6671d961b9 100644 > > > --- a/include/linux/once.h > > > +++ b/include/linux/once.h > > > @@ -64,7 +64,7 @@ void __do_once_slow_done(bool *done, struct static_key_true *once_key, > > > #define DO_ONCE_SLOW(func, ...) \ > > > ({ \ > > > bool ___ret = false; \ > > > - static bool __section(".data.once") ___done = false; \ > > > + static bool __section(.data.once) ___done = false; \ > > > static DEFINE_STATIC_KEY_TRUE(___once_key); \ > > > if (static_branch_unlikely(&___once_key)) { \ > > > ___ret = __do_once_slow_start(&___done); \ > > > > > > This change works, it does not cause kernel panic again after this > > change is applied. > > Great, thanks! Can I get a Tested-by: line for the changelog? Sure, Yongqin Liu <yongqin.liu@linaro.org> > I'll queue this up in a bit and get it fixed in the next release. BTW, to be clear, I tested it with one 4.19-q tree based on the latest ACK android-4.19-q branch[1], If there is something else I could help test, please let me know. [1]: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.19-q
diff --git a/include/linux/once.h b/include/linux/once.h index ae6f4eb41cbe..bb58e1c3aa03 100644 --- a/include/linux/once.h +++ b/include/linux/once.h @@ -5,10 +5,18 @@ #include <linux/types.h> #include <linux/jump_label.h> +/* Helpers used from arbitrary contexts. + * Hard irqs are blocked, be cautious. + */ bool __do_once_start(bool *done, unsigned long *flags); void __do_once_done(bool *done, struct static_key_true *once_key, unsigned long *flags, struct module *mod); +/* Variant for process contexts only. */ +bool __do_once_slow_start(bool *done); +void __do_once_slow_done(bool *done, struct static_key_true *once_key, + struct module *mod); + /* Call a function exactly once. The idea of DO_ONCE() is to perform * a function call such as initialization of random seeds, etc, only * once, where DO_ONCE() can live in the fast-path. After @func has @@ -52,9 +60,29 @@ void __do_once_done(bool *done, struct static_key_true *once_key, ___ret; \ }) +/* Variant of DO_ONCE() for process/sleepable contexts. */ +#define DO_ONCE_SLOW(func, ...) \ + ({ \ + bool ___ret = false; \ + static bool __section(".data.once") ___done = false; \ + static DEFINE_STATIC_KEY_TRUE(___once_key); \ + if (static_branch_unlikely(&___once_key)) { \ + ___ret = __do_once_slow_start(&___done); \ + if (unlikely(___ret)) { \ + func(__VA_ARGS__); \ + __do_once_slow_done(&___done, &___once_key, \ + THIS_MODULE); \ + } \ + } \ + ___ret; \ + }) + #define get_random_once(buf, nbytes) \ DO_ONCE(get_random_bytes, (buf), (nbytes)) #define get_random_once_wait(buf, nbytes) \ DO_ONCE(get_random_bytes_wait, (buf), (nbytes)) \ +#define get_random_slow_once(buf, nbytes) \ + DO_ONCE_SLOW(get_random_bytes, (buf), (nbytes)) + #endif /* _LINUX_ONCE_H */ diff --git a/lib/once.c b/lib/once.c index 59149bf3bfb4..351f66aad310 100644 --- a/lib/once.c +++ b/lib/once.c @@ -66,3 +66,33 @@ void __do_once_done(bool *done, struct static_key_true *once_key, once_disable_jump(once_key, mod); } EXPORT_SYMBOL(__do_once_done); + +static DEFINE_MUTEX(once_mutex); + +bool __do_once_slow_start(bool *done) + __acquires(once_mutex) +{ + mutex_lock(&once_mutex); + if (*done) { + mutex_unlock(&once_mutex); + /* Keep sparse happy by restoring an even lock count on + * this mutex. In case we return here, we don't call into + * __do_once_done but return early in the DO_ONCE_SLOW() macro. + */ + __acquire(once_mutex); + return false; + } + + return true; +} +EXPORT_SYMBOL(__do_once_slow_start); + +void __do_once_slow_done(bool *done, struct static_key_true *once_key, + struct module *mod) + __releases(once_mutex) +{ + *done = true; + mutex_unlock(&once_mutex); + once_disable_jump(once_key, mod); +} +EXPORT_SYMBOL(__do_once_slow_done); diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 5295a579ec82..70070f1003a0 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -765,8 +765,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, if (likely(remaining > 1)) remaining &= ~1U; - net_get_random_once(table_perturb, - INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb)); + get_random_slow_once(table_perturb, + INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb)); index = port_offset & (INET_TABLE_PERTURB_SIZE - 1); offset = READ_ONCE(table_perturb[index]) + (port_offset >> 32);