Message ID | 20230616191744.202292-1-jlayton@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1567138vqr; Fri, 16 Jun 2023 12:27:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6zj638ODomSeewxSkFQKYrSIIfcWyNYmzSbbnCXwezWm8vyLR4x2HzWiJPLucblzrObym3 X-Received: by 2002:a05:6a20:1455:b0:11f:1b6f:6658 with SMTP id a21-20020a056a20145500b0011f1b6f6658mr2117864pzi.11.1686943661479; Fri, 16 Jun 2023 12:27:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686943661; cv=none; d=google.com; s=arc-20160816; b=MlMjjHaD89uHw70wcDpt09Z7bFacILc0+PzP3ol/4pTcDUrbsDch2e6ieey5pU3gyv 5MYQIav3KQFS/6UYYEY75+86siCAvAeQTsDQQ47p1BVL+cgMPye+d+xvPXDYIhn9+Dne g/IORLZ9LPVlvKub4rGjyVAcsxujpU/LxLBS/6id8AYujtH1bhlCbVLp4FZLtReDemgU 19KJQ8xdOfJ08sHRa0L/fnvEUEfeywfvhgAl2RKJAytjAThEKx+qQ6kLIpZas+2g6IbS BIWYe43ofJiBs+qLpB+j57Gf4fIib3t/2fqSuUmHkzG83BRs9srC7jjN/+eZSAuySbo+ miaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=KAdPyDhhAxQcL3i9qsQ7EyKxqC4b7jcAbz9dFxEaZPQ=; b=UoVCM/BSSKd8oGPNBIY+okoiRO51tUDFQug/iMYz+nEfKtn2z2M0Zoc+i87TBWTFqW kAwZpwmj87MZvxXyWCu5OHMC+mwxyb7dFUhypPPIf7pTIyoxVYapf59m5vPnI3WwOTiV AmHMzsNQJmvAC1NOxenWJg17pzd1I3EuqHEi1LKxC876n174Q0nYiXBKNIycnmFgKg/h sG6JZEen+6R/TOUU1fCT2VzCaqCJHBAQLR2yekuB0V3jju7Mi0r1kfiatdky7Y9vCcmq zlIykVxm/VM7XsfXxyIPxHkiiKwNWTpDhFiJ5bNSKILAZdVDZQvmehKWxmTR7GMrv3I4 xtFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZETvxHnd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r16-20020aa79890000000b00666e788d909si821229pfl.147.2023.06.16.12.27.27; Fri, 16 Jun 2023 12:27:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZETvxHnd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346114AbjFPTVk (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Fri, 16 Jun 2023 15:21:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbjFPTVW (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 16 Jun 2023 15:21:22 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CF32524A; Fri, 16 Jun 2023 12:19:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 99902634BF; Fri, 16 Jun 2023 19:17:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 382D2C433C8; Fri, 16 Jun 2023 19:17:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686943067; bh=CAmP+Lx0wzqgHUs9vsC25lJNjq99NNhM1Mo48RueGyc=; h=From:To:Cc:Subject:Date:From; b=ZETvxHndWQKmZsWWEKvvXIR1NvhG0E+JQ9XMa5/v0HqoF1ZdzNxZ/GXDMO1LaUO4z dTo8UADnCGD6UZuvo5lDy70OAvdWz0SEcdnue4m1STiOYVAY/m8pN9g8X7LsAjmFCV zok7BR7RRfuq5+ikc05b+FO7Rmn7O0Pbe+upHOn3mYLkVA10AGC4zNbpMXNNioCN16 1PTOFhDX5zn5YaV8YcoVZgp0uT2FEmQOco1kT6lgBXkuDqdNO3Mnf7uVEjoZB0QGZs qujzgPiXMdTyhI6XZL1nnpWuTmGkiAdXp5cymmX88O+lp0TYmei3kIU55NqFC4tDn4 VeQ7NL4ofgUqw== From: Jeff Layton <jlayton@kernel.org> To: Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>, Olga Kornievskaia <kolga@netapp.com>, Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com> Cc: stable@vger.kernel.org, Eirik Fuller <efuller@redhat.com>, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net Date: Fri, 16 Jun 2023 15:17:43 -0400 Message-Id: <20230616191744.202292-1-jlayton@kernel.org> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768888636654878397?= X-GMAIL-MSGID: =?utf-8?q?1768888636654878397?= |
Series |
nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
|
|
Commit Message
Jeff Layton
June 16, 2023, 7:17 p.m. UTC
f5f9d4a314da moved the initialization of the reply cache into the nfsd
startup, but it didn't account for the stats counters which can be
accessed before nfsd is ever started, causing a NULL pointer
dereference.
This is easy to trigger on some arches (like aarch64), but on x86_64,
calling this_cpu_ptr(NULL) evidently returns a pointer to the
fixed_percpu_data, which I guess this looks just enough like a newly
initialized percpu var to allow nfsd_reply_cache_stats_show to access it
without Oopsing.
Move the initialization of the per-net+per-cpu reply-cache counters back
into nfsd_init_net, while leaving the rest of the reply cache
allocations to be done at nfsd startup time.
Kudos to Eirik who did most of the legwork to track this down.
Cc: stable@vger.kernel.org # v6.3+
Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/cache.h | 2 ++
fs/nfsd/nfscache.c | 13 +++----------
fs/nfsd/nfsctl.c | 8 ++++++++
3 files changed, 13 insertions(+), 10 deletions(-)
Comments
Thanks Eirik and Jeff. At this point in the release cycle, I plan to apply this for the next merge window (6.5). A few cosmetic remarks below. I can take this as-is and make adjustments, or you can redrive. Just let me know. On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote: > f5f9d4a314da moved the initialization of the reply cache into the nfsd > startup, but it didn't account for the stats counters which can be > accessed before nfsd is ever started, causing a NULL pointer > dereference. > > This is easy to trigger on some arches (like aarch64), but on x86_64, > calling this_cpu_ptr(NULL) evidently returns a pointer to the > fixed_percpu_data, which I guess this looks just enough like a newly > initialized percpu var to allow nfsd_reply_cache_stats_show to access it > without Oopsing. > > Move the initialization of the per-net+per-cpu reply-cache counters back > into nfsd_init_net, while leaving the rest of the reply cache > allocations to be done at nfsd startup time. > > Kudos to Eirik who did most of the legwork to track this down. > > Cc: stable@vger.kernel.org # v6.3+ > Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") Why both Fixes: and Cc: stable? > Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 ? > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/nfsd/cache.h | 2 ++ > fs/nfsd/nfscache.c | 13 +++---------- > fs/nfsd/nfsctl.c | 8 ++++++++ > 3 files changed, 13 insertions(+), 10 deletions(-) > > diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h > index f21259ead64b..a4b12d6c41d3 100644 > --- a/fs/nfsd/cache.h > +++ b/fs/nfsd/cache.h > @@ -80,6 +80,8 @@ enum { > > int nfsd_drc_slab_create(void); > void nfsd_drc_slab_free(void); > +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); > +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); > int nfsd_reply_cache_init(struct nfsd_net *); > void nfsd_reply_cache_shutdown(struct nfsd_net *); > int nfsd_cache_lookup(struct svc_rqst *); > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c > index 041faa13b852..b696dc629c0b 100644 > --- a/fs/nfsd/nfscache.c > +++ b/fs/nfsd/nfscache.c > @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) > kmem_cache_destroy(drc_slab); > } > > -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) > +int nfsd_reply_cache_stats_init(struct nfsd_net *nn) As part of making these two functions into a more public API, I would prefer to rename this nfsd_net_reply_cache_init(), and it should include a KDOC comment out front. I'm having some trouble easily distinguishing between the purpose of static __net_init int nfsd_init_net(struct net *net) and static int nfsd_startup_net(struct net *net, const struct cred *cred) The former is invoked when a net namespace is created. The latter is called by write_threads, therefore /proc/fs/nfsd/ must already be mounted. The function names are confusingly similar and there's no KDOC to be found. I might add a clean-up patch to this one. > { > return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); > } > > -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) > +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC. > { > nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); > } > @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) > hashsize = nfsd_hashsize(nn->max_drc_entries); > nn->maskbits = ilog2(hashsize); > > - status = nfsd_reply_cache_stats_init(nn); > - if (status) > - goto out_nomem; > - > nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; > nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; > nn->nfsd_reply_cache_shrinker.seeks = 1; > status = register_shrinker(&nn->nfsd_reply_cache_shrinker, > "nfsd-reply:%s", nn->nfsd_name); > if (status) > - goto out_stats_destroy; > + return status; > > nn->drc_hashtbl = kvzalloc(array_size(hashsize, > sizeof(*nn->drc_hashtbl)), GFP_KERNEL); > @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) > return 0; > out_shrinker: > unregister_shrinker(&nn->nfsd_reply_cache_shrinker); > -out_stats_destroy: > - nfsd_reply_cache_stats_destroy(nn); > -out_nomem: > printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); > return -ENOMEM; > } > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c > index 1489e0b703b4..7c837afcf615 100644 > --- a/fs/nfsd/nfsctl.c > +++ b/fs/nfsd/nfsctl.c > @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) > retval = nfsd_idmap_init(net); > if (retval) > goto out_idmap_error; > + retval = nfsd_reply_cache_stats_init(nn); > + if (retval) > + goto out_repcache_error; > nn->nfsd_versions = NULL; > nn->nfsd4_minorversions = NULL; > nfsd4_init_leases_net(nn); > @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net) > > return 0; > > +out_repcache_error: > + nfsd_idmap_shutdown(net); > out_idmap_error: > nfsd_export_shutdown(net); > out_export_error: > @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net) > > static __net_exit void nfsd_exit_net(struct net *net) > { > + struct nfsd_net *nn = net_generic(net, nfsd_net_id); > + > + nfsd_reply_cache_stats_destroy(nn); > nfsd_idmap_shutdown(net); > nfsd_export_shutdown(net); > nfsd_netns_free_versions(net_generic(net, nfsd_net_id)); We should update this nfsd_netns_free_versions() call site to take the new @nn variable. > -- > 2.40.1 >
On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: > Thanks Eirik and Jeff. > > At this point in the release cycle, I plan to apply this for the > next merge window (6.5). > I think we should take this in sooner. This is a regression and a user-triggerable oops in the right situation. If: - non-x86_64 arch - /proc/fs/nfsd is mounted in the namespace - nfsd is not started in the namespace - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" > A few cosmetic remarks below. I can take this as-is and make > adjustments, or you can redrive. Just let me know. > > I'll plan to redrive it. > On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote: > > f5f9d4a314da moved the initialization of the reply cache into the nfsd > > startup, but it didn't account for the stats counters which can be > > accessed before nfsd is ever started, causing a NULL pointer > > dereference. > > > > This is easy to trigger on some arches (like aarch64), but on x86_64, > > calling this_cpu_ptr(NULL) evidently returns a pointer to the > > fixed_percpu_data, which I guess this looks just enough like a newly > > initialized percpu var to allow nfsd_reply_cache_stats_show to access it > > without Oopsing. > > > > Move the initialization of the per-net+per-cpu reply-cache counters back > > into nfsd_init_net, while leaving the rest of the reply cache > > allocations to be done at nfsd startup time. > > > > Kudos to Eirik who did most of the legwork to track this down. > > > > Cc: stable@vger.kernel.org # v6.3+ > > Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") > > Why both Fixes: and Cc: stable? > > *shrug* : they mean different things. I can drop the Cc stable. > > Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com> > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 ? > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/nfsd/cache.h | 2 ++ > > fs/nfsd/nfscache.c | 13 +++---------- > > fs/nfsd/nfsctl.c | 8 ++++++++ > > 3 files changed, 13 insertions(+), 10 deletions(-) > > > > diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h > > index f21259ead64b..a4b12d6c41d3 100644 > > --- a/fs/nfsd/cache.h > > +++ b/fs/nfsd/cache.h > > @@ -80,6 +80,8 @@ enum { > > > > int nfsd_drc_slab_create(void); > > void nfsd_drc_slab_free(void); > > +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); > > +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); > > int nfsd_reply_cache_init(struct nfsd_net *); > > void nfsd_reply_cache_shutdown(struct nfsd_net *); > > int nfsd_cache_lookup(struct svc_rqst *); > > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c > > index 041faa13b852..b696dc629c0b 100644 > > --- a/fs/nfsd/nfscache.c > > +++ b/fs/nfsd/nfscache.c > > @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) > > kmem_cache_destroy(drc_slab); > > } > > > > -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) > > +int nfsd_reply_cache_stats_init(struct nfsd_net *nn) > > As part of making these two functions into a more public API, I > would prefer to rename this nfsd_net_reply_cache_init(), and it > should include a KDOC comment out front. > > I'm having some trouble easily distinguishing between the purpose of > > static __net_init int nfsd_init_net(struct net *net) > > and > > static int nfsd_startup_net(struct net *net, const struct cred *cred) > > The former is invoked when a net namespace is created. The latter is > called by write_threads, therefore /proc/fs/nfsd/ must already be > mounted. > > The function names are confusingly similar and there's no KDOC to be > found. I might add a clean-up patch to this one. > I can add some kdoc patches in v2. > > { > > return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); > > } > > > > -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) > > +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) > > Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC. > > Ok. > > { > > nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); > > } > > @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) > > hashsize = nfsd_hashsize(nn->max_drc_entries); > > nn->maskbits = ilog2(hashsize); > > > > - status = nfsd_reply_cache_stats_init(nn); > > - if (status) > > - goto out_nomem; > > - > > nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; > > nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; > > nn->nfsd_reply_cache_shrinker.seeks = 1; > > status = register_shrinker(&nn->nfsd_reply_cache_shrinker, > > "nfsd-reply:%s", nn->nfsd_name); > > if (status) > > - goto out_stats_destroy; > > + return status; > > > > nn->drc_hashtbl = kvzalloc(array_size(hashsize, > > sizeof(*nn->drc_hashtbl)), GFP_KERNEL); > > @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) > > return 0; > > out_shrinker: > > unregister_shrinker(&nn->nfsd_reply_cache_shrinker); > > -out_stats_destroy: > > - nfsd_reply_cache_stats_destroy(nn); > > -out_nomem: > > printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); > > return -ENOMEM; > > } > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c > > index 1489e0b703b4..7c837afcf615 100644 > > --- a/fs/nfsd/nfsctl.c > > +++ b/fs/nfsd/nfsctl.c > > @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) > > retval = nfsd_idmap_init(net); > > if (retval) > > goto out_idmap_error; > > + retval = nfsd_reply_cache_stats_init(nn); > > + if (retval) > > + goto out_repcache_error; > > nn->nfsd_versions = NULL; > > nn->nfsd4_minorversions = NULL; > > nfsd4_init_leases_net(nn); > > @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net) > > > > return 0; > > > > +out_repcache_error: > > + nfsd_idmap_shutdown(net); > > out_idmap_error: > > nfsd_export_shutdown(net); > > out_export_error: > > @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net) > > > > static __net_exit void nfsd_exit_net(struct net *net) > > { > > + struct nfsd_net *nn = net_generic(net, nfsd_net_id); > > + > > + nfsd_reply_cache_stats_destroy(nn); > > nfsd_idmap_shutdown(net); > > nfsd_export_shutdown(net); > > nfsd_netns_free_versions(net_generic(net, nfsd_net_id)); > > We should update this nfsd_netns_free_versions() call site to take > the new @nn variable. > Ahh yes. Will fix. I'll plan to send a v2 with the changes you suggest. Thanks,
Linux regression tracking (Thorsten Leemhuis)
June 18, 2023, 10:40 a.m. UTC |
#3
Addressed
Unaddressed
On 16.06.23 22:54, Jeff Layton wrote: > On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: >> Thanks Eirik and Jeff. >> >> At this point in the release cycle, I plan to apply this for the >> next merge window (6.5). > > I think we should take this in sooner. This is a regression and a > user-triggerable oops in the right situation. If: > > - non-x86_64 arch > - /proc/fs/nfsd is mounted in the namespace > - nfsd is not started in the namespace > - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" FWIW, might be worth to simply tell Linus about it and let him decide, that's totally fine and even documented in the old and the new docs for handling regressions[1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2 >>> Cc: stable@vger.kernel.org # v6.3+ >>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") >> >> Why both Fixes: and Cc: stable? > > *shrug* : they mean different things. I can drop the Cc stable. Please leave it, only a stable tag ensures backporting; a fixes tag alone is not enough. See [1] above or these recent messages from Greg: https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/ Ciao, Thorsten
On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote: > On 16.06.23 22:54, Jeff Layton wrote: > > On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: > > > Thanks Eirik and Jeff. > > > > > > At this point in the release cycle, I plan to apply this for the > > > next merge window (6.5). > > > > I think we should take this in sooner. This is a regression and a > > user-triggerable oops in the right situation. If: > > > > - non-x86_64 arch > > - /proc/fs/nfsd is mounted in the namespace > > - nfsd is not started in the namespace > > - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" > > FWIW, might be worth to simply tell Linus about it and let him decide, > that's totally fine and even documented in the old and the new docs for > handling regressions[1]. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2 > I'd rather Chuck make the final call here. The original patch description didn't point out how easy it is to trigger a panic with this, so I was hoping to convince him. To further that argument too: I have to wonder if this bug might cause (temporary?) memory corruption on x86_64. The code hits a spinlock in that struct, so there may be a window of time where it doesn't contain what's expected. > > > > Cc: stable@vger.kernel.org # v6.3+ > > > > Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") > > > > > > Why both Fixes: and Cc: stable? > > > > *shrug* : they mean different things. I can drop the Cc stable. > > Please leave it, only a stable tag ensures backporting; a fixes tag > alone is not enough. See [1] above or these recent messages from Greg: > > https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ > https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/ > Chuck and I also recently requested that the stable series not pick patches automatically for fs/nfsd. This does need to be backported though, so I cc'ed stable to make that clear.
Linux regression tracking (Thorsten Leemhuis)
June 18, 2023, 2:02 p.m. UTC |
#5
Addressed
Unaddressed
On 18.06.23 14:09, Jeff Layton wrote: > On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote: >> On 16.06.23 22:54, Jeff Layton wrote: >>> On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: >> >> FWIW, might be worth to simply tell Linus about it and let him decide, >> that's totally fine and even documented in the old and the new docs for >> handling regressions[1]. >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2 > > I'd rather Chuck make the final call here. Totally fine for me, I just wanted to remind folks that this option exist, as I got the impression people forget it or fear it might annoy Linux. :D >>>>> Cc: stable@vger.kernel.org # v6.3+ >>>>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") >>>> Why both Fixes: and Cc: stable? >>> *shrug* : they mean different things. I can drop the Cc stable. >> >> Please leave it, only a stable tag ensures backporting; a fixes tag >> alone is not enough. See [1] above or these recent messages from Greg: >> >> https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ >> https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/ > > Chuck and I also recently requested that the stable series not pick > patches automatically for fs/nfsd. This does need to be backported > though, so I cc'ed stable to make that clear. Great, many thx! Ciao, thorsten
> On Jun 18, 2023, at 8:09 AM, Jeff Layton <jlayton@kernel.org> wrote: > > On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote: >> On 16.06.23 22:54, Jeff Layton wrote: >>> On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: >>>> Thanks Eirik and Jeff. >>>> >>>> At this point in the release cycle, I plan to apply this for the >>>> next merge window (6.5). >>> >>> I think we should take this in sooner. This is a regression and a >>> user-triggerable oops in the right situation. If: >>> >>> - non-x86_64 arch >>> - /proc/fs/nfsd is mounted in the namespace >>> - nfsd is not started in the namespace >>> - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" >> >> FWIW, might be worth to simply tell Linus about it and let him decide, >> that's totally fine and even documented in the old and the new docs for >> handling regressions[1]. >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2 >> > > I'd rather Chuck make the final call here. Thanks! I feel this one needs broader testing than we can manage in just a couple of days. If this were earlier in the -rc cycle I would pull the patch right into 6.4-rc without hesitation. It is obviously -rc material, but the timing is unfortunate. I'm planning the nfsd for-6.5 pull request early in the merge window, so practically speaking it shouldn't delay the finalized upstream version of this patch by more than a few days. > The original patch > description didn't point out how easy it is to trigger a panic with > this, I will add that information. > so I was hoping to convince him. Oh, I agree it's significant. I just don't want to compound the problem by sending a possibly-buggy patch at the last moment in the 6.4 cycle. When we have our shiny new CI infrastructure in place, we will be able to move faster and with more confidence on fixes this late in a cycle. > To further that argument too: > > I have to wonder if this bug might cause (temporary?) memory corruption > on x86_64. The code hits a spinlock in that struct, so there may be a > window of time where it doesn't contain what's expected. > >>>>> Cc: stable@vger.kernel.org # v6.3+ >>>>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") >>>> >>>> Why both Fixes: and Cc: stable? >>> >>> *shrug* : they mean different things. I can drop the Cc stable. >> >> Please leave it, only a stable tag ensures backporting; a fixes tag >> alone is not enough. See [1] above or these recent messages from Greg: >> >> https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/ >> https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/ > > Chuck and I also recently requested that the stable series not pick > patches automatically for fs/nfsd. To be clear, we requested that stable not pick up patches for fs/nfsd using AUTOSEL. Basically that means stable will pick up only fs/nfsd patches that are explicitly marked with Fixes: or Cc:stable. > This does need to be backported > though, so I cc'ed stable to make that clear. OK, I'll add the "cc: stable" back too. My question wasn't so much a demand to drop the tag, but rather a request for an explanation of why both were needed. I'll try to be less terse next time. Thorsten, if you've added this issue to the regbot database, please feel free to follow up with the correct tags to mark the issue closed as appropriate. -- Chuck Lever
On Sun, 2023-06-18 at 15:59 +0000, Chuck Lever III wrote: > > > On Jun 18, 2023, at 8:09 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote: > > > On 16.06.23 22:54, Jeff Layton wrote: > > > > On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote: > > > > > Thanks Eirik and Jeff. > > > > > > > > > > At this point in the release cycle, I plan to apply this for the > > > > > next merge window (6.5). > > > > > > > > I think we should take this in sooner. This is a regression and a > > > > user-triggerable oops in the right situation. If: > > > > > > > > - non-x86_64 arch > > > > - /proc/fs/nfsd is mounted in the namespace > > > > - nfsd is not started in the namespace > > > > - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" > > > > > > FWIW, might be worth to simply tell Linus about it and let him decide, > > > that's totally fine and even documented in the old and the new docs for > > > handling regressions[1]. > > > > > > [1] > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2 > > > > > > > I'd rather Chuck make the final call here. > > Thanks! I feel this one needs broader testing than we can manage > in just a couple of days. If this were earlier in the -rc cycle > I would pull the patch right into 6.4-rc without hesitation. It > is obviously -rc material, but the timing is unfortunate. > > I'm planning the nfsd for-6.5 pull request early in the merge > window, so practically speaking it shouldn't delay the finalized > upstream version of this patch by more than a few days. > > Ok. I'll trust your judgment then and just cultivate my patience! Cheers,
diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h index f21259ead64b..a4b12d6c41d3 100644 --- a/fs/nfsd/cache.h +++ b/fs/nfsd/cache.h @@ -80,6 +80,8 @@ enum { int nfsd_drc_slab_create(void); void nfsd_drc_slab_free(void); +int nfsd_reply_cache_stats_init(struct nfsd_net *nn); +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn); int nfsd_reply_cache_init(struct nfsd_net *); void nfsd_reply_cache_shutdown(struct nfsd_net *); int nfsd_cache_lookup(struct svc_rqst *); diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..b696dc629c0b 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void) kmem_cache_destroy(drc_slab); } -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn) +int nfsd_reply_cache_stats_init(struct nfsd_net *nn) { return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM); } -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn) { nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM); } @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) hashsize = nfsd_hashsize(nn->max_drc_entries); nn->maskbits = ilog2(hashsize); - status = nfsd_reply_cache_stats_init(nn); - if (status) - goto out_nomem; - nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; nn->nfsd_reply_cache_shrinker.seeks = 1; status = register_shrinker(&nn->nfsd_reply_cache_shrinker, "nfsd-reply:%s", nn->nfsd_name); if (status) - goto out_stats_destroy; + return status; nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) return 0; out_shrinker: unregister_shrinker(&nn->nfsd_reply_cache_shrinker); -out_stats_destroy: - nfsd_reply_cache_stats_destroy(nn); -out_nomem: printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); return -ENOMEM; } diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 1489e0b703b4..7c837afcf615 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net) retval = nfsd_idmap_init(net); if (retval) goto out_idmap_error; + retval = nfsd_reply_cache_stats_init(nn); + if (retval) + goto out_repcache_error; nn->nfsd_versions = NULL; nn->nfsd4_minorversions = NULL; nfsd4_init_leases_net(nn); @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net) return 0; +out_repcache_error: + nfsd_idmap_shutdown(net); out_idmap_error: nfsd_export_shutdown(net); out_export_error: @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net) static __net_exit void nfsd_exit_net(struct net *net) { + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfsd_reply_cache_stats_destroy(nn); nfsd_idmap_shutdown(net); nfsd_export_shutdown(net); nfsd_netns_free_versions(net_generic(net, nfsd_net_id));