Message ID | 20230626123252.73dbc139@endymion.delvare |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7392201vqr; Mon, 26 Jun 2023 04:01:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ47YS8R4WFELbA/vP98hLLzY3K+5O0eCQeIPCMft7QylwHxPi5Av9ggpy9Ry1MvmcpSHXoo X-Received: by 2002:a05:6402:da:b0:518:7415:3b61 with SMTP id i26-20020a05640200da00b0051874153b61mr16065849edu.23.1687777268600; Mon, 26 Jun 2023 04:01:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687777268; cv=none; d=google.com; s=arc-20160816; b=ajlZXRu4mQPkvPVf4ckj307tiJgrwiPHgHi1KIJMVXMbNbD0A88f27FB3UwyBt3BXH f3MeIgzmwcuCmx4ZJ7YV6WoRgcMGYx6BeDjCe+wUgXgVrfJhyv1a6m2YJWy2DYB9hnjJ UURlzFRPqVUQ+EYZmxVWEywVI8gxZPktlxOEyW0z+DYnRJvGEEoWurY29wc9ZWKkYPeZ E/9UrW6bJ0NtfJtlnLeE3w7W2oO5/ISRnlyV3lAwG8BNwDh+xGcZusWbMFWVo82vmr/Y 5b4uUVmqVcCzv9Pd7w5G4ERxyQ6A4QAMrx3T8lg/VqYgLVGnfPSdmulo8YqBXADRAmZu Qgog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=z+HSLECNibzVOo7kYE3yRqNP3uGJcAQpmYEuFSJm/OI=; fh=1jTypjpGUsKe4trrIP5JxHi8HYxRxKg5wQ8OVPRT3hg=; b=xAaRUhS1SGLpszAAZI9qqX5z1WqZBhjpYuyGyjW4Uzm8FkNHpX7Xk5X+a0Oo6qUBHb EQgmqhFgx1DkTtUb1eQ1JSXL6sWv4psUtTHQg0k5BqGEbvgz2doUzImeZj5iJzv99tG2 v+Naymota4DePjs7PmlLh4UmS8NEmPV5yAwsoRvq4KjUuY7bH2uqtUNSSDO28gBnFGqW wMZAb53VXITsXZ/Gp5ma8gHINkRSbhMjewBqhnklgnzHj+/QYcHA7fwycICXZmbH6QD0 +GMx2OdAAMTgQbbmixFHzA70y8z2vZHtklelFBv0y7kcq4dTbt24yCCMEAxSyQE+w5nP QDYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=eWDLaoiI; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=bMbbCan3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n20-20020a056402061400b00519523cfe33si2454296edv.465.2023.06.26.04.00.44; Mon, 26 Jun 2023 04:01:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=eWDLaoiI; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=bMbbCan3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229848AbjFZKdd (ORCPT <rfc822;filip.gregor98@gmail.com> + 99 others); Mon, 26 Jun 2023 06:33:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229524AbjFZKdb (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 26 Jun 2023 06:33:31 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97B48172A; Mon, 26 Jun 2023 03:32:55 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 237891F74B; Mon, 26 Jun 2023 10:32:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1687775574; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=z+HSLECNibzVOo7kYE3yRqNP3uGJcAQpmYEuFSJm/OI=; b=eWDLaoiIC8nY2Eyf9HoiKnryKS7ZVt9r0tcTunuPJbaiAa53ryeZM0uuc0tMiA3fYq8YR3 rCi4zO0qHieNqI9B1kbnY14hPP2PfRavqTV8yhnJ+jFqCF1RwwFlE4YsqXtFz5zLqEOq8g WLb25zUXSR3mxFJjc1AEWeAllF54qN8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1687775574; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=z+HSLECNibzVOo7kYE3yRqNP3uGJcAQpmYEuFSJm/OI=; b=bMbbCan38BmV4BchYOSRSYuGzmBcYrHYluBk3qG41KG/GATFQDYN7GtQx5vV2VWd6v6W1c vg2XM2ROl8rR2RBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DE9F513905; Mon, 26 Jun 2023 10:32:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id VFO4NFVpmWTYDwAAMHmgww (envelope-from <jdelvare@suse.de>); Mon, 26 Jun 2023 10:32:53 +0000 Date: Mon, 26 Jun 2023 12:32:52 +0200 From: Jean Delvare <jdelvare@suse.de> To: Luis Chamberlain <mcgrof@kernel.org> Cc: Michal Hocko <mhocko@suse.com>, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] module: print module name on refcount error Message-ID: <20230626123252.73dbc139@endymion.delvare> Organization: SUSE Linux X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.34; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769762736911541122?= X-GMAIL-MSGID: =?utf-8?q?1769762736911541122?= |
Series |
module: print module name on refcount error
|
|
Commit Message
Jean Delvare
June 26, 2023, 10:32 a.m. UTC
If module_put() triggers a refcount error, include the culprit
module name in the warning message, to easy further investigation of
the issue.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Suggested-by: Michal Hocko <mhocko@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
---
kernel/module/main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Comments
On Mon 26-06-23 12:32:52, Jean Delvare wrote: > If module_put() triggers a refcount error, include the culprit > module name in the warning message, to easy further investigation of > the issue. > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > Suggested-by: Michal Hocko <mhocko@suse.com> > Cc: Luis Chamberlain <mcgrof@kernel.org> > --- > kernel/module/main.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > --- linux-6.3.orig/kernel/module/main.c > +++ linux-6.3/kernel/module/main.c > @@ -850,7 +850,9 @@ void module_put(struct module *module) > if (module) { > preempt_disable(); > ret = atomic_dec_if_positive(&module->refcnt); > - WARN_ON(ret < 0); /* Failed to put refcount */ > + WARN(ret < 0, > + KERN_WARNING "Failed to put refcount for module %s\n", > + module->name); Would it make sense to also print the refcnt here? In our internal bug report it has turned out that this was an overflow (put missing) rather than an underflow (too many put calls). Seeing the value could give a clue about that. We had to configure panic_on_warn to capture a dump to learn more which is rather impractical. Other than that the module information on its own is an improvement because one knows where to start looking or to reduce the tracing data collected. In any case Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > trace_module_put(module, _RET_IP_); > preempt_enable(); > } > > > -- > Jean Delvare > SUSE L3 Support
On Mon, Jun 26, 2023 at 12:32:52PM +0200, Jean Delvare wrote: > If module_put() triggers a refcount error, include the culprit > module name in the warning message, to easy further investigation of > the issue. > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > Suggested-by: Michal Hocko <mhocko@suse.com> > Cc: Luis Chamberlain <mcgrof@kernel.org> > --- > kernel/module/main.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > --- linux-6.3.orig/kernel/module/main.c > +++ linux-6.3/kernel/module/main.c > @@ -850,7 +850,9 @@ void module_put(struct module *module) > if (module) { > preempt_disable(); > ret = atomic_dec_if_positive(&module->refcnt); > - WARN_ON(ret < 0); /* Failed to put refcount */ > + WARN(ret < 0, > + KERN_WARNING "Failed to put refcount for module %s\n", > + module->name); > trace_module_put(module, _RET_IP_); > preempt_enable(); > } > The mod struct ends up actually being allocated, we first read the ELF passed by userspace and we end up allocating space for struct module when reading the ELF section ".gnu.linkonce.this_module". We cache the ELF section index in info->index.mod, we finally copy the module into the allocated space with move_module(). In linux-next code this is much more clear now. What prevents us from racing to free the module and thus invalidating the name? For instance the system call to delete_module() could hammer and so have tons of threads racing try_stop_module(), eventually one of them could win and free_module() would kick in gear. What prevents code from racing the free with a random module_put() called by some other piece of code? I realize this may implicate even the existing code seems racy. Luis
Hi Luis, On Fri, 30 Jun 2023 16:05:33 -0700, Luis Chamberlain wrote: > On Mon, Jun 26, 2023 at 12:32:52PM +0200, Jean Delvare wrote: > > If module_put() triggers a refcount error, include the culprit > > module name in the warning message, to easy further investigation of > > the issue. > > > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > > Suggested-by: Michal Hocko <mhocko@suse.com> > > Cc: Luis Chamberlain <mcgrof@kernel.org> > > --- > > kernel/module/main.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > --- linux-6.3.orig/kernel/module/main.c > > +++ linux-6.3/kernel/module/main.c > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > if (module) { > > preempt_disable(); > > ret = atomic_dec_if_positive(&module->refcnt); > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > + WARN(ret < 0, > > + KERN_WARNING "Failed to put refcount for module %s\n", > > + module->name); > > trace_module_put(module, _RET_IP_); > > preempt_enable(); > > } > > > > The mod struct ends up actually being allocated, we first read the ELF > passed by userspace and we end up allocating space for struct module > when reading the ELF section ".gnu.linkonce.this_module". We cache > the ELF section index in info->index.mod, we finally copy the module > into the allocated space with move_module(). > > In linux-next code this is much more clear now. > > What prevents us from racing to free the module and thus invalidating > the name? > > For instance the system call to delete_module() could hammer and > so have tons of threads racing try_stop_module(), eventually one of > them could win and free_module() would kick in gear. > > What prevents code from racing the free with a random module_put() > called by some other piece of code? > > I realize this may implicate even the existing code seems racy. You are the maintainer so I'll trust your expertise, but this is how I understand it: if we hit this WARN, this means reference counting is screwed. If this is an underflow, we still have a reference to the module while refcnt is zero, meaning the module could be removed at any time. This is inherent to the issue we are reporting, and not related to the proposed change. The name is just one field of struct module, refcnt is in the very same situation already. So the whole piece of code is best effort reporting and assumes (both before and after my proposed change) that nobody attempted to unload the module yet.
On Sat, 1 Jul 2023 17:57:27 +0200, Jean Delvare wrote: > On Fri, 30 Jun 2023 16:05:33 -0700, Luis Chamberlain wrote: > > On Mon, Jun 26, 2023 at 12:32:52PM +0200, Jean Delvare wrote: > > > If module_put() triggers a refcount error, include the culprit > > > module name in the warning message, to easy further investigation of > > > the issue. > > > > > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > > > Suggested-by: Michal Hocko <mhocko@suse.com> > > > Cc: Luis Chamberlain <mcgrof@kernel.org> > > > --- > > > kernel/module/main.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > --- linux-6.3.orig/kernel/module/main.c > > > +++ linux-6.3/kernel/module/main.c > > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > > if (module) { > > > preempt_disable(); > > > ret = atomic_dec_if_positive(&module->refcnt); > > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > > + WARN(ret < 0, > > > + KERN_WARNING "Failed to put refcount for module %s\n", > > > + module->name); > > > trace_module_put(module, _RET_IP_); > > > preempt_enable(); > > > } > > > > The mod struct ends up actually being allocated, we first read the ELF > > passed by userspace and we end up allocating space for struct module > > when reading the ELF section ".gnu.linkonce.this_module". We cache > > the ELF section index in info->index.mod, we finally copy the module > > into the allocated space with move_module(). > > > > In linux-next code this is much more clear now. > > > > What prevents us from racing to free the module and thus invalidating > > the name? > > > > For instance the system call to delete_module() could hammer and > > so have tons of threads racing try_stop_module(), eventually one of > > them could win and free_module() would kick in gear. > > > > What prevents code from racing the free with a random module_put() > > called by some other piece of code? > > > > I realize this may implicate even the existing code seems racy. > > You are the maintainer so I'll trust your expertise, but this is how I > understand it: if we hit this WARN, this means reference counting is > screwed. If this is an underflow, we still have a reference to the > module while refcnt is zero, meaning the module could be removed at any > time. This is inherent to the issue we are reporting, and not related > to the proposed change. The name is just one field of struct module, > refcnt is in the very same situation already. > > So the whole piece of code is best effort reporting and assumes (both > before and after my proposed change) that nobody attempted to unload > the module yet. I thought some more about it and one potential problem with my proposed change is if the module has indeed already been freed and the memory already reused for a different purpose. We are in trouble already (we just called atomic_dec_if_positive on a random memory location) but the WARN message could become very messy if the memory where module.name used to reside no longer contains any string terminator (binary zero). So we probably want to play it safe and add a length limitation when printing the module name. Something like: WARN(ret < 0, KERN_WARNING "Failed to put refcount for module %.*s\n", (int)MODULE_NAME_LEN, module->name);
On Fri 30-06-23 16:05:33, Luis Chamberlain wrote: [...] > What prevents code from racing the free with a random module_put() > called by some other piece of code? Wouldn't be ref count a garbage already? How can you race when freeing if module_put fail?
Hi Michal, On Wed, 28 Jun 2023 12:30:35 +0200, Michal Hocko wrote: > On Mon 26-06-23 12:32:52, Jean Delvare wrote: > > If module_put() triggers a refcount error, include the culprit > > module name in the warning message, to easy further investigation of > > the issue. > > > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > > Suggested-by: Michal Hocko <mhocko@suse.com> > > Cc: Luis Chamberlain <mcgrof@kernel.org> > > --- > > kernel/module/main.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > --- linux-6.3.orig/kernel/module/main.c > > +++ linux-6.3/kernel/module/main.c > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > if (module) { > > preempt_disable(); > > ret = atomic_dec_if_positive(&module->refcnt); > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > + WARN(ret < 0, > > + KERN_WARNING "Failed to put refcount for module %s\n", > > + module->name); > > Would it make sense to also print the refcnt here? In our internal bug > report it has turned out that this was an overflow (put missing) rather > than an underflow (too many put calls). Seeing the value could give a > clue about that. We had to configure panic_on_warn to capture a dump to > learn more which is rather impractical. Well, other calls to module_put() or try_module_get() could happen in parallel, so at the time we print refcnt, its value could be different from the one which triggered the WARN. Additionally, catching an overflow in module_put() is counterintuitive, it only works by accident because the counter gets to negative values. If we really want to reliably report overflows as such then we should add a dedicated WARN to try_module_get(). Doesn't look trivial though. With my proposed implementation, I don't think it's necessary to turn on panic_on_warn to debug further. Once you know which module is culprit, enabling tracing for this specific module should give you all the details you need to figure out what's going on.
On Tue 04-07-23 14:43:12, Jean Delvare wrote: > Hi Michal, > > On Wed, 28 Jun 2023 12:30:35 +0200, Michal Hocko wrote: > > On Mon 26-06-23 12:32:52, Jean Delvare wrote: > > > If module_put() triggers a refcount error, include the culprit > > > module name in the warning message, to easy further investigation of > > > the issue. > > > > > > Signed-off-by: Jean Delvare <jdelvare@suse.de> > > > Suggested-by: Michal Hocko <mhocko@suse.com> > > > Cc: Luis Chamberlain <mcgrof@kernel.org> > > > --- > > > kernel/module/main.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > --- linux-6.3.orig/kernel/module/main.c > > > +++ linux-6.3/kernel/module/main.c > > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > > if (module) { > > > preempt_disable(); > > > ret = atomic_dec_if_positive(&module->refcnt); > > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > > + WARN(ret < 0, > > > + KERN_WARNING "Failed to put refcount for module %s\n", > > > + module->name); > > > > Would it make sense to also print the refcnt here? In our internal bug > > report it has turned out that this was an overflow (put missing) rather > > than an underflow (too many put calls). Seeing the value could give a > > clue about that. We had to configure panic_on_warn to capture a dump to > > learn more which is rather impractical. > > Well, other calls to module_put() or try_module_get() could happen in > parallel, so at the time we print refcnt, its value could be different > from the one which triggered the WARN. Racess with module_put should be impossible because all of them should fail, right? Races with put are possible but we do not need an exact value to tell the difference between over and underflow, no?
On Tue, 4 Jul 2023 15:05:33 +0200, Michal Hocko wrote: > On Tue 04-07-23 14:43:12, Jean Delvare wrote: > > On Wed, 28 Jun 2023 12:30:35 +0200, Michal Hocko wrote: > > > Would it make sense to also print the refcnt here? In our internal bug > > > report it has turned out that this was an overflow (put missing) rather > > > than an underflow (too many put calls). Seeing the value could give a > > > clue about that. We had to configure panic_on_warn to capture a dump to > > > learn more which is rather impractical. > > > > Well, other calls to module_put() or try_module_get() could happen in > > parallel, so at the time we print refcnt, its value could be different > > from the one which triggered the WARN. > > Racess with module_put should be impossible because all of them should > fail, right? Most probably yes, but after taking a deeper look at the code, I wouldn't swear. For example delete_module() will decrement refcnt and increment it again if the module can't actually be removed. This could get refcnt to positive again briefly, at which point another module_put() could succeed. > Races with put are possible but we do not need an exact > value to tell the difference between over and underflow, no? Indeed not. But my other points still stand. Plus, if you really want to know the refcnt value, it's already visible in /sys/module/*/refcnt and lsmod.
On Mon, Jul 03, 2023 at 03:47:22PM +0200, Michal Hocko wrote: > On Fri 30-06-23 16:05:33, Luis Chamberlain wrote: > [...] > > What prevents code from racing the free with a random module_put() > > called by some other piece of code? > > Wouldn't be ref count a garbage already? How can you race when freeing > if module_put fail? It could yes, ie, so this risks at all being junk. So best IMHO is to tidy up all the get / puts and add respective tests to fix all this mess with proper messages as needed. My cursory review of the refcnt stuf is I see some races possible. While I'd be happy to help debugging aids, adding accesses to random memory for a string seems more risk prone. Luis
On Fri 07-07-23 11:56:49, Luis Chamberlain wrote: > On Mon, Jul 03, 2023 at 03:47:22PM +0200, Michal Hocko wrote: > > On Fri 30-06-23 16:05:33, Luis Chamberlain wrote: > > [...] > > > What prevents code from racing the free with a random module_put() > > > called by some other piece of code? > > > > Wouldn't be ref count a garbage already? How can you race when freeing > > if module_put fail? > > It could yes, ie, so this risks at all being junk. Could you be more specific please? I still do not see a scenario where module string name would be junk while refcount itself would be a valid memory. > So best IMHO is > to tidy up all the get / puts and add respective tests to fix all > this mess with proper messages as needed. My cursory review of the > refcnt stuf is I see some races possible. It would likely be better to use refcount_t instead of atomic_t. > While I'd be happy to help debugging aids, adding accesses to random > memory for a string seems more risk prone. If there is really a scenario when module could be unloaded leaving dangling struct module behind then we have a real problem as this is exported to userspace IIRC. Not to mention module_get/put calls modifying memory (UAF).
On Mon, Jul 10, 2023 at 07:43:01AM +0200, Michal Hocko wrote: > On Fri 07-07-23 11:56:49, Luis Chamberlain wrote: > > On Mon, Jul 03, 2023 at 03:47:22PM +0200, Michal Hocko wrote: > > > On Fri 30-06-23 16:05:33, Luis Chamberlain wrote: > > > [...] > > > > What prevents code from racing the free with a random module_put() > > > > called by some other piece of code? > > > > > > Wouldn't be ref count a garbage already? How can you race when freeing > > > if module_put fail? > > > > It could yes, ie, so this risks at all being junk. > > Could you be more specific please? I still do not see a scenario where > module string name would be junk while refcount itself would be a valid > memory. That is true, but if refcount is invalid so will the memory for the string. > > So best IMHO is > > to tidy up all the get / puts and add respective tests to fix all > > this mess with proper messages as needed. My cursory review of the > > refcnt stuf is I see some races possible. > > It would likely be better to use refcount_t instead of atomic_t. Patches welcomed. > > While I'd be happy to help debugging aids, adding accesses to random > > memory for a string seems more risk prone. > > If there is really a scenario when module could be unloaded leaving > dangling struct module behind then we have a real problem as this is > exported to userspace IIRC. Not to mention module_get/put calls > modifying memory (UAF). That doesn't mean issues could not exist, given its all protected under privileged execution. All I'm suggesting is I look at this code and don't trust it, and think it could use some love. The selftests for kmod could be used to stress test but also stress-ng now also has module load and unloading so if there are races we can likely exploit them with either the kmod selftest or stress-ng module loading. Luis
On Mon 28-08-23 14:18:30, Jean Delvare wrote: [...] > > > It would likely be better to use refcount_t instead of atomic_t. > > > > Patches welcomed. > > Michal, do I understand correctly that this would prevent the case our > customer had (too many gets), but won't make a difference for actual > too-many-puts situations? yes, refcount_t cannot protect from too-many-puts because there is not real way to protect from those AFAICS. At a certain moment you just drop to 0 and lose your object and all following that is a UAF. But I do not think this is actually the interesting case at all.
--- linux-6.3.orig/kernel/module/main.c +++ linux-6.3/kernel/module/main.c @@ -850,7 +850,9 @@ void module_put(struct module *module) if (module) { preempt_disable(); ret = atomic_dec_if_positive(&module->refcnt); - WARN_ON(ret < 0); /* Failed to put refcount */ + WARN(ret < 0, + KERN_WARNING "Failed to put refcount for module %s\n", + module->name); trace_module_put(module, _RET_IP_); preempt_enable(); }