Message ID | 20230526205204.861311518@infradead.org |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp756258vqr; Fri, 26 May 2023 14:17:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4nBoeRrOlG38lb5l1tiVl7WZWXd31OfX2xZ6CH+rHoxR/1lf6JkmRYQqln3Sj2JBm8iMWP X-Received: by 2002:a17:90a:9bcc:b0:250:132a:5d93 with SMTP id b12-20020a17090a9bcc00b00250132a5d93mr3029944pjw.49.1685135847979; Fri, 26 May 2023 14:17:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685135847; cv=none; d=google.com; s=arc-20160816; b=zUoplSGx93rTFdNjVbI0a/38rC46qIlsVvZcra9vN0s5EHvj21skbqrroEbOSuIyvr 5g+skeU6DiW0EcN3puAosrikLHUaKyW7r7UaiNmco8cmwsLokVolD8iynaweLq1em5zF +BsyN2BQUT5XwVzFQxDuc/LR4U+2lNobLD82ErFudjmJd61hQlBYhOr50To9RD5QNVZV E1JFBL4uyznsDVMYIbo8wjgcXlTEbeYfag6P2KMNbk/3jDA9rL3yjco50GmnAaPslYCi 52eZioqdC4i5QdSAKyhTlfihAckSSFD2qurWGmzXq0C/GJxSv5bVMkGT8NG3w7Ofe9iN w2cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:date:user-agent:message-id :dkim-signature; bh=urvfHsJBOS8zyeLyl+UdJT8Nl0X+FyS97M0e5Nc2t34=; b=MewV4PRmmv7AH7hG1G9+1gZgOLSTgXc9Mg8g4rB8EzFgaxG7LcHm9w/5T/lJMVx8td PMdk5Sovo9ogFFREb7ii0uWN0yHDdWzel1Ma4gHs/00DEOaH2rFpSCk89ufC2YTl1xcy vQ/8FwcFrtAxWWEipJTLGdQImuhvA4EBJpQpnfemyL7FQspxLKE4AvmZZXAMaRGoNvyI SvT382E4ATsN2ml1MFNWQC/7vfffRx0oRRJ2M86boRAGhsFsrXZ/hbnjfFeSdREnHQBk B9YQcpu3PUom+oQSo7Ow/PJioDTckiktgHaipxS8BGWo6BUHOGEpOgnZgxueEvP7hMJi qG8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=nMUfxulQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h187-20020a636cc4000000b0051aeaf666dfsi4428060pgc.668.2023.05.26.14.17.13; Fri, 26 May 2023 14:17:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b=nMUfxulQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243376AbjEZVBP (ORCPT <rfc822;zhanglyra.2023@gmail.com> + 99 others); Fri, 26 May 2023 17:01:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230467AbjEZVBM (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 26 May 2023 17:01:12 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5907719C; Fri, 26 May 2023 14:01:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Subject:Cc:To:From:Date:Message-ID: Sender:Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To:References; bh=urvfHsJBOS8zyeLyl+UdJT8Nl0X+FyS97M0e5Nc2t34=; b=nMUfxulQfyDRFhMgPquBzy1JMU 7OITUm6XBSHDal5ty4OzC+zkpOof2p0Exb/t6YKVC/OylO9uWEcykEdgS5R4KjvddxsOLJSSk3/sI 20YmYkOhdP3+uKSAjKK509Nxl+SAUjL8AB/HtnWV6yD4o4/9dXoINaa15zo9O4IQZHdKXdX/s8hga kTH/FrvlpQrK1ePCQNDgRykI/bsR+q0NLWc/PYlUtf1TG97P4phZb0r6O8trCLHehDEr5KTgvd2kO y7q5ldrzn68zsAGwag8ee8jftSV19rdL5hj596WQuecBK6ZJoMojx0WM3fDtWz5YmYr3I6Kmbk08A L495JIMw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q2eYG-007hpR-2z; Fri, 26 May 2023 21:00:45 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 06CB23002F0; Fri, 26 May 2023 23:00:41 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id AF53F205BBDB2; Fri, 26 May 2023 23:00:41 +0200 (CEST) Message-ID: <20230526205204.861311518@infradead.org> User-Agent: quilt/0.66 Date: Fri, 26 May 2023 22:52:04 +0200 From: Peter Zijlstra <peterz@infradead.org> To: torvalds@linux-foundation.org, keescook@chromium.org, gregkh@linuxfoundation.org, pbonzini@redhat.com Cc: linux-kernel@vger.kernel.org, ojeda@kernel.org, ndesaulniers@google.com, peterz@infradead.org, mingo@redhat.com, will@kernel.org, longman@redhat.com, boqun.feng@gmail.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, paulmck@kernel.org, frederic@kernel.org, quic_neeraju@quicinc.com, joel@joelfernandes.org, josh@joshtriplett.org, mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com, rcu@vger.kernel.org, tj@kernel.org, tglx@linutronix.de Subject: [PATCH v2 0/2] Lock and Pointer guards X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766970991644610276?= X-GMAIL-MSGID: =?utf-8?q?1766993007400704771?= |
Series |
Lock and Pointer guards
|
|
Message
Peter Zijlstra
May 26, 2023, 8:52 p.m. UTC
By popular demand, a new and improved version :-) New since -v1 ( https://lkml.kernel.org/r/20230526150549.250372621@infradead.org ) - much improved interface for lock guards: guard() and scoped () { } as suggested by Linus. - moved the ';' for the 'last' additional guard members into the definition as suggested by Kees. - put __cleanup in the right place with the suggested comment as suggested by Kees and Miguel - renamed the definition macros DEFINE_LOCK_GUARD_[012] to indicate the number of arguments the lock function takes - added comments to guards.h - renamed pretty much all guard types Again compile and boot tested x86_64-defconfig
Comments
On 5/26/23 16:52, Peter Zijlstra wrote: > By popular demand, a new and improved version :-) > > New since -v1 ( https://lkml.kernel.org/r/20230526150549.250372621@infradead.org ) > > - much improved interface for lock guards: guard() and scoped () { } > as suggested by Linus. <name bikeshedding> I know I'm the one who hinted at C++ "std::scoped_lock" as a similar preexisting API, but I find that "scoped()" is weird in the newly proposed form. "scoped_lock" is fine considering that "scoped" is an adjective applying to "lock", but in the case of e.g. scoped(rcu) { }, then we are really declaring a "scope" of type "rcu". I suspect that in this case: scope(rcu) { } would be less unexpected than the adjective form: scoped(rcu) { } Especially if we go for the name "guard()", rather than the adjective guarded(), for its counterpart. Thanks, Mathieu
On Fri, May 26, 2023 at 2:01 PM Peter Zijlstra <peterz@infradead.org> wrote: > > By popular demand, a new and improved version :-) Well, I certainly find the improved version more legible, despite my further comment about the type macros being too complicated. But now I've slept on it, and I think the problem goes beyond "macros being too complicated". I actually think that the whole "create a new scope" is just wrong. I understand why you did it: the whole "its' going out of scope" is part of the *idea*. BUT. Adding a new scope results in (a) pointless indentation (b) syntax problems with fighting the limited syntax of for loops (c) extra dummy variables - both for the for loop hack, but also for the scope guard thing itself and none of the above is an *advantage*. So how about we take a step back, and say "what if we don't create a new scope at all"? I think it actually improves on everything. The macros become *trivial*. The code becomes more legible. And you can have multiple different scopes very naturally, or you can just share a scope. Let me build up my argument here. Let's start with this *trivial* helper: /* Trivial "generic" auto-release macro */ #define auto_release_name(type, name, init, exit) \ type name __attribute__((__cleanup__(exit))) = (init) it's truly stupid: it creates a new variable of type 'type', with name 'name', and with the initial value 'init', and with the cleanup function 'exit'. It looks stupid, but it's the *good* stupid. It's really really obvious, and there are no games here. Let me then introduce *one* other helper, because it turns out that much of the time, you don't really want to pick a name. So we'll use the above macro, but make a version of it that just automatically picks a unique name for you: #define auto_release(type, init, exit) \ auto_release_name(type, \ __UNIQUE_ID(auto) __maybe_unused, \ init, exit) again, there are absolutely no games here, it's all just very very straightforward. It's so straightforward that it all feels stupid. But again, it's the 'S' in "KISS". Keep It Simple, Stupid. And it turns out that the above two trivial macros are actually quite useful in themselves. You want to do an auto-cleanup version of 'struct fd'? It's trivial: /* Trivial "getfd()" wrapper */ static inline void release_fd(struct fd *fd) { fdput(*fd); } #define auto_getfd(name, n) \ auto_release_name(struct fd, name, fdget(n), release_fd) and now you can write code like this: int testfd(int fd) { auto_getfd(f, fd); /* Do something with f.file */ return f.file ? 0 : -EBADF; } Which is really trivial, and actually pretty legible. Note that there is no explicit "scope" here. The scope is simply implicit in the code. The compiler will verify that it's at the beginning of the block, since we do that -Wdeclaration-after-statement, so for the kernel this is always at the beginning of a scope anyway. And that's actually a big advantage. You want to deal with *two* file descriptors? Sure. Just do this: /* Do two file descriptors point to the same file? */ int testfd2(int fd1, int fd2) { auto_getfd(f1, fd1); auto_getfd(f2, fd2); return f1.file && f1.file == f2.file; } and it JustWorks(tm). Notice how getting rid of the explicit scoping is actually a *feature*. Ok, so why did I want that helper that used that auto-genmerated name? The above didn't use it, but look here: /* Trivial "mutex" auto-release helpers */ static inline struct mutex *get_mutex(struct mutex *a) { mutex_lock(a); return a; } static inline void put_mutex(struct mutex **a) { mutex_unlock(*a); } #define auto_release_mutex_lock(lock) \ auto_release(struct mutex *, get_mutex(lock), put_mutex) That's all you need for a nice locked region. Or how about RCU, which doesn't have a lock type? Just do this: struct rcu_scope; static inline struct rcu_scope *get_rcu(void) { rcu_read_lock(); return NULL; } static void put_rcu(struct rcu_scope **a) { rcu_read_unlock(); } #define auto_release_rcu_lock() \ auto_release(struct rcu_scope *, get_rcu(), put_rcu) and you're done. And how you can *mix* all these: int testfd2(int fd1, int fd2) { auto_release_mutex_lock(&cgroup_mutex); auto_release_rcu_lock(); auto_getfd(f1, fd1); auto_getfd(f2, fd2); return f1.file && f1.file == f2.file; } Ok. The above is insane. But it's instructive. And it's all SO VERY SIMPLE. It's also an example of something people need to be aware of: the auto-releasing is not ordered. That may or may not be a problem. It's not a problem for two identical locks, but it very much *can* be a problem if you mix different locks. So in the crazy above, the RCU lock may be released *after* the cgroup_mutex is released. Or before. I'm not convinced it's ordered even if you end up using explicit scopes, which you can obviously still do: int testfd2(int fd1, int fd2) { auto_release_mutex_lock(&cgroup_mutex); do { auto_release_rcu_lock(); auto_getfd(f1, fd1); auto_getfd(f2, fd2); return f1.file && f1.file == f2.file; } while (0); } so I don't think the nasty "scoped()" version is any better in this regard. And before you say "unlock order doesn't matter" - that's actually not true. Unlock order does matter when you have things like "spin_lock_irq()" where the irq-off region then protects other locks too. If you do the "unlock_irq" before you've released the other spinlocks that needed irq protection, you're now screwed. Final notes: - I think the above is simpler and objectively better in every way from the explicitly scoped thing - I do think that the auto-release can be very dangerous for locking, and people need to verify me about the ordering. Maybe the freeing order is well-defined. - I also suspect that to get maximum advantage of this all, we would have to get rid of -Wdeclaration-after-statement That last point is something that some people will celebrate, but I do think it has helped keep our code base somewhat controlled, and made people think more about the variables they declare. But if you start doing consecutive cleanup operations, you really end up wanting to do thigns like this: int testfd2(int fd1, int fd2) { auto_getfd(f1, fd1); if (!f1.file) return -EBADF; auto_getfd(f2, fd2); if (!f2.file) return -EBADF; return do_something (f1, f2); } and you basically *have* to allow those declarations in the middle - because you don't want to clean up fd2 in that "f1 failed" case, since f2 doesn't exist yet. Linus
On Sat, May 27, 2023 at 9:18 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > It's also an example of something people need to be aware of: the > auto-releasing is not ordered. That may or may not be a problem. It's > not a problem for two identical locks, but it very much *can* be a > problem if you mix different locks. It is guaranteed. It would be nice to have it documented, but if you look at the intermediate representation of this simple example: #include <stdio.h> int print_on_exit(void **f) { if (*f) puts(*f); *f = NULL; } int main(int argc) { // prints "second" then "first" void *__attribute__((__cleanup__(print_on_exit))) cleanup1 = "first"; void *__attribute__((__cleanup__(print_on_exit))) cleanup2 = "second"; } ... the implementation is introducing a scope and reusing the C++ try/finally implementation, and the "optimizations" that are allowed when functions are not "throwing" anything. This is always the case for C, so this (from f.c.005t.original, obtained with -c -fdump-tree-all): { // the duplicated initializers are just an artifact of the AST void * cleanup1 = (void *) "first"; void * cleanup2 = (void *) "second"; void * cleanup1 = (void *) "first"; try { void * cleanup2 = (void *) "second"; try { } finally { print_on_exit (&cleanup2); } } finally { print_on_exit (&cleanup1); } } return 0; becomes this as soon as exceptions are lowered (from f.c.013t.eh), which is before any kind of optimization: int main (int argc) { void * cleanup2; void * cleanup1; int D.3187; cleanup1 = "first"; cleanup2 = "second"; print_on_exit (&cleanup2); print_on_exit (&cleanup1); cleanup1 = {CLOBBER(eol)}; cleanup2 = {CLOBBER(eol)}; D.3187 = 0; goto <D.3188>; <D.3188>: return D.3187; } > So in the crazy above, the RCU lock may be released *after* the > cgroup_mutex is released. Or before. > > I'm not convinced it's ordered even if you end up using explicit > scopes, which you can obviously still do: It is, see void *__attribute__((__cleanup__(print_on_exit))) cleanup1 = "first"; { void *__attribute__((__cleanup__(print_on_exit))) cleanup2 = "second"; puts("begin nested scope"); } puts("back to outer scope"); which prints begin nested scope second back to outer scope first This is practically required by glibc's nasty pthread_cleanup_push/pthread_cleanup_pop macros (the macros are nasty even if you ignore glibc's implementation, but still): # define pthread_cleanup_push(routine, arg) \ do { \ struct __pthread_cleanup_frame __clframe \ __attribute__ ((__cleanup__ (__pthread_cleanup_routine))) \ = { .__cancel_routine = (routine), .__cancel_arg = (arg), \ .__do_it = 1 }; /* Remove a cleanup handler installed by the matching pthread_cleanup_push. If EXECUTE is non-zero, the handler function is called. */ # define pthread_cleanup_pop(execute) \ __clframe.__do_it = (execute); \ } while (0) If the scope wasn't respected, pthread_cleanup_pop(1) would be broken because pthread_cleanup_pop() must immediately execute the function if its argument is nonzero. > - I think the above is simpler and objectively better in every way > from the explicitly scoped thing It is simpler indeed, but a scope-based variant is useful too based on my experience with QEMU. Maybe things like Python's with statement have spoiled me, but I can't quite get used to the lone braces in { ... { auto_release(mutex, &mutex); ... } ... } So in QEMU we have both of them. In Linux that would be auto_release(mutex, &mutex) and scoped(mutex, &mutex) {} > - I do think that the auto-release can be very dangerous for locking, > and people need to verify me about the ordering. Maybe the freeing > order is well-defined. It is but having it documented would be better. > - I also suspect that to get maximum advantage of this all, we would > have to get rid of -Wdeclaration-after-statement Having both a declaration-based and a scope-based variant helps with that too. You could add _Pragma("GCC diagnostic push/ignore/pop") to the declaration-based macros but ouch... even without thinking of whether clang supports it, which I didn't check. Paolo > That last point is something that some people will celebrate, but I do > think it has helped keep our code base somewhat controlled, and made > people think more about the variables they declare.
On Mon, May 29, 2023 at 8:09 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On Sat, May 27, 2023 at 9:18 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > It's also an example of something people need to be aware of: the > > auto-releasing is not ordered. That may or may not be a problem. It's > > not a problem for two identical locks, but it very much *can* be a > > problem if you mix different locks. > > It is guaranteed. It would be nice to have it documented, but if you > look at the intermediate representation of this simple example: Well, I can see that it might be doing that reverse ordering in practice, but for the life of me, I can't actually find anything that says it is guaranteed. Google did find me one blog post by Ian Lance Taylor from 2008 that said that yes, each __cleanup__ attribute basically creates its own little scope, and that the cleanup in reverse declaration order is thus guaranteed. Let's add Ian to the cc, partly to confirm it wasn't just a random implementation detail, but also partly to perhaps ask him to get somebody to document it. Because if it's not documented, how do we know that the clang implementation of that attribute then ends up also guaranteeing the reverse order cleanup, even if gcc might guarantee it? I *suspect* - but cannot find any guarantees - that it's going to match C++ destructors, and you probably end up pretty much always having to deal with these cleanup functions in reverse order, so it all sounds very likely to me. And maybe it's even documented somewhere that neither of us could find. Anyway, I do like the option to use cleanup functions, but I think we'd better make sure, since we really may require nesting for locks (even if in many cases it won't matter). Ian? Any chance you can either point us at documentation, or maybe just confirm it, and hopefully make said documentation happen? Linus
On Mon, May 29, 2023 at 12:04 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mon, May 29, 2023 at 8:09 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On Sat, May 27, 2023 at 9:18 PM Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > > > It's also an example of something people need to be aware of: the > > > auto-releasing is not ordered. That may or may not be a problem. It's > > > not a problem for two identical locks, but it very much *can* be a > > > problem if you mix different locks. > > > > It is guaranteed. It would be nice to have it documented, but if you > > look at the intermediate representation of this simple example: > > Well, I can see that it might be doing that reverse ordering in > practice, but for the life of me, I can't actually find anything that > says it is guaranteed. > > Google did find me one blog post by Ian Lance Taylor from 2008 that > said that yes, each __cleanup__ attribute basically creates its own > little scope, and that the cleanup in reverse declaration order is > thus guaranteed. > > Let's add Ian to the cc, partly to confirm it wasn't just a random > implementation detail, but also partly to perhaps ask him to get > somebody to document it. > > Because if it's not documented, how do we know that the clang > implementation of that attribute then ends up also guaranteeing the > reverse order cleanup, even if gcc might guarantee it? > > I *suspect* - but cannot find any guarantees - that it's going to > match C++ destructors, and you probably end up pretty much always > having to deal with these cleanup functions in reverse order, so it > all sounds very likely to me. > > And maybe it's even documented somewhere that neither of us could find. > > Anyway, I do like the option to use cleanup functions, but I think > we'd better make sure, since we really may require nesting for locks > (even if in many cases it won't matter). > > Ian? Any chance you can either point us at documentation, or maybe > just confirm it, and hopefully make said documentation happen? It was a while ago, but I expect that I was just thinking of the implementation. I agree that the documentation could be clearer. I filed https://gcc.gnu.org/PR110029. Ian
On Mon, May 29, 2023 at 5:27 PM Ian Lance Taylor <iant@google.com> wrote: > > It was a while ago, but I expect that I was just thinking of the > implementation. I agree that the documentation could be clearer. I > filed https://gcc.gnu.org/PR110029. Thanks. I think we can proceed with the assumption that it's all clearly ordered. Even when the scopes are explicit (ie actual separate block scopes for the variables and no shared scope), doing a 'return' (or break out of a loop) will obviously exit multiple scopes at once, so it would be good to have it documented that the cleanup is always in that reverse order by scope and declaration order within a scope. I guess I should check with the clang people too. Linus
On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote: > So how about we take a step back, and say "what if we don't create a > new scope at all"? Note that the lock_guard() and ptr_guard() options I have don't require the new scope thing. The scope thing is optional. > I think it actually improves on everything. The macros become > *trivial*. The code becomes more legible. > > And you can have multiple different scopes very naturally, or you can > just share a scope. > > Let me build up my argument here. Let's start with this *trivial* helper: > > /* Trivial "generic" auto-release macro */ > #define auto_release_name(type, name, init, exit) \ > type name __attribute__((__cleanup__(exit))) = (init) > > it's truly stupid: it creates a new variable of type 'type', with name > 'name', and with the initial value 'init', and with the cleanup > function 'exit'. > > It looks stupid, but it's the *good* stupid. It's really really > obvious, and there are no games here. I really don't like the auto naming since C++/C23 use auto for type inference. > Let me then introduce *one* other helper, because it turns out that > much of the time, you don't really want to pick a name. So we'll use > the above macro, but make a version of it that just automatically > picks a unique name for you: > > #define auto_release(type, init, exit) \ > auto_release_name(type, \ > __UNIQUE_ID(auto) __maybe_unused, \ > init, exit) I like that option. > And it turns out that the above two trivial macros are actually quite > useful in themselves. You want to do an auto-cleanup version of > 'struct fd'? It's trivial: > > /* Trivial "getfd()" wrapper */ > static inline void release_fd(struct fd *fd) > { fdput(*fd); } > > #define auto_getfd(name, n) \ > auto_release_name(struct fd, name, fdget(n), release_fd) > > - I think the above is simpler and objectively better in every way > from the explicitly scoped thing Well, I think having that as a option would still be very nice. > - I also suspect that to get maximum advantage of this all, we would > have to get rid of -Wdeclaration-after-statement > > That last point is something that some people will celebrate, but I do > think it has helped keep our code base somewhat controlled, and made > people think more about the variables they declare. > > But if you start doing consecutive cleanup operations, you really end > up wanting to do thigns like this: > > int testfd2(int fd1, int fd2) > { > auto_getfd(f1, fd1); > if (!f1.file) > return -EBADF; > auto_getfd(f2, fd2); > if (!f2.file) > return -EBADF; > return do_something (f1, f2); > } If you extend the ptr_guard() idea you don't need to get rid of -Wdeclaration-after-statement and we could write it like: int testfd2(int fd1, int fd2) { ptr_guard(fdput, f1) = fdget(fd1); ptr_guard(fdput, f2) = null_ptr(fdput); if (!f1.file) return -EBADF; f2 = fdget(f2); if (!f2.file) return -EBADF; return do_something(f1, f2); } Yes, the macros would be a little more involved, but not horribly so I think. typedef struct fd guard_fdput_t; static const struct fd guard_fdput_null = __to_fd(0); static inline void guard_fdput_cleanup(struct fd *fd) { fdput(*fd); } #define ptr_guard(_guard, _name) \ guard##_guard##_t _name __cleanup(guard##_guard##_cleanup) #define null_ptr(_guard) guard##_guard##_null; And for actual pointer types (as opposed to fat pointer wrappers like struct fd) we can have a regular helper macro like earlier: #define DEFINE_PTR_GUARD(_guard, _type, _put) \ typdef _type *guard##_guard##_t; \ static const _type *guard##_guard##_null = NULL; \ static inline void guard##_guard##_cleanup(_type **ptr) \ { if (*ptr) _put(*ptr); } [NOTE: value propagation gets rid of the above conditional where appropriate] eg.: DEFINE_PTR_GUARD(put_task, struct task_struct, put_task_struct); Possibly with a helper: #define null_guard(_guard, _name) \ ptr_guard(_guard, _name) = null_ptr(_guard) Now, ptr_guard() per the above, is asymmetric in that it only cares about release, let guard() be the symmetric thing that also cares about init like so: #define DEFINE_GUARD(_guard, _type, _put, _get) \ DEFINE_PTR_GUARD(_guard, _type, _put) \ static inline void guard##_guard##_init(guard##_guard##_t arg) \ { _get(arg); return arg; } #define guard(_guard, _name, _var...) \ ptr_guard(_guard, _name) = guard##_guard@##_init(_var) #define anon_guard(_guard, _var..) \ guard(_guard, __UNIQUE_ID(guard), _var) for eg.: DEFINE_GUARD(mutex, struct mutex, mutex_unlock, mutex_lock); which then lets one write: int testfd2(int fd1, int fd2) { anon_guard(mutex, &cgroup_mutex); ptr_guard(fdput, f1) = fdget(fd1); null_guard(fdput, f2); if (!f1.file) return -EBADF; f2 = fdget(fd2); if (!f2.file) return -EBADf; return do_something(f1,f2); } The RCU thing can then either be manually done like: struct rcu_guard; typedef struct rcu_guard *guard_rcu_t; static const guard_rcu_null = NULL; static inline guard_rcu_cleanup(struct rcu_guard **ptr) { if (*ptr) rcu_read_unlock(); } static inline struct rcu_guard *guard_rcu_init(void) { rcu_read_lock(); return (void*)1; } (or because we'll need this pattern a few more times, with yet another DEFINE_foo_GUARD helper) and: anon_guard(rcu); works. And at this point the previous scope() things are just one helper macro away: #define scope(_guard, _var..) \ for (guard##_guard##_t *done = NULL, scope = guard##_guard##_init(var); \ !done; done++) to be used where appropriate etc.. Yes, it's a wee bit more involved, but I'm thinking it gives a fair amount of flexibility and we don't need to ret rid of -Wdeclaration-after-statement. Hmm?
On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote: > And before you say "unlock order doesn't matter" - that's actually not > true. Unlock order does matter when you have things like > "spin_lock_irq()" where the irq-off region then protects other locks > too. So I had already verified that GCC keeps them ordered, I see the subthread with Ian and the documentation update request -- so all good there. Also, lockdep would scream bloody murder if you get that wrong, so I can certainly add something to the selftests that would detect this.
On Tue, May 30, 2023 at 11:23:42AM +0200, Peter Zijlstra wrote: > Yes, it's a wee bit more involved, but I'm thinking it gives a fair > amount of flexibility and we don't need to ret rid of > -Wdeclaration-after-statement. One other thing I forgot to point put; it allows things like: int store_fd(int fd) { ptr_guard(fdput, f) = fdget(fd); void *ret; if (!f.file) return -EBADF; ret = xa_store(&xarray, f.file->private, f); if (xa_is_err(ret)) return xa_err(ret); f = null_ptr(fdput); // xarray now owns f return 0; } Where we can assign null_ptr() to clear the guard and inhibit the cleanup function to pass ownership around.
On 30/05/23 11:23, Peter Zijlstra wrote: > On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote: > > >> And it turns out that the above two trivial macros are actually quite >> useful in themselves. You want to do an auto-cleanup version of >> 'struct fd'? It's trivial: >> >> /* Trivial "getfd()" wrapper */ >> static inline void release_fd(struct fd *fd) >> { fdput(*fd); } >> >> #define auto_getfd(name, n) \ >> auto_release_name(struct fd, name, fdget(n), release_fd) >> > >> - I think the above is simpler and objectively better in every way >> from the explicitly scoped thing > > Well, I think having that as a option would still be very nice. > IMO the explicit scoping can help with readability. It gives a clear visual indication of where critical sections are, and you can break it up with a scope + guards as in migrate_swap_stop() to stay at sane indentation levels (with Python context managers, this would all be one scope). I'd argue that for these, the scope/indentation is beneficial and not just a byproduct. Even for longer functions like try_to_wake_up(), this works out alright. This obviously falls apart when dealing with too many guards (e.g. copy_process()) or if the resulting indentation is nuts, but I concur that keeping the explicit scope as an option would be nice.
On Tue, May 30, 2023 at 11:23:42AM +0200, Peter Zijlstra wrote: > Yes, it's a wee bit more involved, but I'm thinking it gives a fair > amount of flexibility and we don't need to ret rid of > -Wdeclaration-after-statement. So I made all that work and .. Yes, you're absolutely right. Busting -Wdeclaration-after-statement is the right thing to do for guards. So then I came up with: #define __ptr_guard(_guard, _name) \ guard_##_guard##_t _name __cleanup(guard_##_guard##_cleanup) #define ptr_guard(_guard, _name) \ __diag(push) \ __diag(ignored "-Wdeclaration-after-statement") \ __ptr_guard(_guard, _name) \ __diag(pop) #define guard_init(_guard, _var...) \ guard_##_guard##_init(_var) #define named_guard(_guard, _name, _var...) \ ptr_guard(_guard, _name) = guard_init(_guard, _var) #define guard(_guard, _var...) \ named_guard(_guard, __UNIQUE_ID(guard), _var) #define scoped_guard(_guard, _var...) \ for (__ptr_guard(_guard, scope) = guard_init(_guard, _var), \ *done = NULL; !done; done = (void *)1) And that all (mostly) works on clang, but not GCC :-( GCC refuses to accept _Pragma() inside an expression. So I now have that ptr_guard() with push/pop for clang but without for GCC, which means that only clang has a fighting chance to report -Wdeclaration-after-statement warns until such a time as that we can get GCC 'fixed'. https://godbolt.org/z/5MPeq5W6K FWIW: the work-in-progress patches I have are here: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=core/guards
On Tue, Jun 6, 2023 at 2:43 AM Peter Zijlstra <peterz@infradead.org> wrote: > > ( GCC refuses to accept _Pragma() inside an expression. If we really want this all, I think we'd just stop using -Wdeclaration-after-statement entirely. There are other uses for it, and people have asked for mixing declarations and code before. I think that particular straightjacket has been a good thing, but I also think that it's ok to just let it go as a hard rule, and just try to make it a coding style issue for the common case, but allow mixed declarations and code when it makes sense. For the whole "automatic release case it definitely makes sense, but it's not like it isn't possible useful elsewhere. I just don't want for it to become some global pattern for everything. That said, I still don't understand why you lke the name "guard" for this. I understand not liking "auto", but "guard" doesn't seem any better. In fact, much worse. Guarded expressions means something completely different both in real life and in computer science. I'm assuming there's some history there, but it makes no sense to me as a name here. Linus
On Tue, Jun 06, 2023 at 06:17:33AM -0700, Linus Torvalds wrote: > That said, I still don't understand why you lke the name "guard" for > this. I understand not liking "auto", but "guard" doesn't seem any > better. In fact, much worse. Guarded expressions means something > completely different both in real life and in computer science. > > I'm assuming there's some history there, but it makes no sense to me > as a name here. I know the name from C++ where it is std::lock_guard<> (well, back when I still did C++ it wasn't std, but whatever), and Rust seems to have std::sync::MutexGuard<>. But if that's the sole objection left, lets have a bike-shed party and pick a colour :-) 'shield' 'sentry' 'sentinel' 'keeper' 'custodian' 'warden' ?
On Tue, Jun 6, 2023 at 6:40 AM Peter Zijlstra <peterz@infradead.org> wrote: > > I know the name from C++ where it is std::lock_guard<> (well, back when > I still did C++ it wasn't std, but whatever), and Rust seems to have > std::sync::MutexGuard<>. > > But if that's the sole objection left, lets have a bike-shed party and > pick a colour :-) > > 'shield' 'sentry' 'sentinel' 'keeper' 'custodian' 'warden' ? I feel like you seem entirely too fixated on locking. That's not even _remotely_ the most interesting case, I think. For the common locking patterns , maybe "guard" makes sense as part of the name, but it damn well shouldn't be about any "ptr_guard" or whatever. I think it should be some trivial mutex_scope(mymytex) { .. } and yes, in that case scoping makes tons of sense and might even be required, because you do want to visually see the scope of the locking. I'm not seeing the point of "guard" in that name either, but I don't think the above is the complex case for naming, for use, or for syntax. In contrast, I think reference counting and allocations are the much more interesting one, and the case where you are also more likely to have multiple consecutive ones, and where it's likely less about "this region is protected" and much more about the RAII patterns and resources that you just want cleanup for. So that's why I had that "auto_release" name - to make it clear that it's about the *cleanup*, not about some "guarded region". See my argument? The locking cases can probably be handled with just a couple of macros (mutex, rcu, maybe spinlock?) and I would violently argue that the locking cases will have to have unconditional cleanup (ie if you *ever* have the situation where some path is supposed to be able to return with the lock held and no cleanup, then you should simply not use the "mutex_scope()" kind of helper AT ALL). So locking is, I feel, almost uninteresting. There are only a few cases, and only trivial patterns for it, because anything non-trivial had better be done very explicitly, and very much visibly by hand. But look at the cases where we currently use "goto exit" kind of code, or have duplicated cleanups because we do "cleanup(x); return -ERRNO" kinds of patterns. Some of them are locking, yes. But a lot of them are more generic "do this before returning" kinds of things: freeing (possibly complex) data structures, uncharging statistcs, or even things like writing results back to user space. And, unlike locking, those often (but not always) end up having the "don't do this in the single success case" situation, so that you need to have some way to show "ok, don't release it after all". Yes, that ends up in practice likely being setting that special pointer to NULL, but I think we need to have a much better syntax for it to show that "now we're *not* doing the auto-release". So it's that more complex case that I (a) don't think "guard" makes any sense for at all (b) think it's where the bigger payoffs are (c) think that generally are not "nested" (you release the resources you've allocated, but you don't "nest" the allocations - they are just serial. (d) think that the syntax could be pretty nasty, because the cleanup is not just a trivial fixed "unlock" function Just as an example of something that has a bit of everything, look at kernel/cpu.c and the _cpu_down() function in particular, which has that "goto out" to do all the cleanup. That looks like a perfect case for having some nice "associate the cleanup with the initialization" RAII patterns, but while it does have one lock (cpus_write_lock/unlock()), even that one is abstracted away so that it may be a no-op, and might be a percpu rwlock, and so having to create a special macro for that would be more work than is worth it. Because what I *don't* want to happen is that people create some one-time-use "helper" macro infrastructure to use this all. Again, that cpus_write_lock/unlock is a good example of this. End result: to avoid having crazy indentation, and crazy one-time helper inline functions or whatever, I think you'd really want some fairly generic model for "I am doing X, and I want to to Y as a cleanup when you exit the function" where both X and Y can be *fairly* complicated things. They might be as simple (and reasonably common) as "fd = fdget(f)" and "fdput(fd)", but what about the one-offs like that cpus_write_lock/unlock pattern? That one only happens in two places (_cpu_down() and _cpu_up()), and maybe the answer is "we can't reasonably do it for that, because the complexity is not worth it". But maybe we *can*. For example, this is likely the only realistic *good* case for the horrible "nested function" thing that gcc supports. In my "look, we could clean up a 'struct fd' example from last week, I made it do something like this: /* Trivial "getfd()" wrapper */ static inline void release_fd(struct fd *fd) { fdput(*fd); } #define auto_getfd(name, n) \ auto_release_name(struct fd, name, fdget(n), release_fd) but it would possibly be much more generic, and much more useful, if that "release_fd()" function was generated by the macro as a local nested inline function. End result: you could use any arbitrary local cleanup code. So you could have something like #define RAII(type, var, init, exit) \ __RAII(type, var, init, exit, __UNIQUE_ID(fn) #define __RAII(type, var, init, exit, exitname) \ void exitname(type *p) { exit } \ type var __attribute__((__cleanup__(exitname))) = (init) and do all of the above with RAII(struct fd, fd, fdget(f), fdput(fd)); because that macro would literally expand to create a (uniquely named) nested function that then contains that "fdput(fd)". I dunno. The syntax looks pretty bad, and I'm not even convinced clang supports nested functions, so the above may not be an option. But wouldn't it be nice to just be able to declare that arbitrary cleanup in-place? Then the lock cases would really just be trivial helper wrappers. And you can use "guard" there if you want, but I feel it makes absolutely *no* sense in the generic case. There is absolutely nothng that is being "guarded" by having an automatic cleanup of some random variable. Linus
On Tue, Jun 06, 2023 at 11:42:51AM +0200, Peter Zijlstra wrote: > [...] > #define scoped_guard(_guard, _var...) \ > for (__ptr_guard(_guard, scope) = guard_init(_guard, _var), \ > *done = NULL; !done; done = (void *)1) nit: Linus's example was "(void *)8" (instead of 1) because we've had issues in the past with alignment warnings on archs that are sensitive to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL comparisons.)
On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote: > > nit: Linus's example was "(void *)8" (instead of 1) because we've had > issues in the past with alignment warnings on archs that are sensitive > to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL > comparisons.) Note that I don't think we ever saw such a warning, it was just a theoretical observation that depending on type, the compiler might warn about known mis-aligned pointer bits. So I'm not sure the 1-vs-8 actually matters. We do other things that assume that low bits in a pointer are retained and valid, even if in theory the C type system might have issues with it. But maybe I mis-remember - if you did get an actual warning, maybe we should document that warning just to keep the memory alive. Linus
On Tue, Jun 06, 2023 at 07:50:47AM -0700, Linus Torvalds wrote: > So you could have something like > > #define RAII(type, var, init, exit) \ > __RAII(type, var, init, exit, __UNIQUE_ID(fn) > > #define __RAII(type, var, init, exit, exitname) \ > void exitname(type *p) { exit } \ > type var __attribute__((__cleanup__(exitname))) = (init) > > and do all of the above with > > RAII(struct fd, fd, fdget(f), fdput(fd)); "fdput(fd)" needs to be "fdput(*p)", since otherwise "fdput(fd)" is referencing "fd" before it has been declared. But regardless, yes, Clang is angry about the nested function. Also, while my toy[1] example doesn't show it, GCC may also generate code that requires an executable stack for some instances (or at least it did historically) that need trampolines. [1] https://godbolt.org/z/WTjx6Gs7x Also, more nits on naming: isn't this more accurately called Scope-based Resource Management (SBRM) not RAII? (RAII is technically object lifetime, and SBRM is scope entry/exit.)
On Tue, Jun 06, 2023 at 08:45:49AM -0700, Linus Torvalds wrote: > On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote: > > > > nit: Linus's example was "(void *)8" (instead of 1) because we've had > > issues in the past with alignment warnings on archs that are sensitive > > to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL > > comparisons.) > > Note that I don't think we ever saw such a warning, it was just a > theoretical observation that depending on type, the compiler might > warn about known mis-aligned pointer bits. > > So I'm not sure the 1-vs-8 actually matters. We do other things that > assume that low bits in a pointer are retained and valid, even if in > theory the C type system might have issues with it. > > But maybe I mis-remember - if you did get an actual warning, maybe we > should document that warning just to keep the memory alive. I've never seen a warning, but since this came up in the dissection of the __is_constexpr() behavior, it's been burned into my mind. ;)
On Tue, Jun 06, 2023 at 07:50:47AM -0700, Linus Torvalds wrote: > I feel like you seem entirely too fixated on locking. It's what I started out with... but it's certainly not everything. The thing I have removes pretty much all the error gotos from sched/core.c and events/core.c and while locking is ofcourse a fair amount of that there's significant non-locking usage. > (c) think that generally are not "nested" (you release the resources > you've allocated, but you don't "nest" the allocations - they are just > serial. > > (d) think that the syntax could be pretty nasty, because the cleanup > is not just a trivial fixed "unlock" function So I'm not sure you've seen the actual convertions I've done, but yes, there's lots of those. I've just used the guard naming because locks is what I started out with. + ptr_guard(kfree, alloc) = kzalloc(event->read_size, GFP_KERNEL); + if (!alloc) return -ENOMEM; event = perf_event_alloc(&attr, cpu, task, group_leader, NULL, NULL, NULL, cgroup_fd); + if (IS_ERR(event)) + return PTR_ERR(event); + + ptr_guard(free_event, event_guard) = event; + ptr_guard(put_task, p) = find_get_task(pid); + if (!p) + return -ESRCH; So it does all that... you just hate the naming -- surely we can fix that. Would it all be less offensive if I did: s/guard/cleanup/ on the whole thing? Then we'd have things like: DEFINE_PTR_CLEANUP(put_task, struct task_struct *, if (_C) put_task_struct(_C)) ptr_cleanup(put_task, p) = find_get_task(pid); if (!p) return -ESRCH; DEFINE_PTR_CLEANUP(free_event, struct perf_event *, if (!IS_ERR_OR_NULL(_C)) free_event(_C)) ptr_cleanup(free_event, event) = perf_event_alloc(...); if (IS_ERR(event)) return PTR_ERR(event); DEFINE_PTR_CLEANUP(kfree, void *, kfree(_C)) ptr_cleanup(kfree, foo) = kzalloc(...); if (!foo) return -ENOMEM; But do we then continue with: DEFINE_CLEANUP(mutex, struct mutex *, mutex_lock(_C), mutex_unlock(_C)) cleanup(mutex, &my_mutex); scoped_cleanup (mutex, &my_mutex) { ... } etc..? or do we keep guard() there, but based internally on ptr_cleanup() with the cleanup_## naming.
On Tue, Jun 6, 2023 at 11:08 AM Peter Zijlstra <peterz@infradead.org> wrote:> > Would it all be less offensive if I did: s/guard/cleanup/ on the whole > thing? It's more than "guard" for me. What is "ptr"? Why? We already know of at least one case where it's not a pointer at all, ie 'struct fd'. So I *really* hate the naming. Absolutely none of it makes sense to me. One part is a nonsensical name apparently based on a special-case operation, and the other part is a nonsensical type from just one random - if common - implementation issue. What you want to do is to have a way to define and name a "constructor/desctructor" pair for an arbitrary type - *not* necessarily a pointer - and then optionally a way to say "Oh, don't do the destructor, because I'm actually going to use it long-term". I said "cleanup", but that's not right either, since we always have to have that initializer too. Maybe just bite the bullet, and call the damn thing a "class", and have some syntax like DEFINE_CLASS(name, type, exit, init, initarg...); to create the infrastructure for some named 'class'. So you'd have DEFINE_CLASS(mutex, struct mutex *, mutex_unlock(*_P), ({mutex_lock(mutex); mutex;}), struct mutex *mutex) to define the mutex "class", and do DEFINE_CLASS(fd, struct fd, fdput(*_P), fdget(f), int f) for the 'struct fd' thing. The above basically would just create the wrapper functions (and typedefs) for the constructor and destructor. So the 'DEFINE_CLASS(mutex ..)' thing would basically just expand to typedef struct mutex *class_mutex_type; static inline void class_mutex_destructor(class_mutex_type *_C) { mutex_unlock(*_C); } static inline class_mutex_type class_mutex_constructor(struct mutex *mutex) { return ({mutex_lock(mutex); mutex;}); } Then to _instantiate_ one of those, you'd do INSTANTIATE_CLASS(name, var) which would expand to class_name_type var __attribute__((__cleanup__(class_name_destructor))) = class_name_constructor and the magic of that syntax is that you'd actually use that "INSTANTIATE_CLASS()" with the argument to the init function afterwards, so you'd actually do INSTANTIATE_CLASS(mutex, n)(&sched_domains_mutex); to create a variable 'n' of class 'mutex', where the class_mutex_constructor gets the pointer to 'sched_domain_mutex' as the argument. And at THAT point, you can do this: #define mutex_guard(m) \ INSTANTIATE_CLASS(mutex, __UNIQUE_ID(mutex))(m) and now you can do mutex_guard(&sched_domains_mutex); to basically start a guarded section where you hold 'sched_domains_mutex'. And in that *very* least situation, 'guard' makes sense in the name. But no earlier. And there is absolutely no 'ptr' anywhere here. The above should work also for the 'struct fd' case, so you can do INSTANTIATE(fd, f)(fd); to create a 'struct fd' named 'f' that is initialized with 'fdget(fd)', and will DTRT when going out of scope. Modulo any stupid errors I did. And ok, I didn't write out the exact macro magic to *get* those expansions above (I just tried to write out the approximate end result), but it should be mostly fairly straightforward. So it would be the usual token pasting tricks to get the 'class type', the 'class destructor' and the 'class constructor' functions. Finally, note that the "initarg" part is a macro vararg thing. The initargs can be empty (for things like RCU), but it could also possibly be multiple arguments (like a "size and gfp_flags" for an allocation?). I'm sure there's something horribly wrong in the above, but my point is that I'd really like this to make naming and conceptual sense. And if somebody goes "The above is basically exactly what the original C++ compiler did back when it was really just a pre-processor for C", then you'd be 100% right. The above is basically (a part of) what Bjarne did, except he did it as a separate pass. And to answer the next question: why not just give up and do C++ - it's because of all the *other* things Bjarne did. Linus
On Tue, Jun 06, 2023 at 04:22:26PM -0700, Linus Torvalds wrote: > On Tue, Jun 6, 2023 at 11:08 AM Peter Zijlstra <peterz@infradead.org> wrote:> > > Would it all be less offensive if I did: s/guard/cleanup/ on the whole > > thing? > > It's more than "guard" for me. > > What is "ptr"? Why? We already know of at least one case where it's > not a pointer at all, ie 'struct fd'. (so in my view struct fd is nothing more than a fat pointer) > So I *really* hate the naming. Absolutely none of it makes sense to > me. One part is a nonsensical name apparently based on a special-case > operation, and the other part is a nonsensical type from just one > random - if common - implementation issue. > > What you want to do is to have a way to define and name a > "constructor/desctructor" pair for an arbitrary type - *not* > necessarily a pointer - and then optionally a way to say "Oh, don't do > the destructor, because I'm actually going to use it long-term". Yes, so when it's a 'pointer', that part becomes assigning it NULL (or fdnull in the struct fd case). For example: DEFINE_PTR_CLEANUP(kfree, void *, kfree(_C)) ptr_cleanup(kfree, mem) = kzalloc(....); if (!mem) return -ENOMEM; object = mem; // build object with more failure cases mem = NULL; // object is a success, we keep it. return object; > I said "cleanup", but that's not right either, since we always have to > have that initializer too. I've found that for most things the initializer part isn't actually that important. Consider that struct fd thing again; perf has a helper: static inline struct fd perf_fget_light(int fd) { struct fd f = fdget(fd); if (!f.file) return fdnull; if (f.file->f_op != &perf_fops) { fdput(f); return fdnull; } return f; } So now we have both fdget() and perf_fget_light() to obtain a struct fd, both need fdput(). The pointer with destructor semantics works for both: DEFINE_PTR_CLEANUP(fdput, struct fd, fdput(_C)) ptr_cleanup(fdput, f) = perf_fget_light(fd); or, somewhere else: ptr_cleanup(fdput, f) = fdget(fd); The same is true for kfree(), we have a giant pile of allocation functions that all are freed with kfree(): kmalloc(), kzalloc(), kmalloc_node(), kzalloc_node(), krealloc(), kmalloc_array(), krealloc_array(), kcalloc(), etc.. > Maybe just bite the bullet, and call the damn thing a "class", and > have some syntax like > > DEFINE_CLASS(name, type, exit, init, initarg...); > > to create the infrastructure for some named 'class'. So you'd have > > DEFINE_CLASS(mutex, struct mutex *, > mutex_unlock(*_P), > ({mutex_lock(mutex); mutex;}), struct mutex *mutex) > > to define the mutex "class", and do > > DEFINE_CLASS(fd, struct fd, > fdput(*_P), > fdget(f), int f) > > for the 'struct fd' thing. Right; that is very close to what I have. And certainly useful -- although as per the above, perhaps not so for the struct fd case. > Then to _instantiate_ one of those, you'd do > > INSTANTIATE_CLASS(name, var) > > which would expand to > > class_name_type var > __attribute__((__cleanup__(class_name_destructor))) = > class_name_constructor > > and the magic of that syntax is that you'd actually use that > "INSTANTIATE_CLASS()" with the argument to the init function > afterwards, so you'd actually do > > INSTANTIATE_CLASS(mutex, n)(&sched_domains_mutex); > > to create a variable 'n' of class 'mutex', where the > class_mutex_constructor gets the pointer to 'sched_domain_mutex' as > the argument. Yes, I had actually considered this syntax, and I really like it. The only reason I hadn't done that is because the for-loop thing, there I couldn't make it work. > I'm sure there's something horribly wrong in the above, but my point > is that I'd really like this to make naming and conceptual sense. Right, I hear ya. So the asymmetric case (iow destructor only) could be seen as using the copy-constructor. #define DEFINE_CLASS(name, type, exit, init, init_args...) \ typedef type class_##name##_t; \ static inline void class_##name##_destructor(type *this) \ { type THIS = *this; exit; } \ static inline type class_##name##_constructor(init_args) \ { type THIS = init; return THIS; } #define __INSTANTIATE_VAR(name, var) \ class_##name##_t var __cleanup(class_##name##_destructor) #define INSTANTIATE_CLASS(name, var) \ __INSTANTIATE_VAR(name, var) = class_##name##_constructor DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f) INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd)); Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily circumvent the default constructor? And/or how about EXTEND_CLASS(), something like so? #define EXTEND_CLASS(name, ext, init, init_args...) \ typedef class_##name##_t class_##name##ext##_t; \ static inline void class_##name##ext##_destructor(class_##name##_t *this) \ { class_##name##_destructor(this); } \ static inline type class_##name##ext##_constructor(init_args) \ { type THIS = init; return THIS; } DEFINE_CLASS(fd, struct fd, fdput(THIS), fdget(fd), int fd) EXTEND_CLASS(fd, _perf, perf_fget_light(fd), int fd) INSTANTIATE_CLASS(fd_perf, f)(fd); > And at THAT point, you can do this: > > #define mutex_guard(m) \ > INSTANTIATE_CLASS(mutex, __UNIQUE_ID(mutex))(m) > > and now you can do > > mutex_guard(&sched_domains_mutex); So the 'problem' is the amount of different guards I ended up having and you can't have macro's define more macros :/ Which is how I ended up with the: guard(mutex, &sched_domains_mutex); syntax. This can ofcourse be achieved using the above CLASS thing like: DEFINE_CLASS(mutex, struct mutex *, mutex_unlock(THIS), ({ mutex_lock(m); m; }), struct mutex *m) #define named_guard(name, var, args...) \ INSTANTIATE_CLASS(name, var)(args) #define guard(name, args...) \ named_guard(name, __UNIQUE_ID(guard), args) #define scoped_guard(name, args...) \ for (named_guard(name, scope, args), \ *done = NULL; !done; done = (void *)1) With the understanding they're only to be used for locks. Also, I'm already tired of writing INSTANTIATE.. would: CLASS(fd, f)(fd); VAR(kfree, mem) = kzalloc_node(...); be acceptable shorthand?
On Wed, Jun 07, 2023 at 11:41:01AM +0200, Peter Zijlstra wrote: > > I'm sure there's something horribly wrong in the above, but my point > > is that I'd really like this to make naming and conceptual sense. > > Right, I hear ya. So the asymmetric case (iow destructor only) could be > seen as using the copy-constructor. > > #define DEFINE_CLASS(name, type, exit, init, init_args...) \ > typedef type class_##name##_t; \ > static inline void class_##name##_destructor(type *this) \ > { type THIS = *this; exit; } \ > static inline type class_##name##_constructor(init_args) \ > { type THIS = init; return THIS; } > > #define __INSTANTIATE_VAR(name, var) \ > class_##name##_t var __cleanup(class_##name##_destructor) > > #define INSTANTIATE_CLASS(name, var) \ > __INSTANTIATE_VAR(name, var) = class_##name##_constructor > > > DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f) > > INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd)); > > > Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily > circumvent the default constructor? Or perhaps use the smart-pointer concept applied to our classes like: #define smart_ptr(name, var) \ __INSTANTIATE_VAR(name, var) To mean a pointer that calls the destructor for class 'name'. I think the nearest thing C++ has is std::unique_ptr<>. Then we can write: DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p) smart_ptr(kfree, mem) = kzalloc_node(...); if (!mem) return -ENOMEM; object = mem; // further initiatlize object with error cases etc.. mem = NULL; // success, we keep it. return object;
On Thu, Jun 08, 2023 at 10:52:48AM +0200, Peter Zijlstra wrote: > On Wed, Jun 07, 2023 at 11:41:01AM +0200, Peter Zijlstra wrote: > > > > > I'm sure there's something horribly wrong in the above, but my point > > > is that I'd really like this to make naming and conceptual sense. > > > > Right, I hear ya. So the asymmetric case (iow destructor only) could be > > seen as using the copy-constructor. > > > > #define DEFINE_CLASS(name, type, exit, init, init_args...) \ > > typedef type class_##name##_t; \ > > static inline void class_##name##_destructor(type *this) \ > > { type THIS = *this; exit; } \ > > static inline type class_##name##_constructor(init_args) \ > > { type THIS = init; return THIS; } > > > > #define __INSTANTIATE_VAR(name, var) \ > > class_##name##_t var __cleanup(class_##name##_destructor) > > > > #define INSTANTIATE_CLASS(name, var) \ > > __INSTANTIATE_VAR(name, var) = class_##name##_constructor > > > > > > DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f) > > > > INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd)); > > > > > > Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily > > circumvent the default constructor? > > Or perhaps use the smart-pointer concept applied to our classes like: > > #define smart_ptr(name, var) \ > __INSTANTIATE_VAR(name, var) > > To mean a pointer that calls the destructor for class 'name'. I think > the nearest thing C++ has is std::unique_ptr<>. > > > Then we can write: > > > DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p) > > > smart_ptr(kfree, mem) = kzalloc_node(...); > if (!mem) > return -ENOMEM; > > object = mem; > > // further initiatlize object with error cases etc.. > > mem = NULL; // success, we keep it. > return object; I like the idea, as we need a way to say "don't clean this up, it was passed to somewhere else" for these types of allocations, but have it "automatically" cleaned up on the error paths. I have no say in the naming, though I always disliked the idea of a pointer being "smart" as they are just a dumb memory register :) thanks, greg k-h
On Thu, Jun 8, 2023 at 1:53 AM Peter Zijlstra <peterz@infradead.org> wrote: > > Or perhaps use the smart-pointer concept applied to our classes like: > > #define smart_ptr(name, var) \ > __INSTANTIATE_VAR(name, var) So this is the only situation where I think "ptr" makes sense (your "fat pointer" argument is nonsensical - sure, you can treat anything as a pointer if you're brave enough, but that just means that "pointer" becomes a meaningless word). However, I think that for "smart pointers", we don't need any of this complexity at all, and we don't need that ugly syntax. > Then we can write: > > DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p) > > smart_ptr(kfree, mem) = kzalloc_node(...); > if (!mem) > return -ENOMEM; No, the above is broken, and would result in us using "void *" in situations where we really *really* don't want that. For automatic freeing, you want something that can handle different types properly, and without having to constantly declare the types somewhere else before use. And no, you do *not* want that "kfree(THIS)" kind of interface, because you want the compiler to inline the freeing function wrapper, and notice _statically_ when somebody zeroed the variable and not even call "kfree()", because otherwise you'd have a pointless call to kfree(NULL) in the success path too. So for convenient automatic pointer freeing, you want an interface much more akin to struct whatever *ptr __automatic_kfree = kmalloc(...); which is much more legible, doesn't have any type mis-use issues, and is also just trivially dealt with by a static inline void automatic_kfree_wrapper(void *pp) { void *p = *(void **)pp; if (p) kfree(p); } #define __automatic_kfree \ __attribute__((__cleanup__(automatic_kfree_wrapper))) #define no_free_ptr(p) \ ({ __auto_type __ptr = (p); (p) = NULL; __ptr; }) which I just tested generates the sane code even for the "set the ptr to NULL and return success" case. The above allows you to trivially do things like struct whatever *p __automatic_kfree = kmalloc(..); if (!do_something(p)) return -ENOENT; return no_free_ptr(p); and it JustWorks(tm). And yes, it needs a couple of different versions of that "__automatic_kfree()" wrapper for the different freeing cases (kvfree, rcu_free, whatever), but those are all trivial one-liners. And no, I didn't think too much about those names. "__automatic_kfree" is too damn long to type, but you hated "auto". And "no_free_ptr()" is not wonderful name either. But I tried to make the naming at least be obvious, if not wonderful. Linus
From: Linus Torvalds > Sent: 06 June 2023 16:46 > > On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote: > > > > nit: Linus's example was "(void *)8" (instead of 1) because we've had > > issues in the past with alignment warnings on archs that are sensitive > > to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL > > comparisons.) __is_constexpr() is playing entirely different games. Basically the type of (x ? (void *)y : (int *)z) depends on whether 'y' is a compile-time 0 (int *) or not (void *). ... > So I'm not sure the 1-vs-8 actually matters. We do other things that > assume that low bits in a pointer are retained and valid, even if in > theory the C type system might have issues with it. Yes, given that gcc will assume a pointer is aligned if you try to memcpy() from it, I'm surprised it doesn't always assume that. In which case (long)ptr_to_int & 3 can be validly assumed to be zero. I've found some 'day job' code that passed the address of a member of a 'packed' structure to a function which then used host ordered unsigned char[] accesses. The compiler is certainly allowed to convert that back to a word write - which would then fault. (I've not looked to see if any modern compilers do.) Of course the simple ptr->member would have be fine. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote: > So for convenient automatic pointer freeing, you want an interface > much more akin to > > struct whatever *ptr __automatic_kfree = kmalloc(...); > > which is much more legible, doesn't have any type mis-use issues, and > is also just trivially dealt with by a > > static inline void automatic_kfree_wrapper(void *pp) > { void *p = *(void **)pp; if (p) kfree(p); } > #define __automatic_kfree \ > __attribute__((__cleanup__(automatic_kfree_wrapper))) > #define no_free_ptr(p) \ > ({ __auto_type __ptr = (p); (p) = NULL; __ptr; }) > > which I just tested generates the sane code even for the "set the ptr > to NULL and return success" case. > > The above allows you to trivially do things like > > struct whatever *p __automatic_kfree = kmalloc(..); > > if (!do_something(p)) > return -ENOENT; > > return no_free_ptr(p); I am a little worried about how (any version so far of) this API could go wrong, e.g. if someone uses this and does "return p" instead of "return no_free_ptr(p)", it'll return a freed pointer. I was hoping we could do something like this to the end of automatic_kfree_wrapper(): *(void **)pp = NULL; i.e. if no_free_ptr() goes missing, "return p" will return NULL, which is much easier to track down that dealing with later use-after-free bugs, etc. Unfortunately, the __cleanup ordering is _after_ the compiler stores the return value... static inline void cleanup_info(struct info **p) { free(*p); *p = NULL; /* this is effectively ignored */ } struct info *do_something(int f) { struct info *var __attribute__((__cleanup__(cleanup_info))) = malloc(1024); process(var); return var; /* oops, forgot to disable cleanup */ } compile down to: do_something: pushq %rbx movl $1024, %edi call malloc movq %rax, %rbx movq %rax, %rdi call process movq %rbx, %rdi call free movq %rbx, %rax ; uses saved copy of malloc return popq %rbx ret The point being, if we can proactively make this hard to shoot ourselves in the foot, that would be nice. :)
On Thu, Jun 8, 2023 at 9:47 AM Kees Cook <keescook@chromium.org> wrote: > > I am a little worried about how (any version so far of) this API could go > wrong, e.g. if someone uses this and does "return p" instead of "return > no_free_ptr(p)", Absolutely. I think the whole "don't always cleanup" is quite dangerous, and maybe not worth it, but it _is_ a fairly common pattern. > I was hoping we could do > something like this to the end of automatic_kfree_wrapper(): > > *(void **)pp = NULL; That would be a lovely thing, but as you found out, it fundamentally cannot work. The whole point of "cleanup" is that it is done when the variable - not trh *value* - goes out of scope. So when you have that return var; /* oops, forgot to disable cleanup */ situation, by definition "var" hasn't gone out of scope until _after_ you read the value and return that value, so pretty much by definition it cannot make a difference to assign something to 'pp' in the cleanup code. > The point being, if we can proactively make this hard to shoot ourselves in > the foot, that would be nice. :) So the good thing is that things like KASAN would immediately notice, and since this all happens (presumably) for the success case, it's not about some unlikely error case. I also think that -fanalyzer might just catch it (as part of -Wanalyzer-use-after-free) at compile-time, but I didn't check. So if/when people start using -fanalyze on the kernel, those things would be caught early. So while this "forget to avoid cleanup" is a worry, I think it ends up likely being one that is fairly easy to avoid with other checks in place. Famous last words. Linus
On Thu, Jun 8, 2023 at 9:47 AM Kees Cook <keescook@chromium.org> wrote: > > On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote: > > So for convenient automatic pointer freeing, you want an interface > > much more akin to > > > > struct whatever *ptr __automatic_kfree = kmalloc(...); > > > > which is much more legible, doesn't have any type mis-use issues, and > > is also just trivially dealt with by a > > > > static inline void automatic_kfree_wrapper(void *pp) > > { void *p = *(void **)pp; if (p) kfree(p); } > > #define __automatic_kfree \ > > __attribute__((__cleanup__(automatic_kfree_wrapper))) > > #define no_free_ptr(p) \ > > ({ __auto_type __ptr = (p); (p) = NULL; __ptr; }) > > > > which I just tested generates the sane code even for the "set the ptr > > to NULL and return success" case. > > > > The above allows you to trivially do things like > > > > struct whatever *p __automatic_kfree = kmalloc(..); > > > > if (!do_something(p)) > > return -ENOENT; > > > > return no_free_ptr(p); > > I am a little worried about how (any version so far of) this API could go > wrong, e.g. if someone uses this and does "return p" instead of "return > no_free_ptr(p)", it'll return a freed pointer. Presumably, one could simply just not use RAII(/SBRM someone else corrected me about this recently coincidentally; I taught them my fun C++ acronym IDGAF) when working with a value that conditionally "escapes" the local scope.
On Thu, Jun 08, 2023 at 10:20:19AM -0700, Nick Desaulniers wrote: > Presumably, one could simply just not use RAII(/SBRM someone else > corrected me about this recently coincidentally; I taught them my fun > C++ acronym IDGAF) when working with a value that conditionally > "escapes" the local scope. But then you're back to the error goto :/
On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote: > > DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p) > > > > smart_ptr(kfree, mem) = kzalloc_node(...); > > if (!mem) > > return -ENOMEM; > > No, the above is broken, and would result in us using "void *" in > situations where we really *really* don't want that. > > For automatic freeing, you want something that can handle different > types properly, and without having to constantly declare the types > somewhere else before use. Ah, I see what you did with the no_free_ptr(), that avoids having to have two pointers around, nice! > So for convenient automatic pointer freeing, you want an interface > much more akin to > > struct whatever *ptr __automatic_kfree = kmalloc(...); > > which is much more legible, doesn't have any type mis-use issues, and > is also just trivially dealt with by a > > static inline void automatic_kfree_wrapper(void *pp) > { void *p = *(void **)pp; if (p) kfree(p); } > #define __automatic_kfree \ > __attribute__((__cleanup__(automatic_kfree_wrapper))) > #define no_free_ptr(p) \ > ({ __auto_type __ptr = (p); (p) = NULL; __ptr; }) > > which I just tested generates the sane code even for the "set the ptr > to NULL and return success" case. > > The above allows you to trivially do things like > > struct whatever *p __automatic_kfree = kmalloc(..); > > if (!do_something(p)) > return -ENOENT; > > return no_free_ptr(p); > > and it JustWorks(tm). OK, something like so then? #define DEFINE_FREE(name, type, free) \ static inline __free_##name(type *p) { type _P = *p; free; } #define __free(name) __cleanup(__free_##name) #define no_free_ptr(p) \ ({ __auto_type __ptr = (p); (p) = NULL; __ptr; }) DEFINE_FREE(kfree, void *, if (_P) kfree(_P)); struct obj *p __free(kfree) = kmalloc(...); if (!do_something(p)) return -ENOENT; return no_free_ptr(p); DEFINE_CLASS(find_get_context, struct perf_event_context *, if (!IS_ERR_OR_NULL(_C)) put_ctx(_C), find_get_context(task, event), struct task_struct *task, struct perf_event *event) DEFINE_FREE(free_event, struct perf_event *, if (!IS_ERR_OR_NULL(_P)) free_event(_P)); struct perf_event *event __free(free_event) = perf_event_alloc(...) if (IS_ERR(event)) return event; class(find_get_context, ctx)(task, event); if (IS_ERR(ctx)) return (void*)ctx; if (!task && !container_of(ctx, struct perf_cpu_context, ctx)->online) return -ENODEV; ... event->ctx = get_ctx(ctx); return no_free_ptr(event); works for me, I'll go make it happen.
On Thu, Jun 8, 2023 at 11:51 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Jun 08, 2023 at 10:20:19AM -0700, Nick Desaulniers wrote: > > Presumably, one could simply just not use RAII when working with a value that conditionally > > "escapes" the local scope. > > But then you're back to the error goto :/ Thinking more about the expected ergonomics here over lunch...no meaningful insights, just thoughts... For something like a mutex/lock; I'd expect those to be acquired then released within the same function, yeah? In which case __attribute__((cleanup())) has fewer hazards since the resource doesn't escape. For a pointer to a dynamically allocated region that may get returned to the caller... I mean, people do this in C++. It is safe and canonical to return a std::unique_ptr. When created locally the destructor does the expected thing regardless of control flow. IIUC, std::unique_ptr's move constructor basically does what Kees suggested earlier (trigger warning: C++): https://github.com/llvm/llvm-project/blob/7a52f79126a59717012d8039ef875f68e3c637fd/libcxx/include/__memory/unique_ptr.h#L429-L430. example: https://godbolt.org/z/51s49G9f1 A recent commit to clang https://reviews.llvm.org/rG301eb6b68f30 raised an interesting point (deficiency is perhaps too strong a word) about GNU-style attributes; they generally have no meaning on an ABI. Consider a function that returns a locally constructed std::unique_ptr. If the function returns a type where the caller knows what destructor functions to run. This is part of the ABI. Here, we're talking about using __attribute__((cleanup())) to DTR locally, but then we return a "raw" pointer to a caller. What cleanup function should the caller run, implicitly, if at all? If we use __attribute__((cleanup())) that saves us a few gotos locally, but the caller perhaps now needs the same treatment.
On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote: > > struct obj *p __free(kfree) = kmalloc(...); Yeah, the above actually looks really good to me - I like the naming here, and the use looks very logical to me. Of course, maybe once I see the patches that use this I go "uhh", but at least for now I think you've hit on a rather legible syntax. I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least it's pretty clear about what it does. So my only worry is that it's not pretty and to the point like your "__free(kfree)" syntax. But it feels workable and not misleading, so unless somebody can come up with a better name, I think it's ok. Linus
On Thu, Jun 08, 2023 at 07:25:27PM -0700, Linus Torvalds wrote: > On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > struct obj *p __free(kfree) = kmalloc(...); > > Yeah, the above actually looks really good to me - I like the naming > here, and the use looks very logical to me. > > Of course, maybe once I see the patches that use this I go "uhh", but > at least for now I think you've hit on a rather legible syntax. > > I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least > it's pretty clear about what it does. > > So my only worry is that it's not pretty and to the point like your > "__free(kfree)" syntax. > > But it feels workable and not misleading, so unless somebody can come > up with a better name, I think it's ok. #define return_ptr(p) \ return no_free_ptr(p) struct obj *p __free(kfree) = kmalloc(...); if (!p) return ERR_PTR(-ENOMEM); ... return_ptr(p); ?
On 08/06/2023 17.45, Linus Torvalds wrote: > And no, I didn't think too much about those names. "__automatic_kfree" > is too damn long to type, but you hated "auto". And "no_free_ptr()" is > not wonderful name either. But I tried to make the naming at least be > obvious, if not wonderful. No opinion either way, just throwing a data point out there: So the systemd codebase makes quite heavy use of this automatic cleanup stuff. Their name for no_free_ptr is TAKE_PTR(), and the verb "take" apparently comes from Rust: //// src/fundamental/macro-fundamental.h /* Takes inspiration from Rust's Option::take() method: reads and returns a pointer, but at the same time * resets it to NULL. See: https://doc.rust-lang.org/std/option/enum.Option.html#method.take */ #define TAKE_GENERIC(var, type, nullvalue) \ ({ \ type *_pvar_ = &(var); \ type _var_ = *_pvar_; \ type _nullvalue_ = nullvalue; \ *_pvar_ = _nullvalue_; \ _var_; \ }) #define TAKE_PTR_TYPE(ptr, type) TAKE_GENERIC(ptr, type, NULL) #define TAKE_PTR(ptr) TAKE_PTR_TYPE(ptr, typeof(ptr)) #define TAKE_STRUCT_TYPE(s, type) TAKE_GENERIC(s, type, {}) #define TAKE_STRUCT(s) TAKE_STRUCT_TYPE(s, typeof(s)) and then they also have a /* Like TAKE_PTR() but for file descriptors, resetting them to -EBADF */ #define TAKE_FD(fd) TAKE_GENERIC(fd, int, -EBADF) with their "auto-close fd" helper of course knowing to ignore any negative value. Rasmus
On 6/8/23 22:14, Nick Desaulniers wrote: > Here, we're talking about using __attribute__((cleanup())) to DTR > locally, but then we return a "raw" pointer to a caller. What cleanup > function should the caller run, implicitly, if at all? If we use > __attribute__((cleanup())) that saves us a few gotos locally, but the > caller perhaps now needs the same treatment. But this is only a problem when you return a void*; and in general in C you will return a struct more often than a raw pointer (and in C++ you also have the issue of delete vs. delete[], that does not exist in C). Returning a struct doesn't protect against use-after-free bugs in the way std::unique_ptr<> or Rust lifetimes do, but it at least tries to protect against calling the wrong cleanup function if you provide a typed "destructor" function that does the right thing---for example by handling reference counting or by freeing sub-structs before calling kfree/vfree. Of course it's not a silver bullet, but then that's why people are looking into Rust for Linux. Paolo
On June 8, 2023 7:25:27 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote: >On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote: >> >> struct obj *p __free(kfree) = kmalloc(...); > >Yeah, the above actually looks really good to me - I like the naming >here, and the use looks very logical to me. > >Of course, maybe once I see the patches that use this I go "uhh", but >at least for now I think you've hit on a rather legible syntax. > >I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least >it's pretty clear about what it does. > >So my only worry is that it's not pretty and to the point like your >"__free(kfree)" syntax. > >But it feels workable and not misleading, so unless somebody can come >up with a better name, I think it's ok. I like the proposed "take" naming, and before reading that reply I was going to suggest "keep". *shrug*