[v2,0/2] Lock and Pointer guards

Message ID 20230526205204.861311518@infradead.org
Headers
Series Lock and Pointer guards |

Message

Peter Zijlstra May 26, 2023, 8:52 p.m. UTC
  By popular demand, a new and improved version :-)

New since -v1 ( https://lkml.kernel.org/r/20230526150549.250372621@infradead.org )

 - much improved interface for lock guards: guard() and scoped () { }
   as suggested by Linus.
 - moved the ';' for the 'last' additional guard members into the definition
   as suggested by Kees.
 - put __cleanup in the right place with the suggested comment
   as suggested by Kees and Miguel
 - renamed the definition macros DEFINE_LOCK_GUARD_[012] to indicate
   the number of arguments the lock function takes
 - added comments to guards.h
 - renamed pretty much all guard types

Again compile and boot tested x86_64-defconfig
  

Comments

Mathieu Desnoyers May 27, 2023, 5:21 p.m. UTC | #1
On 5/26/23 16:52, Peter Zijlstra wrote:
> By popular demand, a new and improved version :-)
> 
> New since -v1 ( https://lkml.kernel.org/r/20230526150549.250372621@infradead.org )
> 
>   - much improved interface for lock guards: guard() and scoped () { }
>     as suggested by Linus.

<name bikeshedding>

I know I'm the one who hinted at C++ "std::scoped_lock" as a similar 
preexisting API, but I find that "scoped()" is weird in the newly 
proposed form. "scoped_lock" is fine considering that "scoped" is an 
adjective applying to "lock", but in the case of e.g. scoped(rcu) { }, 
then we are really declaring a "scope" of type "rcu". I suspect that in 
this case:

scope(rcu) { }

would be less unexpected than the adjective form:

scoped(rcu) { }

Especially if we go for the name "guard()", rather than the adjective 
guarded(), for its counterpart.

Thanks,

Mathieu
  
Linus Torvalds May 27, 2023, 7:18 p.m. UTC | #2
On Fri, May 26, 2023 at 2:01 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> By popular demand, a new and improved version :-)

Well, I certainly find the improved version more legible, despite my
further comment about the type macros being too complicated.

But now I've slept on it, and I think the problem goes beyond "macros
being too complicated".

I actually think that the whole "create a new scope" is just wrong. I
understand why you did it: the whole "its' going out of scope" is part
of the *idea*.

BUT.

Adding a new scope results in

 (a) pointless indentation

 (b) syntax problems with fighting the limited syntax of for loops

 (c) extra dummy variables - both for the for loop hack, but also for
the scope guard thing itself

and none of the above is an *advantage*.

So how about we take a step back, and say "what if we don't create a
new scope at all"?

I think it actually improves on everything. The macros become
*trivial*. The code becomes more legible.

And you can have multiple different scopes very naturally, or you can
just share a scope.

Let me build up my argument here. Let's start with this *trivial* helper:

    /* Trivial "generic" auto-release macro */
    #define auto_release_name(type, name, init, exit) \
        type name __attribute__((__cleanup__(exit))) = (init)

it's truly stupid: it creates a new variable of type 'type', with name
'name', and with the initial value 'init', and with the cleanup
function 'exit'.

It looks stupid, but it's the *good* stupid. It's really really
obvious, and there are no games here.

Let me then introduce *one* other helper, because it turns out that
much of the time, you don't really want to  pick a name. So we'll use
the above macro, but make a version of it that just automatically
picks a unique name for you:

    #define auto_release(type, init, exit) \
        auto_release_name(type, \
                __UNIQUE_ID(auto) __maybe_unused, \
                init, exit)

again, there are absolutely no games here, it's all just very very
straightforward. It's so straightforward that it all feels stupid.

But again, it's the 'S' in "KISS". Keep It Simple, Stupid.

And it turns out that the above two trivial macros are actually quite
useful in themselves. You want to do an auto-cleanup version of
'struct fd'? It's trivial:

    /* Trivial "getfd()" wrapper */
    static inline void release_fd(struct fd *fd)
    { fdput(*fd); }

    #define auto_getfd(name, n) \
        auto_release_name(struct fd, name, fdget(n), release_fd)

and now you can write code like this:

    int testfd(int fd)
    {
        auto_getfd(f, fd);
        /* Do something with f.file */
        return f.file ? 0 : -EBADF;
    }

Which is really trivial, and actually pretty legible.

Note that there is no explicit "scope" here. The scope is simply
implicit in the code. The compiler will verify that it's at the
beginning of the block, since we do that
-Wdeclaration-after-statement, so for the kernel this is always at the
beginning of a scope anyway.

And that's actually a big advantage. You want to deal with *two* file
descriptors? Sure. Just do this:

    /* Do two file descriptors point to the same file? */
    int testfd2(int fd1, int fd2)
    {
        auto_getfd(f1, fd1);
        auto_getfd(f2, fd2);
        return f1.file && f1.file == f2.file;
    }

and it JustWorks(tm). Notice how getting rid of the explicit scoping
is actually a *feature*.

Ok, so why did I want that helper that used that auto-genmerated name?
The above didn't use it, but look here:

    /* Trivial "mutex" auto-release helpers */
    static inline struct mutex *get_mutex(struct mutex *a)
    { mutex_lock(a); return a; }

    static inline void put_mutex(struct mutex **a)
    { mutex_unlock(*a); }

    #define auto_release_mutex_lock(lock) \
        auto_release(struct mutex *, get_mutex(lock), put_mutex)

That's all you need for a nice locked region.

Or how about RCU, which doesn't have a lock type? Just do this:

    struct rcu_scope;

    static inline struct rcu_scope *get_rcu(void)
    { rcu_read_lock(); return NULL; }

    static void put_rcu(struct rcu_scope **a)
    { rcu_read_unlock(); }

    #define auto_release_rcu_lock() \
        auto_release(struct rcu_scope *, get_rcu(), put_rcu)

and you're done. And how you can *mix* all these:

    int testfd2(int fd1, int fd2)
    {
        auto_release_mutex_lock(&cgroup_mutex);
        auto_release_rcu_lock();
        auto_getfd(f1, fd1);
        auto_getfd(f2, fd2);
        return f1.file && f1.file == f2.file;
    }

Ok. The above is insane. But it's instructive. And it's all SO VERY SIMPLE.

It's also an example of something people need to be aware of: the
auto-releasing is not ordered. That may or may not be a problem. It's
not a problem for two identical locks, but it very much *can* be a
problem if you mix different locks.

So in the crazy above, the RCU lock may be released *after* the
cgroup_mutex is released. Or before.

I'm not convinced it's ordered even if you end up using explicit
scopes, which you can obviously still do:

    int testfd2(int fd1, int fd2)
    {
        auto_release_mutex_lock(&cgroup_mutex);
        do {
                auto_release_rcu_lock();
                auto_getfd(f1, fd1);
                auto_getfd(f2, fd2);
                return f1.file && f1.file == f2.file;
        } while (0);
    }

so I don't think the nasty "scoped()" version is any better in this regard.

And before you say "unlock order doesn't matter" - that's actually not
true. Unlock order does matter when you have things like
"spin_lock_irq()" where the irq-off region then protects other locks
too.

If you do the "unlock_irq" before you've released the other spinlocks
that needed irq protection, you're now screwed.

Final notes:

 - I think the above is simpler and objectively better in every way
from the explicitly scoped thing

 - I do think that the auto-release can be very dangerous for locking,
and people need to verify me about the ordering. Maybe the freeing
order is well-defined.

 - I also suspect that to get maximum advantage of this all, we would
have to get rid of -Wdeclaration-after-statement

That last point is something that some people will celebrate, but I do
think it has helped keep our code base somewhat controlled, and made
people think more about the variables they declare.

But if you start doing consecutive cleanup operations, you really end
up wanting to do thigns like this:

    int testfd2(int fd1, int fd2)
    {
        auto_getfd(f1, fd1);
        if (!f1.file)
                return -EBADF;
        auto_getfd(f2, fd2);
        if (!f2.file)
                return -EBADF;
        return do_something (f1, f2);
    }

and you basically *have* to allow those declarations in the middle -
because you don't want to clean up fd2 in that "f1 failed" case, since
f2 doesn't exist yet.

                  Linus
  
Paolo Bonzini May 29, 2023, 12:09 p.m. UTC | #3
On Sat, May 27, 2023 at 9:18 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> It's also an example of something people need to be aware of: the
> auto-releasing is not ordered. That may or may not be a problem. It's
> not a problem for two identical locks, but it very much *can* be a
> problem if you mix different locks.

It is guaranteed. It would be nice to have it documented, but if you
look at the intermediate representation of this simple example:

#include <stdio.h>

int print_on_exit(void **f) {
    if (*f) puts(*f);
    *f = NULL;
}

int main(int argc) {
    // prints "second" then "first"
    void *__attribute__((__cleanup__(print_on_exit))) cleanup1 = "first";
    void *__attribute__((__cleanup__(print_on_exit))) cleanup2 = "second";
}

... the implementation is introducing a scope and reusing the C++
try/finally implementation, and the "optimizations" that are allowed
when functions are not "throwing" anything. This is always the case
for C, so this (from f.c.005t.original, obtained with -c
-fdump-tree-all):

{
  // the duplicated initializers are just an artifact of the AST
  void * cleanup1 = (void *) "first";
  void * cleanup2 = (void *) "second";

    void * cleanup1 = (void *) "first";
  try
    {
            void * cleanup2 = (void *) "second";
      try
        {

        }
      finally
        {
          print_on_exit (&cleanup2);
        }
    }
  finally
    {
      print_on_exit (&cleanup1);
    }
}
return 0;

becomes this as soon as exceptions are lowered (from f.c.013t.eh),
which is before any kind of optimization:

int main (int argc)
{
  void * cleanup2;
  void * cleanup1;
  int D.3187;

  cleanup1 = "first";
  cleanup2 = "second";
  print_on_exit (&cleanup2);
  print_on_exit (&cleanup1);
  cleanup1 = {CLOBBER(eol)};
  cleanup2 = {CLOBBER(eol)};
  D.3187 = 0;
  goto <D.3188>;
  <D.3188>:
  return D.3187;
}

> So in the crazy above, the RCU lock may be released *after* the
> cgroup_mutex is released. Or before.
>
> I'm not convinced it's ordered even if you end up using explicit
> scopes, which you can obviously still do:

It is, see

        void *__attribute__((__cleanup__(print_on_exit))) cleanup1 = "first";
        {
                void *__attribute__((__cleanup__(print_on_exit)))
cleanup2 = "second";
                puts("begin nested scope");
        }
        puts("back to outer scope");

which prints

begin nested scope
second
back to outer scope
first

This is practically required by glibc's nasty
pthread_cleanup_push/pthread_cleanup_pop macros (the macros are nasty
even if you ignore glibc's implementation, but still):

#  define pthread_cleanup_push(routine, arg) \
  do {                                                                        \
    struct __pthread_cleanup_frame __clframe                                  \
      __attribute__ ((__cleanup__ (__pthread_cleanup_routine)))               \
      = { .__cancel_routine = (routine), .__cancel_arg = (arg),               \
          .__do_it = 1 };

/* Remove a cleanup handler installed by the matching pthread_cleanup_push.
   If EXECUTE is non-zero, the handler function is called. */
#  define pthread_cleanup_pop(execute) \
    __clframe.__do_it = (execute);                                            \
  } while (0)

If the scope wasn't respected, pthread_cleanup_pop(1) would be broken
because pthread_cleanup_pop() must immediately execute the function if
its argument is nonzero.

>  - I think the above is simpler and objectively better in every way
> from the explicitly scoped thing

It is simpler indeed, but a scope-based variant is useful too based on
my experience with QEMU. Maybe things like Python's with statement
have spoiled me, but I can't quite get used to the lone braces in

{
    ...
    {
        auto_release(mutex, &mutex);
        ...
    }
    ...
}

So in QEMU we have both of them. In Linux that would be

   auto_release(mutex, &mutex)

and

   scoped(mutex, &mutex) {}

>  - I do think that the auto-release can be very dangerous for locking,
> and people need to verify me about the ordering. Maybe the freeing
> order is well-defined.

It is but having it documented would be better.

>  - I also suspect that to get maximum advantage of this all, we would
> have to get rid of -Wdeclaration-after-statement

Having both a declaration-based and a scope-based variant helps with
that too. You could add _Pragma("GCC diagnostic push/ignore/pop") to
the declaration-based macros but ouch... even without thinking of
whether clang supports it, which I didn't check.

Paolo

> That last point is something that some people will celebrate, but I do
> think it has helped keep our code base somewhat controlled, and made
> people think more about the variables they declare.
  
Linus Torvalds May 29, 2023, 7:04 p.m. UTC | #4
On Mon, May 29, 2023 at 8:09 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On Sat, May 27, 2023 at 9:18 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > It's also an example of something people need to be aware of: the
> > auto-releasing is not ordered. That may or may not be a problem. It's
> > not a problem for two identical locks, but it very much *can* be a
> > problem if you mix different locks.
>
> It is guaranteed. It would be nice to have it documented, but if you
> look at the intermediate representation of this simple example:

Well, I can see that it might be doing that reverse ordering in
practice, but for the life of me, I can't actually find anything that
says it is guaranteed.

Google did find me one blog post by Ian Lance Taylor from 2008 that
said that yes, each __cleanup__ attribute basically creates its own
little scope, and that the cleanup in reverse declaration order is
thus guaranteed.

Let's add Ian to the cc, partly to confirm it wasn't just a random
implementation detail, but also partly to perhaps ask him to get
somebody to document it.

Because if it's not documented, how do we know that the clang
implementation of that attribute then ends up also guaranteeing the
reverse order cleanup, even if gcc might guarantee it?

I *suspect* - but cannot find any guarantees - that it's going to
match C++ destructors, and you probably end up pretty much always
having to deal with these cleanup functions in reverse order, so it
all sounds very likely to me.

And maybe it's even documented somewhere that neither of us could find.

Anyway, I do like the option to use cleanup functions, but I think
we'd better make sure, since we really may require nesting for locks
(even if in many cases it won't matter).

Ian? Any chance you can either point us at documentation, or maybe
just confirm it, and hopefully make said documentation happen?

            Linus
  
Ian Lance Taylor May 29, 2023, 9:27 p.m. UTC | #5
On Mon, May 29, 2023 at 12:04 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, May 29, 2023 at 8:09 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On Sat, May 27, 2023 at 9:18 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > > It's also an example of something people need to be aware of: the
> > > auto-releasing is not ordered. That may or may not be a problem. It's
> > > not a problem for two identical locks, but it very much *can* be a
> > > problem if you mix different locks.
> >
> > It is guaranteed. It would be nice to have it documented, but if you
> > look at the intermediate representation of this simple example:
>
> Well, I can see that it might be doing that reverse ordering in
> practice, but for the life of me, I can't actually find anything that
> says it is guaranteed.
>
> Google did find me one blog post by Ian Lance Taylor from 2008 that
> said that yes, each __cleanup__ attribute basically creates its own
> little scope, and that the cleanup in reverse declaration order is
> thus guaranteed.
>
> Let's add Ian to the cc, partly to confirm it wasn't just a random
> implementation detail, but also partly to perhaps ask him to get
> somebody to document it.
>
> Because if it's not documented, how do we know that the clang
> implementation of that attribute then ends up also guaranteeing the
> reverse order cleanup, even if gcc might guarantee it?
>
> I *suspect* - but cannot find any guarantees - that it's going to
> match C++ destructors, and you probably end up pretty much always
> having to deal with these cleanup functions in reverse order, so it
> all sounds very likely to me.
>
> And maybe it's even documented somewhere that neither of us could find.
>
> Anyway, I do like the option to use cleanup functions, but I think
> we'd better make sure, since we really may require nesting for locks
> (even if in many cases it won't matter).
>
> Ian? Any chance you can either point us at documentation, or maybe
> just confirm it, and hopefully make said documentation happen?


It was a while ago, but I expect that I was just thinking of the
implementation.  I agree that the documentation could be clearer.  I
filed https://gcc.gnu.org/PR110029.

Ian
  
Linus Torvalds May 30, 2023, 12:06 a.m. UTC | #6
On Mon, May 29, 2023 at 5:27 PM Ian Lance Taylor <iant@google.com> wrote:
>
> It was a while ago, but I expect that I was just thinking of the
> implementation.  I agree that the documentation could be clearer.  I
> filed https://gcc.gnu.org/PR110029.

Thanks. I think we can proceed with the assumption that it's all
clearly ordered. Even when the scopes are explicit (ie actual separate
block scopes for the variables and no shared scope), doing a 'return'
(or break out of a loop) will obviously exit multiple scopes at once,
so it would be good to have it documented that the cleanup is always
in that reverse order by scope and declaration order within a scope.

I guess I should check with the clang people too.

              Linus
  
Peter Zijlstra May 30, 2023, 9:23 a.m. UTC | #7
On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote:

> So how about we take a step back, and say "what if we don't create a
> new scope at all"?

Note that the lock_guard() and ptr_guard() options I have don't require
the new scope thing. The scope thing is optional.

> I think it actually improves on everything. The macros become
> *trivial*. The code becomes more legible.
> 
> And you can have multiple different scopes very naturally, or you can
> just share a scope.
> 
> Let me build up my argument here. Let's start with this *trivial* helper:
> 
>     /* Trivial "generic" auto-release macro */
>     #define auto_release_name(type, name, init, exit) \
>         type name __attribute__((__cleanup__(exit))) = (init)
> 
> it's truly stupid: it creates a new variable of type 'type', with name
> 'name', and with the initial value 'init', and with the cleanup
> function 'exit'.
> 
> It looks stupid, but it's the *good* stupid. It's really really
> obvious, and there are no games here.

I really don't like the auto naming since C++/C23 use auto for type
inference.

> Let me then introduce *one* other helper, because it turns out that
> much of the time, you don't really want to  pick a name. So we'll use
> the above macro, but make a version of it that just automatically
> picks a unique name for you:
> 
>     #define auto_release(type, init, exit) \
>         auto_release_name(type, \
>                 __UNIQUE_ID(auto) __maybe_unused, \
>                 init, exit)

I like that option.

> And it turns out that the above two trivial macros are actually quite
> useful in themselves. You want to do an auto-cleanup version of
> 'struct fd'? It's trivial:
> 
>     /* Trivial "getfd()" wrapper */
>     static inline void release_fd(struct fd *fd)
>     { fdput(*fd); }
> 
>     #define auto_getfd(name, n) \
>         auto_release_name(struct fd, name, fdget(n), release_fd)
> 

>  - I think the above is simpler and objectively better in every way
> from the explicitly scoped thing

Well, I think having that as a option would still be very nice.

>  - I also suspect that to get maximum advantage of this all, we would
> have to get rid of -Wdeclaration-after-statement
> 
> That last point is something that some people will celebrate, but I do
> think it has helped keep our code base somewhat controlled, and made
> people think more about the variables they declare.
> 
> But if you start doing consecutive cleanup operations, you really end
> up wanting to do thigns like this:
> 
>     int testfd2(int fd1, int fd2)
>     {
>         auto_getfd(f1, fd1);
>         if (!f1.file)
>                 return -EBADF;
>         auto_getfd(f2, fd2);
>         if (!f2.file)
>                 return -EBADF;
>         return do_something (f1, f2);
>     }

If you extend the ptr_guard() idea you don't need to get rid of
-Wdeclaration-after-statement and we could write it like:

	int testfd2(int fd1, int fd2)
	{
		ptr_guard(fdput, f1) = fdget(fd1);
		ptr_guard(fdput, f2) = null_ptr(fdput);
		if (!f1.file)
			return -EBADF;
		f2 = fdget(f2);
		if (!f2.file)
			return -EBADF;
		return do_something(f1, f2);
	}


Yes, the macros would be a little more involved, but not horribly so I
think.

typedef struct fd guard_fdput_t;

static const struct fd guard_fdput_null = __to_fd(0);

static inline void guard_fdput_cleanup(struct fd *fd)
{
	fdput(*fd);
}

#define ptr_guard(_guard, _name) \
	guard##_guard##_t _name __cleanup(guard##_guard##_cleanup)

#define null_ptr(_guard)		guard##_guard##_null;

And for actual pointer types (as opposed to fat pointer wrappers like
struct fd) we can have a regular helper macro like earlier:

#define DEFINE_PTR_GUARD(_guard, _type, _put)		\
typdef _type *guard##_guard##_t;			\
static const _type *guard##_guard##_null = NULL;	\
static inline void guard##_guard##_cleanup(_type **ptr)	\
{ if (*ptr) _put(*ptr); }

[NOTE: value propagation gets rid of the above conditional where
appropriate]

eg.:

  DEFINE_PTR_GUARD(put_task, struct task_struct, put_task_struct);


Possibly with a helper:

#define null_guard(_guard, _name) \
	ptr_guard(_guard, _name) = null_ptr(_guard)



Now, ptr_guard() per the above, is asymmetric in that it only cares
about release, let guard() be the symmetric thing that also cares about
init like so:

#define DEFINE_GUARD(_guard, _type, _put, _get)			\
DEFINE_PTR_GUARD(_guard, _type, _put)				\
static inline void guard##_guard##_init(guard##_guard##_t arg)	\
{ _get(arg); return arg; }

#define guard(_guard, _name, _var...)	\
	ptr_guard(_guard, _name) = guard##_guard@##_init(_var)

#define anon_guard(_guard, _var..) \
	guard(_guard, __UNIQUE_ID(guard), _var)

for eg.:

  DEFINE_GUARD(mutex, struct mutex, mutex_unlock, mutex_lock);

which then lets one write:

	int testfd2(int fd1, int fd2)
	{
		anon_guard(mutex, &cgroup_mutex);
		ptr_guard(fdput, f1) = fdget(fd1);
		null_guard(fdput, f2);
		if (!f1.file)
			return -EBADF;
		f2 = fdget(fd2);
		if (!f2.file)
			return -EBADf;
		return do_something(f1,f2);
	}

The RCU thing can then either be manually done like:

struct rcu_guard;

typedef struct rcu_guard *guard_rcu_t;
static const guard_rcu_null = NULL;
static inline guard_rcu_cleanup(struct rcu_guard **ptr)
{ if (*ptr) rcu_read_unlock(); }
static inline struct rcu_guard *guard_rcu_init(void)
{ rcu_read_lock(); return (void*)1; }

(or because we'll need this pattern a few more times, with yet another
DEFINE_foo_GUARD helper)

and:

	anon_guard(rcu);

works.

And at this point the previous scope() things are just one helper macro
away:

#define scope(_guard, _var..) \
	for (guard##_guard##_t *done = NULL, scope = guard##_guard##_init(var); \
	     !done; done++)

to be used where appropriate etc..


Yes, it's a wee bit more involved, but I'm thinking it gives a fair
amount of flexibility and we don't need to ret rid of
-Wdeclaration-after-statement.

Hmm?
  
Peter Zijlstra May 30, 2023, 9:26 a.m. UTC | #8
On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote:

> And before you say "unlock order doesn't matter" - that's actually not
> true. Unlock order does matter when you have things like
> "spin_lock_irq()" where the irq-off region then protects other locks
> too.

So I had already verified that GCC keeps them ordered, I see the
subthread with Ian and the documentation update request -- so all good
there.

Also, lockdep would scream bloody murder if you get that wrong, so I
can certainly add something to the selftests that would detect this.
  
Peter Zijlstra May 30, 2023, 9:34 a.m. UTC | #9
On Tue, May 30, 2023 at 11:23:42AM +0200, Peter Zijlstra wrote:

> Yes, it's a wee bit more involved, but I'm thinking it gives a fair
> amount of flexibility and we don't need to ret rid of
> -Wdeclaration-after-statement.

One other thing I forgot to point put; it allows things like:

	int store_fd(int fd)
	{
		ptr_guard(fdput, f) = fdget(fd);
		void *ret;
		if (!f.file)
			return -EBADF;
		ret = xa_store(&xarray, f.file->private, f);
		if (xa_is_err(ret))
			return xa_err(ret);
		f = null_ptr(fdput); // xarray now owns f
		return 0;
	}

Where we can assign null_ptr() to clear the guard and inhibit the
cleanup function to pass ownership around.
  
Valentin Schneider May 30, 2023, 1:58 p.m. UTC | #10
On 30/05/23 11:23, Peter Zijlstra wrote:
> On Sat, May 27, 2023 at 12:18:04PM -0700, Linus Torvalds wrote:
>
>
>> And it turns out that the above two trivial macros are actually quite
>> useful in themselves. You want to do an auto-cleanup version of
>> 'struct fd'? It's trivial:
>>
>>     /* Trivial "getfd()" wrapper */
>>     static inline void release_fd(struct fd *fd)
>>     { fdput(*fd); }
>>
>>     #define auto_getfd(name, n) \
>>         auto_release_name(struct fd, name, fdget(n), release_fd)
>>
>
>>  - I think the above is simpler and objectively better in every way
>> from the explicitly scoped thing
>
> Well, I think having that as a option would still be very nice.
>

IMO the explicit scoping can help with readability. It gives a clear visual
indication of where critical sections are, and you can break it up with a
scope + guards as in migrate_swap_stop() to stay at sane indentation
levels (with Python context managers, this would all be one scope).

I'd argue that for these, the scope/indentation is beneficial and not just
a byproduct. Even for longer functions like try_to_wake_up(), this works
out alright.

This obviously falls apart when dealing with too many guards
(e.g. copy_process()) or if the resulting indentation is nuts, but I concur
that keeping the explicit scope as an option would be nice.
  
Peter Zijlstra June 6, 2023, 9:42 a.m. UTC | #11
On Tue, May 30, 2023 at 11:23:42AM +0200, Peter Zijlstra wrote:

> Yes, it's a wee bit more involved, but I'm thinking it gives a fair
> amount of flexibility and we don't need to ret rid of
> -Wdeclaration-after-statement.

So I made all that work and .. Yes, you're absolutely right.

Busting -Wdeclaration-after-statement is the right thing to do for
guards.

So then I came up with:

#define __ptr_guard(_guard, _name)                                     \
       guard_##_guard##_t _name __cleanup(guard_##_guard##_cleanup)

#define ptr_guard(_guard, _name)                                       \
       __diag(push)                                                    \
       __diag(ignored "-Wdeclaration-after-statement")                 \
       __ptr_guard(_guard, _name)                                      \
       __diag(pop)

#define guard_init(_guard, _var...)                                    \
       guard_##_guard##_init(_var)

#define named_guard(_guard, _name, _var...)                            \
       ptr_guard(_guard, _name) = guard_init(_guard, _var)

#define guard(_guard, _var...)                                         \
       named_guard(_guard, __UNIQUE_ID(guard), _var)

#define scoped_guard(_guard, _var...)                                  \
       for (__ptr_guard(_guard, scope) = guard_init(_guard, _var),     \
            *done = NULL; !done; done = (void *)1)


And that all (mostly) works on clang, but not GCC :-( GCC refuses to
accept _Pragma() inside an expression.

So I now have that ptr_guard() with push/pop for clang but without for
GCC, which means that only clang has a fighting chance to report
-Wdeclaration-after-statement warns until such a time as that we can get
GCC 'fixed'.

  https://godbolt.org/z/5MPeq5W6K


FWIW: the work-in-progress patches I have are here:

  https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=core/guards
  
Linus Torvalds June 6, 2023, 1:17 p.m. UTC | #12
On Tue, Jun 6, 2023 at 2:43 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> ( GCC refuses to accept _Pragma() inside an expression.

If we really want this all, I think we'd just stop using
-Wdeclaration-after-statement entirely.

There are other uses for it, and people have asked for mixing
declarations and code before.

I think that particular straightjacket has been a good thing, but I
also think that it's ok to just let it go as a hard rule, and just try
to make it a coding style issue for the common case, but allow mixed
declarations and code when it makes sense.

For the whole "automatic release case it definitely makes sense, but
it's not like it isn't possible useful elsewhere. I just don't want
for it to become some global pattern for everything.

That said, I still don't understand why you lke the name "guard" for
this.  I understand not liking "auto", but "guard" doesn't seem any
better. In fact, much worse. Guarded expressions means something
completely different both in real life and in computer science.

I'm assuming there's some history there, but it makes no sense to me
as a name here.

              Linus
  
Peter Zijlstra June 6, 2023, 1:40 p.m. UTC | #13
On Tue, Jun 06, 2023 at 06:17:33AM -0700, Linus Torvalds wrote:

> That said, I still don't understand why you lke the name "guard" for
> this.  I understand not liking "auto", but "guard" doesn't seem any
> better. In fact, much worse. Guarded expressions means something
> completely different both in real life and in computer science.
> 
> I'm assuming there's some history there, but it makes no sense to me
> as a name here.

I know the name from C++ where it is std::lock_guard<> (well, back when
I still did C++ it wasn't std, but whatever), and Rust seems to have
std::sync::MutexGuard<>.

But if that's the sole objection left, lets have a bike-shed party and
pick a colour :-)

'shield' 'sentry' 'sentinel' 'keeper' 'custodian' 'warden' ?
  
Linus Torvalds June 6, 2023, 2:50 p.m. UTC | #14
On Tue, Jun 6, 2023 at 6:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> I know the name from C++ where it is std::lock_guard<> (well, back when
> I still did C++ it wasn't std, but whatever), and Rust seems to have
> std::sync::MutexGuard<>.
>
> But if that's the sole objection left, lets have a bike-shed party and
> pick a colour :-)
>
> 'shield' 'sentry' 'sentinel' 'keeper' 'custodian' 'warden' ?

I feel like you seem entirely too fixated on locking.

That's not even _remotely_ the most interesting case, I think.

For the common locking patterns , maybe "guard" makes sense as part of
the name, but it damn well shouldn't be about any "ptr_guard" or
whatever.

I think it should be some trivial

        mutex_scope(mymytex) {
                ..
        }

and yes, in that case scoping makes tons of sense and might even be
required, because you do want to visually see the scope of the
locking.

I'm not seeing the point of "guard" in that name either, but I don't
think the above is the complex case for naming, for use, or for
syntax.

In contrast, I think reference counting and allocations are the much
more interesting one, and the case where you are also more likely to
have multiple consecutive ones, and where it's likely less about "this
region is protected" and much more about the RAII patterns and
resources that you just want cleanup for.

So that's why I had that "auto_release" name - to make it clear that
it's about the *cleanup*, not about some "guarded region". See my
argument?

The locking cases can probably be handled with just a couple of macros
(mutex, rcu, maybe spinlock?) and I would violently argue that the
locking cases will have to have unconditional cleanup (ie if you
*ever* have the situation where some path is supposed to be able to
return with the lock held and no cleanup, then you should simply not
use the "mutex_scope()" kind of helper AT ALL).

So locking is, I feel, almost uninteresting. There are only a few
cases, and only trivial patterns for it, because anything non-trivial
had better be done very explicitly, and very much visibly by hand.

But look at the cases where we currently use "goto exit" kind of code,
or have duplicated cleanups because we do "cleanup(x); return -ERRNO"
kinds of patterns.

Some of them are locking, yes. But a lot of them are more generic "do
this before returning" kinds of things: freeing (possibly complex)
data structures, uncharging statistcs, or even things like writing
results back to user space.

And, unlike locking, those often (but not always) end up having the
"don't do this in the single success case" situation, so that you need
to have some way to show "ok, don't release it after all". Yes, that
ends up in practice likely being setting that special pointer to NULL,
but I think we need to have a much better syntax for it to show that
"now we're *not* doing the auto-release".

So it's that more complex case that I

 (a) don't think "guard" makes any sense for at all

 (b) think it's where the bigger payoffs are

 (c) think that generally are not "nested" (you release the resources
you've allocated, but you don't "nest" the allocations - they are just
serial.

 (d) think that the syntax could be pretty nasty, because the cleanup
is not just a trivial fixed "unlock" function

Just as an example of something that has a bit of everything, look at
kernel/cpu.c and the _cpu_down() function in particular, which has
that "goto out" to do all the cleanup.

That looks like a perfect case for having some nice "associate the
cleanup with the initialization" RAII patterns, but while it does have
one lock (cpus_write_lock/unlock()), even that one is abstracted away
so that it may be a no-op, and might be a percpu rwlock, and so having
to create a special macro for that would be more work than is worth
it.

Because what I *don't* want to happen is that people create some
one-time-use "helper" macro infrastructure to use this all. Again,
that cpus_write_lock/unlock is a good example of this.

End result: to avoid having crazy indentation, and crazy one-time
helper inline functions or whatever, I think you'd really want some
fairly generic model for

  "I am doing X, and I want to to Y as a cleanup when you exit the function"

where both X and Y can be *fairly* complicated things. They might be
as simple (and reasonably common) as "fd = fdget(f)" and "fdput(fd)",
but what about the one-offs like that cpus_write_lock/unlock pattern?

That one only happens in two places (_cpu_down() and _cpu_up()), and
maybe the answer is "we can't reasonably do it for that, because the
complexity is not worth it".

But maybe we *can*.

For example, this is likely the only realistic *good* case for the
horrible "nested function" thing that gcc supports. In my "look, we
could clean up a 'struct fd' example from last week, I made it do
something like this:

    /* Trivial "getfd()" wrapper */
    static inline void release_fd(struct fd *fd)
    { fdput(*fd); }

    #define auto_getfd(name, n) \
        auto_release_name(struct fd, name, fdget(n), release_fd)

but it would possibly be much more generic, and much more useful, if
that "release_fd()" function was generated by the macro as a local
nested inline function.

End result: you could use any arbitrary local cleanup code.

So you could have something like

  #define RAII(type, var, init, exit) \
        __RAII(type, var, init, exit, __UNIQUE_ID(fn)

  #define __RAII(type, var, init, exit, exitname) \
        void exitname(type *p) { exit } \
        type var __attribute__((__cleanup__(exitname))) = (init)

and do all of the above with

    RAII(struct fd, fd, fdget(f), fdput(fd));

because that macro would literally expand to create a (uniquely named)
nested function that then contains that "fdput(fd)".

I dunno. The syntax looks pretty bad, and I'm not even convinced clang
supports nested functions, so the above may not be an option.

But wouldn't it be nice to just be able to declare that arbitrary
cleanup in-place?

Then the lock cases would really just be trivial helper wrappers. And
you can use "guard" there if you want, but I feel it makes absolutely
*no* sense in the generic case. There is absolutely nothng that is
being "guarded" by having an automatic cleanup of some random
variable.

              Linus
  
Kees Cook June 6, 2023, 3:31 p.m. UTC | #15
On Tue, Jun 06, 2023 at 11:42:51AM +0200, Peter Zijlstra wrote:
> [...]
> #define scoped_guard(_guard, _var...)                                  \
>        for (__ptr_guard(_guard, scope) = guard_init(_guard, _var),     \
>             *done = NULL; !done; done = (void *)1)

nit: Linus's example was "(void *)8" (instead of 1) because we've had
issues in the past with alignment warnings on archs that are sensitive
to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL
comparisons.)
  
Linus Torvalds June 6, 2023, 3:45 p.m. UTC | #16
On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote:
>
> nit: Linus's example was "(void *)8" (instead of 1) because we've had
> issues in the past with alignment warnings on archs that are sensitive
> to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL
> comparisons.)

Note that I don't think we ever saw such a warning, it was just a
theoretical observation that depending on type, the compiler might
warn about known mis-aligned pointer bits.

So I'm not sure the 1-vs-8 actually matters. We do other things that
assume that low bits in a pointer are retained and valid, even if in
theory the C type system might have issues with it.

But maybe I mis-remember - if you did get an actual warning, maybe we
should document that warning just to keep the memory alive.

            Linus
  
Kees Cook June 6, 2023, 4:06 p.m. UTC | #17
On Tue, Jun 06, 2023 at 07:50:47AM -0700, Linus Torvalds wrote:
> So you could have something like
> 
>   #define RAII(type, var, init, exit) \
>         __RAII(type, var, init, exit, __UNIQUE_ID(fn)
> 
>   #define __RAII(type, var, init, exit, exitname) \
>         void exitname(type *p) { exit } \
>         type var __attribute__((__cleanup__(exitname))) = (init)
> 
> and do all of the above with
> 
>     RAII(struct fd, fd, fdget(f), fdput(fd));

"fdput(fd)" needs to be "fdput(*p)", since otherwise "fdput(fd)" is
referencing "fd" before it has been declared.

But regardless, yes, Clang is angry about the nested function. Also,
while my toy[1] example doesn't show it, GCC may also generate code
that requires an executable stack for some instances (or at least it
did historically) that need trampolines.

[1] https://godbolt.org/z/WTjx6Gs7x

Also, more nits on naming: isn't this more accurately called Scope-based
Resource Management (SBRM) not RAII? (RAII is technically object lifetime,
and SBRM is scope entry/exit.)
  
Kees Cook June 6, 2023, 4:08 p.m. UTC | #18
On Tue, Jun 06, 2023 at 08:45:49AM -0700, Linus Torvalds wrote:
> On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > nit: Linus's example was "(void *)8" (instead of 1) because we've had
> > issues in the past with alignment warnings on archs that are sensitive
> > to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL
> > comparisons.)
> 
> Note that I don't think we ever saw such a warning, it was just a
> theoretical observation that depending on type, the compiler might
> warn about known mis-aligned pointer bits.
> 
> So I'm not sure the 1-vs-8 actually matters. We do other things that
> assume that low bits in a pointer are retained and valid, even if in
> theory the C type system might have issues with it.
> 
> But maybe I mis-remember - if you did get an actual warning, maybe we
> should document that warning just to keep the memory alive.

I've never seen a warning, but since this came up in the dissection of
the __is_constexpr() behavior, it's been burned into my mind. ;)
  
Peter Zijlstra June 6, 2023, 6:08 p.m. UTC | #19
On Tue, Jun 06, 2023 at 07:50:47AM -0700, Linus Torvalds wrote:
> I feel like you seem entirely too fixated on locking.

It's what I started out with... but it's certainly not everything.

The thing I have removes pretty much all the error gotos from
sched/core.c and events/core.c and while locking is ofcourse a fair
amount of that there's significant non-locking usage.

>  (c) think that generally are not "nested" (you release the resources
> you've allocated, but you don't "nest" the allocations - they are just
> serial.
> 
>  (d) think that the syntax could be pretty nasty, because the cleanup
> is not just a trivial fixed "unlock" function

So I'm not sure you've seen the actual convertions I've done, but yes,
there's lots of those.

I've just used the guard naming because locks is what I started out
with.

+	ptr_guard(kfree, alloc) = kzalloc(event->read_size, GFP_KERNEL);
+	if (!alloc)
 		return -ENOMEM;


 	event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
 				 NULL, NULL, cgroup_fd);
+	if (IS_ERR(event))
+		return PTR_ERR(event);
+
+	ptr_guard(free_event, event_guard) = event;


+	ptr_guard(put_task, p) = find_get_task(pid);
+	if (!p)
+		return -ESRCH;


So it does all that... you just hate the naming -- surely we can fix
that.


Would it all be less offensive if I did: s/guard/cleanup/ on the whole
thing? Then we'd have things like:


DEFINE_PTR_CLEANUP(put_task, struct task_struct *,
		   if (_C) put_task_struct(_C))

	ptr_cleanup(put_task, p) = find_get_task(pid);
	if (!p)
		return -ESRCH;


DEFINE_PTR_CLEANUP(free_event, struct perf_event *,
		   if (!IS_ERR_OR_NULL(_C)) free_event(_C))

	ptr_cleanup(free_event, event) = perf_event_alloc(...);
	if (IS_ERR(event))
		return PTR_ERR(event);


DEFINE_PTR_CLEANUP(kfree, void *, kfree(_C))

	ptr_cleanup(kfree, foo) = kzalloc(...);
	if (!foo)
		return -ENOMEM;


But do we then continue with:

DEFINE_CLEANUP(mutex, struct mutex *, mutex_lock(_C), mutex_unlock(_C))

	cleanup(mutex, &my_mutex);


	scoped_cleanup (mutex, &my_mutex) {
		...
	}

etc..? or do we keep guard() there, but based internally on
ptr_cleanup() with the cleanup_## naming.
  
Linus Torvalds June 6, 2023, 11:22 p.m. UTC | #20
On Tue, Jun 6, 2023 at 11:08 AM Peter Zijlstra <peterz@infradead.org> wrote:>
> Would it all be less offensive if I did: s/guard/cleanup/ on the whole
> thing?

It's more than "guard" for me.

What is "ptr"? Why? We already know of at least one case where it's
not a pointer at all, ie 'struct fd'.

So I *really* hate the naming. Absolutely none of it makes sense to
me. One part is a nonsensical name apparently based on a special-case
operation, and the other part is a nonsensical type from just one
random - if common - implementation issue.

What you want to do is to have a way to define and name a
"constructor/desctructor" pair for an arbitrary type - *not*
necessarily a pointer - and then optionally a way to say "Oh, don't do
the destructor, because I'm actually going to use it long-term".

I said "cleanup", but that's not right either, since we always have to
have that initializer too.

Maybe just bite the bullet, and call the damn thing a "class", and
have some syntax like

     DEFINE_CLASS(name, type, exit, init, initarg...);

to create the infrastructure for some named 'class'. So you'd have

    DEFINE_CLASS(mutex, struct mutex *,
        mutex_unlock(*_P),
        ({mutex_lock(mutex); mutex;}), struct mutex *mutex)

to define the mutex "class", and do

    DEFINE_CLASS(fd, struct fd,
        fdput(*_P),
        fdget(f), int f)

for the 'struct fd' thing.

The above basically would just create the wrapper functions (and
typedefs) for the constructor and destructor.

So the 'DEFINE_CLASS(mutex ..)' thing would basically just expand to

    typedef struct mutex *class_mutex_type;
    static inline void class_mutex_destructor(class_mutex_type *_C)
    { mutex_unlock(*_C); }
    static inline class_mutex_type class_mutex_constructor(struct mutex *mutex)
    { return ({mutex_lock(mutex); mutex;}); }

Then to _instantiate_ one of those, you'd do

    INSTANTIATE_CLASS(name, var)

which would expand to

    class_name_type var
        __attribute__((__cleanup__(class_name_destructor))) =
class_name_constructor

and the magic of that syntax is that you'd actually use that
"INSTANTIATE_CLASS()" with the argument to the init function
afterwards, so you'd actually do

    INSTANTIATE_CLASS(mutex, n)(&sched_domains_mutex);

to create a variable 'n' of class 'mutex', where the
class_mutex_constructor gets the pointer to 'sched_domain_mutex' as
the argument.

And at THAT point, you can do this:

    #define mutex_guard(m) \
        INSTANTIATE_CLASS(mutex, __UNIQUE_ID(mutex))(m)

and now you can do

       mutex_guard(&sched_domains_mutex);

to basically start a guarded section where you hold 'sched_domains_mutex'.

And in that *very* least situation, 'guard' makes sense in the name.
But no earlier. And there is absolutely no 'ptr' anywhere here.

The above should work also for the 'struct fd' case, so you can do

       INSTANTIATE(fd, f)(fd);

to create a 'struct fd' named 'f' that is initialized with
'fdget(fd)', and will DTRT when going out of scope. Modulo any stupid
errors I did.

And ok, I didn't write out the exact macro magic to *get* those
expansions above (I just tried to write out the approximate end
result), but it should be mostly fairly straightforward.

So it would be the usual token pasting tricks to get the 'class type',
the 'class destructor' and the 'class constructor' functions.

Finally, note that the "initarg" part is a macro vararg thing. The
initargs can be empty (for things like RCU), but it could also
possibly be multiple arguments (like a "size and gfp_flags" for an
allocation?).

I'm sure there's something horribly wrong in the above, but my point
is that I'd really like this to make naming and conceptual sense.

And if somebody goes "The above is basically exactly what the original
C++ compiler did back when it was really just a pre-processor for C",
then you'd be 100% right. The above is basically (a part of) what
Bjarne did, except he did it as a separate pass.

And to answer the next question: why not just give up and do C++ -
it's because of all the *other* things Bjarne did.

             Linus
  
Peter Zijlstra June 7, 2023, 9:41 a.m. UTC | #21
On Tue, Jun 06, 2023 at 04:22:26PM -0700, Linus Torvalds wrote:
> On Tue, Jun 6, 2023 at 11:08 AM Peter Zijlstra <peterz@infradead.org> wrote:>
> > Would it all be less offensive if I did: s/guard/cleanup/ on the whole
> > thing?
> 
> It's more than "guard" for me.
> 
> What is "ptr"? Why? We already know of at least one case where it's
> not a pointer at all, ie 'struct fd'.

(so in my view struct fd is nothing more than a fat pointer)

> So I *really* hate the naming. Absolutely none of it makes sense to
> me. One part is a nonsensical name apparently based on a special-case
> operation, and the other part is a nonsensical type from just one
> random - if common - implementation issue.
> 
> What you want to do is to have a way to define and name a
> "constructor/desctructor" pair for an arbitrary type - *not*
> necessarily a pointer - and then optionally a way to say "Oh, don't do
> the destructor, because I'm actually going to use it long-term".

Yes, so when it's a 'pointer', that part becomes assigning it NULL (or
fdnull in the struct fd case). For example:

DEFINE_PTR_CLEANUP(kfree, void *, kfree(_C))

	ptr_cleanup(kfree, mem) = kzalloc(....);
	if (!mem)
		return -ENOMEM;

	object = mem;

	// build object with more failure cases

	mem = NULL;          // object is a success, we keep it.
	return object;

> I said "cleanup", but that's not right either, since we always have to
> have that initializer too.

I've found that for most things the initializer part isn't actually that
important. Consider that struct fd thing again; perf has a helper:

static inline struct fd perf_fget_light(int fd)
{
	struct fd f = fdget(fd);
	if (!f.file)
		return fdnull;

	if (f.file->f_op != &perf_fops) {
		fdput(f);
		return fdnull;
	}
	return f;
}

So now we have both fdget() and perf_fget_light() to obtain a struct fd,
both need fdput().

The pointer with destructor semantics works for both:

DEFINE_PTR_CLEANUP(fdput, struct fd, fdput(_C))

	ptr_cleanup(fdput, f) = perf_fget_light(fd);

or, somewhere else:

	ptr_cleanup(fdput, f) = fdget(fd);


The same is true for kfree(), we have a giant pile of allocation
functions that all are freed with kfree(): kmalloc(), kzalloc(),
kmalloc_node(), kzalloc_node(), krealloc(), kmalloc_array(),
krealloc_array(), kcalloc(), etc..

> Maybe just bite the bullet, and call the damn thing a "class", and
> have some syntax like
> 
>      DEFINE_CLASS(name, type, exit, init, initarg...);
> 
> to create the infrastructure for some named 'class'. So you'd have
> 
>     DEFINE_CLASS(mutex, struct mutex *,
>         mutex_unlock(*_P),
>         ({mutex_lock(mutex); mutex;}), struct mutex *mutex)
> 
> to define the mutex "class", and do
> 
>     DEFINE_CLASS(fd, struct fd,
>         fdput(*_P),
>         fdget(f), int f)
> 
> for the 'struct fd' thing.

Right; that is very close to what I have. And certainly useful --
although as per the above, perhaps not so for the struct fd case.

> Then to _instantiate_ one of those, you'd do
> 
>     INSTANTIATE_CLASS(name, var)
> 
> which would expand to
> 
>     class_name_type var
>         __attribute__((__cleanup__(class_name_destructor))) =
> class_name_constructor
> 
> and the magic of that syntax is that you'd actually use that
> "INSTANTIATE_CLASS()" with the argument to the init function
> afterwards, so you'd actually do
> 
>     INSTANTIATE_CLASS(mutex, n)(&sched_domains_mutex);
> 
> to create a variable 'n' of class 'mutex', where the
> class_mutex_constructor gets the pointer to 'sched_domain_mutex' as
> the argument.

Yes, I had actually considered this syntax, and I really like it. The
only reason I hadn't done that is because the for-loop thing, there I
couldn't make it work.

> I'm sure there's something horribly wrong in the above, but my point
> is that I'd really like this to make naming and conceptual sense.

Right, I hear ya. So the asymmetric case (iow destructor only) could be
seen as using the copy-constructor.

#define DEFINE_CLASS(name, type, exit, init, init_args...)		\
typedef type class_##name##_t;						\
static inline void class_##name##_destructor(type *this)		\
{ type THIS = *this; exit; }						\
static inline type class_##name##_constructor(init_args)		\
{ type THIS = init; return THIS; }

#define __INSTANTIATE_VAR(name, var)					\
	class_##name##_t var __cleanup(class_##name##_destructor)

#define INSTANTIATE_CLASS(name, var)					\
	__INSTANTIATE_VAR(name, var) = class_##name##_constructor


DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f)

	INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd));


Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily
circumvent the default constructor?

And/or how about EXTEND_CLASS(), something like so?

#define EXTEND_CLASS(name, ext, init, init_args...)			\
typedef class_##name##_t class_##name##ext##_t;				\
static inline void class_##name##ext##_destructor(class_##name##_t *this) \
{ class_##name##_destructor(this); }					\
static inline type class_##name##ext##_constructor(init_args)		\
{ type THIS = init; return THIS; }


DEFINE_CLASS(fd, struct fd, fdput(THIS), fdget(fd), int fd)
EXTEND_CLASS(fd, _perf, perf_fget_light(fd), int fd)

	INSTANTIATE_CLASS(fd_perf, f)(fd);


> And at THAT point, you can do this:
> 
>     #define mutex_guard(m) \
>         INSTANTIATE_CLASS(mutex, __UNIQUE_ID(mutex))(m)
> 
> and now you can do
> 
>        mutex_guard(&sched_domains_mutex);

So the 'problem' is the amount of different guards I ended up having and
you can't have macro's define more macros :/

Which is how I ended up with the:

	guard(mutex, &sched_domains_mutex);

syntax.

This can ofcourse be achieved using the above CLASS thing like:

DEFINE_CLASS(mutex, struct mutex *, mutex_unlock(THIS),
	     ({ mutex_lock(m); m; }), struct mutex *m)

#define named_guard(name, var, args...)					\
	INSTANTIATE_CLASS(name, var)(args)

#define guard(name, args...)						\
	named_guard(name, __UNIQUE_ID(guard), args)

#define scoped_guard(name, args...)					\
	for (named_guard(name, scope, args),				\
	     *done = NULL; !done; done = (void *)1)


With the understanding they're only to be used for locks.

Also, I'm already tired of writing INSTANTIATE.. would:

	CLASS(fd, f)(fd);

	VAR(kfree, mem) = kzalloc_node(...);

be acceptable shorthand?
  
Peter Zijlstra June 8, 2023, 8:52 a.m. UTC | #22
On Wed, Jun 07, 2023 at 11:41:01AM +0200, Peter Zijlstra wrote:


> > I'm sure there's something horribly wrong in the above, but my point
> > is that I'd really like this to make naming and conceptual sense.
> 
> Right, I hear ya. So the asymmetric case (iow destructor only) could be
> seen as using the copy-constructor.
> 
> #define DEFINE_CLASS(name, type, exit, init, init_args...)		\
> typedef type class_##name##_t;						\
> static inline void class_##name##_destructor(type *this)		\
> { type THIS = *this; exit; }						\
> static inline type class_##name##_constructor(init_args)		\
> { type THIS = init; return THIS; }
> 
> #define __INSTANTIATE_VAR(name, var)					\
> 	class_##name##_t var __cleanup(class_##name##_destructor)
> 
> #define INSTANTIATE_CLASS(name, var)					\
> 	__INSTANTIATE_VAR(name, var) = class_##name##_constructor
> 
> 
> DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f)
> 
> 	INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd));
> 
> 
> Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily
> circumvent the default constructor?

Or perhaps use the smart-pointer concept applied to our classes like:

#define smart_ptr(name, var) \
	__INSTANTIATE_VAR(name, var)

To mean a pointer that calls the destructor for class 'name'. I think
the nearest thing C++ has is std::unique_ptr<>.


Then we can write:


DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p)


	smart_ptr(kfree, mem) = kzalloc_node(...);
	if (!mem)
		return -ENOMEM;

	object = mem;

	// further initiatlize object with error cases etc..

	mem = NULL; // success, we keep it.
	return object;
  
Greg KH June 8, 2023, 9:04 a.m. UTC | #23
On Thu, Jun 08, 2023 at 10:52:48AM +0200, Peter Zijlstra wrote:
> On Wed, Jun 07, 2023 at 11:41:01AM +0200, Peter Zijlstra wrote:
> 
> 
> > > I'm sure there's something horribly wrong in the above, but my point
> > > is that I'd really like this to make naming and conceptual sense.
> > 
> > Right, I hear ya. So the asymmetric case (iow destructor only) could be
> > seen as using the copy-constructor.
> > 
> > #define DEFINE_CLASS(name, type, exit, init, init_args...)		\
> > typedef type class_##name##_t;						\
> > static inline void class_##name##_destructor(type *this)		\
> > { type THIS = *this; exit; }						\
> > static inline type class_##name##_constructor(init_args)		\
> > { type THIS = init; return THIS; }
> > 
> > #define __INSTANTIATE_VAR(name, var)					\
> > 	class_##name##_t var __cleanup(class_##name##_destructor)
> > 
> > #define INSTANTIATE_CLASS(name, var)					\
> > 	__INSTANTIATE_VAR(name, var) = class_##name##_constructor
> > 
> > 
> > DEFINE_CLASS(fd, struct fd, fdput(THIS), f, struct fd f)
> > 
> > 	INSTANTIATE_CLASS(fd, f)(perf_fget_light(fd));
> > 
> > 
> > Alternatively, you be OK with exposing INSTANTIATE_VAR() to easily
> > circumvent the default constructor?
> 
> Or perhaps use the smart-pointer concept applied to our classes like:
> 
> #define smart_ptr(name, var) \
> 	__INSTANTIATE_VAR(name, var)
> 
> To mean a pointer that calls the destructor for class 'name'. I think
> the nearest thing C++ has is std::unique_ptr<>.
> 
> 
> Then we can write:
> 
> 
> DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p)
> 
> 
> 	smart_ptr(kfree, mem) = kzalloc_node(...);
> 	if (!mem)
> 		return -ENOMEM;
> 
> 	object = mem;
> 
> 	// further initiatlize object with error cases etc..
> 
> 	mem = NULL; // success, we keep it.
> 	return object;

I like the idea, as we need a way to say "don't clean this up, it was
passed to somewhere else" for these types of allocations, but have it
"automatically" cleaned up on the error paths.

I have no say in the naming, though I always disliked the idea of a
pointer being "smart" as they are just a dumb memory register :)

thanks,

greg k-h
  
Linus Torvalds June 8, 2023, 3:45 p.m. UTC | #24
On Thu, Jun 8, 2023 at 1:53 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Or perhaps use the smart-pointer concept applied to our classes like:
>
> #define smart_ptr(name, var) \
>         __INSTANTIATE_VAR(name, var)

So this is the only situation where I think "ptr" makes sense (your
"fat pointer" argument is nonsensical - sure, you can treat anything
as a pointer if you're brave enough, but that just means that
"pointer" becomes a meaningless word).

However, I think that for "smart pointers", we don't need any of this
complexity at all, and we don't need that ugly syntax.

> Then we can write:
>
> DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p)
>
>         smart_ptr(kfree, mem) = kzalloc_node(...);
>         if (!mem)
>                 return -ENOMEM;

No, the above is broken, and would result in us using "void *" in
situations where we really *really* don't want that.

For automatic freeing, you want something that can handle different
types properly, and without having to constantly declare the types
somewhere else before use.

And no, you do *not* want that "kfree(THIS)" kind of interface,
because you want the compiler to inline the freeing function wrapper,
and notice _statically_ when somebody zeroed the variable and not even
call "kfree()", because otherwise you'd have a pointless call to
kfree(NULL) in the success path too.

So for convenient automatic pointer freeing, you want an interface
much more akin to

        struct whatever *ptr __automatic_kfree = kmalloc(...);

which is much more legible, doesn't have any type mis-use issues, and
is also just trivially dealt with by a

  static inline void automatic_kfree_wrapper(void *pp)
  { void *p = *(void **)pp; if (p) kfree(p); }
  #define __automatic_kfree \
        __attribute__((__cleanup__(automatic_kfree_wrapper)))
  #define no_free_ptr(p) \
        ({ __auto_type __ptr = (p); (p) = NULL; __ptr; })

which I just tested generates the sane code even for the "set the ptr
to NULL and return success" case.

The above allows you to trivially do things like

        struct whatever *p __automatic_kfree = kmalloc(..);

        if (!do_something(p))
                return -ENOENT;

        return no_free_ptr(p);

and it JustWorks(tm).

And yes, it needs a couple of different versions of that
"__automatic_kfree()" wrapper for the different freeing cases (kvfree,
rcu_free, whatever), but those are all trivial one-liners.

And no, I didn't think too much about those names. "__automatic_kfree"
is too damn long to type, but you hated "auto". And "no_free_ptr()" is
not wonderful name either. But I tried to make the naming at least be
obvious, if not wonderful.

               Linus
  
David Laight June 8, 2023, 4:25 p.m. UTC | #25
From: Linus Torvalds
> Sent: 06 June 2023 16:46
> 
> On Tue, Jun 6, 2023 at 8:31 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > nit: Linus's example was "(void *)8" (instead of 1) because we've had
> > issues in the past with alignment warnings on archs that are sensitive
> > to it. (e.g. see the __is_constexpr() macro which is doing NULL/!NULL
> > comparisons.)

__is_constexpr() is playing entirely different games.
Basically the type of (x ? (void *)y : (int *)z)
depends on whether 'y' is a compile-time 0 (int *) or not (void *).

...
> So I'm not sure the 1-vs-8 actually matters. We do other things that
> assume that low bits in a pointer are retained and valid, even if in
> theory the C type system might have issues with it.

Yes, given that gcc will assume a pointer is aligned if you try
to memcpy() from it, I'm surprised it doesn't always assume that.
In which case (long)ptr_to_int & 3 can be validly assumed to be zero.

I've found some 'day job' code that passed the address of a
member of a 'packed' structure to a function which then used
host ordered unsigned char[] accesses.
The compiler is certainly allowed to convert that back to
a word write - which would then fault.
(I've not looked to see if any modern compilers do.)
Of course the simple ptr->member would have be fine.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  
Kees Cook June 8, 2023, 4:47 p.m. UTC | #26
On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote:
> So for convenient automatic pointer freeing, you want an interface
> much more akin to
> 
>         struct whatever *ptr __automatic_kfree = kmalloc(...);
> 
> which is much more legible, doesn't have any type mis-use issues, and
> is also just trivially dealt with by a
> 
>   static inline void automatic_kfree_wrapper(void *pp)
>   { void *p = *(void **)pp; if (p) kfree(p); }
>   #define __automatic_kfree \
>         __attribute__((__cleanup__(automatic_kfree_wrapper)))
>   #define no_free_ptr(p) \
>         ({ __auto_type __ptr = (p); (p) = NULL; __ptr; })
> 
> which I just tested generates the sane code even for the "set the ptr
> to NULL and return success" case.
> 
> The above allows you to trivially do things like
> 
>         struct whatever *p __automatic_kfree = kmalloc(..);
> 
>         if (!do_something(p))
>                 return -ENOENT;
> 
>         return no_free_ptr(p);

I am a little worried about how (any version so far of) this API could go
wrong, e.g. if someone uses this and does "return p" instead of "return
no_free_ptr(p)", it'll return a freed pointer. I was hoping we could do
something like this to the end of automatic_kfree_wrapper():

	*(void **)pp = NULL;

i.e. if no_free_ptr() goes missing, "return p" will return NULL, which
is much easier to track down that dealing with later use-after-free bugs,
etc. Unfortunately, the __cleanup ordering is _after_ the compiler stores
the return value...

static inline void cleanup_info(struct info **p)
{
	free(*p);
	*p = NULL; /* this is effectively ignored */
}

struct info *do_something(int f)
{
	struct info *var __attribute__((__cleanup__(cleanup_info))) =
		malloc(1024);

	process(var);

	return var; /* oops, forgot to disable cleanup */
}

compile down to:

do_something:
        pushq   %rbx
        movl    $1024, %edi
        call    malloc
        movq    %rax, %rbx
        movq    %rax, %rdi
        call    process
        movq    %rbx, %rdi
        call    free
        movq    %rbx, %rax	; uses saved copy of malloc return
        popq    %rbx
        ret

The point being, if we can proactively make this hard to shoot ourselves in
the foot, that would be nice. :)
  
Linus Torvalds June 8, 2023, 4:59 p.m. UTC | #27
On Thu, Jun 8, 2023 at 9:47 AM Kees Cook <keescook@chromium.org> wrote:
>
> I am a little worried about how (any version so far of) this API could go
> wrong, e.g. if someone uses this and does "return p" instead of "return
> no_free_ptr(p)",

Absolutely. I think the whole "don't always cleanup" is quite
dangerous, and maybe not worth it, but it _is_ a fairly common
pattern.

>     I was hoping we could do
> something like this to the end of automatic_kfree_wrapper():
>
>        *(void **)pp = NULL;

That would be a lovely thing, but as you found out, it fundamentally
cannot work.

The whole point of "cleanup" is that it is done when the variable -
not trh *value* - goes out of scope.

So when you have that

        return var; /* oops, forgot to disable cleanup */

situation, by definition "var" hasn't gone out of scope until _after_
you read the value and return that value, so pretty much by definition
it cannot make a difference to assign something to 'pp' in the cleanup
code.

> The point being, if we can proactively make this hard to shoot ourselves in
> the foot, that would be nice. :)

So the good thing is that things like KASAN would immediately notice,
and since this all happens (presumably) for the success case, it's not
about some unlikely error case.

I also think that -fanalyzer might just catch it (as part of
-Wanalyzer-use-after-free) at compile-time, but I didn't check. So
if/when people start using -fanalyze on the kernel, those things would
be caught early.

So while this "forget to avoid cleanup" is a worry, I think it ends up
likely being one that is fairly easy to avoid with other checks in
place.

Famous last words.

             Linus
  
Nick Desaulniers June 8, 2023, 5:20 p.m. UTC | #28
On Thu, Jun 8, 2023 at 9:47 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote:
> > So for convenient automatic pointer freeing, you want an interface
> > much more akin to
> >
> >         struct whatever *ptr __automatic_kfree = kmalloc(...);
> >
> > which is much more legible, doesn't have any type mis-use issues, and
> > is also just trivially dealt with by a
> >
> >   static inline void automatic_kfree_wrapper(void *pp)
> >   { void *p = *(void **)pp; if (p) kfree(p); }
> >   #define __automatic_kfree \
> >         __attribute__((__cleanup__(automatic_kfree_wrapper)))
> >   #define no_free_ptr(p) \
> >         ({ __auto_type __ptr = (p); (p) = NULL; __ptr; })
> >
> > which I just tested generates the sane code even for the "set the ptr
> > to NULL and return success" case.
> >
> > The above allows you to trivially do things like
> >
> >         struct whatever *p __automatic_kfree = kmalloc(..);
> >
> >         if (!do_something(p))
> >                 return -ENOENT;
> >
> >         return no_free_ptr(p);
>
> I am a little worried about how (any version so far of) this API could go
> wrong, e.g. if someone uses this and does "return p" instead of "return
> no_free_ptr(p)", it'll return a freed pointer.

Presumably, one could simply just not use RAII(/SBRM someone else
corrected me about this recently coincidentally; I taught them my fun
C++ acronym IDGAF) when working with a value that conditionally
"escapes" the local scope.
  
Peter Zijlstra June 8, 2023, 6:51 p.m. UTC | #29
On Thu, Jun 08, 2023 at 10:20:19AM -0700, Nick Desaulniers wrote:
> Presumably, one could simply just not use RAII(/SBRM someone else
> corrected me about this recently coincidentally; I taught them my fun
> C++ acronym IDGAF) when working with a value that conditionally
> "escapes" the local scope.

But then you're back to the error goto :/
  
Peter Zijlstra June 8, 2023, 8:06 p.m. UTC | #30
On Thu, Jun 08, 2023 at 08:45:53AM -0700, Linus Torvalds wrote:

> > DEFINE_CLASS(kfree, void *, kfree(THIS), p, void *p)
> >
> >         smart_ptr(kfree, mem) = kzalloc_node(...);
> >         if (!mem)
> >                 return -ENOMEM;
> 
> No, the above is broken, and would result in us using "void *" in
> situations where we really *really* don't want that.
> 
> For automatic freeing, you want something that can handle different
> types properly, and without having to constantly declare the types
> somewhere else before use.

Ah, I see what you did with the no_free_ptr(), that avoids having to
have two pointers around, nice!

> So for convenient automatic pointer freeing, you want an interface
> much more akin to
> 
>         struct whatever *ptr __automatic_kfree = kmalloc(...);
> 
> which is much more legible, doesn't have any type mis-use issues, and
> is also just trivially dealt with by a
> 
>   static inline void automatic_kfree_wrapper(void *pp)
>   { void *p = *(void **)pp; if (p) kfree(p); }
>   #define __automatic_kfree \
>         __attribute__((__cleanup__(automatic_kfree_wrapper)))
>   #define no_free_ptr(p) \
>         ({ __auto_type __ptr = (p); (p) = NULL; __ptr; })
> 
> which I just tested generates the sane code even for the "set the ptr
> to NULL and return success" case.
> 
> The above allows you to trivially do things like
> 
>         struct whatever *p __automatic_kfree = kmalloc(..);
> 
>         if (!do_something(p))
>                 return -ENOENT;
> 
>         return no_free_ptr(p);
> 
> and it JustWorks(tm).

OK, something like so then?


#define DEFINE_FREE(name, type, free) \
	static inline __free_##name(type *p) { type _P = *p; free; }

#define __free(name)	__cleanup(__free_##name)

#define no_free_ptr(p) \
	({ __auto_type __ptr = (p); (p) = NULL; __ptr; })


DEFINE_FREE(kfree, void *, if (_P) kfree(_P));

	struct obj *p __free(kfree) = kmalloc(...);

	if (!do_something(p))
		return -ENOENT;

	return no_free_ptr(p);




DEFINE_CLASS(find_get_context, struct perf_event_context *,
	     if (!IS_ERR_OR_NULL(_C)) put_ctx(_C),
	     find_get_context(task, event), struct task_struct *task, struct perf_event *event)

DEFINE_FREE(free_event, struct perf_event *,
	    if (!IS_ERR_OR_NULL(_P)) free_event(_P));


	struct perf_event *event __free(free_event) = perf_event_alloc(...)
	if (IS_ERR(event))
		return event;

	class(find_get_context, ctx)(task, event);
	if (IS_ERR(ctx))
		return (void*)ctx;

	if (!task && !container_of(ctx, struct perf_cpu_context, ctx)->online)
		return -ENODEV;

	...

	event->ctx = get_ctx(ctx);

	return no_free_ptr(event);



works for me, I'll go make it happen.
  
Nick Desaulniers June 8, 2023, 8:14 p.m. UTC | #31
On Thu, Jun 8, 2023 at 11:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jun 08, 2023 at 10:20:19AM -0700, Nick Desaulniers wrote:
> > Presumably, one could simply just not use RAII when working with a value that conditionally
> > "escapes" the local scope.
>
> But then you're back to the error goto :/

Thinking more about the expected ergonomics here over lunch...no
meaningful insights, just thoughts...

For something like a mutex/lock; I'd expect those to be acquired then
released within the same function, yeah? In which case
__attribute__((cleanup())) has fewer hazards since the resource
doesn't escape.

For a pointer to a dynamically allocated region that may get returned
to the caller...

I mean, people do this in C++. It is safe and canonical to return a
std::unique_ptr.  When created locally the destructor does the
expected thing regardless of control flow.  IIUC, std::unique_ptr's
move constructor basically does what Kees suggested earlier (trigger
warning: C++): https://github.com/llvm/llvm-project/blob/7a52f79126a59717012d8039ef875f68e3c637fd/libcxx/include/__memory/unique_ptr.h#L429-L430.
example: https://godbolt.org/z/51s49G9f1

A recent commit to clang https://reviews.llvm.org/rG301eb6b68f30
raised an interesting point (deficiency is perhaps too strong a word)
about GNU-style attributes; they generally have no meaning on an ABI.

Consider a function that returns a locally constructed
std::unique_ptr.  If the function returns a type where the caller
knows what destructor functions to run.  This is part of the ABI.

Here, we're talking about using __attribute__((cleanup())) to DTR
locally, but then we return a "raw" pointer to a caller. What cleanup
function should the caller run, implicitly, if at all?  If we use
__attribute__((cleanup())) that saves us a few gotos locally, but the
caller perhaps now needs the same treatment.
  
Linus Torvalds June 9, 2023, 2:25 a.m. UTC | #32
On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
>         struct obj *p __free(kfree) = kmalloc(...);

Yeah, the above actually looks really good to me - I like the naming
here, and the use looks very logical to me.

Of course, maybe once I see the patches that use this I go "uhh", but
at least for now I think you've hit on a rather legible syntax.

I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least
it's pretty clear about what it does.

So my only worry is that it's not pretty and to the point like your
"__free(kfree)" syntax.

But it feels workable and not misleading, so unless somebody can come
up with a better name, I think it's ok.

           Linus
  
Peter Zijlstra June 9, 2023, 8:14 a.m. UTC | #33
On Thu, Jun 08, 2023 at 07:25:27PM -0700, Linus Torvalds wrote:
> On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> >         struct obj *p __free(kfree) = kmalloc(...);
> 
> Yeah, the above actually looks really good to me - I like the naming
> here, and the use looks very logical to me.
> 
> Of course, maybe once I see the patches that use this I go "uhh", but
> at least for now I think you've hit on a rather legible syntax.
> 
> I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least
> it's pretty clear about what it does.
> 
> So my only worry is that it's not pretty and to the point like your
> "__free(kfree)" syntax.
> 
> But it feels workable and not misleading, so unless somebody can come
> up with a better name, I think it's ok.

#define return_ptr(p) \
	return no_free_ptr(p)


	struct obj *p __free(kfree) = kmalloc(...);
	if (!p)
		return ERR_PTR(-ENOMEM);

	...

	return_ptr(p);

?
  
Rasmus Villemoes June 9, 2023, 8:27 a.m. UTC | #34
On 08/06/2023 17.45, Linus Torvalds wrote:

> And no, I didn't think too much about those names. "__automatic_kfree"
> is too damn long to type, but you hated "auto". And "no_free_ptr()" is
> not wonderful name either. But I tried to make the naming at least be
> obvious, if not wonderful.

No opinion either way, just throwing a data point out there:

So the systemd codebase makes quite heavy use of this automatic cleanup
stuff. Their name for no_free_ptr is TAKE_PTR(), and the verb "take"
apparently comes from Rust:

//// src/fundamental/macro-fundamental.h

/* Takes inspiration from Rust's Option::take() method: reads and
returns a pointer, but at the same time
 * resets it to NULL. See:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take */
#define TAKE_GENERIC(var, type, nullvalue)                       \
        ({                                                       \
                type *_pvar_ = &(var);                           \
                type _var_ = *_pvar_;                            \
                type _nullvalue_ = nullvalue;                    \
                *_pvar_ = _nullvalue_;                           \
                _var_;                                           \
        })
#define TAKE_PTR_TYPE(ptr, type) TAKE_GENERIC(ptr, type, NULL)
#define TAKE_PTR(ptr) TAKE_PTR_TYPE(ptr, typeof(ptr))
#define TAKE_STRUCT_TYPE(s, type) TAKE_GENERIC(s, type, {})
#define TAKE_STRUCT(s) TAKE_STRUCT_TYPE(s, typeof(s))


and then they also have a

/* Like TAKE_PTR() but for file descriptors, resetting them to -EBADF */
#define TAKE_FD(fd) TAKE_GENERIC(fd, int, -EBADF)

with their "auto-close fd" helper of course knowing to ignore any
negative value.

Rasmus
  
Paolo Bonzini June 9, 2023, 10:20 a.m. UTC | #35
On 6/8/23 22:14, Nick Desaulniers wrote:
> Here, we're talking about using __attribute__((cleanup())) to DTR
> locally, but then we return a "raw" pointer to a caller. What cleanup
> function should the caller run, implicitly, if at all?  If we use
> __attribute__((cleanup())) that saves us a few gotos locally, but the
> caller perhaps now needs the same treatment.

But this is only a problem when you return a void*; and in general in C 
you will return a struct more often than a raw pointer (and in C++ you 
also have the issue of delete vs. delete[], that does not exist in C).

Returning a struct doesn't protect against use-after-free bugs in the 
way std::unique_ptr<> or Rust lifetimes do, but it at least tries to 
protect against calling the wrong cleanup function if you provide a 
typed "destructor" function that does the right thing---for example by 
handling reference counting or by freeing sub-structs before calling 
kfree/vfree.

Of course it's not a silver bullet, but then that's why people are 
looking into Rust for Linux.

Paolo
  
Kees Cook June 9, 2023, 9:18 p.m. UTC | #36
On June 8, 2023 7:25:27 PM PDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Thu, Jun 8, 2023 at 1:06 PM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>>         struct obj *p __free(kfree) = kmalloc(...);
>
>Yeah, the above actually looks really good to me - I like the naming
>here, and the use looks very logical to me.
>
>Of course, maybe once I see the patches that use this I go "uhh", but
>at least for now I think you've hit on a rather legible syntax.
>
>I'm still a bit unsure of the "no_free_ptr(p)" naming, but at least
>it's pretty clear about what it does.
>
>So my only worry is that it's not pretty and to the point like your
>"__free(kfree)" syntax.
>
>But it feels workable and not misleading, so unless somebody can come
>up with a better name, I think it's ok.

I like the proposed "take" naming, and before reading that reply I was going to suggest "keep". *shrug*