[0/3] get_maintainer: add patch-only keyword matching

Message ID 20230927-get_maintainer_add_d-v1-0-28c207229e72@google.com
Headers
Series get_maintainer: add patch-only keyword matching |

Message

Justin Stitt Sept. 27, 2023, 3:19 a.m. UTC
  This series aims to add "D:" which behaves exactly the same as "K:" but
works only on patch files.

The goal of this is to reduce noise when folks use get_maintainer on
tree files as opposed to patches. This use case should be steered away
from [1] but "D:" should help maintainers reduce noise in their inboxes
regardless, especially when matching omnipresent keywords like [2]. In
the event of [2] Kees would be to/cc'd from folks running get_maintainer
on _any_ file containing "__counted_by". The number of these files is
rising and I fear for his inbox as his goal, as I understand it, is to
simply monitor the introduction of new __counted_by annotations to
ensure accurate semantics.

See [3/3] for an illustrative example.

This series also includes a formatting pass over get_maintainer because
I personally found it difficult to parse with the human eye.

[1]: https://lore.kernel.org/all/20230726151515.1650519-1-kuba@kernel.org/
[2]: https://lore.kernel.org/all/20230925172037.work.853-kees@kernel.org/

Signed-off-by: Justin Stitt <justinstitt@google.com>
---
Justin Stitt (3):
      MAINTAINERS: add documentation for D:
      get_maintainer: run perltidy
      get_maintainer: add patch-only pattern matching type

 MAINTAINERS               |    3 +
 scripts/get_maintainer.pl | 3334 +++++++++++++++++++++++----------------------
 2 files changed, 1718 insertions(+), 1619 deletions(-)
---
base-commit: 6465e260f48790807eef06b583b38ca9789b6072
change-id: 20230926-get_maintainer_add_d-07424a814e72

Best regards,
--
Justin Stitt <justinstitt@google.com>
  

Comments

Joe Perches Sept. 27, 2023, 3:26 a.m. UTC | #1
On Wed, 2023-09-27 at 03:19 +0000, Justin Stitt wrote:
> I'm a first time contributor to get_maintainer.pl and the formatting is
> suspicious. I am not sure if there is a particular reason it is the way
> it is but I let my editor format it and submitted the diff here in this
> patch.

Capital NACK.  Completely unnecessary and adds no value.
  
Joe Perches Sept. 27, 2023, 3:27 a.m. UTC | #2
On Wed, 2023-09-27 at 03:19 +0000, Justin Stitt wrote:
> Document what "D:" does.
> 
> This is more or less the same as what "K:" does but only works for patch
> files.

Nack.  I'd rather just add a !$file test to K: patterns.
  
Greg KH Sept. 27, 2023, 6:14 a.m. UTC | #3
On Wed, Sep 27, 2023 at 03:19:16AM +0000, Justin Stitt wrote:
> Note that folks really shouldn't be using get_maintainer on tree files
> anyways [1].

That's not true, Linus and I use it on a daily basis this way, it's part
of our normal workflow, AND the workflow of the kernel security team.

So please don't take that valid use-case away from us.

thanks,

greg k-h
  
Justin Stitt Sept. 27, 2023, 6:46 a.m. UTC | #4
On Wed, Sep 27, 2023 at 3:14 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Sep 27, 2023 at 03:19:16AM +0000, Justin Stitt wrote:
> > Note that folks really shouldn't be using get_maintainer on tree files
> > anyways [1].
>
> That's not true, Linus and I use it on a daily basis this way, it's part
> of our normal workflow, AND the workflow of the kernel security team.
>
> So please don't take that valid use-case away from us.

Fair. I'm on the side of keeping the "K:'' behavior the way it is and
that's why I'm proposing adding "D:" to provide a more granular
content matching type operating strictly on patches. It's purely
opt-in.

The patch I linked mentioned steering folks away from using
tree files but not necessarily removing the behavior.

>
> thanks,
>
> greg k-h

Thanks
Justin
  
Greg KH Sept. 27, 2023, 8:21 a.m. UTC | #5
On Wed, Sep 27, 2023 at 03:46:30PM +0900, Justin Stitt wrote:
> On Wed, Sep 27, 2023 at 3:14 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Sep 27, 2023 at 03:19:16AM +0000, Justin Stitt wrote:
> > > Note that folks really shouldn't be using get_maintainer on tree files
> > > anyways [1].
> >
> > That's not true, Linus and I use it on a daily basis this way, it's part
> > of our normal workflow, AND the workflow of the kernel security team.
> >
> > So please don't take that valid use-case away from us.
> 
> Fair. I'm on the side of keeping the "K:'' behavior the way it is and
> that's why I'm proposing adding "D:" to provide a more granular
> content matching type operating strictly on patches. It's purely
> opt-in.
> 
> The patch I linked mentioned steering folks away from using
> tree files but not necessarily removing the behavior.

Please don't steer folks away from it, it is a valid use case of the
tool, and I would argue, one of the most important ones given how often
I use it that way.

Hence my objection to this verbage in the changelog, it's not correct.

thanks,

greg k-h
  
Nick Desaulniers Sept. 27, 2023, 3:24 p.m. UTC | #6
On Tue, Sep 26, 2023 at 8:19 PM Justin Stitt <justinstitt@google.com> wrote:
>
> This series aims to add "D:" which behaves exactly the same as "K:" but
> works only on patch files.
>
> The goal of this is to reduce noise when folks use get_maintainer on
> tree files as opposed to patches. This use case should be steered away
> from [1] but "D:" should help maintainers reduce noise in their inboxes
> regardless, especially when matching omnipresent keywords like [2]. In
> the event of [2] Kees would be to/cc'd from folks running get_maintainer
> on _any_ file containing "__counted_by". The number of these files is
> rising and I fear for his inbox as his goal, as I understand it, is to
> simply monitor the introduction of new __counted_by annotations to
> ensure accurate semantics.

Something like this (whether this series or a different approach)
would be helpful to me as well; we use K: to get cc'ed on patches
mentioning clang or llvm, but our ML also then ends up getting cc'ed
on every follow up patch to most files.

This is causing excessive posts on our ML. As a result, it's a
struggle to get folks to cc themselves to the ML, which puts the code
review burden on fewer people.

Whether it's a new D: or refinement to the behavior of K:, I applaud
the effort.  Hopefully we can find an approach that works for
everyone.

And may God have mercy on your soul for having to touch that much perl. :-P

>
> See [3/3] for an illustrative example.
>
> This series also includes a formatting pass over get_maintainer because
> I personally found it difficult to parse with the human eye.
>
> [1]: https://lore.kernel.org/all/20230726151515.1650519-1-kuba@kernel.org/
> [2]: https://lore.kernel.org/all/20230925172037.work.853-kees@kernel.org/
>
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
> Justin Stitt (3):
>       MAINTAINERS: add documentation for D:
>       get_maintainer: run perltidy
>       get_maintainer: add patch-only pattern matching type
>
>  MAINTAINERS               |    3 +
>  scripts/get_maintainer.pl | 3334 +++++++++++++++++++++++----------------------
>  2 files changed, 1718 insertions(+), 1619 deletions(-)
> ---
> base-commit: 6465e260f48790807eef06b583b38ca9789b6072
> change-id: 20230926-get_maintainer_add_d-07424a814e72
>
> Best regards,
> --
> Justin Stitt <justinstitt@google.com>
>
  
Kees Cook Sept. 27, 2023, 4:01 p.m. UTC | #7
On Wed, Sep 27, 2023 at 08:24:58AM -0700, Nick Desaulniers wrote:
> On Tue, Sep 26, 2023 at 8:19 PM Justin Stitt <justinstitt@google.com> wrote:
> >
> > This series aims to add "D:" which behaves exactly the same as "K:" but
> > works only on patch files.
> >
> > The goal of this is to reduce noise when folks use get_maintainer on
> > tree files as opposed to patches. This use case should be steered away
> > from [1] but "D:" should help maintainers reduce noise in their inboxes
> > regardless, especially when matching omnipresent keywords like [2]. In
> > the event of [2] Kees would be to/cc'd from folks running get_maintainer
> > on _any_ file containing "__counted_by". The number of these files is
> > rising and I fear for his inbox as his goal, as I understand it, is to
> > simply monitor the introduction of new __counted_by annotations to
> > ensure accurate semantics.
> 
> Something like this (whether this series or a different approach)
> would be helpful to me as well; we use K: to get cc'ed on patches
> mentioning clang or llvm, but our ML also then ends up getting cc'ed
> on every follow up patch to most files.
> 
> This is causing excessive posts on our ML. As a result, it's a
> struggle to get folks to cc themselves to the ML, which puts the code
> review burden on fewer people.
> 
> Whether it's a new D: or refinement to the behavior of K:, I applaud
> the effort.  Hopefully we can find an approach that works for
> everyone.

Yes, please! I would use this immediately -- there are a bunch of places
where pstore, strings, hardening, etc all want review if certain
functions or structures are changed in a patch, but we're not
maintainers of the files they appear in.

> > Justin Stitt (3):
> >       MAINTAINERS: add documentation for D:
> >       get_maintainer: add patch-only pattern matching type

Can we squash these two changes together, and then likely add some
patches for moving things out of K: ?
  
Kees Cook Sept. 27, 2023, 4:06 p.m. UTC | #8
On Wed, Sep 27, 2023 at 03:19:14AM +0000, Justin Stitt wrote:
> Document what "D:" does.
> 
> This is more or less the same as what "K:" does but only works for patch
> files.
> 
> See [3/3] for more info and an illustrative example.
> ---
>  MAINTAINERS | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b19995690904..de68d2c0cf29 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -59,6 +59,9 @@ Descriptions of section entries and preferred order
>  	      matches patches or files that contain one or more of the words
>  	      printk, pr_info or pr_err
>  	   One regex pattern per line.  Multiple K: lines acceptable.
> +  D: *Content regex* (perl extended) pattern match patches only.
> +     Usage same as K:.
> +

The "emphasis" tags here are used when rendering:
https://docs.kernel.org/process/maintainers.html

In this case, I assume "D" is inspired by "Diff", so perhaps reword this
to get a proper emphasis hint, and add additional context:

  D: *Diff content regex* (perl extended) pattern match that applies
     only to patches and not entire files (e.g. when using the
     get_maintainers.pl script).
  
Joe Perches Sept. 27, 2023, 7 p.m. UTC | #9
On Wed, 2023-09-27 at 03:19 +0000, Justin Stitt wrote:
> Add the "D:" type which behaves the same as "K:" but will only match
> content present in a patch file.

Likely it'd be less aggravating just to document
that K: is only for patches and add a !$file test.
  
Joe Perches Sept. 27, 2023, 7:33 p.m. UTC | #10
On Wed, 2023-09-27 at 09:15 -0700, Kees Cook wrote:
> On Wed, Sep 27, 2023 at 03:19:16AM +0000, Justin Stitt wrote:
> > Add the "D:" type which behaves the same as "K:" but will only match
> > content present in a patch file.
> > 
> > To illustrate:
> > 
> > Imagine this entry in MAINTAINERS:
> > 
> > NEW REPUBLIC
> > M: Han Solo <hansolo@rebelalliance.co>
> > W: https://www.jointheresistance.org
> > D: \bstrncpy\b
> > 
> > Our maintainer, Han, will only be added to the recipients if a patch
> > file is passed to get_maintainer (like what b4 does):
> > $ ./scripts/get_maintainer.pl 0004-some-change.patch
> > 
> > If the above patch has a `strncpy` present in the subject, commit log or
> > diff then Han will be to/cc'd.
> > 
> > However, in the event of a file from the tree given like:
> > $ ./scripts/get_maintainer.pl ./lib/string.c
> > 
> > Han will not be noisily to/cc'd (like a K: type would in this
> > circumstance)
> > 
> > Note that folks really shouldn't be using get_maintainer on tree files
> > anyways [1].
> > 
> > [1]: https://lore.kernel.org/all/20230726151515.1650519-1-kuba@kernel.org/
> 
> As Greg suggested, please drop the above paragraph and link. Then this
> looks good to me.
> 
> I would immediately want to send this patch too, so please feel free to
> add this to your series (and I bet many other hints on "git grep 'K:.\\b'"
> would want to switch from K: to D: too):
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
[]
> @@ -5057,7 +5057,7 @@ F:	Documentation/kbuild/llvm.rst
>  F:	include/linux/compiler-clang.h
>  F:	scripts/Makefile.clang
>  F:	scripts/clang-tools/
> -K:	\b(?i:clang|llvm)\b
> +D:	\b(?i:clang|llvm)\b

etc...

My assumption is that the K: --file use is just unnecessary
and it'd be better to only use the K: lookup on patches.

(and I've somehow stuffed up the receiving side of my
 email configuration so please ignore any emails to me
 that bounce for a while)