Message ID | 20231004-get_maintainer_change_k-v1-1-ac7ced18306a@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:254a:b0:403:3b70:6f57 with SMTP id hf10csp409515vqb; Wed, 4 Oct 2023 14:21:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGAPZeTXMn7BLfiEKAEl4kH1muPnp/EjesoAlxVEp0GFX6mjdfNMCnQrzAAaBvYmH1aCOqa X-Received: by 2002:a17:903:32c1:b0:1c3:b0c7:38bf with SMTP id i1-20020a17090332c100b001c3b0c738bfmr985157plr.12.1696454502648; Wed, 04 Oct 2023 14:21:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696454502; cv=none; d=google.com; s=arc-20160816; b=LLaaMQmK4BWtsXhkkhHUunud1LIf/sI+RgCDRc5GGbTdpxvr7nmo9z0j+UhOxumn7t Q/erRaIJKL8Ida8YjEXQfFnVRGAQ6FbNlmT55TX7pxvUeBihw1p9xpVGU++KvikYRo3Y guAcPdeyxO1Qgl4qp0OIxZtl+zH81CLk0nWPi5/tzEuytwPUdlCK74WL8fjC3gfYAbA+ Epbk+bTJ2WeaZ4eeaYgepv6eke5+TUFTPUjcq83wAG0uOgzBl5SycwbsyEevOJZon6Pr kuwXX1NMhkkwNIMQH6c/dtVyEWKrFkVDF3liOnmuaRqFTByhjkfSfmNF/vhjcG0mK66i DXOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=uustZycB7kJeJ83TA5F/0nr7LHNmB7Mbe0hjj9D6IWQ=; fh=jBWag5Kq1ZAH8lWgncOuDan7vuMW7KyWk1NZRueaMRs=; b=1Itbz4eChKZlUp1lKT68q6t1rXJkIXXJItc3Kl5aP2ay99y9ONgs7LmNSxVwQJ5mjc kq42zM3hcVvhASygXDOekit28ISVRqtoR7yv0uIpXE/JXOyXHNIRP++Odk3OmJOPRZtP ZA9tj66u0FSZ/OYDEC/TayUFBtknAA5otw8xXGa5nYfP1CueKWtSlqMWW57E4oAEEaMv bB+W6+lx/D3jNY3aGqydD0LMFJBWyDcN6uiTsz+BQzSgs/pWRJM3r1SyBy78qeMZcGLR d4sdsRZpgJJqgDnRTEoTsO4Kt9eS59SoNiCQJ4c3QYmAOLB9Lz2bEP8mqlV2r3DGq777 NowQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Ha54N8py; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id ja1-20020a170902efc100b001c7615a8de5si25545plb.20.2023.10.04.14.21.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 14:21:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Ha54N8py; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 886B680401C7; Wed, 4 Oct 2023 14:21:39 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233594AbjJDVVX (ORCPT <rfc822;ezelljr.billy@gmail.com> + 19 others); Wed, 4 Oct 2023 17:21:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233800AbjJDVVU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 4 Oct 2023 17:21:20 -0400 Received: from mail-oo1-xc49.google.com (mail-oo1-xc49.google.com [IPv6:2607:f8b0:4864:20::c49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9B8E9E for <linux-kernel@vger.kernel.org>; Wed, 4 Oct 2023 14:21:16 -0700 (PDT) Received: by mail-oo1-xc49.google.com with SMTP id 006d021491bc7-57e43ab7b89so289561eaf.1 for <linux-kernel@vger.kernel.org>; Wed, 04 Oct 2023 14:21:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696454476; x=1697059276; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=uustZycB7kJeJ83TA5F/0nr7LHNmB7Mbe0hjj9D6IWQ=; b=Ha54N8pytjNC9tkGgZGjlSvCkiI4i6WxohaCDS3C1rDdFEdVSZSo4F6o9Fuj+ja/lQ 3GQRfasrw8cx2qpgfSO0BQTlmNgfAeL5veLd0vNOr4Z2x5wgCFROBjPDgk3ZMSx44LjB TxJ6dsJMbYv8Of6AbnNkFQ2Q+z5svXtXcv6q9m4AEKCNEHNfxZ0lgOtyi6d/Z+yhqYNQ smWKdgwDmBT0UUhb9sqseGgJlXQiUUBGXNdi1MJPHSBRABKIZ52f1kwp86OqscPzTWBB nBAmyQ6kp3HguWFnxP4uS5elCpA9ZqGpR2an3I1LEnUlaQfIWMLphbMXRvXxCRHOn6A8 SfqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696454476; x=1697059276; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=uustZycB7kJeJ83TA5F/0nr7LHNmB7Mbe0hjj9D6IWQ=; b=sbnprJ2/K7RpqXoF9rzufTQejCe/2yhn3Ddvds58+RfiS1KcwSIyVp7+90dBwsC8AP EF1cucdEnOQANd9f37xvi8WY1LIUc89omMZ0aYk9ElP5/hDRZI7q7K79oHpMGyQB2nEe d/wwqPhCoipONOslTxI0LJ5Mu+6STnY59aU9wLeOYD3lPkA6JrG36T3fmIXfMBT9spqD /S1rQXSPr4xy+7c2XN2bskJUSXM5Qppb9DADd2ZvZDw2/6CGbWew1KD+YCJk4B0Af10D F6XDoC23dw9XFs4hmWCbar2fREij6az/I0GCn/ohM2lFehf3l20WNqadh0kCFfXKBTwf UD9Q== X-Gm-Message-State: AOJu0YwS/YxYf2r5c0OxlUuDkEoMQJ2NQcoqZfv5R8IqoyDwHl/BlevW IdrEd4q8EWrtnmPfXsoI5sTpqRuo+LwRXzvalw== X-Received: from jstitt-linux1.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:23b5]) (user=justinstitt job=sendgmr) by 2002:a4a:3786:0:b0:56c:86f2:ae14 with SMTP id r128-20020a4a3786000000b0056c86f2ae14mr1194962oor.0.1696454476216; Wed, 04 Oct 2023 14:21:16 -0700 (PDT) Date: Wed, 04 Oct 2023 21:21:15 +0000 Mime-Version: 1.0 X-B4-Tracking: v=1; b=H4sIAErXHWUC/6tWKk4tykwtVrJSqFYqSi3LLM7MzwNyDHUUlJIzE vPSU3UzU4B8JSMDI2NDAwMT3fTUkvjcxMy8EiBOLYqHKIrP1jUxSzQyMDVNNUo2tVQC6i4oSk3 LrACbHB1bWwsA8EjKJGkAAAA= X-Developer-Key: i=justinstitt@google.com; a=ed25519; pk=tC3hNkJQTpNX/gLKxTNQKDmiQl6QjBNCGKJINqAdJsE= X-Developer-Signature: v=1; a=ed25519-sha256; t=1696454475; l=3356; i=justinstitt@google.com; s=20230717; h=from:subject:message-id; bh=1ouqrRVKZGujiAVFhjDudh3UU/n1klQ/sSPG4zQOfLc=; b=UD+GQ6914b8AOXn9ZvlYU2isRdr9rJqkMSxpIASk0X2jcIMhDfhUbpSvTGdbMM4eAL9IJoqIY DCGiTzLeTBECDKA4a+u9c4v381L8zcnlapyBvdWHrh36B5ZJEAAHyZ0 X-Mailer: b4 0.12.3 Message-ID: <20231004-get_maintainer_change_k-v1-1-ac7ced18306a@google.com> Subject: [PATCH] get_maintainer/MAINTAINERS: confine K content matching to patches From: Justin Stitt <justinstitt@google.com> To: Joe Perches <joe@perches.com> Cc: linux-kernel@vger.kernel.org, Justin Stitt <justinstitt@google.com>, Kees Cook <keescook@chromium.org>, Nick Desaulniers <ndesaulniers@google.com> Content-Type: text/plain; charset="utf-8" X-Spam-Status: No, score=-4.8 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 04 Oct 2023 14:21:39 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778861476325958306 X-GMAIL-MSGID: 1778861476325958306 |
Series |
get_maintainer/MAINTAINERS: confine K content matching to patches
|
|
Commit Message
Justin Stitt
Oct. 4, 2023, 9:21 p.m. UTC
The current behavior of K: is a tad bit noisy. It matches against the
entire contents of files instead of just against the contents of a
patch.
This means that a patch with a single character change (fixing a typo or
whitespace or something) would still to/cc maintainers and lists if the
affected file matched against the regex pattern given in K:. For
example, if a file has the word "clang" in it then every single patch
touching that file will to/cc Nick, Nathan and some lists.
Let's change this behavior to only content match against patches
(subjects, message, diff) as this is what most people expect the
behavior already is. Most users of "K:" would prefer patch-only content
matching. If this is not the case let's add a new matching type as
proposed in [1].
This patch involves 1) ripping out the file-based keyword matching from
get_maintainer.pl and 2) updating the MAINTAINERS documentation to
reflect patch-only semantics.
Signed-off-by: Justin Stitt <justinstitt@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/all/20230928-get_maintainer_add_d-v2-0-8acb3f394571@google.com/ [1]
---
MAINTAINERS | 8 ++++----
scripts/get_maintainer.pl | 13 -------------
2 files changed, 4 insertions(+), 17 deletions(-)
---
base-commit: cbf3a2cb156a2c911d8f38d8247814b4c07f49a2
change-id: 20231004-get_maintainer_change_k-46a2055e2c59
Best regards,
--
Justin Stitt <justinstitt@google.com>
Comments
On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > The current behavior of K: is a tad bit noisy. It matches against the > entire contents of files instead of just against the contents of a > patch. > > This means that a patch with a single character change (fixing a typo or > whitespace or something) would still to/cc maintainers and lists if the > affected file matched against the regex pattern given in K:. For > example, if a file has the word "clang" in it then every single patch > touching that file will to/cc Nick, Nathan and some lists. > > Let's change this behavior to only content match against patches > (subjects, message, diff) as this is what most people expect the > behavior already is. Most users of "K:" would prefer patch-only content > matching. If this is not the case let's add a new matching type as > proposed in [1]. I'm glad to know you are coming around to my suggestion. I believe the file-based keyword matching should _not_ be removed and the option should be added for it like I suggested. I also think it might be better to mark the "maintained" output differently as something like "keyword matched" instead. Something like: --- scripts/get_maintainer.pl | 38 ++++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl index ab123b498fd9..befae75e61ab 100755 --- a/scripts/get_maintainer.pl +++ b/scripts/get_maintainer.pl @@ -57,6 +57,7 @@ my $subsystem = 0; my $status = 0; my $letters = ""; my $keywords = 1; +my $keywords_in_file = 0; my $sections = 0; my $email_file_emails = 0; my $from_filename = 0; @@ -272,6 +273,7 @@ if (!GetOptions( 'letters=s' => \$letters, 'pattern-depth=i' => \$pattern_depth, 'k|keywords!' => \$keywords, + 'kf|keywords-in-file!' => \$keywords_in_file, 'sections!' => \$sections, 'fe|file-emails!' => \$email_file_emails, 'f|file' => \$from_filename, @@ -318,6 +320,7 @@ if ($sections || $letters ne "") { $subsystem = 0; $web = 0; $keywords = 0; + $keywords_in_file = 0; $interactive = 0; } else { my $selections = $email + $scm + $status + $subsystem + $web; @@ -548,16 +551,14 @@ foreach my $file (@ARGV) { $file =~ s/^\Q${cur_path}\E//; #strip any absolute path $file =~ s/^\Q${lk_path}\E//; #or the path to the lk tree push(@files, $file); - if ($file ne "MAINTAINERS" && -f $file && $keywords) { + if ($file ne "MAINTAINERS" && -f $file && $keywords && $keywords_in_file) { open(my $f, '<', $file) or die "$P: Can't open $file: $!\n"; my $text = do { local($/) ; <$f> }; close($f); - if ($keywords) { - foreach my $line (keys %keyword_hash) { - if ($text =~ m/$keyword_hash{$line}/x) { - push(@keyword_tvi, $line); - } + foreach my $line (keys %keyword_hash) { + if ($text =~ m/$keyword_hash{$line}/x) { + push(@keyword_tvi, $line); } } } @@ -919,7 +920,7 @@ sub get_maintainers { } foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) { - add_categories($line); + add_categories($line, ""); if ($sections) { my $i; my $start = find_starting_index($line); @@ -947,7 +948,7 @@ sub get_maintainers { if ($keywords) { @keyword_tvi = sort_and_uniq(@keyword_tvi); foreach my $line (@keyword_tvi) { - add_categories($line); + add_categories($line, ":Keyword"); } } @@ -1076,6 +1077,7 @@ Output type options: Other options: --pattern-depth => Number of pattern directory traversals (default: 0 (all)) --keywords => scan patch for keywords (default: $keywords) + --keywords-in-file => scan file for keywords (default: $keywords_in_file) --sections => print all of the subsystem sections with pattern matches --letters => print all matching 'letter' types from all matching sections --mailmap => use .mailmap file (default: $email_use_mailmap) @@ -1086,7 +1088,7 @@ Other options: Default options: [--email --tree --nogit --git-fallback --m --r --n --l --multiline - --pattern-depth=0 --remove-duplicates --rolestats] + --pattern-depth=0 --remove-duplicates --rolestats --keywords] Notes: Using "-f directory" may give unexpected results: @@ -1312,7 +1314,7 @@ sub get_list_role { } sub add_categories { - my ($index) = @_; + my ($index, $suffix) = @_; my $i; my $start = find_starting_index($index); @@ -1342,7 +1344,7 @@ sub add_categories { if (!$hash_list_to{lc($list_address)}) { $hash_list_to{lc($list_address)} = 1; push(@list_to, [$list_address, - "subscriber list${list_role}"]); + "subscriber list${list_role}" . $suffix]); } } } else { @@ -1352,12 +1354,12 @@ sub add_categories { if ($email_moderated_list) { $hash_list_to{lc($list_address)} = 1; push(@list_to, [$list_address, - "moderated list${list_role}"]); + "moderated list${list_role}" . $suffix]); } } else { $hash_list_to{lc($list_address)} = 1; push(@list_to, [$list_address, - "open list${list_role}"]); + "open list${list_role}" . $suffix]); } } } @@ -1365,19 +1367,19 @@ sub add_categories { } elsif ($ptype eq "M") { if ($email_maintainer) { my $role = get_maintainer_role($i); - push_email_addresses($pvalue, $role); + push_email_addresses($pvalue, $role . $suffix); } } elsif ($ptype eq "R") { if ($email_reviewer) { my $subsystem = get_subsystem_name($i); - push_email_addresses($pvalue, "reviewer:$subsystem"); + push_email_addresses($pvalue, "reviewer:$subsystem" . $suffix); } } elsif ($ptype eq "T") { - push(@scm, $pvalue); + push(@scm, $pvalue . $suffix); } elsif ($ptype eq "W") { - push(@web, $pvalue); + push(@web, $pvalue . $suffix); } elsif ($ptype eq "S") { - push(@status, $pvalue); + push(@status, $pvalue . $suffix); } } }
On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > The current behavior of K: is a tad bit noisy. It matches against the > > entire contents of files instead of just against the contents of a > > patch. > > > > This means that a patch with a single character change (fixing a typo or > > whitespace or something) would still to/cc maintainers and lists if the > > affected file matched against the regex pattern given in K:. For > > example, if a file has the word "clang" in it then every single patch > > touching that file will to/cc Nick, Nathan and some lists. > > > > Let's change this behavior to only content match against patches > > (subjects, message, diff) as this is what most people expect the > > behavior already is. Most users of "K:" would prefer patch-only content > > matching. If this is not the case let's add a new matching type as > > proposed in [1]. > > I'm glad to know you are coming around to my suggestion. :) > > I believe the file-based keyword matching should _not_ be > removed and the option should be added for it like I suggested. Having a command line flag allowing get_maintainer.pl users to decide the behavior of K: is weird to me. If I'm a maintainer setting my K: in MAINTAINERS I want some sort of consistent behavior. Some patches will start hitting mailing list that DO have keywords in the patch and others, confusingly, not. I understand setting the default state of that flag is probably good enough as it is what 99.9% of users will use. What do you think about this issue in particular, though? To note, we get some speed-up here as pattern matching a patch that touches lots of files would result in searching all of them in their entirety. Just removing this behavior _might_ have a measurable speed-up for patch series touching dozens of files. > > I also think it might be better to mark the "maintained" output > differently as something like "keyword matched" instead. > > > Something like: > --- > scripts/get_maintainer.pl | 38 ++++++++++++++++++++------------------ > 1 file changed, 20 insertions(+), 18 deletions(-) > > diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl > index ab123b498fd9..befae75e61ab 100755 > --- a/scripts/get_maintainer.pl > +++ b/scripts/get_maintainer.pl > @@ -57,6 +57,7 @@ my $subsystem = 0; > my $status = 0; > my $letters = ""; > my $keywords = 1; > +my $keywords_in_file = 0; > my $sections = 0; > my $email_file_emails = 0; > my $from_filename = 0; > @@ -272,6 +273,7 @@ if (!GetOptions( > 'letters=s' => \$letters, > 'pattern-depth=i' => \$pattern_depth, > 'k|keywords!' => \$keywords, > + 'kf|keywords-in-file!' => \$keywords_in_file, > 'sections!' => \$sections, > 'fe|file-emails!' => \$email_file_emails, > 'f|file' => \$from_filename, > @@ -318,6 +320,7 @@ if ($sections || $letters ne "") { > $subsystem = 0; > $web = 0; > $keywords = 0; > + $keywords_in_file = 0; > $interactive = 0; > } else { > my $selections = $email + $scm + $status + $subsystem + $web; > @@ -548,16 +551,14 @@ foreach my $file (@ARGV) { > $file =~ s/^\Q${cur_path}\E//; #strip any absolute path > $file =~ s/^\Q${lk_path}\E//; #or the path to the lk tree > push(@files, $file); > - if ($file ne "MAINTAINERS" && -f $file && $keywords) { > + if ($file ne "MAINTAINERS" && -f $file && $keywords && $keywords_in_file) { > open(my $f, '<', $file) > or die "$P: Can't open $file: $!\n"; > my $text = do { local($/) ; <$f> }; > close($f); > - if ($keywords) { > - foreach my $line (keys %keyword_hash) { > - if ($text =~ m/$keyword_hash{$line}/x) { > - push(@keyword_tvi, $line); > - } > + foreach my $line (keys %keyword_hash) { > + if ($text =~ m/$keyword_hash{$line}/x) { > + push(@keyword_tvi, $line); > } > } > } > @@ -919,7 +920,7 @@ sub get_maintainers { > } > > foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) { > - add_categories($line); > + add_categories($line, ""); > if ($sections) { > my $i; > my $start = find_starting_index($line); > @@ -947,7 +948,7 @@ sub get_maintainers { > if ($keywords) { > @keyword_tvi = sort_and_uniq(@keyword_tvi); > foreach my $line (@keyword_tvi) { > - add_categories($line); > + add_categories($line, ":Keyword"); > } > } > > @@ -1076,6 +1077,7 @@ Output type options: > Other options: > --pattern-depth => Number of pattern directory traversals (default: 0 (all)) > --keywords => scan patch for keywords (default: $keywords) > + --keywords-in-file => scan file for keywords (default: $keywords_in_file) > --sections => print all of the subsystem sections with pattern matches > --letters => print all matching 'letter' types from all matching sections > --mailmap => use .mailmap file (default: $email_use_mailmap) > @@ -1086,7 +1088,7 @@ Other options: > > Default options: > [--email --tree --nogit --git-fallback --m --r --n --l --multiline > - --pattern-depth=0 --remove-duplicates --rolestats] > + --pattern-depth=0 --remove-duplicates --rolestats --keywords] > > Notes: > Using "-f directory" may give unexpected results: > @@ -1312,7 +1314,7 @@ sub get_list_role { > } > > sub add_categories { > - my ($index) = @_; > + my ($index, $suffix) = @_; > > my $i; > my $start = find_starting_index($index); > @@ -1342,7 +1344,7 @@ sub add_categories { > if (!$hash_list_to{lc($list_address)}) { > $hash_list_to{lc($list_address)} = 1; > push(@list_to, [$list_address, > - "subscriber list${list_role}"]); > + "subscriber list${list_role}" . $suffix]); > } > } > } else { > @@ -1352,12 +1354,12 @@ sub add_categories { > if ($email_moderated_list) { > $hash_list_to{lc($list_address)} = 1; > push(@list_to, [$list_address, > - "moderated list${list_role}"]); > + "moderated list${list_role}" . $suffix]); > } > } else { > $hash_list_to{lc($list_address)} = 1; > push(@list_to, [$list_address, > - "open list${list_role}"]); > + "open list${list_role}" . $suffix]); > } > } > } > @@ -1365,19 +1367,19 @@ sub add_categories { > } elsif ($ptype eq "M") { > if ($email_maintainer) { > my $role = get_maintainer_role($i); > - push_email_addresses($pvalue, $role); > + push_email_addresses($pvalue, $role . $suffix); > } > } elsif ($ptype eq "R") { > if ($email_reviewer) { > my $subsystem = get_subsystem_name($i); > - push_email_addresses($pvalue, "reviewer:$subsystem"); > + push_email_addresses($pvalue, "reviewer:$subsystem" . $suffix); > } > } elsif ($ptype eq "T") { > - push(@scm, $pvalue); > + push(@scm, $pvalue . $suffix); > } elsif ($ptype eq "W") { > - push(@web, $pvalue); > + push(@web, $pvalue . $suffix); > } elsif ($ptype eq "S") { > - push(@status, $pvalue); > + push(@status, $pvalue . $suffix); > } > } > } >
On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > The current behavior of K: is a tad bit noisy. It matches against the > > > entire contents of files instead of just against the contents of a > > > patch. > > > > > > This means that a patch with a single character change (fixing a typo or > > > whitespace or something) would still to/cc maintainers and lists if the > > > affected file matched against the regex pattern given in K:. For > > > example, if a file has the word "clang" in it then every single patch > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > Let's change this behavior to only content match against patches > > > (subjects, message, diff) as this is what most people expect the > > > behavior already is. Most users of "K:" would prefer patch-only content > > > matching. If this is not the case let's add a new matching type as > > > proposed in [1]. > > > > I'm glad to know you are coming around to my suggestion. > :) > > > > > I believe the file-based keyword matching should _not_ be > > removed and the option should be added for it like I suggested. > > Having a command line flag allowing get_maintainer.pl > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > my K: in MAINTAINERS I want some sort of consistent behavior. Some > patches will start hitting mailing list that DO have keywords in the patch > and others, confusingly, not. Not true. If a patch contains a keyword match, get_maintainers will _always_ show the K: keyword maintainers unless --nokeywords is specified on the command line. If a file contains a keyword match, it'll only show the K: keyword if --keywords-in-file is set. > To note, we get some speed-up here as pattern matching a patch that > touches lots of files would result in searching all of them in their > entirety. Just removing this behavior _might_ have a measurable > speed-up for patch series touching dozens of files. Again, not true. Patches do _not_ scan the original modified files for keyword matches. Only the patch itself is scanned. That's the current behavior as well.
On Thu, Oct 5, 2023 at 11:15 AM Joe Perches <joe@perches.com> wrote: > > On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > > The current behavior of K: is a tad bit noisy. It matches against the > > > > entire contents of files instead of just against the contents of a > > > > patch. > > > > > > > > This means that a patch with a single character change (fixing a typo or > > > > whitespace or something) would still to/cc maintainers and lists if the > > > > affected file matched against the regex pattern given in K:. For > > > > example, if a file has the word "clang" in it then every single patch > > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > > > Let's change this behavior to only content match against patches > > > > (subjects, message, diff) as this is what most people expect the > > > > behavior already is. Most users of "K:" would prefer patch-only content > > > > matching. If this is not the case let's add a new matching type as > > > > proposed in [1]. > > > > > > I'm glad to know you are coming around to my suggestion. > > :) > > > > > > > > I believe the file-based keyword matching should _not_ be > > > removed and the option should be added for it like I suggested. > > > > Having a command line flag allowing get_maintainer.pl > > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > > my K: in MAINTAINERS I want some sort of consistent behavior. Some > > patches will start hitting mailing list that DO have keywords in the patch > > and others, confusingly, not. > > Not true. > > If a patch contains a keyword match, get_maintainers will _always_ > show the K: keyword maintainers unless --nokeywords is specified > on the command line. ... > > If a file contains a keyword match, it'll only show the K: > keyword if --keywords-in-file is set. Right, what I'm saying is a patch can arrive in a maintainer's inbox wherein the patch itself has no mention of the keyword (if get_maintainer user opted for --keywords-in-file). Just trying to avoid some cases of the question: "Why is this in my inbox?" > > > To note, we get some speed-up here as pattern matching a patch that > > touches lots of files would result in searching all of them in their > > entirety. Just removing this behavior _might_ have a measurable > > speed-up for patch series touching dozens of files. > > Again, not true. > > Patches do _not_ scan the original modified files for keyword matches. > Only the patch itself is scanned. That's the current behavior as well. > Feel like I'm missing something here. How is K: matching keywords in files without reading them. If my patch touches 10 files then all 10 of those files are scanned for K: matches right?
On Thu, 2023-10-05 at 11:30 -0700, Justin Stitt wrote: > On Thu, Oct 5, 2023 at 11:15 AM Joe Perches <joe@perches.com> wrote: > > > > On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > > > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > > > The current behavior of K: is a tad bit noisy. It matches against the > > > > > entire contents of files instead of just against the contents of a > > > > > patch. > > > > > > > > > > This means that a patch with a single character change (fixing a typo or > > > > > whitespace or something) would still to/cc maintainers and lists if the > > > > > affected file matched against the regex pattern given in K:. For > > > > > example, if a file has the word "clang" in it then every single patch > > > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > > > > > Let's change this behavior to only content match against patches > > > > > (subjects, message, diff) as this is what most people expect the > > > > > behavior already is. Most users of "K:" would prefer patch-only content > > > > > matching. If this is not the case let's add a new matching type as > > > > > proposed in [1]. > > > > > > > > I'm glad to know you are coming around to my suggestion. > > > :) > > > > > > > > > > > I believe the file-based keyword matching should _not_ be > > > > removed and the option should be added for it like I suggested. > > > > > > Having a command line flag allowing get_maintainer.pl > > > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > > > my K: in MAINTAINERS I want some sort of consistent behavior. Some > > > patches will start hitting mailing list that DO have keywords in the patch > > > and others, confusingly, not. > > > > Not true. > > > > If a patch contains a keyword match, get_maintainers will _always_ > > show the K: keyword maintainers unless --nokeywords is specified > > on the command line. > > ... > > > > > If a file contains a keyword match, it'll only show the K: > > keyword if --keywords-in-file is set. > > Right, what I'm saying is a patch can arrive in a maintainer's inbox > wherein the patch itself has no mention of the keyword (if > get_maintainer user opted for --keywords-in-file). Just trying to > avoid some cases of the question: "Why is this in my inbox?" Because the script user specifically asked for it. > > > To note, we get some speed-up here as pattern matching a patch that > > > touches lots of files would result in searching all of them in their > > > entirety. Just removing this behavior _might_ have a measurable > > > speed-up for patch series touching dozens of files. > > > > Again, not true. > > > > Patches do _not_ scan the original modified files for keyword matches. > > Only the patch itself is scanned. That's the current behavior as well. > > > > Feel like I'm missing something here. How is K: matching keywords in > files without reading them. > > If my patch touches 10 files then all 10 of those files are scanned for > K: matches right? Nope. Understand the patches are the input to get_maintainer and not just files. If a patch is fed to get_maintainer then any files modified by the patch are _not_ scanned. Only the patch _content_ is used for keyword matches.
On Thu, Oct 5, 2023 at 11:42 AM Joe Perches <joe@perches.com> wrote: > > On Thu, 2023-10-05 at 11:30 -0700, Justin Stitt wrote: > > On Thu, Oct 5, 2023 at 11:15 AM Joe Perches <joe@perches.com> wrote: > > > > > > On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > > > > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > > > > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > > > > The current behavior of K: is a tad bit noisy. It matches against the > > > > > > entire contents of files instead of just against the contents of a > > > > > > patch. > > > > > > > > > > > > This means that a patch with a single character change (fixing a typo or > > > > > > whitespace or something) would still to/cc maintainers and lists if the > > > > > > affected file matched against the regex pattern given in K:. For > > > > > > example, if a file has the word "clang" in it then every single patch > > > > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > > > > > > > Let's change this behavior to only content match against patches > > > > > > (subjects, message, diff) as this is what most people expect the > > > > > > behavior already is. Most users of "K:" would prefer patch-only content > > > > > > matching. If this is not the case let's add a new matching type as > > > > > > proposed in [1]. > > > > > > > > > > I'm glad to know you are coming around to my suggestion. > > > > :) > > > > > > > > > > > > > > I believe the file-based keyword matching should _not_ be > > > > > removed and the option should be added for it like I suggested. > > > > > > > > Having a command line flag allowing get_maintainer.pl > > > > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > > > > my K: in MAINTAINERS I want some sort of consistent behavior. Some > > > > patches will start hitting mailing list that DO have keywords in the patch > > > > and others, confusingly, not. > > > > > > Not true. > > > > > > If a patch contains a keyword match, get_maintainers will _always_ > > > show the K: keyword maintainers unless --nokeywords is specified > > > on the command line. > > > > ... > > > > > > > > If a file contains a keyword match, it'll only show the K: > > > keyword if --keywords-in-file is set. > > > > Right, what I'm saying is a patch can arrive in a maintainer's inbox > > wherein the patch itself has no mention of the keyword (if > > get_maintainer user opted for --keywords-in-file). Just trying to > > avoid some cases of the question: "Why is this in my inbox?" > > Because the script user specifically asked for it. > > > > > To note, we get some speed-up here as pattern matching a patch that > > > > touches lots of files would result in searching all of them in their > > > > entirety. Just removing this behavior _might_ have a measurable > > > > speed-up for patch series touching dozens of files. > > > > > > Again, not true. > > > > > > Patches do _not_ scan the original modified files for keyword matches. > > > Only the patch itself is scanned. That's the current behavior as well. > > > > > > > Feel like I'm missing something here. How is K: matching keywords in > > files without reading them. > > > > If my patch touches 10 files then all 10 of those files are scanned for > > K: matches right? > > Nope. > > Understand the patches are the input to get_maintainer and not > just files. > > If a patch is fed to get_maintainer then any files modified by > the patch are _not_ scanned. > > Only the patch _content_ is used for keyword matches. > Got it. I'll roll your patch into a v3. Thanks Justin
On Thu, 2023-10-05 at 12:52 -0700, Justin Stitt wrote: > On Thu, Oct 5, 2023 at 11:42 AM Joe Perches <joe@perches.com> wrote: > > > > On Thu, 2023-10-05 at 11:30 -0700, Justin Stitt wrote: > > > On Thu, Oct 5, 2023 at 11:15 AM Joe Perches <joe@perches.com> wrote: > > > > > > > > On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > > > > > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > > > > > > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > > > > > The current behavior of K: is a tad bit noisy. It matches against the > > > > > > > entire contents of files instead of just against the contents of a > > > > > > > patch. > > > > > > > > > > > > > > This means that a patch with a single character change (fixing a typo or > > > > > > > whitespace or something) would still to/cc maintainers and lists if the > > > > > > > affected file matched against the regex pattern given in K:. For > > > > > > > example, if a file has the word "clang" in it then every single patch > > > > > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > > > > > > > > > Let's change this behavior to only content match against patches > > > > > > > (subjects, message, diff) as this is what most people expect the > > > > > > > behavior already is. Most users of "K:" would prefer patch-only content > > > > > > > matching. If this is not the case let's add a new matching type as > > > > > > > proposed in [1]. > > > > > > > > > > > > I'm glad to know you are coming around to my suggestion. > > > > > :) > > > > > > > > > > > > > > > > > I believe the file-based keyword matching should _not_ be > > > > > > removed and the option should be added for it like I suggested. > > > > > > > > > > Having a command line flag allowing get_maintainer.pl > > > > > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > > > > > my K: in MAINTAINERS I want some sort of consistent behavior. Some > > > > > patches will start hitting mailing list that DO have keywords in the patch > > > > > and others, confusingly, not. > > > > > > > > Not true. > > > > > > > > If a patch contains a keyword match, get_maintainers will _always_ > > > > show the K: keyword maintainers unless --nokeywords is specified > > > > on the command line. > > > > > > ... > > > > > > > > > > > If a file contains a keyword match, it'll only show the K: > > > > keyword if --keywords-in-file is set. > > > > > > Right, what I'm saying is a patch can arrive in a maintainer's inbox > > > wherein the patch itself has no mention of the keyword (if > > > get_maintainer user opted for --keywords-in-file). Just trying to > > > avoid some cases of the question: "Why is this in my inbox?" > > > > Because the script user specifically asked for it. > > > > > > > To note, we get some speed-up here as pattern matching a patch that > > > > > touches lots of files would result in searching all of them in their > > > > > entirety. Just removing this behavior _might_ have a measurable > > > > > speed-up for patch series touching dozens of files. > > > > > > > > Again, not true. > > > > > > > > Patches do _not_ scan the original modified files for keyword matches. > > > > Only the patch itself is scanned. That's the current behavior as well. > > > > > > > > > > Feel like I'm missing something here. How is K: matching keywords in > > > files without reading them. > > > > > > If my patch touches 10 files then all 10 of those files are scanned for > > > K: matches right? > > > > Nope. > > > > Understand the patches are the input to get_maintainer and not > > just files. > > > > If a patch is fed to get_maintainer then any files modified by > > the patch are _not_ scanned. > > > > Only the patch _content_ is used for keyword matches. > > > > Got it. I'll roll your patch into a v3. > Actually, I have a slightly improved patch as the actual keyword is shown too. I'll get it uploaded and make sure you are credited with the effort to make the change. cheers, Joe
On Thu, Oct 5, 2023 at 1:05 PM Joe Perches <joe@perches.com> wrote: > > On Thu, 2023-10-05 at 12:52 -0700, Justin Stitt wrote: > > On Thu, Oct 5, 2023 at 11:42 AM Joe Perches <joe@perches.com> wrote: > > > > > > On Thu, 2023-10-05 at 11:30 -0700, Justin Stitt wrote: > > > > On Thu, Oct 5, 2023 at 11:15 AM Joe Perches <joe@perches.com> wrote: > > > > > > > > > > On Thu, 2023-10-05 at 11:06 -0700, Justin Stitt wrote: > > > > > > On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote: > > > > > > > > > > > > > > On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote: > > > > > > > > The current behavior of K: is a tad bit noisy. It matches against the > > > > > > > > entire contents of files instead of just against the contents of a > > > > > > > > patch. > > > > > > > > > > > > > > > > This means that a patch with a single character change (fixing a typo or > > > > > > > > whitespace or something) would still to/cc maintainers and lists if the > > > > > > > > affected file matched against the regex pattern given in K:. For > > > > > > > > example, if a file has the word "clang" in it then every single patch > > > > > > > > touching that file will to/cc Nick, Nathan and some lists. > > > > > > > > > > > > > > > > Let's change this behavior to only content match against patches > > > > > > > > (subjects, message, diff) as this is what most people expect the > > > > > > > > behavior already is. Most users of "K:" would prefer patch-only content > > > > > > > > matching. If this is not the case let's add a new matching type as > > > > > > > > proposed in [1]. > > > > > > > > > > > > > > I'm glad to know you are coming around to my suggestion. > > > > > > :) > > > > > > > > > > > > > > > > > > > > I believe the file-based keyword matching should _not_ be > > > > > > > removed and the option should be added for it like I suggested. > > > > > > > > > > > > Having a command line flag allowing get_maintainer.pl > > > > > > users to decide the behavior of K: is weird to me. If I'm a maintainer setting > > > > > > my K: in MAINTAINERS I want some sort of consistent behavior. Some > > > > > > patches will start hitting mailing list that DO have keywords in the patch > > > > > > and others, confusingly, not. > > > > > > > > > > Not true. > > > > > > > > > > If a patch contains a keyword match, get_maintainers will _always_ > > > > > show the K: keyword maintainers unless --nokeywords is specified > > > > > on the command line. > > > > > > > > ... > > > > > > > > > > > > > > If a file contains a keyword match, it'll only show the K: > > > > > keyword if --keywords-in-file is set. > > > > > > > > Right, what I'm saying is a patch can arrive in a maintainer's inbox > > > > wherein the patch itself has no mention of the keyword (if > > > > get_maintainer user opted for --keywords-in-file). Just trying to > > > > avoid some cases of the question: "Why is this in my inbox?" > > > > > > Because the script user specifically asked for it. > > > > > > > > > To note, we get some speed-up here as pattern matching a patch that > > > > > > touches lots of files would result in searching all of them in their > > > > > > entirety. Just removing this behavior _might_ have a measurable > > > > > > speed-up for patch series touching dozens of files. > > > > > > > > > > Again, not true. > > > > > > > > > > Patches do _not_ scan the original modified files for keyword matches. > > > > > Only the patch itself is scanned. That's the current behavior as well. > > > > > > > > > > > > > Feel like I'm missing something here. How is K: matching keywords in > > > > files without reading them. > > > > > > > > If my patch touches 10 files then all 10 of those files are scanned for > > > > K: matches right? > > > > > > Nope. > > > > > > Understand the patches are the input to get_maintainer and not > > > just files. > > > > > > If a patch is fed to get_maintainer then any files modified by > > > the patch are _not_ scanned. > > > > > > Only the patch _content_ is used for keyword matches. > > > > > > > Got it. I'll roll your patch into a v3. > > > > Actually, I have a slightly improved patch as > the actual keyword is shown too. > > I'll get it uploaded and make sure you are credited > with the effort to make the change. > Dang, we just collided in mid-air. I just sent a new patch. Let's disregard my patch that was sent. Thanks for the efforts here. I appreciate it. > cheers, Joe Thanks Justin
diff --git a/MAINTAINERS b/MAINTAINERS index 35977b269d5e..13e7f40ea70b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -51,13 +51,13 @@ Descriptions of section entries and preferred order get_maintainer will not look at git log history when an F: pattern match occurs. When an N: match occurs, git log history is used to also notify the people that have git commit signatures. - K: *Content regex* (perl extended) pattern match in a patch or file. + K: *Content regex* (perl extended) pattern match patch content For instance: K: of_get_profile - matches patches or files that contain "of_get_profile" + matches patches that contain "of_get_profile" K: \b(printk|pr_(info|err))\b - matches patches or files that contain one or more of the words - printk, pr_info or pr_err + matches patches that contain one or more of the words printk, + pr_info or pr_err One regex pattern per line. Multiple K: lines acceptable. Maintainers List diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl index ab123b498fd9..ea58929287bf 100755 --- a/scripts/get_maintainer.pl +++ b/scripts/get_maintainer.pl @@ -548,19 +548,6 @@ foreach my $file (@ARGV) { $file =~ s/^\Q${cur_path}\E//; #strip any absolute path $file =~ s/^\Q${lk_path}\E//; #or the path to the lk tree push(@files, $file); - if ($file ne "MAINTAINERS" && -f $file && $keywords) { - open(my $f, '<', $file) - or die "$P: Can't open $file: $!\n"; - my $text = do { local($/) ; <$f> }; - close($f); - if ($keywords) { - foreach my $line (keys %keyword_hash) { - if ($text =~ m/$keyword_hash{$line}/x) { - push(@keyword_tvi, $line); - } - } - } - } } else { my $file_cnt = @files; my $lastfile;