[v4,1/3] checkpatch: warn when unknown tags are used for links

Message ID 3b036087d80b8c0e07a46a1dbaaf4ad0d018f8d5.1674217480.git.linux@leemhuis.info
State New
Headers
Series checkpatch.pl: warn about discouraged tags and missing Link: tags |

Commit Message

Thorsten Leemhuis Jan. 20, 2023, 12:35 p.m. UTC
  From: Kai Wasserbäch <kai@dev.carbon-project.org>

Issue a warning when encountering URLs behind unknown tags, as Linus
recently stated ```please stop making up random tags that make no sense.
Just use "Link:"```[1]. That statement was triggered by an use of
'BugLink', but that's not the only tag people invented:

$ git log -100000 --no-merges --format=email -P \
   --grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
  sort | uniq -c | sort -rn | head -n 20
 103958 Link: http
    418 BugLink: http
    372 Patchwork: http
    280 Closes: http
    224 Bug: http
    123 References: http
     84 Bugzilla: http
     61 URL: http
     42 v1: http
     38 Datasheet: http
     20 v2: http
      9 Ref: http
      9 Fixes: http
      9 Buglink: http
      8 v3: http
      8 Reference: http
      7 See: http
      6 1: http
      5 link: http
      3 Link:http

Some of these non-standard tags make it harder for external tools that
rely on use of proper tags. One of those tools is the regression
tracking bot 'regzbot', which looks out for "Link:" tags pointing to
reports of tracked regressions.

The initial idea was to use a disallow list to raise an error when
encountering known unwanted tags like BugLink:; during review it was
requested to use a list of allowed tags instead[2].

Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/15f7df96d49082fb7799dda6e187b33c84f38831.camel@perches.com/ [2]
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Co-developed-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
---
 scripts/checkpatch.pl | 12 ++++++++++++
 1 file changed, 12 insertions(+)
  

Comments

Matthieu Baerts Feb. 27, 2023, 1:25 p.m. UTC | #1
Hello,

On 20/01/2023 13:35, Thorsten Leemhuis wrote:
> From: Kai Wasserbäch <kai@dev.carbon-project.org>
> 
> Issue a warning when encountering URLs behind unknown tags, as Linus
> recently stated ```please stop making up random tags that make no sense.
> Just use "Link:"```[1]. That statement was triggered by an use of
> 'BugLink', but that's not the only tag people invented:
> 
> $ git log -100000 --no-merges --format=email -P \
>    --grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
>   sort | uniq -c | sort -rn | head -n 20
>  103958 Link: http
>     418 BugLink: http
>     372 Patchwork: http
>     280 Closes: http
>     224 Bug: http
>     123 References: http
>      84 Bugzilla: http
>      61 URL: http
>      42 v1: http
>      38 Datasheet: http
>      20 v2: http
>       9 Ref: http
>       9 Fixes: http
>       9 Buglink: http
>       8 v3: http
>       8 Reference: http
>       7 See: http
>       6 1: http
>       5 link: http
>       3 Link:http
> 
> Some of these non-standard tags make it harder for external tools that
> rely on use of proper tags. One of those tools is the regression
> tracking bot 'regzbot', which looks out for "Link:" tags pointing to
> reports of tracked regressions.

I'm sorry for the late feedback but would it be possible to add an
exception for the "Closes" tag followed by a URL?

This tag is useful -- at least for us when maintaining the MPTCP subtree
-- to have tickets being automatically closed when a patch is accepted.
I don't think this "Closes" tag is a "random one that makes no sense"
but I agree it is not an "official" one described in the documentation.

On our side, we are using GitHub to manage issues but this also works
with GitLab and probably others. Other keywords are also accepted [1][2]
but I guess it is best to stick with one, especially when it is already
used according to the list provided above.

Would it then be OK to allow this "Closes" tag in checkpatch.pl and
mention it in the documentation (Submitting patches)?

Or should we switch to the "Link" tag instead (and re-do the tracking
manually)?

Cheers,
Matt

[1]
https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests
[2]
https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern
  
Thorsten Leemhuis March 2, 2023, 5:36 a.m. UTC | #2
On 27.02.23 14:25, Matthieu Baerts wrote:
> On 20/01/2023 13:35, Thorsten Leemhuis wrote:
>> From: Kai Wasserbäch <kai@dev.carbon-project.org>
>>
>> Issue a warning when encountering URLs behind unknown tags, as Linus
>> recently stated ```please stop making up random tags that make no sense.
>> Just use "Link:"```[1]. That statement was triggered by an use of
>> 'BugLink', but that's not the only tag people invented:
>>
>> $ git log -100000 --no-merges --format=email -P \
>>    --grep='^\w+:[ \t]*http' | grep -Poh '^\w+:[ \t]*http' | \
>>   sort | uniq -c | sort -rn | head -n 20
>>  103958 Link: http
>>     418 BugLink: http
>>     372 Patchwork: http
>>     280 Closes: http
>>     224 Bug: http
>>     123 References: http
>> [...]
>>
>> Some of these non-standard tags make it harder for external tools that
>> rely on use of proper tags. One of those tools is the regression
>> tracking bot 'regzbot', which looks out for "Link:" tags pointing to
>> reports of tracked regressions.
> 
> I'm sorry for the late feedback but would it be possible to add an
> exception for the "Closes" tag followed by a URL?

As I just wrote in a reply to Jakub: Not sure. Every special case makes
things harder for humans and software that looks at a commits downstream.

> This tag is useful -- at least for us when maintaining the MPTCP subtree
> -- to have tickets being automatically closed when a patch is accepted.
> I don't think this "Closes" tag is a "random one that makes no sense"
> but I agree it is not an "official" one described in the documentation.
>
> On our side, we are using GitHub to manage issues but this also works
> with GitLab and probably others. Other keywords are also accepted [1][2]
> but I guess it is best to stick with one, especially when it is already
> used according to the list provided above.
> 
> Would it then be OK to allow this "Closes" tag in checkpatch.pl and
> mention it in the documentation (Submitting patches)?
> 
> Or should we switch to the "Link" tag instead (and re-do the tracking
> manually)?
> 
> [1]
> https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests
> [2]
> https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern

For the record, let me repeat and further elaborate what I already said
on social media before you wrote your mail:

 * I'm not mostly neutral here, but it was Linus who wrote "please stop
making up random tags that make no sense." in [1]. This was triggered by
a use of "BugLink:"; maybe there are tools out there that rely on that
tag, hence their users might ask for a exception as well. That's why I
think it's Linus call to grant any exceptions.

[1]
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/

 * if such an exception is made, it IMHO must be documented in our
documentation, so any software and humans that rely on these tags are
aware of it.

Ciao, Thorsten
  

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 78cc595b98ce..d739ce0909b1 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3250,6 +3250,18 @@  sub process {
 			$commit_log_possible_stack_dump = 0;
 		}
 
+# Check for odd tags before a URI/URL
+		if ($in_commit_log &&
+		    $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
+			if ($1 =~ /^v(?:ersion)?\d+/i) {
+				WARN("COMMIT_LOG_VERSIONING",
+				     "Patch version information should be after the --- line\n" . $herecurr);
+			} else {
+				WARN("COMMIT_LOG_USE_LINK",
+				     "Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
+			}
+		}
+
 # Check for lines starting with a #
 		if ($in_commit_log && $line =~ /^#/) {
 			if (WARN("COMMIT_COMMENT_SYMBOL",