checkpatch.pl: Relax commit ID check to allow more than 12 chars

Message ID 20230129123431.1282427-1-j.neuschaefer@gmx.net
State New
Headers
Series checkpatch.pl: Relax commit ID check to allow more than 12 chars |

Commit Message

Jonathan Neuschäfer Jan. 29, 2023, 12:34 p.m. UTC
  By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
hashes with 13 digits, because of the number of objects.

Relax the rule in checkpatch.pl to allow a few more digits (up to 16).

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
---
 scripts/checkpatch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
2.39.0
  

Comments

Joe Perches Jan. 29, 2023, 5:52 p.m. UTC | #1
On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> hashes with 13 digits, because of the number of objects.
> 
> Relax the rule in checkpatch.pl to allow a few more digits (up to 16).

NAK without updating the process docs first.

Documentation/process/submitting-patches.rst-If your patch fixes a bug in a specific commit, e.g. you found an issue using
Documentation/process/submitting-patches.rst:``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
Documentation/process/submitting-patches.rst-the SHA-1 ID, and the one line summary.  Do not split the tag across multiple
Documentation/process/submitting-patches.rst-lines, tags are exempt from the "wrap at 75 columns" rule in order to simplify
Documentation/process/submitting-patches.rst-parsing scripts.  For example::
Documentation/process/submitting-patches.rst-
Documentation/process/submitting-patches.rst-   Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually fr>
Documentation/process/submitting-patches.rst-
Documentation/process/submitting-patches.rst-The following ``git config`` settings can be used to add a pretty format for
Documentation/process/submitting-patches.rst-outputting the above style in the ``git log`` or ``git show`` commands::
Documentation/process/submitting-patches.rst-
Documentation/process/submitting-patches.rst-   [core]
Documentation/process/submitting-patches.rst:           abbrev = 12
Documentation/process/submitting-patches.rst-   [pretty]
Documentation/process/submitting-patches.rst-           fixes = Fixes: %h (\"%s\")

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -3177,7 +3177,7 @@ sub process {
>  				$tag_case = 0 if $tag eq "Fixes:";
>  				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
> 
> -				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
> +				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,16}$/i);
>  				$id_case = 0 if ($orig_commit !~ /[A-F]/);
> 
>  				# Always strip leading/trailing parens then double quotes if existing
> @@ -3194,7 +3194,7 @@ sub process {
>  			if ($ctitle ne $title || $tag_case || $tag_space ||
>  			    $id_length || $id_case || !$title_has_quotes) {
>  				if (WARN("BAD_FIXES_TAG",
> -				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> +				     "Please use correct Fixes: style 'Fixes: <12-16 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
>  				    $fix) {
>  					$fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
>  				}
> --
> 2.39.0
>
  
Jonathan Neuschäfer Jan. 31, 2023, 9 p.m. UTC | #2
On Sun, Jan 29, 2023 at 09:52:38AM -0800, Joe Perches wrote:
> On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > hashes with 13 digits, because of the number of objects.
> > 
> > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> 
> NAK without updating the process docs first.

Good point, I'll do that.

Thanks,
Jonathan

> 
> Documentation/process/submitting-patches.rst-If your patch fixes a bug in a specific commit, e.g. you found an issue using
> Documentation/process/submitting-patches.rst:``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
> Documentation/process/submitting-patches.rst-the SHA-1 ID, and the one line summary.  Do not split the tag across multiple
> Documentation/process/submitting-patches.rst-lines, tags are exempt from the "wrap at 75 columns" rule in order to simplify
> Documentation/process/submitting-patches.rst-parsing scripts.  For example::
> Documentation/process/submitting-patches.rst-
> Documentation/process/submitting-patches.rst-   Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually fr>
> Documentation/process/submitting-patches.rst-
> Documentation/process/submitting-patches.rst-The following ``git config`` settings can be used to add a pretty format for
> Documentation/process/submitting-patches.rst-outputting the above style in the ``git log`` or ``git show`` commands::
> Documentation/process/submitting-patches.rst-
> Documentation/process/submitting-patches.rst-   [core]
> Documentation/process/submitting-patches.rst:           abbrev = 12
> Documentation/process/submitting-patches.rst-   [pretty]
> Documentation/process/submitting-patches.rst-           fixes = Fixes: %h (\"%s\")
> 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
> > @@ -3177,7 +3177,7 @@ sub process {
> >  				$tag_case = 0 if $tag eq "Fixes:";
> >  				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
> > 
> > -				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
> > +				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,16}$/i);
> >  				$id_case = 0 if ($orig_commit !~ /[A-F]/);
> > 
> >  				# Always strip leading/trailing parens then double quotes if existing
> > @@ -3194,7 +3194,7 @@ sub process {
> >  			if ($ctitle ne $title || $tag_case || $tag_space ||
> >  			    $id_length || $id_case || !$title_has_quotes) {
> >  				if (WARN("BAD_FIXES_TAG",
> > -				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> > +				     "Please use correct Fixes: style 'Fixes: <12-16 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> >  				    $fix) {
> >  					$fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
> >  				}
> > --
> > 2.39.0
> > 
>
  
Joe Perches Feb. 4, 2023, 4:57 p.m. UTC | #3
On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
> On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > hashes with 13 digits, because of the number of objects.
> > 
> > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> 
> NAK without updating the process docs first.

btw: it looks like 12 will still be sufficient for awhile yet

$ git count
total 1154908
$ git -c core.abbrev=5 log --pretty=format:%h | \
  perl -nE 'chomp;say length' | sort | uniq -c | sort -n -k2
    198 5
 664613 6
 450955 7
  36667 8
   2312 9
    155 10
      8 11
  
Jonathan Neuschäfer Feb. 5, 2023, 10:40 a.m. UTC | #4
On Sat, Feb 04, 2023 at 08:57:59AM -0800, Joe Perches wrote:
> On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
> > On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > > hashes with 13 digits, because of the number of objects.
> > > 
> > > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> > 
> > NAK without updating the process docs first.
> 
> btw: it looks like 12 will still be sufficient for awhile yet
> 
> $ git count
> total 1154908
> $ git -c core.abbrev=5 log --pretty=format:%h | \
>   perl -nE 'chomp;say length' | sort | uniq -c | sort -n -k2
>     198 5
>  664613 6
>  450955 7
>   36667 8
>    2312 9
>     155 10
>       8 11

Ok, I get similar stats on my tree (which includes linux-next and a few
other remotes).

However, git's default heuristic for %h length uses 13 digits here, so I
think other people might get 13 digits as well. I could force git to use
less digits than it naturally would, by setting core.abbrev=12 (and
document this idea in the documentation), but that doesn't seem nice.
Therefore, I still think allowing a few more digits is a good idea.


Thanks,
Jonathan
  
Randy Dunlap Feb. 5, 2023, 4:33 p.m. UTC | #5
On 2/5/23 02:40, Jonathan Neuschäfer wrote:
> On Sat, Feb 04, 2023 at 08:57:59AM -0800, Joe Perches wrote:
>> On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
>>> On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
>>>> By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
>>>> hashes with 13 digits, because of the number of objects.
>>>>
>>>> Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
>>>
>>> NAK without updating the process docs first.
>>
>> btw: it looks like 12 will still be sufficient for awhile yet
>>
>> $ git count
>> total 1154908
>> $ git -c core.abbrev=5 log --pretty=format:%h | \
>>   perl -nE 'chomp;say length' | sort | uniq -c | sort -n -k2
>>     198 5
>>  664613 6
>>  450955 7
>>   36667 8
>>    2312 9
>>     155 10
>>       8 11
> 
> Ok, I get similar stats on my tree (which includes linux-next and a few
> other remotes).
> 
> However, git's default heuristic for %h length uses 13 digits here, so I
> think other people might get 13 digits as well. I could force git to use
> less digits than it naturally would, by setting core.abbrev=12 (and
> document this idea in the documentation), but that doesn't seem nice.
> Therefore, I still think allowing a few more digits is a good idea.

I have core.abbrev=12 and I still get 13 "digits" often.
Then I just chop it off at 12 to satisfy checkpatch...
  
Linus Torvalds Feb. 5, 2023, 8:38 p.m. UTC | #6
On Sat, Feb 4, 2023 at 8:58 AM Joe Perches <joe@perches.com> wrote:
>
> btw: it looks like 12 will still be sufficient for awhile yet

To be honest, that's actually closer to the 12-digit limit than I was expecting.

The git heuristics are pretty good, and it sounds like 13 hex digits
is already starting to happen, so maybe we should relax things.

That said, "up to 16" does sound questionable.

We're talking exponential growth by number of digits, so saying "let's
go from 12 to 16" is a *huge* jump. And I'd like to keep people doing
fewer digits just because these things get used in free-flowing prose,
and we have the whole line wrapping issue and things just get uglier
at some point.

So we're closing in on two decades of git use, and we are not that far
from having 10 million objects in our git database (for the base
tree). Sure, that's  a lot of objects, but to a close approximation
the object count grows _largely_ linearly with time.

Considering that git is actually pretty good at handling the ambiguous
case anyway, I'd say go up at *most* to 14 digits.

I just checked my current tip-of-tree, and I needed to go down to
*five* digits to have git start complaining about ambiguous object
names:

  [torvalds@ryzen linux]$ git show c608f
  error: short object ID c608f is ambiguous
  hint: The candidates are:
  hint:   c608f6b58f30 commit 2023-02-05 - Merge tag 'usb-6.2-rc7' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
  hint:   c608f14fb0ee tree
  hint:   c608fccf692f tree
  hint:   c608f76e5753 blob
  hint:   c608fa168fe6 blob
  hint:   c608fd96771c blob

and maybe that was pure luck, but looking at your stats it does look
like "6 digits is still unique for most objects", I really think that
we're better off with shorter and visually easier numbers than going
overboard.

Note above how even with just 5 digits, it's still unique in actual
commits, so from a *practical* standpoint even five digits are fine
(because normal human communication doesn't talk about the blob or
tree commits).

If this was some case of "when you hit the limit, things break
horribly badly", that would be one thing. But that not even being true
means that things like line wrapping and just visuals matter.

So I think 12 digits likely still work just fine for another decade or
two, but yes, we're at the point where we might want to start thinking
about 13 or 14.

                Linus
  
Geert Uytterhoeven Feb. 6, 2023, 8:38 a.m. UTC | #7
Hi Joe,

On Sat, Feb 4, 2023 at 5:59 PM Joe Perches <joe@perches.com> wrote:
> On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
> > On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > > hashes with 13 digits, because of the number of objects.
> > >
> > > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> >
> > NAK without updating the process docs first.
>
> btw: it looks like 12 will still be sufficient for awhile yet
>
> $ git count
> total 1154908

Hmm, Ubuntu git too old?

> $ git -c core.abbrev=5 log --pretty=format:%h | \
>   perl -nE 'chomp;say length' | sort | uniq -c | sort -n -k2
>     198 5
>  664613 6
>  450955 7
>   36667 8
>    2312 9
>     155 10
>       8 11

I'm already at twelve:

 433752 6
 640819 7
  62759 8
   3998 9
    261 10
     12 11
      1 12

I've been using core.abbrev=16 for a while, and some maintainers
reject my patches with Fixes: tags because of that...

Is it really worthwhile to save on the number of hexits, making lookup
of some commits more inconvenient?

Note that while "git show edb9b8" suggests edb9b8f[...],
gitweb says bad object id:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edb9b8

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
  
Joe Perches Feb. 6, 2023, 11:09 a.m. UTC | #8
On Mon, 2023-02-06 at 09:38 +0100, Geert Uytterhoeven wrote:
> Hi Joe,
> 
> On Sat, Feb 4, 2023 at 5:59 PM Joe Perches <joe@perches.com> wrote:
> > On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
> > > On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > > > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > > > hashes with 13 digits, because of the number of objects.
> > > > 
> > > > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> > > 
> > > NAK without updating the process docs first.
> > 
> > btw: it looks like 12 will still be sufficient for awhile yet
> > 
> > $ git count
> > total 1154908
> 
> Hmm, Ubuntu git too old?

Don't think so

$ git --version
git version 2.39.1

More likely just using Linus' tree and not a
development tree with a bunch of branches.

I've got a -next tree with history back to next-20151106
with a bunch of missing dates because I don't fetch it
every day.  It has:

$ git tag | grep next | wc -l
1134

There I get:

$ git -c core.abbrev=5 log --pretty=format:%h | \
  perl -nE 'chomp;say length' | sort | uniq -c | sort -n -k2
      6 5
 542082 6
 568573 7
  51124 8
   3249 9
    217 10
     14 11
      1 12

> I've been using core.abbrev=16 for a while, and some maintainers
> reject my patches with Fixes: tags because of that...

Perhaps because that's not the documented format?

> Is it really worthwhile to save on the number of hexits, making lookup
> of some commits more inconvenient?
> 
> Note that while "git show edb9b8" suggests edb9b8f[...],
> gitweb says bad object id:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edb9b8

hmm.  Not here.

$ git show edb9b8
tree edb9b8

Kconfig
Makefile
fmvj18x_cs.c
  
Geert Uytterhoeven Feb. 6, 2023, 11:54 a.m. UTC | #9
Hi Joe,

On Mon, Feb 6, 2023 at 12:09 PM Joe Perches <joe@perches.com> wrote:
> On Mon, 2023-02-06 at 09:38 +0100, Geert Uytterhoeven wrote:
> > On Sat, Feb 4, 2023 at 5:59 PM Joe Perches <joe@perches.com> wrote:
> > > On Sun, 2023-01-29 at 09:52 -0800, Joe Perches wrote:
> > > > On Sun, 2023-01-29 at 13:34 +0100, Jonathan Neuschäfer wrote:
> > > > > By now, `git log --pretty=%h` (on my copy of linux.git) prints commit
> > > > > hashes with 13 digits, because of the number of objects.
> > > > >
> > > > > Relax the rule in checkpatch.pl to allow a few more digits (up to 16).
> > > >
> > > > NAK without updating the process docs first.
> > >
> > > btw: it looks like 12 will still be sufficient for awhile yet
> > >
> > > $ git count
> > > total 1154908
> >
> > Hmm, Ubuntu git too old?
>
> Don't think so
>
> $ git --version
> git version 2.39.1

Exactly, Ubuntu 22.04LTS only has

$ git --version
git version 2.34.1

i.e. no git count.

> > I've been using core.abbrev=16 for a while, and some maintainers
> > reject my patches with Fixes: tags because of that...
>
> Perhaps because that's not the documented format?

Right.  I look a lot at history, and don't want to become slowed down
by ambiguous Fixes: tags anytime soon (or later).

> > Is it really worthwhile to save on the number of hexits, making lookup
> > of some commits more inconvenient?
> >
> > Note that while "git show edb9b8" suggests edb9b8f[...],
> > gitweb says bad object id:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edb9b8
>
> hmm.  Not here.
>
> $ git show edb9b8
> tree edb9b8

Yeah, I also have that tree object.  But I don't want to see the tree
object; I want to see the commit object, which is in v6.2-rc7:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edb9b8f

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
  
Joe Perches Feb. 6, 2023, 12:25 p.m. UTC | #10
On Mon, 2023-02-06 at 12:54 +0100, Geert Uytterhoeven wrote:
> Hi Joe,

rehi Geert

maybe:
---
 scripts/checkpatch.pl | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index bd44d12965c98..55267ee6b1190 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -36,6 +36,8 @@ my $showfile = 0;
 my $file = 0;
 my $git = 0;
 my %git_commits = ();
+my $git_sha1_min = 12;
+my $git_sha1_max = 14;
 my $check = 0;
 my $check_orig = 0;
 my $summary = 1;
@@ -1230,7 +1232,13 @@ sub git_commit_info {
 		 $lines[0] =~ /^fatal: bad object $commit/) {
 		$id = undef;
 	} else {
-		$id = substr($lines[0], 0, 12);
+		my $len = length($commit);
+		if ($len < $git_sha1_min) {
+			$len = $git_sha1_min;
+		} elsif ($len > $git_sha1_max) {
+			$len = $git_sha1_max;
+		}
+		$id = substr($lines[0], 0, $len);
 		$desc = substr($lines[0], 41);
 	}
 
@@ -1297,7 +1305,7 @@ for my $filename (@ARGV) {
 	if ($filename eq '-') {
 		$vname = 'Your patch';
 	} elsif ($git) {
-		$vname = "Commit " . substr($filename, 0, 12) . ' ("' . $git_commits{$filename} . '")';
+		$vname = "Commit " . substr($filename, 0, $git_sha1_min) . ' ("' . $git_commits{$filename} . '")';
 	} else {
 		$vname = $filename;
 	}
@@ -3191,7 +3199,7 @@ sub process {
 				$tag_case = 0 if $tag eq "Fixes:";
 				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
 
-				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
+				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{$git_sha1_min,$git_sha1_max}$/i);
 				$id_case = 0 if ($orig_commit !~ /[A-F]/);
 
 				# Always strip leading/trailing parens then double quotes if existing
@@ -3208,7 +3216,7 @@ sub process {
 			if ($ctitle ne $title || $tag_case || $tag_space ||
 			    $id_length || $id_case || !$title_has_quotes) {
 				if (WARN("BAD_FIXES_TAG",
-				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
+				     "Please use correct Fixes: style 'Fixes: <$git_sha1_min to $git_sha1_max chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
 				    $fix) {
 					$fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
 				}
@@ -3300,9 +3308,9 @@ sub process {
 		    $line !~ /^This reverts commit [0-9a-f]{7,40}/ &&
 		    (($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i ||
 		      ($line =~ /\bcommit\s*$/i && defined($rawlines[$linenr]) && $rawlines[$linenr] =~ /^\s*[0-9a-f]{5,}\b/i)) ||
-		     ($line =~ /(?:\s|^)[0-9a-f]{12,40}(?:[\s"'\(\[]|$)/i &&
-		      $line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i &&
-		      $line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) {
+		     ($line =~ /(?:\s|^)[0-9a-f]{$git_sha1_min,40}(?:[\s"'\(\[]|$)/i &&
+		      $line !~ /[\<\[][0-9a-f]{$git_sha1_min,40}[\>\]]/i &&
+		      $line !~ /\bfixes:\s*[0-9a-f]{$git_sha1_min,40}/i))) {
 			my $init_char = "c";
 			my $orig_commit = "";
 			my $short = 1;
@@ -3340,11 +3348,11 @@ sub process {
 			if ($input =~ /\b(c)ommit\s+([0-9a-f]{5,})\b/i) {
 				$init_char = $1;
 				$orig_commit = lc($2);
-				$short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{12,40}/i);
+				$short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{$git_sha1_min,40}/i);
 				$long = 1 if ($input =~ /\bcommit\s+[0-9a-f]{41,}/i);
 				$space = 0 if ($input =~ /\bcommit [0-9a-f]/i);
 				$case = 0 if ($input =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
-			} elsif ($input =~ /\b([0-9a-f]{12,40})\b/i) {
+			} elsif ($input =~ /\b([0-9a-f]{$git_sha1_min,40})\b/i) {
 				$orig_commit = lc($1);
 			}
 
@@ -3355,7 +3363,7 @@ sub process {
 			    ($short || $long || $space || $case || ($orig_desc ne $description) || !$has_quotes) &&
 			    $last_git_commit_id_linenr != $linenr - 1) {
 				ERROR("GIT_COMMIT_ID",
-				      "Please use git commit description style 'commit <12+ chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx);
+				      "Please use git commit description style 'commit <$git_sha1_min to $git_sha1_max chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx);
 			}
 			#don't report the next line if this line ends in commit and the sha1 hash is the next line
 			$last_git_commit_id_linenr = $linenr if ($line =~ /\bcommit\s*$/i);
  
Jonathan Neuschäfer Feb. 7, 2023, 2:47 p.m. UTC | #11
On Mon, Feb 06, 2023 at 04:25:19AM -0800, Joe Perches wrote:
> On Mon, 2023-02-06 at 12:54 +0100, Geert Uytterhoeven wrote:
> > Hi Joe,
> 
> rehi Geert
> 
> maybe:
> ---

At a quick glance, this looks reasonable to me. Feel free to take over
the patch and send a real v2.


Thanks,
Jonathan


>  scripts/checkpatch.pl | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index bd44d12965c98..55267ee6b1190 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -36,6 +36,8 @@ my $showfile = 0;
>  my $file = 0;
>  my $git = 0;
>  my %git_commits = ();
> +my $git_sha1_min = 12;
> +my $git_sha1_max = 14;
>  my $check = 0;
>  my $check_orig = 0;
>  my $summary = 1;
> @@ -1230,7 +1232,13 @@ sub git_commit_info {
>  		 $lines[0] =~ /^fatal: bad object $commit/) {
>  		$id = undef;
>  	} else {
> -		$id = substr($lines[0], 0, 12);
> +		my $len = length($commit);
> +		if ($len < $git_sha1_min) {
> +			$len = $git_sha1_min;
> +		} elsif ($len > $git_sha1_max) {
> +			$len = $git_sha1_max;
> +		}
> +		$id = substr($lines[0], 0, $len);
>  		$desc = substr($lines[0], 41);
>  	}
>  
> @@ -1297,7 +1305,7 @@ for my $filename (@ARGV) {
>  	if ($filename eq '-') {
>  		$vname = 'Your patch';
>  	} elsif ($git) {
> -		$vname = "Commit " . substr($filename, 0, 12) . ' ("' . $git_commits{$filename} . '")';
> +		$vname = "Commit " . substr($filename, 0, $git_sha1_min) . ' ("' . $git_commits{$filename} . '")';
>  	} else {
>  		$vname = $filename;
>  	}
> @@ -3191,7 +3199,7 @@ sub process {
>  				$tag_case = 0 if $tag eq "Fixes:";
>  				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
>  
> -				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
> +				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{$git_sha1_min,$git_sha1_max}$/i);
>  				$id_case = 0 if ($orig_commit !~ /[A-F]/);
>  
>  				# Always strip leading/trailing parens then double quotes if existing
> @@ -3208,7 +3216,7 @@ sub process {
>  			if ($ctitle ne $title || $tag_case || $tag_space ||
>  			    $id_length || $id_case || !$title_has_quotes) {
>  				if (WARN("BAD_FIXES_TAG",
> -				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> +				     "Please use correct Fixes: style 'Fixes: <$git_sha1_min to $git_sha1_max chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
>  				    $fix) {
>  					$fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
>  				}
> @@ -3300,9 +3308,9 @@ sub process {
>  		    $line !~ /^This reverts commit [0-9a-f]{7,40}/ &&
>  		    (($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i ||
>  		      ($line =~ /\bcommit\s*$/i && defined($rawlines[$linenr]) && $rawlines[$linenr] =~ /^\s*[0-9a-f]{5,}\b/i)) ||
> -		     ($line =~ /(?:\s|^)[0-9a-f]{12,40}(?:[\s"'\(\[]|$)/i &&
> -		      $line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i &&
> -		      $line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) {
> +		     ($line =~ /(?:\s|^)[0-9a-f]{$git_sha1_min,40}(?:[\s"'\(\[]|$)/i &&
> +		      $line !~ /[\<\[][0-9a-f]{$git_sha1_min,40}[\>\]]/i &&
> +		      $line !~ /\bfixes:\s*[0-9a-f]{$git_sha1_min,40}/i))) {
>  			my $init_char = "c";
>  			my $orig_commit = "";
>  			my $short = 1;
> @@ -3340,11 +3348,11 @@ sub process {
>  			if ($input =~ /\b(c)ommit\s+([0-9a-f]{5,})\b/i) {
>  				$init_char = $1;
>  				$orig_commit = lc($2);
> -				$short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{12,40}/i);
> +				$short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{$git_sha1_min,40}/i);
>  				$long = 1 if ($input =~ /\bcommit\s+[0-9a-f]{41,}/i);
>  				$space = 0 if ($input =~ /\bcommit [0-9a-f]/i);
>  				$case = 0 if ($input =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
> -			} elsif ($input =~ /\b([0-9a-f]{12,40})\b/i) {
> +			} elsif ($input =~ /\b([0-9a-f]{$git_sha1_min,40})\b/i) {
>  				$orig_commit = lc($1);
>  			}
>  
> @@ -3355,7 +3363,7 @@ sub process {
>  			    ($short || $long || $space || $case || ($orig_desc ne $description) || !$has_quotes) &&
>  			    $last_git_commit_id_linenr != $linenr - 1) {
>  				ERROR("GIT_COMMIT_ID",
> -				      "Please use git commit description style 'commit <12+ chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx);
> +				      "Please use git commit description style 'commit <$git_sha1_min to $git_sha1_max chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx);
>  			}
>  			#don't report the next line if this line ends in commit and the sha1 hash is the next line
>  			$last_git_commit_id_linenr = $linenr if ($line =~ /\bcommit\s*$/i);
  

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 78cc595b98ce1..3a2c8b5426191 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3177,7 +3177,7 @@  sub process {
 				$tag_case = 0 if $tag eq "Fixes:";
 				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);

-				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
+				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,16}$/i);
 				$id_case = 0 if ($orig_commit !~ /[A-F]/);

 				# Always strip leading/trailing parens then double quotes if existing
@@ -3194,7 +3194,7 @@  sub process {
 			if ($ctitle ne $title || $tag_case || $tag_space ||
 			    $id_length || $id_case || !$title_has_quotes) {
 				if (WARN("BAD_FIXES_TAG",
-				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
+				     "Please use correct Fixes: style 'Fixes: <12-16 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
 				    $fix) {
 					$fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
 				}