[5/6] kbuild: rust_is_available: fix confusion when a version appears in the path

Message ID 20230109204520.539080-5-ojeda@kernel.org
State New
Headers
Series [1/6] docs: rust: add paragraph about finding a suitable `libclang` |

Commit Message

Miguel Ojeda Jan. 9, 2023, 8:45 p.m. UTC
  `bindgen`'s output for `libclang`'s version check contains paths, which
in turn may contain strings that look like version numbers [1]:

    .../6.1.0-dev/.../rust_is_available_bindgen_libclang.h:2:9: warning: clang version 11.1.0  [-W#pragma-messages], err: false

which the script will pick up as the version instead of the latter.

It is also the case that versions may appear after the actual version
(e.g. distribution's version text), which was the reason behind `head` [2]:

    .../rust-is-available-bindgen-libclang.h:2:9: warning: clang version 13.0.0 (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false

Thus instead ask for a match after the `clang version` string.

Reported-by: Jordan (@jordanisaacs)
Link: https://github.com/Rust-for-Linux/linux/issues/942 [1]
Link: https://github.com/Rust-for-Linux/linux/pull/789 [2]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
---
 scripts/rust_is_available.sh | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
  

Comments

Masahiro Yamada Jan. 12, 2023, 5:32 a.m. UTC | #1
On Tue, Jan 10, 2023 at 5:46 AM Miguel Ojeda <ojeda@kernel.org> wrote:
>
> `bindgen`'s output for `libclang`'s version check contains paths, which
> in turn may contain strings that look like version numbers [1]:
>
>     .../6.1.0-dev/.../rust_is_available_bindgen_libclang.h:2:9: warning: clang version 11.1.0  [-W#pragma-messages], err: false
>
> which the script will pick up as the version instead of the latter.
>
> It is also the case that versions may appear after the actual version
> (e.g. distribution's version text), which was the reason behind `head` [2]:
>
>     .../rust-is-available-bindgen-libclang.h:2:9: warning: clang version 13.0.0 (Fedora 13.0.0-3.fc35) [-W#pragma-messages], err: false
>
> Thus instead ask for a match after the `clang version` string.
>
> Reported-by: Jordan (@jordanisaacs)
> Link: https://github.com/Rust-for-Linux/linux/issues/942 [1]
> Link: https://github.com/Rust-for-Linux/linux/pull/789 [2]
> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
> ---
>  scripts/rust_is_available.sh | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
> index 0c082a248f15..a86659410e48 100755
> --- a/scripts/rust_is_available.sh
> +++ b/scripts/rust_is_available.sh
> @@ -141,9 +141,7 @@ fi
>  # of the `libclang` found by the Rust bindings generator is suitable.
>  bindgen_libclang_version=$( \
>         echo "$bindgen_libclang_output" \
> -               | grep -F 'clang version ' \
> -               | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
> -               | head -n 1 \
> +               | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
>  )
>  bindgen_libclang_min_version=$($min_tool_version llvm)
>  bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)
> --
> 2.39.0
>



You do not need to fork sed.




diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
index 1f478d7e0f77..ebe427e27379 100755
--- a/scripts/rust_is_available.sh
+++ b/scripts/rust_is_available.sh
@@ -137,14 +137,9 @@ fi

 # `bindgen` returned successfully, thus use the output to check that
the version
 # of the `libclang` found by the Rust bindings generator is suitable.
-bindgen_libclang_version=$( \
-       echo "$bindgen_libclang_output" \
-               | grep -F 'clang version ' \
-               | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
-               | head -n 1 \
-)
+set -- ${bindgen_libclang_output#**clang version}
+bindgen_libclang_cversion=$(get_canonical_version $1)
 bindgen_libclang_min_version=$($min_tool_version llvm)
-bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)
 bindgen_libclang_min_cversion=$(get_canonical_version
$bindgen_libclang_min_version)
 if [ "$bindgen_libclang_cversion" -lt "$bindgen_libclang_min_cversion" ]; then
        echo >&2 "***"
  
Miguel Ojeda Jan. 13, 2023, 11:12 p.m. UTC | #2
On Thu, Jan 12, 2023 at 6:32 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> +set -- ${bindgen_libclang_output#**clang version}
> +bindgen_libclang_cversion=$(get_canonical_version $1)
>  bindgen_libclang_min_version=$($min_tool_version llvm)
> -bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)

Nice trick :) To be honest, I am not really fond of `set`, and in this
case it means the command is not symmetric (we remove the prefix using
parameter expansion, and the suffix via positional argument
selection), but if you prefer it that way, I think it would be fine.

However, why the double asterisk? One already matches any string,
including spaces, no?

Cheers,
Miguel
  
Miguel Ojeda Jan. 13, 2023, 11:30 p.m. UTC | #3
On Mon, Jan 9, 2023 at 9:46 PM Miguel Ojeda <ojeda@kernel.org> wrote:
>
> Reported-by: Jordan (@jordanisaacs)

Cc'ing Jordan who gave us the email address in GitHub and wants to
send a `Tested-by` tag.

Cheers,
Miguel
  
Masahiro Yamada Jan. 15, 2023, 2:39 a.m. UTC | #4
On Sat, Jan 14, 2023 at 8:13 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Thu, Jan 12, 2023 at 6:32 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > +set -- ${bindgen_libclang_output#**clang version}
> > +bindgen_libclang_cversion=$(get_canonical_version $1)
> >  bindgen_libclang_min_version=$($min_tool_version llvm)
> > -bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)
>
> Nice trick :) To be honest, I am not really fond of `set`, and in this
> case it means the command is not symmetric (we remove the prefix using
> parameter expansion, and the suffix via positional argument
> selection), but if you prefer it that way, I think it would be fine.


I just tend to write efficient code.
(scripts/{cc,ld,as}-version.sh do not use sed or grep at all.)

Especially, I avoid unneeded process forks
in the process forks.






> However, why the double asterisk? One already matches any string,
> including spaces, no?


Sorry, it is my mistake.

I meant double pound.


  set -- ${bindgen_libclang_output##*clang version}



The double pound strips "the longest matching pattern",
just in case "clang version" is contained in the file path.
(but if a space is contained in the directory path,
it would have failed earlier.






>
> Cheers,
> Miguel




--
Best Regards
Masahiro Yamada
  

Patch

diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
index 0c082a248f15..a86659410e48 100755
--- a/scripts/rust_is_available.sh
+++ b/scripts/rust_is_available.sh
@@ -141,9 +141,7 @@  fi
 # of the `libclang` found by the Rust bindings generator is suitable.
 bindgen_libclang_version=$( \
 	echo "$bindgen_libclang_output" \
-		| grep -F 'clang version ' \
-		| grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
-		| head -n 1 \
+		| sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
 )
 bindgen_libclang_min_version=$($min_tool_version llvm)
 bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)