[6/6] kbuild: rust_is_available: normalize version matching

Message ID 20230109204520.539080-6-ojeda@kernel.org
State New
Headers
Series [1/6] docs: rust: add paragraph about finding a suitable `libclang` |

Commit Message

Miguel Ojeda Jan. 9, 2023, 8:45 p.m. UTC
  In order to match the version string, `sed` is used in a couple
cases, and `grep` and `head` in a couple others.

Make the script more consistent and easier to understand by
using the same method, `sed`, for all of them.

This makes the version matching also a bit more strict for
the changed cases, since the strings `rustc ` and `bindgen `
will now be required, which should be fine since `rustc`
complains if one attempts to call it with another program
name, and `bindgen` uses a hardcoded string.

In addition, clarify why one of the existing `sed` commands
does not provide an address like the others.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
---
 scripts/rust_is_available.sh | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)
  

Comments

Masahiro Yamada Jan. 12, 2023, 6:22 a.m. UTC | #1
On Tue, Jan 10, 2023 at 5:46 AM Miguel Ojeda <ojeda@kernel.org> wrote:
>
> In order to match the version string, `sed` is used in a couple
> cases, and `grep` and `head` in a couple others.
>
> Make the script more consistent and easier to understand by
> using the same method, `sed`, for all of them.
>
> This makes the version matching also a bit more strict for
> the changed cases, since the strings `rustc ` and `bindgen `
> will now be required, which should be fine since `rustc`
> complains if one attempts to call it with another program
> name, and `bindgen` uses a hardcoded string.
>
> In addition, clarify why one of the existing `sed` commands
> does not provide an address like the others.
>
> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>


Maybe, your purpose is to use sed consistently, but
perhaps you can avoid forking sed if you know the
format of the first line.


BTW, what is missing here is, you do not check if
${RUSTC} is really rustc.


I can fool this script to print
"arithmetic expression: expecting primary: "100000 *  + 100 *  + "



$ make RUSTC=true rustavailable
./scripts/rust_is_available.sh: 19: arithmetic expression: expecting
primary: "100000 *  + 100 *  + "
***
*** Please see Documentation/rust/quick-start.rst for details
*** on how to setup Rust support.
***
make: *** [Makefile:1809: rustavailable] Error 2






scripts/{as,ld}-version.sh tried their best to
parse the line with shell syntax only, and
print "unknown assembler invoked" if the given
tool does not seem to be a supported assembler.







> ---
>  scripts/rust_is_available.sh | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
> index a86659410e48..99811842b61f 100755
> --- a/scripts/rust_is_available.sh
> +++ b/scripts/rust_is_available.sh
> @@ -66,8 +66,7 @@ fi
>  # Non-stable and distributions' versions may have a version suffix, e.g. `-dev`.
>  rust_compiler_version=$( \
>         LC_ALL=C "$RUSTC" --version 2>/dev/null \
> -               | head -n 1 \
> -               | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
> +               | sed -nE '1s:.*rustc ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
>  )
>  rust_compiler_min_version=$($min_tool_version rustc)
>  rust_compiler_cversion=$(get_canonical_version $rust_compiler_version)
> @@ -94,8 +93,7 @@ fi
>  # Non-stable and distributions' versions may have a version suffix, e.g. `-dev`.
>  rust_bindings_generator_version=$( \
>         LC_ALL=C "$BINDGEN" --version 2>/dev/null \
> -               | head -n 1 \
> -               | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
> +               | sed -nE '1s:.*bindgen ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
>  )
>  rust_bindings_generator_min_version=$($min_tool_version bindgen)
>  rust_bindings_generator_cversion=$(get_canonical_version $rust_bindings_generator_version)
> @@ -139,6 +137,9 @@ fi
>
>  # `bindgen` returned successfully, thus use the output to check that the version
>  # of the `libclang` found by the Rust bindings generator is suitable.
> +#
> +# Unlike other version checks, note that this one does not necessarily appear
> +# in the first line of the output, thus no `sed` address is provided.
>  bindgen_libclang_version=$( \
>         echo "$bindgen_libclang_output" \
>                 | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
> --
> 2.39.0
>
  
Miguel Ojeda Jan. 13, 2023, 11:15 p.m. UTC | #2
On Thu, Jan 12, 2023 at 7:23 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> Maybe, your purpose is to use sed consistently, but
> perhaps you can avoid forking sed if you know the
> format of the first line.

The most unknown format would be the one of the libclang check, where
there may be other lines before the one we are interested in. However,
the pattern expansion would still match newlines, right?

> BTW, what is missing here is, you do not check if
> ${RUSTC} is really rustc.
>
> I can fool this script to print
> "arithmetic expression: expecting primary: "100000 *  + 100 *  + "

We can test if nothing was printed by `sed` for that (or do it with
shell builtins).

Having said that, I would say fooling the script on purpose is an more
of an oddity compared to the case `MAKEFLAGS` attempts to cover
(please see my reply on the other patch). So if we cover this, then I
would say we should really cover the other one.

Cheers,
Miguel
  
Masahiro Yamada Jan. 15, 2023, 2:48 a.m. UTC | #3
On Sat, Jan 14, 2023 at 8:15 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Thu, Jan 12, 2023 at 7:23 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > Maybe, your purpose is to use sed consistently, but
> > perhaps you can avoid forking sed if you know the
> > format of the first line.
>
> The most unknown format would be the one of the libclang check, where
> there may be other lines before the one we are interested in. However,
> the pattern expansion would still match newlines, right?
>
> > BTW, what is missing here is, you do not check if
> > ${RUSTC} is really rustc.
> >
> > I can fool this script to print
> > "arithmetic expression: expecting primary: "100000 *  + 100 *  + "
>
> We can test if nothing was printed by `sed` for that (or do it with
> shell builtins).
>
> Having said that, I would say fooling the script on purpose is an more
> of an oddity compared to the case `MAKEFLAGS` attempts to cover
> (please see my reply on the other patch). So if we cover this, then I
> would say we should really cover the other one.



get_canonical_version() in scripts/as-version.sh has
a little more trick to avoid
"arithmetic expression: expecting primary: "100000 *  + 100 *  + "
but it is up to you.




> Cheers,
> Miguel
  
Masahiro Yamada Jan. 15, 2023, 10:48 a.m. UTC | #4
On Sun, Jan 15, 2023 at 11:48 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Sat, Jan 14, 2023 at 8:15 AM Miguel Ojeda
> <miguel.ojeda.sandonis@gmail.com> wrote:
> >
> > On Thu, Jan 12, 2023 at 7:23 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> > >
> > > Maybe, your purpose is to use sed consistently, but
> > > perhaps you can avoid forking sed if you know the
> > > format of the first line.
> >
> > The most unknown format would be the one of the libclang check, where
> > there may be other lines before the one we are interested in. However,
> > the pattern expansion would still match newlines, right?
> >
> > > BTW, what is missing here is, you do not check if
> > > ${RUSTC} is really rustc.
> > >
> > > I can fool this script to print
> > > "arithmetic expression: expecting primary: "100000 *  + 100 *  + "
> >
> > We can test if nothing was printed by `sed` for that (or do it with
> > shell builtins).
> >
> > Having said that, I would say fooling the script on purpose is an more
> > of an oddity compared to the case `MAKEFLAGS` attempts to cover
> > (please see my reply on the other patch). So if we cover this, then I
> > would say we should really cover the other one.
>
>
>
> get_canonical_version() in scripts/as-version.sh has
> a little more trick to avoid
> "arithmetic expression: expecting primary: "100000 *  + 100 *  + "
> but it is up to you.



My code accepts anything that is separated by dots
(and non-numerical strings are silently turned into zero).

Your code takes exactly the "([0-9]+\.[0-9]+\.[0-9]+)" pattern,
so it works very safely.

I think using sed is fine.
  

Patch

diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
index a86659410e48..99811842b61f 100755
--- a/scripts/rust_is_available.sh
+++ b/scripts/rust_is_available.sh
@@ -66,8 +66,7 @@  fi
 # Non-stable and distributions' versions may have a version suffix, e.g. `-dev`.
 rust_compiler_version=$( \
 	LC_ALL=C "$RUSTC" --version 2>/dev/null \
-		| head -n 1 \
-		| grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
+		| sed -nE '1s:.*rustc ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
 )
 rust_compiler_min_version=$($min_tool_version rustc)
 rust_compiler_cversion=$(get_canonical_version $rust_compiler_version)
@@ -94,8 +93,7 @@  fi
 # Non-stable and distributions' versions may have a version suffix, e.g. `-dev`.
 rust_bindings_generator_version=$( \
 	LC_ALL=C "$BINDGEN" --version 2>/dev/null \
-		| head -n 1 \
-		| grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
+		| sed -nE '1s:.*bindgen ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
 )
 rust_bindings_generator_min_version=$($min_tool_version bindgen)
 rust_bindings_generator_cversion=$(get_canonical_version $rust_bindings_generator_version)
@@ -139,6 +137,9 @@  fi
 
 # `bindgen` returned successfully, thus use the output to check that the version
 # of the `libclang` found by the Rust bindings generator is suitable.
+#
+# Unlike other version checks, note that this one does not necessarily appear
+# in the first line of the output, thus no `sed` address is provided.
 bindgen_libclang_version=$( \
 	echo "$bindgen_libclang_output" \
 		| sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'