diff mbox series

selftests/nolibc: Fix up compile error for rv32

Message ID	20230520120254.66315-1-falcon@tinylab.org
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; From: Zhangjin Wu <falcon@tinylab.org> To: linux@weissschuh.net Cc: aou@eecs.berkeley.edu, falcon@tinylab.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-riscv@lists.infradead.org, palmer@dabbelt.com, paul.walmsley@sifive.com, shuah@kernel.org, w@1wt.eu Subject: [PATCH] selftests/nolibc: Fix up compile error for rv32 Date: Sat, 20 May 2023 20:02:53 +0800 Message-Id: <20230520120254.66315-1-falcon@tinylab.org> In-Reply-To: <20230520-nolibc-stackprotector-riscv-v1-1-d8912012a034@weissschuh.net> References: <20230520-nolibc-stackprotector-riscv-v1-1-d8912012a034@weissschuh.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Feedback-ID: bizesmtp:tinylab.org:qybglogicsvrsz:qybglogicsvrsz3a-3 Precedence: bulk
Series	selftests/nolibc: Fix up compile error for rv32 \| selftests/nolibc: Fix up compile error for rv32

Commit Message

Zhangjin Wu May 20, 2023, 12:02 p.m. UTC

  When compile nolibc-test.c for rv32, we got such error:

    tools/testing/selftests/nolibc/nolibc-test.c:599:57: error: ‘__NR_fstat’ undeclared (first use in this function)
      599 |   CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;

The generic include/uapi/asm-generic/unistd.h used by rv32 doesn't
support __NR_fstat, using the common __NR_read functions as expected.

    Running test 'syscall'
    69 syscall_noargs = 1                                            [OK]
    70 syscall_args = -1 EBADF                                       [OK]

Btw, the latest riscv libc6-dev package is required, otherwise, we would
also get such error:

    In file included from /usr/riscv64-linux-gnu/include/sys/cdefs.h:452,
                     from /usr/riscv64-linux-gnu/include/features.h:461,
                     from /usr/riscv64-linux-gnu/include/bits/libc-header-start.h:33,
                     from /usr/riscv64-linux-gnu/include/limits.h:26,
                     from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:194,
                     from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/syslimits.h:7,
                     from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:34,
                     from /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/nolibc-test.c:6:
    /usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported"
       28 | # error "rv32i-based targets are not supported"

The glibc commit 5b6113d62efa ("RISC-V: Support the 32-bit ABI
implementation") fixed up above error, so, glibc >= 2.33 (who includes
this commit) is required.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
---
 tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Willy Tarreau May 20, 2023, 1:32 p.m. UTC | #1

Thomas, Zhangjin,

I've merged your latest patches in my branch 20230520-nolibc-rv32+stkp2,
which was rebased to integrate the updated commit messages and a few
missing s-o-b from mine. Please have a look:

https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git

However, Thomas, I noticed something puzzling me. While I tested with
gcc-9.5 (that I have here along my toolchains) I found that it would
systematically fail:

sysroot/x86/include/stackprotector.h:46:1: warning: 'no_stack_protector' attribute directive ignored [-Wattributes]
46 | {
| ^
!!Stack smashing detected!!
qemu: uncaught target signal 6 (Aborted) - core dumped
0 test(s) passed.

The reason is that it doesn't support the attribute "no_stack_protector".
Upon closer investigation, I noticed that _start() on x86_64 doens't have
it, yet it works on more recent compilers! So I don't understand why it
works with more recent compilers.

I managed to avoid the crash by enclosing the __stack_chk_init() function
in a #pragma GCC optimize("-fno-stack-protector") while removing the
attribute (though Clang and more recent gcc use this attribute so we
shouldn't completely drop it either).

I consider this non-critical as we can expect that regtests are run with
a reasonably recent compiler version, but if in the long term we can find
a more reliable detection for this, it would be nice.

For example I found that gcc defines __SSP_ALL__ to 1 when
-fstack-protector is used, and 2 when -fstack-protector-all is used.
With clang, it's 1 and 3 respectively. Maybe we should use that and
drop NOLIBC_STACKPROTECTOR, that would be one less variable to deal
with: the code would automatically adapt to whatever cflags the user
sets on the compiler, which is generally better.

Regards,
Willy

Thomas Weißschuh May 20, 2023, 2 p.m. UTC | #2

Hi Willy, Zhangjin,

On 2023-05-20 20:02:53+0800, Zhangjin Wu wrote:
> When compile nolibc-test.c for rv32, we got such error:
> 
>     tools/testing/selftests/nolibc/nolibc-test.c:599:57: error: ‘__NR_fstat’ undeclared (first use in this function)
>       599 |   CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
> 
> The generic include/uapi/asm-generic/unistd.h used by rv32 doesn't
> support __NR_fstat, using the common __NR_read functions as expected.
> 
>     Running test 'syscall'
>     69 syscall_noargs = 1                                            [OK]
>     70 syscall_args = -1 EBADF                                       [OK]
> 
> Btw, the latest riscv libc6-dev package is required, otherwise, we would
> also get such error:
> 
>     In file included from /usr/riscv64-linux-gnu/include/sys/cdefs.h:452,
>                      from /usr/riscv64-linux-gnu/include/features.h:461,
>                      from /usr/riscv64-linux-gnu/include/bits/libc-header-start.h:33,
>                      from /usr/riscv64-linux-gnu/include/limits.h:26,
>                      from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:194,
>                      from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/syslimits.h:7,
>                      from /usr/lib/gcc-cross/riscv64-linux-gnu/9/include/limits.h:34,
>                      from /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/nolibc-test.c:6:
>     /usr/riscv64-linux-gnu/include/bits/wordsize.h:28:3: error: #error "rv32i-based targets are not supported"
>        28 | # error "rv32i-based targets are not supported"
> 
> The glibc commit 5b6113d62efa ("RISC-V: Support the 32-bit ABI
> implementation") fixed up above error, so, glibc >= 2.33 (who includes
> this commit) is required.
> 
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> ---
>  tools/testing/selftests/nolibc/nolibc-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> index 063f9959ac44..d8b59c8f6c03 100644
> --- a/tools/testing/selftests/nolibc/nolibc-test.c
> +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> @@ -596,7 +596,7 @@ int run_syscall(int min, int max)
>  		CASE_TEST(write_badf);        EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
>  		CASE_TEST(write_zero);        EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
>  		CASE_TEST(syscall_noargs);    EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
> -		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
> +		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_read, -1, &tmp, 1), -1, EBADF); break;

The goal of this second test was to make sure that arguments are passed
in the correct order. For this I tried to have a syscall were the
checked error is generated from a non-first argument.
(The NULL generating the EFAULT).

So the new check does not fullfil this goal anymore.

Maybe we can find a new syscall to test with?

The code should have had a comment I guess.

>  		case __LINE__:
>  			return ret; /* must be last */
>  		/* note: do not set any defaults so as to permit holes above */
> -- 
> 2.25.1
>

Thomas Weißschuh May 20, 2023, 2:07 p.m. UTC | #3

Hi Willy!

On 2023-05-20 15:32:37+0200, Willy Tarreau wrote:
> Thomas, Zhangjin,
> 
> I've merged your latest patches in my branch 20230520-nolibc-rv32+stkp2,
> which was rebased to integrate the updated commit messages and a few
> missing s-o-b from mine. Please have a look:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
> 
> However, Thomas, I noticed something puzzling me. While I tested with
> gcc-9.5 (that I have here along my toolchains) I found that it would
> systematically fail:
> 
>   sysroot/x86/include/stackprotector.h:46:1: warning: 'no_stack_protector' attribute directive ignored [-Wattributes]
>      46 | {
>         | ^
>   !!Stack smashing detected!!
>   qemu: uncaught target signal 6 (Aborted) - core dumped
>   0 test(s) passed.
> 
> The reason is that it doesn't support the attribute "no_stack_protector".
> Upon closer investigation, I noticed that _start() on x86_64 doens't have
> it, yet it works on more recent compilers! So I don't understand why it
> works with more recent compilers.

_start() not having the attribute is indeed an oversight.
No idea how it worked before.

> I managed to avoid the crash by enclosing the __stack_chk_init() function
> in a #pragma GCC optimize("-fno-stack-protector") while removing the
> attribute (though Clang and more recent gcc use this attribute so we
> shouldn't completely drop it either).

I would like to first align x86 to __attribute__((no_stack_protector))
for uniformity and then figure out on how to make it nicer.

> I consider this non-critical as we can expect that regtests are run with
> a reasonably recent compiler version, but if in the long term we can find
> a more reliable detection for this, it would be nice.
> 
> For example I found that gcc defines __SSP_ALL__ to 1 when
> -fstack-protector is used, and 2 when -fstack-protector-all is used.
> With clang, it's 1 and 3 respectively. Maybe we should use that and
> drop NOLIBC_STACKPROTECTOR, that would be one less variable to deal
> with: the code would automatically adapt to whatever cflags the user
> sets on the compiler, which is generally better.

That sounds great!

I explicitly looked for something like this before, dumping preprocessor
directives and comparing.
It seems the fact that my compilers enable this feature by default made
me miss it.

I'll send patches.

Thomas

Willy Tarreau May 20, 2023, 2:09 p.m. UTC | #4

On Sat, May 20, 2023 at 04:00:54PM +0200, Thomas Weißschuh wrote:
> > diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> > index 063f9959ac44..d8b59c8f6c03 100644
> > --- a/tools/testing/selftests/nolibc/nolibc-test.c
> > +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> > @@ -596,7 +596,7 @@ int run_syscall(int min, int max)
> >  		CASE_TEST(write_badf);        EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
> >  		CASE_TEST(write_zero);        EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
> >  		CASE_TEST(syscall_noargs);    EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
> > -		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
> > +		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_read, -1, &tmp, 1), -1, EBADF); break;
> 
> The goal of this second test was to make sure that arguments are passed
> in the correct order. For this I tried to have a syscall were the
> checked error is generated from a non-first argument.
> (The NULL generating the EFAULT).
> So the new check does not fullfil this goal anymore.

Ah OK good to know.

> Maybe we can find a new syscall to test with?

Maybe it would be worth considering pselect() or equivalent which
involve many arguments. I don't know if rv32 has fstatat() or
lstat() for example, that could be used as alternatives ?

> The code should have had a comment I guess.

Indeed ;-)

With that said, if rv32 is missing some essential syscalls, my question
regarding its relevance here still holds!

Willy

Willy Tarreau May 20, 2023, 2:13 p.m. UTC | #5

On Sat, May 20, 2023 at 04:07:34PM +0200, Thomas Weißschuh wrote:
> Hi Willy!
> 
> On 2023-05-20 15:32:37+0200, Willy Tarreau wrote:
> > Thomas, Zhangjin,
> > 
> > I've merged your latest patches in my branch 20230520-nolibc-rv32+stkp2,
> > which was rebased to integrate the updated commit messages and a few
> > missing s-o-b from mine. Please have a look:
> > 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
> > 
> > However, Thomas, I noticed something puzzling me. While I tested with
> > gcc-9.5 (that I have here along my toolchains) I found that it would
> > systematically fail:
> > 
> >   sysroot/x86/include/stackprotector.h:46:1: warning: 'no_stack_protector' attribute directive ignored [-Wattributes]
> >      46 | {
> >         | ^
> >   !!Stack smashing detected!!
> >   qemu: uncaught target signal 6 (Aborted) - core dumped
> >   0 test(s) passed.
> > 
> > The reason is that it doesn't support the attribute "no_stack_protector".
> > Upon closer investigation, I noticed that _start() on x86_64 doens't have
> > it, yet it works on more recent compilers! So I don't understand why it
> > works with more recent compilers.
> 
> _start() not having the attribute is indeed an oversight.
> No idea how it worked before.

No problem, I preferred to mention it anyway.

> > I managed to avoid the crash by enclosing the __stack_chk_init() function
> > in a #pragma GCC optimize("-fno-stack-protector") while removing the
> > attribute (though Clang and more recent gcc use this attribute so we
> > shouldn't completely drop it either).
> 
> I would like to first align x86 to __attribute__((no_stack_protector))
> for uniformity and then figure out on how to make it nicer.

I agree.

> > I consider this non-critical as we can expect that regtests are run with
> > a reasonably recent compiler version, but if in the long term we can find
> > a more reliable detection for this, it would be nice.
> > 
> > For example I found that gcc defines __SSP_ALL__ to 1 when
> > -fstack-protector is used, and 2 when -fstack-protector-all is used.
> > With clang, it's 1 and 3 respectively. Maybe we should use that and
> > drop NOLIBC_STACKPROTECTOR, that would be one less variable to deal
> > with: the code would automatically adapt to whatever cflags the user
> > sets on the compiler, which is generally better.
> 
> That sounds great!
> 
> I explicitly looked for something like this before, dumping preprocessor
> directives and comparing.
> It seems the fact that my compilers enable this feature by default made
> me miss it.

Hmmm that's indeed possible. With -fno-stack-protector it should disappear:

  $ gcc -fno-stack-protector -dM -E -xc - < /dev/null |grep SSP
  $ gcc -fstack-protector -dM -E -xc - < /dev/null |grep SSP
  #define __SSP__ 1
  $ gcc -fstack-protector-all -dM -E -xc - < /dev/null |grep SSP
  #define __SSP_ALL__ 2
  $ clang -fstack-protector-all -dM -E -xc - < /dev/null |grep SSP
  #define __SSP_ALL__ 3

> I'll send patches.

OK thanks. Just be aware that I'll be less responsive this week-end from
now on.

Willy

Willy Tarreau May 21, 2023, 3:58 a.m. UTC | #6

On Sun, May 21, 2023 at 02:30:22AM +0800, Zhangjin Wu wrote:
> > > The goal of this second test was to make sure that arguments are passed
> > > in the correct order. For this I tried to have a syscall were the
> > > checked error is generated from a non-first argument.
> > > (The NULL generating the EFAULT).
> > > So the new check does not fullfil this goal anymore.
> > 
> > Ah OK good to know.
> >
> 
> does this meet the requirement? the 3rd argument shouldn't < 0, otherwise, there is an EFAULT. 
> 
>     CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_read, 1, &tmp, -1), -1, EFAULT); break;
> 
> It get such test result:
> 
>     70 syscall_args = -1 EFAULT                                      [OK]

I think what Thomas meant is that he wants to be sure the call doesn't
end up as read(-1, &tmp, 1). Here you could have -EBADF or -EFAULT, it
depends. Anyway other solutions can be found if necessary. Another
approach could be to switch back to __NR_fstat and condition it to its
definition.

> > > Maybe we can find a new syscall to test with?
> > 
> > Maybe it would be worth considering pselect() or equivalent which
> > involve many arguments. I don't know if rv32 has fstatat() or
> > lstat() for example, that could be used as alternatives ?
> >
> 
> Unfortuantely, none of them is available in rv32, we have the same tricks you used in another reply:
> 
>     $ echo "#include <asm/unistd.h>" | \
>         riscv64-linux-gnu-gcc -march=rv32im -mabi=ilp32 -Wl,-melf32lriscv_ilp32 -xc - -E -dM | \
>         grep -E "pselect|fstat|lstat"
>     #define __NR_pselect6_time64 413
>     #define __NR3264_fstatfs 44
>     #define __NR_fstatfs64 __NR3264_fstatfs

Then probably fstatfs should work equally for this test.

> Or, use the rv32 test result as a crude reference:
(... trimmed to keep only the failed ones ...)
> 
>     15 chmod_net = -1 ENOENT                                        [FAIL]
>     16 chmod_self = -1 ENOENT  != (-1 EPERM)                        [FAIL]
>     17 chown_self = -1 ENOENT  != (-1 EPERM)                        [FAIL]
>     20 chroot_exe = -1 ENOENT  != (-1 ENOTDIR)                      [FAIL]
>     30 fork = 1 ENOSYS                                              [FAIL]
>     33 gettimeofday_null = -1 ENOSYS                                [FAIL]
>     35 gettimeofday_bad1 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     36 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     37 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     45 link_cross = -1 ENOENT  != (-1 EXDEV)                        [FAIL]
>     51 poll_null = -1 ENOSYS                                        [FAIL]
>     52 poll_stdout = -1 ENOSYS                                      [FAIL]
>     53 poll_fault = -1 ENOSYS  != (-1 EFAULT)                       [FAIL]
>     56 select_null = -1 ENOSYS                                      [FAIL]
>     57 select_stdout = -1 ENOSYS                                    [FAIL]
>     58 select_fault = -1 ENOSYS  != (-1 EFAULT)                     [FAIL]
>     64 wait_child = -1 ENOSYS  != (-1 ECHILD)                       [FAIL]
>     65 waitpid_min = -1 ENOSYS  != (-1 ESRCH)                       [FAIL]
>     66 waitpid_child = -1 ENOSYS  != (-1 ECHILD)                    [FAIL]
>     Errors during this test: 19

So that's a lot of failures and we should start to blindly degrade other
tests just for the sake of fixing these ones here, it should be done more
carefully.

> As my latest reply here [1] explains, this error is not really rv32 specific,
> all of the time32 based syscalls and even the other 32bit syscalls have been
> disabled by default for new architectures (add the author of commit
> "c8ce48f06503" in the cc list), rv32 here is a very good test case for such
> trend.

I'm fine with going in that direction if that's the future, and using
rv32 as a guide towards this. *BUT* it doesn't mean we have to break
the rest that currently works on existing platforms and is currently
used by various programs. Maybe it means that some of these tests
should be grouped together into a time32_syscall category that's only
tested when __ARCH_WANT_TIME32_SYSCALLS is defined. It would also help
figure what is wrong for some of them. For example chmod/chown/link above
seem to indicate that /proc is not mounted in your test config, not that
the syscalls are not supported. This it seems to me that on this platform
we should still see these syscalls fail and the other ones not executed.

Another approach would be to just group them for easier detection of
the __ARCH_WANT_TIME32_SYSCALLS vs other ones, but not disable them, so
that we can watch the progress made on supporting the mapping of these
ones on new syscalls instead.

And for the one that you changed from __NR_stat to __NR_read, I'm
proposing that either we find another one that works everywhere, or
that we just revert the change and guard it under an ifdef, because
having a direct reference to a syscall number requires that it exists
and I agree that we must not break the build in such a case.

My preference for the short term would be the following:
  1) make sure we fix build issues for all platforms, including rv32
  2) make sure Thomas' work on syscall() and STKP works fine where it
     should, as it used to till now on other platforms

  => this should be added to the 6.5 queue, and for this I don't want
     to make this series regress as it should be queued quickly so that
     test code used by other developers working on 6.5 is reasonably
     stabilized.

  3) evaluate what needs to be done regarding time32, this implies
     working in the lower abstraction layers to depend on
     __ARCH_WANT_TIME32_SYSCALLS and use the new syscalls instead.

  => I don't know how much work it requires; if it's trivial this
     could possibly be for 6.5, otherwise it will have to be postponed.

Thanks,
Willy

Zhangjin Wu May 21, 2023, 6:08 p.m. UTC | #7

Willy, Thomas,

> On Sun, May 21, 2023 at 02:30:22AM +0800, Zhangjin Wu wrote:
> > ...
>
> I think what Thomas meant is that he wants to be sure the call doesn't
> end up as read(-1, &tmp, 1). Here you could have -EBADF or -EFAULT, it
> depends. Anyway other solutions can be found if necessary. Another
> approach could be to switch back to __NR_fstat and condition it to its
> definition.
>
> > > > Maybe we can find a new syscall to test with?
> > >
> > > Maybe it would be worth considering pselect() or equivalent which
> > > involve many arguments. I don't know if rv32 has fstatat() or
> > > lstat() for example, that could be used as alternatives ?
> > >
> >
> > Unfortuantely, none of them is available in rv32, we have the same tricks you used in another reply:
> >
> >     $ echo "#include <asm/unistd.h>" | \
> >         riscv64-linux-gnu-gcc -march=rv32im -mabi=ilp32 -Wl,-melf32lriscv_ilp32 -xc - -E -dM | \
> >         grep -E "pselect|fstat|lstat"
> >     #define __NR_pselect6_time64 413
> >     #define __NR3264_fstatfs 44
> >     #define __NR_fstatfs64 __NR3264_fstatfs
>
> Then probably fstatfs should work equally for this test.
>

I tested a change like this (try __NR_fstatfs64, __NR_fstatfs, and __NR_fstat one by one):

    diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
    index 063f9959ac44..0a60e8ca5756 100644
    --- a/tools/testing/selftests/nolibc/nolibc-test.c
    +++ b/tools/testing/selftests/nolibc/nolibc-test.c
    @@ -500,6 +500,16 @@ static int test_fork(void)
     	}
     }

    +#ifdef __NR_fstatfs64
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs64, 0, 0, NULL), -1, EINVAL)
    +#elif defined(__NR_fstatfs)
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs, 0, NULL), -1, EFAULT)
    +#elif defined(__NR_fstat)
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT)
    +#else
    +#error None of __NR_fstatfs64, __NR_fstatfs and __NR_fstat defined, cannot implement syscall_args test
    +#endif
    +
     /* Run syscall tests between IDs <min> and <max>.
      * Return 0 on success, non-zero on failure.
      */
    @@ -596,7 +606,7 @@ int run_syscall(int min, int max)
     		CASE_TEST(write_badf);        EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
     		CASE_TEST(write_zero);        EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
     		CASE_TEST(syscall_noargs);    EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
    -		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
    +		CASE_TEST(syscall_args);      EXPECT_SYSER_SYSCALL_ARGS(); break;
     		case __LINE__:
     			return ret; /* must be last */
     		/* note: do not set any defaults so as to permit holes above */

And another change like this (try __NR_statx and then __NR_fstat):

    diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
    index 063f9959ac44..d740ba405cb2 100644
    --- a/tools/testing/selftests/nolibc/nolibc-test.c
    +++ b/tools/testing/selftests/nolibc/nolibc-test.c
    @@ -500,6 +500,17 @@ static int test_fork(void)
     	}
     }

    +static int test_syscall_args(void)
    +{
    +#ifdef __NR_statx
    +	return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
    +#elif defined(__NR_fstat)
    +	return syscall(__NR_fstat, 0, NULL);
    +#else
    +#error Neither __NR_statx nor __NR_fstat defined, cannot implement syscall_args test
    +#endif
    +}
    +
     /* Run syscall tests between IDs <min> and <max>.
      * Return 0 on success, non-zero on failure.
      */
    @@ -596,7 +607,7 @@ int run_syscall(int min, int max)
     		CASE_TEST(write_badf);        EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
     		CASE_TEST(write_zero);        EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
     		CASE_TEST(syscall_noargs);    EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
    -		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
    +		CASE_TEST(syscall_args);      EXPECT_SYSER(1, test_syscall_args(), -1, EFAULT); break;
     		case __LINE__:
     			return ret; /* must be last */
     		/* note: do not set any defaults so as to permit holes above */

Which one is acceptable?

> > Or, use the rv32 test result as a crude reference:
> (... trimmed to keep only the failed ones ...)
> >
> >     15 chmod_net = -1 ENOENT                                        [FAIL]
> >     16 chmod_self = -1 ENOENT  != (-1 EPERM)                        [FAIL]
> >     17 chown_self = -1 ENOENT  != (-1 EPERM)                        [FAIL]
> >     20 chroot_exe = -1 ENOENT  != (-1 ENOTDIR)                      [FAIL]
> >     30 fork = 1 ENOSYS                                              [FAIL]
> >     33 gettimeofday_null = -1 ENOSYS                                [FAIL]
> >     35 gettimeofday_bad1 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     36 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     37 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     45 link_cross = -1 ENOENT  != (-1 EXDEV)                        [FAIL]
> >     51 poll_null = -1 ENOSYS                                        [FAIL]
> >     52 poll_stdout = -1 ENOSYS                                      [FAIL]
> >     53 poll_fault = -1 ENOSYS  != (-1 EFAULT)                       [FAIL]
> >     56 select_null = -1 ENOSYS                                      [FAIL]
> >     57 select_stdout = -1 ENOSYS                                    [FAIL]
> >     58 select_fault = -1 ENOSYS  != (-1 EFAULT)                     [FAIL]
> >     64 wait_child = -1 ENOSYS  != (-1 ECHILD)                       [FAIL]
> >     65 waitpid_min = -1 ENOSYS  != (-1 ESRCH)                       [FAIL]
> >     66 waitpid_child = -1 ENOSYS  != (-1 ECHILD)                    [FAIL]
> >     Errors during this test: 19
>
> So that's a lot of failures and we should start to blindly degrade other
> tests just for the sake of fixing these ones here, it should be done more
> carefully.
>

Removed the config options related failures (will use defconfig to re-check
them, I did use a tiny config for faster compile):

    30 fork = 1 ENOSYS                                              [FAIL]
    33 gettimeofday_null = -1 ENOSYS                                [FAIL]
    35 gettimeofday_bad1 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
    36 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
    37 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
    51 poll_null = -1 ENOSYS                                        [FAIL]
    52 poll_stdout = -1 ENOSYS                                      [FAIL]
    53 poll_fault = -1 ENOSYS  != (-1 EFAULT)                       [FAIL]
    56 select_null = -1 ENOSYS                                      [FAIL]
    57 select_stdout = -1 ENOSYS                                    [FAIL]
    58 select_fault = -1 ENOSYS  != (-1 EFAULT)                     [FAIL]
    64 wait_child = -1 ENOSYS  != (-1 ECHILD)                       [FAIL]
    65 waitpid_min = -1 ENOSYS  != (-1 ESRCH)                       [FAIL]
    66 waitpid_child = -1 ENOSYS  != (-1 ECHILD)                    [FAIL]

The left failed syscalls may be waitpid, wait4, gettimeofday, poll and select,
it is not too many.

>
> My preference for the short term would be the following:
>   1) make sure we fix build issues for all platforms, including rv32
>   2) make sure Thomas' work on syscall() and STKP works fine where it
>      should, as it used to till now on other platforms
>
>   => this should be added to the 6.5 queue, and for this I don't want
>      to make this series regress as it should be queued quickly so that
>      test code used by other developers working on 6.5 is reasonably
>      stabilized.

Hope one of the above changes meets this goal, if no, perhaps we can simply
revert my '__NR_read' patch and left it together with the other rv32 failures
to next merge window.

>
>   3) evaluate what needs to be done regarding time32, this implies
>      working in the lower abstraction layers to depend on
>      __ARCH_WANT_TIME32_SYSCALLS and use the new syscalls instead.
>
>   => I don't know how much work it requires; if it's trivial this
>      could possibly be for 6.5, otherwise it will have to be postponed.

My suggestion is directly fix up the failures one by one in current stage,
because nolibc currently only uses part of the time32 syscalls, it may be not
that complex to fix up them. Have read the code of musl and glibc today, both
of them have good time64 syscalls support, I plan to fix up the above failures
in several days, hope so but no promise ;-)

And another question: for the new changes, If a platform support both of the
32bit and 64bit syscalls, do we need to put the 64bit syscalls at first?

For example, the __NR_llseek we just added, do we need to check __NR_llseek
before check __NR_lseek? this may support 64bit better but also may generate
bigger size of code, currently, my patch checks __NR_lseek at first and then
__NR_llseek to get smaller size of code, but as I read from the musl and glibc
code, both of them put the 64bit syscalls path before others.

Thanks,
Zhangjin

>
> Thanks,
> Willy

Zhangjin Wu May 23, 2023, 6:03 p.m. UTC | #8

Hi, Willy, Thomas

Good news, I just fixed up all of the time32 syscalls related build and
test failures for rv32 and plan to send out the whole patchset tomorrow.

The patchset is based on 20230520-nolibc-rv32+stkp2 of [1] and the new
statckprotect patchset [2] (If v2 is ready, I prefer to rebase on v2).

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[2]: https://lore.kernel.org/linux-riscv/20230521100818.GA29408@1wt.eu/T/#t

For the fstat compile issue, I prefer to use a __NR_statx for rv32
instead of using fstatfs, because of different fstatfs* have different
errno's, it is ugly to use lots of macros ;-) what's your suggestion?

Here is the __NR_statx version:

    +static int test_syscall_args(void)
    +{
    +#ifdef __NR_statx
    +	return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
    +#elif defined(__NR_fstat)
    +	return syscall(__NR_fstat, 0, NULL);
    +#else
    +#error Neither __NR_statx nor __NR_fstat defined, cannot implement syscall_args test
    +#endif
    +}

And the fstatfs version:

    +#ifdef __NR_fstatfs64
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs64, 0, 0, NULL), -1, EINVAL)
    +#elif defined(__NR_fstatfs)
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs, 0, NULL), -1, EFAULT)
    +#elif defined(__NR_fstat)
    +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT)
    +#else
    +#error None of __NR_fstatfs64, __NR_fstatfs and __NR_fstat defined, cannot implement syscall_args test
    +#endif

And I plan to move __NR_fstat as the first branch to make sure it works
as before on the other platforms.

>     30 fork = 1 ENOSYS                                              [FAIL]
>     33 gettimeofday_null = -1 ENOSYS                                [FAIL]
>     35 gettimeofday_bad1 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     36 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     37 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
>     51 poll_null = -1 ENOSYS                                        [FAIL]
>     52 poll_stdout = -1 ENOSYS                                      [FAIL]
>     53 poll_fault = -1 ENOSYS  != (-1 EFAULT)                       [FAIL]
>     56 select_null = -1 ENOSYS                                      [FAIL]
>     57 select_stdout = -1 ENOSYS                                    [FAIL]
>     58 select_fault = -1 ENOSYS  != (-1 EFAULT)                     [FAIL]
>     64 wait_child = -1 ENOSYS  != (-1 ECHILD)                       [FAIL]
>     65 waitpid_min = -1 ENOSYS  != (-1 ESRCH)                       [FAIL]
>     66 waitpid_child = -1 ENOSYS  != (-1 ECHILD)                    [FAIL]
>

All of the above failures have been fixed up, some of them are very
easy, some of them are a little hard.

1. 30, 64-66 depends on wait4, use waitid instead
2. 33-37 depends on gettimeofday, use clock_gettime64 instead
3. 51-53 depends on ppoll, use ppoll_time64 instead
4. 56-58 depends on pselect*, use pselect_time64 instead

And I have found there are two same 'gettimeofday_bad2', is it designed?
If no, I will send a patch to dedup it:

    CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
    CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;

> 
> My suggestion is directly fix up the failures one by one in current stage,
> because nolibc currently only uses part of the time32 syscalls, it may be not
> that complex to fix up them. Have read the code of musl and glibc today, both
> of them have good time64 syscalls support, I plan to fix up the above failures
> in several days, hope so but no promise ;-)
>

both musl and glibc help a lot, but also encounter some critical issues, for
example, to pass some test cases, it is required to emulate the same path of
the target syscalls and return the same errno's for them, the code comment will
exaplain the details.

> And another question: for the new changes, If a platform support both of the
> 32bit and 64bit syscalls, do we need to put the 64bit syscalls at first?
>

To make sure not touch too much on the code and reduce test cost, the patchset
just kept the default code order and let the old code in the first branch.

I plan to send the whole patchset tomorrow.

Thanks,
Zhangjin

Willy Tarreau May 23, 2023, 6:56 p.m. UTC | #9

Hi Zhangjin,

On Wed, May 24, 2023 at 02:03:40AM +0800, Zhangjin Wu wrote:
> Hi, Willy, Thomas
> 
> Good news, I just fixed up all of the time32 syscalls related build and
> test failures for rv32 and plan to send out the whole patchset tomorrow.

Ah that's excellent!

> The patchset is based on 20230520-nolibc-rv32+stkp2 of [1] and the new
> statckprotect patchset [2] (If v2 is ready, I prefer to rebase on v2).

I'm really sorry for still failing to merge Thomas' branch but I'm on
the last mile to a release (next week) and collecting last minute stuff
from everywhere and my days and nights are really too short these days :-(

> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
> [2]: https://lore.kernel.org/linux-riscv/20230521100818.GA29408@1wt.eu/T/#t
> 
> For the fstat compile issue, I prefer to use a __NR_statx for rv32
> instead of using fstatfs, because of different fstatfs* have different
> errno's, it is ugly to use lots of macros ;-) what's your suggestion?
> 
> Here is the __NR_statx version:
> 
>     +static int test_syscall_args(void)
>     +{
>     +#ifdef __NR_statx
>     +	return syscall(__NR_statx, 0, NULL, 0, 0, NULL);
>     +#elif defined(__NR_fstat)
>     +	return syscall(__NR_fstat, 0, NULL);
>     +#else
>     +#error Neither __NR_statx nor __NR_fstat defined, cannot implement syscall_args test
>     +#endif
>     +}
> 
> And the fstatfs version:
> 
>     +#ifdef __NR_fstatfs64
>     +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs64, 0, 0, NULL), -1, EINVAL)
>     +#elif defined(__NR_fstatfs)
>     +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstatfs, 0, NULL), -1, EFAULT)
>     +#elif defined(__NR_fstat)
>     +#define EXPECT_SYSER_SYSCALL_ARGS() EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT)
>     +#else
>     +#error None of __NR_fstatfs64, __NR_fstatfs and __NR_fstat defined, cannot implement syscall_args test
>     +#endif

The first one indeed looks cleaner, I agree! Can't we just use statx
for all archs then and further simplify this ?

> And I plan to move __NR_fstat as the first branch to make sure it works
> as before on the other platforms.

Note that it was very recently added (recent batch of updates from
Thomas) and still only queued in my branch, so we can reorder everything
as we want and if in the end we drop some of these patches or change some
approaches to be more portable from the beginning, that's always better.

> >     30 fork = 1 ENOSYS                                              [FAIL]
> >     33 gettimeofday_null = -1 ENOSYS                                [FAIL]
> >     35 gettimeofday_bad1 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     36 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     37 gettimeofday_bad2 = -1 ENOSYS  != (-1 EFAULT)                [FAIL]
> >     51 poll_null = -1 ENOSYS                                        [FAIL]
> >     52 poll_stdout = -1 ENOSYS                                      [FAIL]
> >     53 poll_fault = -1 ENOSYS  != (-1 EFAULT)                       [FAIL]
> >     56 select_null = -1 ENOSYS                                      [FAIL]
> >     57 select_stdout = -1 ENOSYS                                    [FAIL]
> >     58 select_fault = -1 ENOSYS  != (-1 EFAULT)                     [FAIL]
> >     64 wait_child = -1 ENOSYS  != (-1 ECHILD)                       [FAIL]
> >     65 waitpid_min = -1 ENOSYS  != (-1 ESRCH)                       [FAIL]
> >     66 waitpid_child = -1 ENOSYS  != (-1 ECHILD)                    [FAIL]
> >
> 
> All of the above failures have been fixed up, some of them are very
> easy, some of them are a little hard.
> 
> 1. 30, 64-66 depends on wait4, use waitid instead
> 2. 33-37 depends on gettimeofday, use clock_gettime64 instead
> 3. 51-53 depends on ppoll, use ppoll_time64 instead
> 4. 56-58 depends on pselect*, use pselect_time64 instead

Awesome!

> And I have found there are two same 'gettimeofday_bad2', is it designed?
> If no, I will send a patch to dedup it:
> 
>     CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
>     CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;

I remember this one and I couldn't spot it last time. We had two accidental
duplicates, I searched them and managed to find the first one (which I
dropped a version or two ago) and didn't spot the remaining one so I
thought it was already gone. Apparently not! Fell free to send a patch
to kill it, of course!

> > My suggestion is directly fix up the failures one by one in current stage,
> > because nolibc currently only uses part of the time32 syscalls, it may be not
> > that complex to fix up them. Have read the code of musl and glibc today, both
> > of them have good time64 syscalls support, I plan to fix up the above failures
> > in several days, hope so but no promise ;-)
> >
> 
> both musl and glibc help a lot, but also encounter some critical issues, for
> example, to pass some test cases, it is required to emulate the same path of
> the target syscalls and return the same errno's for them, the code comment will
> exaplain the details.

OK but you know, we're only entitled to respect the documented syscall
errors. If we've used some in our tests because "it's just how they happen
to work" and the behavior changes on a different implementation while still
matching the man page, we may have to adjust some of our tests. On the
other hand if a remapped syscall doesn't respect the man page, it will
break some existing code and needs to be fixed (which is exactly the
purpose of nolibc-test in the first place).

> > And another question: for the new changes, If a platform support both of the
> > 32bit and 64bit syscalls, do we need to put the 64bit syscalls at first?
> >
> 
> To make sure not touch too much on the code and reduce test cost, the patchset
> just kept the default code order and let the old code in the first branch.

I'm fine with this. I absolutely don't care a single second about the
tests performance (they're instantaneous right now, an extra "if" will
not even be noticeable). Similarly we'll care about the code size when
the resulting binary reaches hundreds of kB. What matters for now is
that regular contributors like Thomas, Mark and you find it easy enough
to add or fix some entries without having to think about tons of stuff
to do it properly. The rest is just accessory.

> I plan to send the whole patchset tomorrow.

OK thank you! I prefer to warn you that it's unlikely that I'll be able
to work on it before the week-end, though, so take your time.

Thanks again,
Willy

diff mbox series

Patch

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 063f9959ac44..d8b59c8f6c03 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -596,7 +596,7 @@  int run_syscall(int min, int max)
 		CASE_TEST(write_badf);        EXPECT_SYSER(1, write(-1, &tmp, 1), -1, EBADF); break;
 		CASE_TEST(write_zero);        EXPECT_SYSZR(1, write(1, &tmp, 0)); break;
 		CASE_TEST(syscall_noargs);    EXPECT_SYSEQ(1, syscall(__NR_getpid), getpid()); break;
-		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
+		CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_read, -1, &tmp, 1), -1, EBADF); break;
 		case __LINE__:
 			return ret; /* must be last */
 		/* note: do not set any defaults so as to permit holes above */