Kconfig: Explicitly disable asm goto w/ outputs on gcc-11 (and earlier)

Message ID 20240208220604.140859-1-seanjc@google.com
State New
Headers
Series Kconfig: Explicitly disable asm goto w/ outputs on gcc-11 (and earlier) |

Commit Message

Sean Christopherson Feb. 8, 2024, 10:06 p.m. UTC
  Explicitly require gcc-12+ to enable asm goto with outputs on gcc to avoid
what is effectively a data corruption bug on gcc-11.  As per
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, "asm goto" is
*supposed* be implicitly volatile, but gcc-11 fails to treat it as such.
When compiling with -O2, failure to treat the asm block as volatile can
result in the entire block being discarded during optimization.

Even worse, forcing "asm volatile goto" keeps the block, but generates
completely bogus code.

Hardcode the gcc-12 or later requirement as trying to pipe the assembled
output to stdout, e.g. to query the generated code via objdump, doesn't
work due to the assembler wanting to seek throughout the output file.

Note, gcc-11 is the first gcc version that supports goto w/ outputs
(obviously with a loose definition of "supports").

E.g. given KVM's code sequence:

  vmcs12->guest_pdptr0 = vmcs_read64(GUEST_PDPTR0);
  vmcs12->guest_pdptr1 = vmcs_read64(GUEST_PDPTR1);
  vmcs12->guest_pdptr2 = vmcs_read64(GUEST_PDPTR2);
  vmcs12->guest_pdptr3 = vmcs_read64(GUEST_PDPTR3);

where vmcs_read64() eventually becomes:

	asm_volatile_goto("1: vmread %[field], %[output]\n\t"
			  "jna %l[do_fail]\n\t"

			  _ASM_EXTABLE(1b, %l[do_exception])

			  : [output] "=r" (value)
			  : [field] "r" (field)
			  : "cc"
			  : do_fail, do_exception);

	return value;

  do_fail:
	instrumentation_begin();
	vmread_error(field);
	instrumentation_end();
	return 0;

  do_exception:
	kvm_spurious_fault();
	return 0;

the sequence of VMREADs should generate:

   nopl   0x0(%rax,%rax,1)
   mov    $0x280a,%eax
   vmread %rax,%rax
   jbe    0xffffffff81099849 <sync_vmcs02_to_vmcs12+1929>
   mov    %rax,0xd8(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280c,%eax
   vmread %rax,%rax
   jbe    0xffffffff8109982c <sync_vmcs02_to_vmcs12+1900>
   mov    %rax,0xe0(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280e,%eax
   vmread %rax,%rax
   jbe    0xffffffff8109980f <sync_vmcs02_to_vmcs12+1871>
   mov    %rax,0xe8(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x2810,%eax
   vmread %rax,%rax
   jbe    0xffffffff810997f2 <sync_vmcs02_to_vmcs12+1842>
   mov    %rax,0xf0(%rbx)
   jmp    0xffffffff81099297 <sync_vmcs02_to_vmcs12+471>

but gcc-11 will omit the asm block for the VMREAD to GUEST_PDPTR3 and skip
straight to one of the "return 0" statements:

   nopl   0x0(%rax,%rax,1)
   mov    $0x280a,%r13d
   vmread %r13,%r13
   jbe    0xffffffff810996cd <sync_vmcs02_to_vmcs12+1949>
   mov    %r13,0xd8(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280c,%r13d
   vmread %r13,%r13
   jbe    0xffffffff810996ae <sync_vmcs02_to_vmcs12+1918>
   mov    %r13,0xe0(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280e,%r13d
   vmread %r13,%r13
   jbe    0xffffffff8109968f <sync_vmcs02_to_vmcs12+1887>
   mov    %r13,0xe8(%rbx)
   nopl   0x0(%rax,%rax,1)
   xor    %r12d,%r12d      <= return 0
   mov    %r12,0xf0(%rbx)  <= store result to vmcs12->guest_pdptr3
   jmp    0xffffffff8109912c <sync_vmcs02_to_vmcs12+508>

and with "volatile" forced, gcc-11 generates the correct-at-first-glance,
but terribly broken sequence of:

   nopl   0x0(%rax,%rax,1)
   mov    $0x280a,%r13d
   vmread %r13,%r13
   jbe    0xffffffff810999a4 <sync_vmcs02_to_vmcs12+1988>
   mov    %r13,0xd8(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280c,%r13d
   vmread %r13,%r13
   jbe    0xffffffff81099985 <sync_vmcs02_to_vmcs12+1957>
   mov    %r13,0xe0(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x280e,%r13d
   vmread %r13,%r13
   jbe    0xffffffff81099966 <sync_vmcs02_to_vmcs12+1926>
   mov    %r13,0xe8(%rbx)
   nopl   0x0(%rax,%rax,1)
   mov    $0x2810,%eax
   vmread %rax,%rax
   jbe    0xffffffff8109994a <sync_vmcs02_to_vmcs12+1898>
   xor    %r12d,%r12d     <= WTF gcc!?!?!
   mov    %r12,0xf0(%rbx)

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979
Fixes: 587f17018a2c ("Kconfig: add config option for asm goto w/ outputs")
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

Linus, I'm sending to you directly as this seems urgent enough to apply
straightaway, and this obviously affects much more than the build system.

 init/Kconfig | 5 +++++
 1 file changed, 5 insertions(+)


base-commit: 047371968ffc470769f541d6933e262dc7085456
  

Comments

Nick Desaulniers Feb. 9, 2024, 5:03 p.m. UTC | #1
+ Andrew

Andrew, here's the full thread, I cut most of it out:
https://lore.kernel.org/lkml/20240208220604.140859-1-seanjc@google.com/
.

On Thu, Feb 8, 2024 at 2:06 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Explicitly require gcc-12+ to enable asm goto with outputs on gcc to avoid
> what is effectively a data corruption bug on gcc-11.  As per
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, "asm goto" is
> *supposed* be implicitly volatile, but gcc-11 fails to treat it as such.
> When compiling with -O2, failure to treat the asm block as volatile can
> result in the entire block being discarded during optimization.
>
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979

Shoot, I was cc'ed on that bug (I think I noticed it in testing as
well, and pointed it out to Andrew on IRC who then cc'ed me to it). I
probably should have asked if that would cause issues at some point
for the kernel.

I took a look at the test case added in that bug; it doesn't compile
until gcc-13 (specifically gcc 13.2, not gcc 13.1).  I'm curious since
the bug says the fix was backported to gcc-12 and gcc-13. Are there
specific versions of those that contain the fix? If so, should Sean
amend his version checks below? For instance, was the fix backported
to gcc 12.3, so users of gcc 12.2 would still have issues?  I can't
tell in godbolt since the added test case doesn't compile until gcc
13.2. https://godbolt.org/z/eqaa7dfo3

My implementation in clang still has some issues, too.  It's hard to
get new control flow right, and there are minimal users outside the
kernel to help us validate.

So as much of a fan of feature detection as I am, I admit some of
these edge cases aren't perfect, and we may need to result to version
detection when such bugs become observable to users.

I'm happy to ack this patch, but I would like to wait for feedback
from Andrew as to whether we can be even more precise with avoiding
more specific versions of gcc 12 and 13 (if necessary).

> Fixes: 587f17018a2c ("Kconfig: add config option for asm goto w/ outputs")
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: kvm@vger.kernel.org
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>
> Linus, I'm sending to you directly as this seems urgent enough to apply
> straightaway, and this obviously affects much more than the build system.
>
>  init/Kconfig | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index deda3d14135b..f4e46d64c1e7 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -82,6 +82,11 @@ config CC_CAN_LINK_STATIC
>         default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m32-flag) -static)
>
>  config CC_HAS_ASM_GOTO_OUTPUT
> +       # gcc-11 has a nasty bug where it doesn't treat asm goto as volatile,
> +       # which can result in asm blocks being dropped when compiling with -02.
> +       # Note, explicitly forcing volatile doesn't entirely fix the bug!
> +       # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979
> +       depends on !CC_IS_GCC || GCC_VERSION >= 120000

LGTM; but we might need to be more specific about avoiding certain min
versions of gcc 13 and 12.

>         def_bool $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
>
>  config CC_HAS_ASM_GOTO_TIED_OUTPUT
>
> base-commit: 047371968ffc470769f541d6933e262dc7085456
> --
> 2.43.0.687.g38aa6559b0-goog
>
  
Andrew Pinski (QUIC) Feb. 9, 2024, 5:14 p.m. UTC | #2
> -----Original Message-----
> From: Nick Desaulniers <ndesaulniers@google.com>
> Sent: Friday, February 9, 2024 9:03 AM
> To: Sean Christopherson <seanjc@google.com>; Andrew Pinski (QUIC)
> <quic_apinski@quicinc.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>; linux-
> kernel@vger.kernel.org; Masahiro Yamada <masahiroy@kernel.org>; Peter
> Zijlstra <peterz@infradead.org>; kvm@vger.kernel.org
> Subject: Re: [PATCH] Kconfig: Explicitly disable asm goto w/ outputs on gcc-11
> (and earlier)
> 
> + Andrew
> 
> Andrew, here's the full thread, I cut most of it out:
> https://lore.kernel.org/lkml/20240208220604.140859-1-
> seanjc@google.com/

So the exact versions of GCC where this is/was fixed are:
12.4.0 (not released yet)
13.2.0
14.1.0 (not released yet)

This information is listed in the bug report via the "known to work" field though since 12.4.0 is not released yet, it is listed as 12.3.1. (13.2.0 was not released at the time it was backported so it is listed as 13.1.1). The target milestone in this case lists the lowest version # where the fix is/will be included.

Thanks,
Andrew Pinski

> .
> 
> On Thu, Feb 8, 2024 at 2:06 PM Sean Christopherson <seanjc@google.com>
> wrote:
> >
> > Explicitly require gcc-12+ to enable asm goto with outputs on gcc to
> > avoid what is effectively a data corruption bug on gcc-11.  As per
> > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, "asm goto" is
> > *supposed* be implicitly volatile, but gcc-11 fails to treat it as such.
> > When compiling with -O2, failure to treat the asm block as volatile
> > can result in the entire block being discarded during optimization.
> >
> > Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979
> 
> Shoot, I was cc'ed on that bug (I think I noticed it in testing as well, and
> pointed it out to Andrew on IRC who then cc'ed me to it). I probably should
> have asked if that would cause issues at some point for the kernel.
> 
> I took a look at the test case added in that bug; it doesn't compile until gcc-13
> (specifically gcc 13.2, not gcc 13.1).  I'm curious since the bug says the fix was
> backported to gcc-12 and gcc-13. Are there specific versions of those that
> contain the fix? If so, should Sean amend his version checks below? For
> instance, was the fix backported to gcc 12.3, so users of gcc 12.2 would still
> have issues?  I can't tell in godbolt since the added test case doesn't compile
> until gcc 13.2. https://godbolt.org/z/eqaa7dfo3
> 
> My implementation in clang still has some issues, too.  It's hard to get new
> control flow right, and there are minimal users outside the kernel to help us
> validate.
> 
> So as much of a fan of feature detection as I am, I admit some of these edge
> cases aren't perfect, and we may need to result to version detection when
> such bugs become observable to users.
> 
> I'm happy to ack this patch, but I would like to wait for feedback from Andrew
> as to whether we can be even more precise with avoiding more specific
> versions of gcc 12 and 13 (if necessary).
> 
> > Fixes: 587f17018a2c ("Kconfig: add config option for asm goto w/
> > outputs")
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: kvm@vger.kernel.org
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >
> > Linus, I'm sending to you directly as this seems urgent enough to
> > apply straightaway, and this obviously affects much more than the build
> system.
> >
> >  init/Kconfig | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/init/Kconfig b/init/Kconfig index
> > deda3d14135b..f4e46d64c1e7 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -82,6 +82,11 @@ config CC_CAN_LINK_STATIC
> >         default $(success,$(srctree)/scripts/cc-can-link.sh $(CC)
> > $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m32-flag) -static)
> >
> >  config CC_HAS_ASM_GOTO_OUTPUT
> > +       # gcc-11 has a nasty bug where it doesn't treat asm goto as volatile,
> > +       # which can result in asm blocks being dropped when compiling with -
> 02.
> > +       # Note, explicitly forcing volatile doesn't entirely fix the bug!
> > +       # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979
> > +       depends on !CC_IS_GCC || GCC_VERSION >= 120000
> 
> LGTM; but we might need to be more specific about avoiding certain min
> versions of gcc 13 and 12.
> 
> >         def_bool $(success,echo 'int foo(int x) { asm goto ("":
> > "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o
> > /dev/null)
> >
> >  config CC_HAS_ASM_GOTO_TIED_OUTPUT
> >
> > base-commit: 047371968ffc470769f541d6933e262dc7085456
> > --
> > 2.43.0.687.g38aa6559b0-goog
> >
> 
> 
> --
> Thanks,
> ~Nick Desaulniers
  
Linus Torvalds Feb. 9, 2024, 5:55 p.m. UTC | #3
On Fri, 9 Feb 2024 at 09:14, Andrew Pinski (QUIC)
<quic_apinski@quicinc.com> wrote:
>
> So the exact versions of GCC where this is/was fixed are:
> 12.4.0 (not released yet)
> 13.2.0
> 14.1.0 (not released yet)

Looking at the patch that the bugzilla says is the fix, it *looks*
like it's just the "mark volatile" that is missing.

But Sean says that  even if we mark "asm goto" as volatile manually,
it still fails.

So there seems to be something else going on in addition to just the volatile.

Side note: the reason we have that "asm_volatile_goto()" define in the
kernel is that we *used* to have a _different_ workaround for a gcc
bug in this area:

 /*
  * GCC 'asm goto' miscompiles certain code sequences:
  *
  *   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
  *
  * Work it around via a compiler barrier quirk suggested by Jakub Jelinek.
  *
  * (asm goto is automatically volatile - the naming reflects this.)
  */
 #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)

and looking at that (old) bugzilla there seems to be a lot of "seems
to be fixed", but it's not entirely clear.

We've removed that workaround in commit 43c249ea0b1e ("compiler-gcc.h:
remove ancient workaround for gcc PR 58670"), I'm wondering if maybe
that removal was a bit optimistic.

                Linus
  
Sean Christopherson Feb. 9, 2024, 6:43 p.m. UTC | #4
On Fri, Feb 09, 2024, Linus Torvalds wrote:
> On Fri, 9 Feb 2024 at 09:14, Andrew Pinski (QUIC)
> <quic_apinski@quicinc.com> wrote:
> >
> > So the exact versions of GCC where this is/was fixed are:
> > 12.4.0 (not released yet)
> > 13.2.0
> > 14.1.0 (not released yet)
> 
> Looking at the patch that the bugzilla says is the fix, it *looks*
> like it's just the "mark volatile" that is missing.
> 
> But Sean says that  even if we mark "asm goto" as volatile manually,
> it still fails.
> 
> So there seems to be something else going on in addition to just the volatile.

Aha!  Yeah, there's a second bug that set things up so that the "not implicitly
volatile" bug could rear its head.  (And now I feel less bad for not suspecting
the compiler sooner, because it didn't occur to me that gcc could possibly think
the asm blob had no used outputs).

With "volatile" forced, gcc generates code for the asm blob, but doesn't actually
consume the output of the VMREAD.  As a result, the optimization pass sees the
unused output and throws it away because the blob isn't treated as volatile.

   vmread %rax,%rax       <= output register is unused
   jbe    0xffffffff8109994a <sync_vmcs02_to_vmcs12+1898>
   xor    %r12d,%r12d     <= one of the "return 0" statements
   mov    %r12,0xf0(%rbx) <= store the output

> Side note: the reason we have that "asm_volatile_goto()" define in the
> kernel is that we *used* to have a _different_ workaround for a gcc
> bug in this area:
> 
>  /*
>   * GCC 'asm goto' miscompiles certain code sequences:
>   *
>   *   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
>   *
>   * Work it around via a compiler barrier quirk suggested by Jakub Jelinek.
>   *
>   * (asm goto is automatically volatile - the naming reflects this.)
>   */
>  #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> 
> and looking at that (old) bugzilla there seems to be a lot of "seems
> to be fixed", but it's not entirely clear.
> 
> We've removed that workaround in commit 43c249ea0b1e ("compiler-gcc.h:
> remove ancient workaround for gcc PR 58670"), I'm wondering if maybe
> that removal was a bit optimistic.

FWIW, reverting that does restore correct behavior on gcc-11.

Note, this is 100% reproducible across multiple systems, though AFAICT it's
somewhat dependent on the .config.  Holler if anyone wants the .config.
  
Nick Desaulniers Feb. 9, 2024, 6:55 p.m. UTC | #5
On Fri, Feb 9, 2024 at 10:43 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 09, 2024, Linus Torvalds wrote:
> > On Fri, 9 Feb 2024 at 09:14, Andrew Pinski (QUIC)
> > <quic_apinski@quicinc.com> wrote:
> > >
> > > So the exact versions of GCC where this is/was fixed are:
> > > 12.4.0 (not released yet)
> > > 13.2.0
> > > 14.1.0 (not released yet)
> >
> > Looking at the patch that the bugzilla says is the fix, it *looks*
> > like it's just the "mark volatile" that is missing.
> >
> > But Sean says that  even if we mark "asm goto" as volatile manually,
> > it still fails.
> >
> > So there seems to be something else going on in addition to just the volatile.
>
> Aha!  Yeah, there's a second bug that set things up so that the "not implicitly
> volatile" bug could rear its head.  (And now I feel less bad for not suspecting
> the compiler sooner, because it didn't occur to me that gcc could possibly think
> the asm blob had no used outputs).
>
> With "volatile" forced, gcc generates code for the asm blob, but doesn't actually
> consume the output of the VMREAD.  As a result, the optimization pass sees the
> unused output and throws it away because the blob isn't treated as volatile.
>
>    vmread %rax,%rax       <= output register is unused
>    jbe    0xffffffff8109994a <sync_vmcs02_to_vmcs12+1898>
>    xor    %r12d,%r12d     <= one of the "return 0" statements
>    mov    %r12,0xf0(%rbx) <= store the output
>
> > Side note: the reason we have that "asm_volatile_goto()" define in the
> > kernel is that we *used* to have a _different_ workaround for a gcc
> > bug in this area:
> >
> >  /*
> >   * GCC 'asm goto' miscompiles certain code sequences:
> >   *
> >   *   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
> >   *
> >   * Work it around via a compiler barrier quirk suggested by Jakub Jelinek.
> >   *
> >   * (asm goto is automatically volatile - the naming reflects this.)
> >   */
> >  #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> >
> > and looking at that (old) bugzilla there seems to be a lot of "seems
> > to be fixed", but it's not entirely clear.
> >
> > We've removed that workaround in commit 43c249ea0b1e ("compiler-gcc.h:

Interesting, I thought I had proposed removing that earlier and Linus
had yelled about doing so.
https://lore.kernel.org/lkml/20180907222109.163802-1-ndesaulniers@google.com/
seems like in ~2018 I was trying to, but can't seem to find the thread
where Linus pushed back.

> > remove ancient workaround for gcc PR 58670"), I'm wondering if maybe
> > that removal was a bit optimistic.
>
> FWIW, reverting that does restore correct behavior on gcc-11.

And also pessimizes all asm gotos (for gcc) including ones that don't
contain output as described in 43c249ea0b1e.  The version checks would
at least not pessimize those.

>
> Note, this is 100% reproducible across multiple systems, though AFAICT it's
> somewhat dependent on the .config.  Holler if anyone wants the .config.
  
Linus Torvalds Feb. 9, 2024, 6:56 p.m. UTC | #6
On Fri, 9 Feb 2024 at 10:43, Sean Christopherson <seanjc@google.com> wrote:
>
> > We've removed that workaround in commit 43c249ea0b1e ("compiler-gcc.h:
> > remove ancient workaround for gcc PR 58670"), I'm wondering if maybe
> > that removal was a bit optimistic.
>
> FWIW, reverting that does restore correct behavior on gcc-11.

Hmm. I suspect we'll just have to revert that commit then - although
probably in some form where it only affects the "this has outputs"
case.

Which is much less common than the non-output "asm goto" case.

It does cause gcc to generate fairly horrific code (as noted in the
commit), but it's almost certainly still better code than what the
non-asm-goto case results in.

We have very few uses of CC_HAS_ASM_GOTO_OUTPUT (and the related
CC_HAS_ASM_GOTO_TIED_OUTPUT), but unsafe_get_user() in particular
generates horrid code without it.

But it would be really good to understand *what* that secondary bug
is, and the fix for it. Just to make sure that gcc is really fixed,
and there isn't some pending bug that we just keep hiding with that
extra empty volatile asm.

Andrew?

                   Linus
  
Linus Torvalds Feb. 9, 2024, 7:01 p.m. UTC | #7
On Fri, 9 Feb 2024 at 10:55, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> And also pessimizes all asm gotos (for gcc) including ones that don't
> contain output as described in 43c249ea0b1e.  The version checks would
> at least not pessimize those.

Yeah, no, we should limit that workaround only to the asm goto with
outputs case.

We should also probably get rid of the existing "asm_volatile_goto()"
macro name entirely. That name was always pretty horrific, in that it
didn't even mark the asm as volatile even in the case where it did
anything.

So the name of that macro made little sense, and the new workaround
should be only for asm goto with outputs. So I'd suggest jmaking a new
macro with that name:

   #define asm_goto_output(x...)

and on gcc use that old workaround, and on clang just make it be a
plain "asm goto".

Hmm?

            Linus
  
Nick Desaulniers Feb. 9, 2024, 7:20 p.m. UTC | #8
On Fri, Feb 9, 2024 at 11:01 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, 9 Feb 2024 at 10:55, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > And also pessimizes all asm gotos (for gcc) including ones that don't
> > contain output as described in 43c249ea0b1e.  The version checks would
> > at least not pessimize those.
>
> Yeah, no, we should limit that workaround only to the asm goto with
> outputs case.
>
> We should also probably get rid of the existing "asm_volatile_goto()"
> macro name entirely. That name was always pretty horrific, in that it
> didn't even mark the asm as volatile even in the case where it did
> anything.

+1.

43c249ea0b1e could have done so, but I'm guessing "tree wide changes
in Linux are not fun" was perhaps the reason it wasn't done so then.

I also think folks are too aggressive putting volatile on asm
statements that might not need them; it's definitely less cognitive
burden to just always put `volatile` on inline asm but I suspect
that's leaving some performance on the floor in certain cases.  (I had
a patch to clang to warn when the volatile was unnecessary in cases
where it was explicit, but that was shot down in code review as being
harassing to users).

>
> So the name of that macro made little sense, and the new workaround
> should be only for asm goto with outputs. So I'd suggest jmaking a new
> macro with that name:
>
>    #define asm_goto_output(x...)
>
> and on gcc use that old workaround, and on clang just make it be a
> plain "asm goto".
>
> Hmm?

Thinking through the tradeoffs, the Kconfig approach would pessimize
GCC versions with the "lack of implicit volatile" bug to not use asm
goto w/ outputs at all.

Having a new macro would just make the `volatile` qualifier explicit,
which is a no-op on gcc versions that don't contain the bug (while
allowing them to use asm goto with outputs, which is probably better
for codegen).

So I agree a new macro seems better than disabling things for kconfig.
(Assuming the only issue is the need to make `volatile` explicit for a
few GCC versions; it's not clear to me from Sean's latest response if
there's more than one bug here).  I'm not signing up to shave either
yak though.
  
Andrew Pinski (QUIC) Feb. 9, 2024, 7:26 p.m. UTC | #9
> -----Original Message-----
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Friday, February 9, 2024 10:56 AM
> To: Sean Christopherson <seanjc@google.com>
> Cc: Andrew Pinski (QUIC) <quic_apinski@quicinc.com>; Nick Desaulniers
> <ndesaulniers@google.com>; linux-kernel@vger.kernel.org; Masahiro Yamada
> <masahiroy@kernel.org>; Peter Zijlstra <peterz@infradead.org>;
> kvm@vger.kernel.org
> Subject: Re: [PATCH] Kconfig: Explicitly disable asm goto w/ outputs on gcc-11
> (and earlier)
> 
> On Fri, 9 Feb 2024 at 10:43, Sean Christopherson <seanjc@google.com>
> wrote:
> >
> > > We've removed that workaround in commit 43c249ea0b1e ("compiler-
> gcc.h:
> > > remove ancient workaround for gcc PR 58670"), I'm wondering if maybe
> > > that removal was a bit optimistic.
> >
> > FWIW, reverting that does restore correct behavior on gcc-11.
> 
> Hmm. I suspect we'll just have to revert that commit then - although probably
> in some form where it only affects the "this has outputs"
> case.
> 
> Which is much less common than the non-output "asm goto" case.
> 
> It does cause gcc to generate fairly horrific code (as noted in the commit), but
> it's almost certainly still better code than what the non-asm-goto case results
> in.
> 
> We have very few uses of CC_HAS_ASM_GOTO_OUTPUT (and the related
> CC_HAS_ASM_GOTO_TIED_OUTPUT), but unsafe_get_user() in particular
> generates horrid code without it.
> 
> But it would be really good to understand *what* that secondary bug is, and
> the fix for it. Just to make sure that gcc is really fixed, and there isn't some
> pending bug that we just keep hiding with that extra empty volatile asm.
> 
> Andrew?
> 

Let me first state inline-asm without output has always been volatile; `asm goto` or not (this has been true since 2.95.3 or even before).
The issue is with an output inline-asm is no longer volatile. But the documentation of GCC stated all `asm goto` was volatile. 
So the side effect of making all `asm goto` as volatile does have an effect on the ones without an output since they were already treated internally and treated similarly to the inline-asm without an output.
The fix that was done to GCC was now treat all `asm goto` as volatile instead of just the ones without output.
So basically what would happen is internally GCC would remove the `asm goto` with an output if the output was unused. This would cause havoc in the internal state of GCC because the CFG would not be up to date to what the rest of the IR was. 

Oh and there was a bug in GCC 12.1.0-12.3.0, and GCC 13.1.0 releases which might be the reason why folks are hitting issues even with the volatile added back, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 was the fix there. 

Note there was/is a new different  for `asm goto` was fixed in the past few weeks where some of the optimizations was not ready to handle them (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422); this was just fixed on all of the open release branches but NOT in any release yet; it will be in 11.5.0, 12.4.0, 13.3.0 and 14.1.0.



>                    Linus
  
Linus Torvalds Feb. 9, 2024, 8:39 p.m. UTC | #10
On Fri, 9 Feb 2024 at 11:01, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> We should also probably get rid of the existing "asm_volatile_goto()"
> macro name entirely. That name was always pretty horrific, in that it
> didn't even mark the asm as volatile even in the case where it did
> anything.
>
> So the name of that macro made little sense, and the new workaround
> should be only for asm goto with outputs. So I'd suggest jmaking a new
> macro with that name:
>
>    #define asm_goto_output(x...)
>
> and on gcc use that old workaround, and on clang just make it be a
> plain "asm goto".

So here's a suggested patch that does this.

It's largely done with "git grep" and "sed -i", plus some manual
fixups for the (few) cases where we have outputs.

It looks superficially sane to me, and it passed an allmodconfig build
with gcc, but I'm not going to claim that it is really tested.

Sean? Does this work for the case you noticed?

Basically this gets rid of the old "asm_volatile_goto()" entirely as
useless, but replaces it with "asm_goto_outputs()" for the places
where we have outputs.

And then for gcc, it makes those cases

 (a) use "asm volatile goto" to fix the fact that some versions of gcc
will have missed the "volatile"

 (b) adds that extra empty asm as a second barrier after the "real"
asm goto statement

That (b) is very much voodoo programming, but it matches the old magic
barrier thing that Jakub Jelinek suggested for the really *old* gcc
bug wrt plain (non-output) "asm goto". The underlying bug for _that_
was fixed long ago:

    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670

We removed that for plain "asm goto" workaround a couple of years ago,
so "asm_volatile_goto()" has been a no-op since June 2022, but this
now resurrects that hack for the output case.

I'm not loving it, but Sean seemed to confirm that it fixes the code
generation problem, so ...

Adding Uros to the cc, since he is both involved with gcc and with the
previous asm goto workaround removal, so maybe he has other
suggestions. Uros, see

    https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/

for background.

Also adding Jakub since I'm re-using the hack he suggested for a
different - but similar - case. He may have strong opinions too, and
may object to that particular monkey-see-monkey-do voodoo programming.

                Linus
  
Sean Christopherson Feb. 9, 2024, 9:46 p.m. UTC | #11
On Fri, Feb 09, 2024, Linus Torvalds wrote:
> Sean? Does this work for the case you noticed?

Yep.  You can quite literally see the effect of the asm("").  A "good" sequence
directly propagates the result from the VMREAD's destination register to its
final destination

  <+1756>:  mov    $0x280e,%r13d
  <+1762>:  vmread %r13,%r13
  <+1766>:  jbe    0x209fa <sync_vmcs02_to_vmcs12+1834>
  <+1768>:  mov    %r13,0xe8(%rbx)

whereas the "bad" sequence bounces through a different register.

  <+1780>:  mov    $0x2810,%eax
  <+1785>:  vmread %rax,%rax
  <+1788>:  jbe    0x209e4 <sync_vmcs02_to_vmcs12+1812>
  <+1790>:  mov    %rax,%r12
  <+1793>:  mov    %r12,0xf0(%rbx)

> That (b) is very much voodoo programming, but it matches the old magic
> barrier thing that Jakub Jelinek suggested for the really *old* gcc
> bug wrt plain (non-output) "asm goto". The underlying bug for _that_
> was fixed long ago:
> 
>     http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
> 
> We removed that for plain "asm goto" workaround a couple of years ago,
> so "asm_volatile_goto()" has been a no-op since June 2022, but this
> now resurrects that hack for the output case.
> 
> I'm not loving it, but Sean seemed to confirm that it fixes the code
> generation problem, so ...

Yeah, I'm in the same boat.
  
David Laight Feb. 10, 2024, 5:21 p.m. UTC | #12
From: Sean Christopherson
> Sent: 09 February 2024 21:47
> 
> On Fri, Feb 09, 2024, Linus Torvalds wrote:
> > Sean? Does this work for the case you noticed?
> 
> Yep.  You can quite literally see the effect of the asm("").  A "good" sequence
> directly propagates the result from the VMREAD's destination register to its
> final destination
> 
>   <+1756>:  mov    $0x280e,%r13d
>   <+1762>:  vmread %r13,%r13
>   <+1766>:  jbe    0x209fa <sync_vmcs02_to_vmcs12+1834>
>   <+1768>:  mov    %r13,0xe8(%rbx)
> 
> whereas the "bad" sequence bounces through a different register.
> 
>   <+1780>:  mov    $0x2810,%eax
>   <+1785>:  vmread %rax,%rax
>   <+1788>:  jbe    0x209e4 <sync_vmcs02_to_vmcs12+1812>
>   <+1790>:  mov    %rax,%r12
>   <+1793>:  mov    %r12,0xf0(%rbx)
..

Annoying, but I doubt it is measurable in this case.
Firstly it could easily be a 'free' register rename.
Secondly isn't vmread horribly slow anyway, so an extra
clock or two won't matter?

The double register move that OPTIMER_HIDE_VAR() often
generates is another matter entirely :-)
In the old days the peephole optimiser would (should?)
have removed most of these.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  
Uros Bizjak Feb. 11, 2024, 11:12 a.m. UTC | #13
On Fri, Feb 9, 2024 at 9:39 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, 9 Feb 2024 at 11:01, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > We should also probably get rid of the existing "asm_volatile_goto()"
> > macro name entirely. That name was always pretty horrific, in that it
> > didn't even mark the asm as volatile even in the case where it did
> > anything.
> >
> > So the name of that macro made little sense, and the new workaround
> > should be only for asm goto with outputs. So I'd suggest jmaking a new
> > macro with that name:
> >
> >    #define asm_goto_output(x...)
> >
> > and on gcc use that old workaround, and on clang just make it be a
> > plain "asm goto".
>
> So here's a suggested patch that does this.
>
> It's largely done with "git grep" and "sed -i", plus some manual
> fixups for the (few) cases where we have outputs.
>
> It looks superficially sane to me, and it passed an allmodconfig build
> with gcc, but I'm not going to claim that it is really tested.
>
> Sean? Does this work for the case you noticed?
>
> Basically this gets rid of the old "asm_volatile_goto()" entirely as
> useless, but replaces it with "asm_goto_outputs()" for the places
> where we have outputs.
>
> And then for gcc, it makes those cases
>
>  (a) use "asm volatile goto" to fix the fact that some versions of gcc
> will have missed the "volatile"
>
>  (b) adds that extra empty asm as a second barrier after the "real"
> asm goto statement
>
> That (b) is very much voodoo programming, but it matches the old magic
> barrier thing that Jakub Jelinek suggested for the really *old* gcc
> bug wrt plain (non-output) "asm goto". The underlying bug for _that_
> was fixed long ago:
>
>     http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
>
> We removed that for plain "asm goto" workaround a couple of years ago,
> so "asm_volatile_goto()" has been a no-op since June 2022, but this
> now resurrects that hack for the output case.
>
> I'm not loving it, but Sean seemed to confirm that it fixes the code
> generation problem, so ...
>
> Adding Uros to the cc, since he is both involved with gcc and with the
> previous asm goto workaround removal, so maybe he has other
> suggestions. Uros, see
>
>     https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/

I'd suggest the original poster to file a bug report in the GCC
bugzilla. This way, the bug can be properly analysed and eventually
fixed. The detailed instructions are available at
https://gcc.gnu.org/bugs/

> for background.
>
> Also adding Jakub since I'm re-using the hack he suggested for a
> different - but similar - case. He may have strong opinions too, and
> may object to that particular monkey-see-monkey-do voodoo programming.

This big-hammer approach will effectively disable all optimizations
around the asm, and should really be used only as a last resort. As
learned from the PR58670 (linked above), these workarounds can stay in
the kernel forever, and even when the compiler is fixed, there is
little incentive to remove them - developers from the kernel world do
not want to touch them, because "the workaround just works", and
developers from the compiler world do not want to touch them either,
because of unknown hidden consequences of removal.

Please note, that the kernel is a heavy user of asm statements, and in
GCC some forms of asm statements were developed specifically for
kernel use, e.g. CC-output, and asm goto in all of its flavors. They
have little or no use outside the kernel. If the kernel disables all
optimizations around these statements, there actually remain no
real-world testcases of how compiler optimizations interact with asm
statements, and when new optimizations are introduced to the compiler,
asm statemets are left behind (*).

(*) When the GCC compiler nears its release, people start to test it
on their code bases, and the kernel is one of the most important code
bases as far as the compiler is concerned. We count on bugreports from
these tests to fix possible fall-out from new optimizations, and the
big-hammer workaround will just hide the possible problems.

So, I'd suggest at least limit the workaround to known-bad compilers.

Thanks,
Uros.
  
Linus Torvalds Feb. 11, 2024, 7:59 p.m. UTC | #14
On Sun, 11 Feb 2024 at 03:12, Uros Bizjak <ubizjak@gmail.com> wrote:
>
> I'd suggest the original poster to file a bug report in the GCC
> bugzilla. This way, the bug can be properly analysed and eventually
> fixed. The detailed instructions are available at
> https://gcc.gnu.org/bugs/

Yes, please. Sean?

In order to *not* confuse it with the "asm goto with output doesn't
imply volatile" bugs, could you make a bug report that talks purely
about the code generation issue that happens even with a manually
added volatile (your third code sequence in your original email)?

> So, I'd suggest at least limit the workaround to known-bad compilers.

Right now, I don't think we know which ones are bad.

If it was just the "add a volatile by hand" issue, we'd have a good
pointer to exactly which issue it is.

The other bugzilla entries I saw talked about ICE errors - and one of
them is quite likely to be the cause of this, because it really looks
like the code issue is some internal confusion. The asm that Sean uses
doesn't even have an unused result, so even the volatile shouldn't
matter, and seems purely a case of "limiting optimizations partially
hides the issue".

And then us adding another empty asm obviously does just even more of
the same, to the point that now the code works.

So I don't think we actually know which compiler version is the fixed
one. We only know that gcc-11 doesn't work for Sean.

                  Linus
  
Jakub Jelinek Feb. 11, 2024, 8:12 p.m. UTC | #15
On Sun, Feb 11, 2024 at 11:59:49AM -0800, Linus Torvalds wrote:
> On Sun, 11 Feb 2024 at 03:12, Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > I'd suggest the original poster to file a bug report in the GCC
> > bugzilla. This way, the bug can be properly analysed and eventually
> > fixed. The detailed instructions are available at
> > https://gcc.gnu.org/bugs/
> 
> Yes, please. Sean?
> 
> In order to *not* confuse it with the "asm goto with output doesn't
> imply volatile" bugs, could you make a bug report that talks purely
> about the code generation issue that happens even with a manually
> added volatile (your third code sequence in your original email)?

Preferably for all the different cases where you suspect a compiler bug.
At minimum preprocessed source + compiler options + detailed description
where do you think the bug is (or small runtime testcase but that is
harder for issues derived from the kernel obviously).
GCC 11 is still supported upstream, so bugs reproduceable even with just
that should be filed in gcc.gnu.org/bugzilla/, if something is only
reproduceable with older compilers, guess it belongs in some distribution's
bugtrackers if those still support those compilers.
Once filed we can bisect, analyze them, fix.
ICE bugs are even easier to file, all we need is preprocessed source,
command line options and gcc version/architecture.  gcc -freport-bug
in most cases should be able to create everything in one file for the
bugreport.

As for the workarounds in the kernel, I'd also like to see only workarounds
for specific compiler versions (once filed, bisected and analyzed,
workaround could be either based on affected compiler versions, or
kernel could try to check for the compiler bug in question and only add
workaround if that bug reproduces on a short testcase.  Sure, this would
be easier in autoconf style checks, but could be done even in kernel's
makefiles.

	Jakub
  
Sean Christopherson Feb. 13, 2024, 12:15 a.m. UTC | #16
On Sun, Feb 11, 2024, Linus Torvalds wrote:
> On Sun, 11 Feb 2024 at 03:12, Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > I'd suggest the original poster to file a bug report in the GCC
> > bugzilla. This way, the bug can be properly analysed and eventually
> > fixed. The detailed instructions are available at
> > https://gcc.gnu.org/bugs/
> 
> Yes, please. Sean?
> 
> In order to *not* confuse it with the "asm goto with output doesn't
> imply volatile" bugs, could you make a bug report that talks purely
> about the code generation issue that happens even with a manually
> added volatile (your third code sequence in your original email)?

Will do.  Got a bug report ready, just waiting for a GCC Bugzilla account to be
created for me so I can file it...
  
Sean Christopherson Feb. 14, 2024, 5:20 p.m. UTC | #17
On Mon, Feb 12, 2024, Sean Christopherson wrote:
> On Sun, Feb 11, 2024, Linus Torvalds wrote:
> > On Sun, 11 Feb 2024 at 03:12, Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > I'd suggest the original poster to file a bug report in the GCC
> > > bugzilla. This way, the bug can be properly analysed and eventually
> > > fixed. The detailed instructions are available at
> > > https://gcc.gnu.org/bugs/
> > 
> > Yes, please. Sean?
> > 
> > In order to *not* confuse it with the "asm goto with output doesn't
> > imply volatile" bugs, could you make a bug report that talks purely
> > about the code generation issue that happens even with a manually
> > added volatile (your third code sequence in your original email)?
> 
> Will do.  Got a bug report ready, just waiting for a GCC Bugzilla account to be
> created for me so I can file it...

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
  
Linus Torvalds Feb. 14, 2024, 6:43 p.m. UTC | #18
On Sun, 11 Feb 2024 at 03:12, Uros Bizjak <ubizjak@gmail.com> wrote:
>
> So, I'd suggest at least limit the workaround to known-bad compilers.

Based on the current state of

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921

I would suggest this attached kernel patch, which makes the manual
"volatile" the default case (since it should make no difference except
for the known gcc issue), and limits the extra empty asm serialization
to gcc versions older than 12.1.0.

But Jakub is clearly currently trying to figure out exactly what was
going wrong, so things may change. Maybe the commit he bisected to
happened to just accidentally hide the real issue.

                Linus
  
Linus Torvalds Feb. 15, 2024, 12:11 a.m. UTC | #19
On Wed, 14 Feb 2024 at 10:43, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Based on the current state of
>
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
>
> I would suggest this attached kernel patch [...]

Well, that "current state" didn't last long, and it looks like Jakub
found the real issue and posted a suggested fix.

Anyway, the end result is that the current kernel situation - that
adds the workaround for all gcc versions - is the best that we can do
for now.

             Linus
  
Jakub Jelinek Feb. 15, 2024, 8:39 a.m. UTC | #20
On Wed, Feb 14, 2024 at 04:11:05PM -0800, Linus Torvalds wrote:
> On Wed, 14 Feb 2024 at 10:43, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Based on the current state of
> >
> >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
> >
> > I would suggest this attached kernel patch [...]
> 
> Well, that "current state" didn't last long, and it looks like Jakub
> found the real issue and posted a suggested fix.
> 
> Anyway, the end result is that the current kernel situation - that
> adds the workaround for all gcc versions - is the best that we can do
> for now.

Can it be guarded with
#if GCC_VERSION < 140100
so that it isn't forgotten?  GCC 14.1 certainly will have a fix for this
(so will GCC 13.3, 12.4 and 11.5).
Maybe it would be helpful to use
#if GCC_VERSION < 140000
while it is true that no GCC 14 snapshots until today (or whenever the fix
will be committed) have the fix, for GCC trunk it is up to the distros
to use the latest snapshot if they use it at all and would allow better
testing of the kernel code without the workaround, so that if there are
other issues they won't be discovered years later.  Most userland code
doesn't actually use asm goto with outputs...

	Jakub
  
Linus Torvalds Feb. 15, 2024, 6:25 p.m. UTC | #21
On Thu, 15 Feb 2024 at 00:39, Jakub Jelinek <jakub@redhat.com> wrote:
>
> Can it be guarded with
> #if GCC_VERSION < 140100

Ack. I'll update the workaround to do that, and add the new and
improved bugzilla pointer.

Thanks for fixing this so quickly.

                Linus
  
Linus Torvalds Feb. 15, 2024, 7:17 p.m. UTC | #22
On Thu, 15 Feb 2024 at 10:25, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, 15 Feb 2024 at 00:39, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > Can it be guarded with
> > #if GCC_VERSION < 140100
>
> Ack. I'll update the workaround to do that, and add the new and
> improved bugzilla pointer.

. and I also followed your suggestion to just consider any gcc-14
snapshots as fixed. That seemed safe, considering that in practice the
actual bug that Sean reported seems to not actually trigger with any
gcc version 12.1+ as per your bisect (and my minimal testing).

HOWEVER, when I was working through this, I noted that the *other*
part of the workaround (the "missing volatile") doesn't seem to have
been backported as aggressively.

IOW, I find that "Mark asm goto with outputs as volatile" in the
gcc-12 and gcc-13 branches, but not in gcc-11.

So I did end up making the default "asm_goto_output()" macro always
use "asm volatile goto()", so that we don't have to worry about the
other gcc issue.

End result: the extra empty asm barrier is now dependent on gcc
version in my tree, but the manual addition of 'volatile' is
unconditional.

Because it looks to me like gcc-11.5 will have your fix for the pseudo
ordering, but not Andrew Pinski's fix for missing a volatile.

Anyway, since the version dependencies were complex enough, I ended up
going with putting that logic in our Kconfig files, rather than making
the gcc-specific header file an ugly mess of #if's.

Our Kconfig files are pretty much designed for having complicated
configuration dependencies, so it ends up being quite natural there:

  config GCC_ASM_GOTO_OUTPUT_WORKAROUND
        bool
        depends on CC_IS_GCC && CC_HAS_ASM_GOTO_OUTPUT
        # Fixed in GCC 14.1, 13.3, 12.4 and 11.5
        # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
        default y if GCC_VERSION < 110500
        default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
        default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300

and having those kinds of horrid expressions as preprocessor code
included in every single compilation unit seemed just nasty.

              Linus
  
Jakub Jelinek Feb. 15, 2024, 7:26 p.m. UTC | #23
On Thu, Feb 15, 2024 at 11:17:44AM -0800, Linus Torvalds wrote:
> but the manual addition of 'volatile' is unconditional.

That is just fine.  The extra keyword is not wrong, it is just redundant
with fixed compilers, where it doesn't change anything in the behavior.

	Jakub
  

Patch

diff --git a/init/Kconfig b/init/Kconfig
index deda3d14135b..f4e46d64c1e7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -82,6 +82,11 @@  config CC_CAN_LINK_STATIC
 	default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m32-flag) -static)
 
 config CC_HAS_ASM_GOTO_OUTPUT
+	# gcc-11 has a nasty bug where it doesn't treat asm goto as volatile,
+	# which can result in asm blocks being dropped when compiling with -02.
+	# Note, explicitly forcing volatile doesn't entirely fix the bug!
+	# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979
+	depends on !CC_IS_GCC || GCC_VERSION >= 120000
 	def_bool $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
 
 config CC_HAS_ASM_GOTO_TIED_OUTPUT