[v2,3/4] x86/entry: Disable IA32 syscall if ia32_disabled is true

Message ID 20230609111311.4110901-4-nik.borisov@suse.com
State New
Headers
Series Make IA32_EMULATION boot time overridable |

Commit Message

Nikolay Borisov June 9, 2023, 11:13 a.m. UTC
  First stage of disabling ia32 compat layer is to disable 32bit syscall
entry points. Legacy int 0x80 vector is disabled by zeroing out its gate
descriptor in the idt and the sysenter vector is disabled by re-using
the existing code in case IA32_EMULATION is disabled.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
---
 arch/x86/include/asm/desc.h  |  1 +
 arch/x86/kernel/cpu/common.c | 37 ++++++++++++++++++------------------
 arch/x86/kernel/idt.c        |  7 +++++++
 3 files changed, 27 insertions(+), 18 deletions(-)
  

Comments

Thomas Gleixner June 9, 2023, 3:22 p.m. UTC | #1
On Fri, Jun 09 2023 at 14:13, Nikolay Borisov wrote:
> First stage of disabling ia32 compat layer is to disable 32bit syscall
> entry points. Legacy int 0x80 vector is disabled by zeroing out its gate
> descriptor in the idt and the sysenter vector is disabled by re-using
> the existing code in case IA32_EMULATION is disabled.

This describes WHAT the patch does without providing any context.

> +	if ((IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) ||
> +	    !IS_ENABLED(CONFIG_IA32_EMULATION)) {

I told you before that my brain based compiler complains about your
patches not building with CONFIG_IA32_EMULATION=n. The above still fails
to build.

Aside of that this condition is convoluted and can be simplified to
exactly a simple and understandable

        if (foo)

which is actually the obvious solution to make it compile in all
configurations.

It's not too much asked to flip the config switch which affects the code
you are changing for a test.

> @@ -226,6 +226,13 @@ void __init idt_setup_early_traps(void)
>  void __init idt_setup_traps(void)
>  {
>  	idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);
> +
> +	if (IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) {

Ditto.

> +		gate_desc null_desc = {};

Lacks a newline between declaration and code. It's documented to be
required, no?

> +		write_idt_entry(idt_table, IA32_SYSCALL_VECTOR, &null_desc);
> +		clear_bit(IA32_SYSCALL_VECTOR, system_vectors);
> +	}

That aside, I asked you to split IA32_SYSCALL_VECTOR out of def_idts[]
and handle it separately, no? If you disagree with me then reply to my
review first instead of ignoring me silently.

I don't care about you wasting your time, but I very much care about you
wasting my time.

Stop this right now before it becomes a habit.

Thanks,

        tglx
  
Nikolay Borisov June 9, 2023, 4:03 p.m. UTC | #2
On 9.06.23 г. 18:22 ч., Thomas Gleixner wrote:
> On Fri, Jun 09 2023 at 14:13, Nikolay Borisov wrote:
>> First stage of disabling ia32 compat layer is to disable 32bit syscall
>> entry points. Legacy int 0x80 vector is disabled by zeroing out its gate
>> descriptor in the idt and the sysenter vector is disabled by re-using
>> the existing code in case IA32_EMULATION is disabled.
> 
> This describes WHAT the patch does without providing any context.
> 
>> +	if ((IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) ||
>> +	    !IS_ENABLED(CONFIG_IA32_EMULATION)) {
> 
> I told you before that my brain based compiler complains about your
> patches not building with CONFIG_IA32_EMULATION=n. The above still fails
> to build.

Yes, it does. My bad.

> 
> Aside of that this condition is convoluted and can be simplified to
> exactly a simple and understandable
> 
>          if (foo)
> 
> which is actually the obvious solution to make it compile in all
> configurations.

I fail to see how this can be done the way you suggest given that 
ia32_disabled is visible iff IA32_EMULATION is selected, this means an 
#ifdef is mandatory so that ia32_disabled is checked when we know it 
will exist as a symbol, the same applies for the entry_SYSCALL_compat 
symbol which has to be used iff IA32_EMULATION is defined. I.e the 
ignore code should also be duplicated in the #ifdef IA32_EMULATION && 
ia32_disabled and in the #else  condition.

> 
> It's not too much asked to flip the config switch which affects the code
> you are changing for a test.

Sorry, missed it the first time.

> 
>> @@ -226,6 +226,13 @@ void __init idt_setup_early_traps(void)
>>   void __init idt_setup_traps(void)
>>   {
>>   	idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);
>> +
>> +	if (IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) {
> 
> Ditto.

This actually doesn't fail, because if IA32_EMULATION is n that 
conditional expands to 'if (0 && ia32_disabled)' which is eliminated by 
the compiler.

> 
>> +		gate_desc null_desc = {};
> 
> Lacks a newline between declaration and code. It's documented to be
> required, no?
> 
>> +		write_idt_entry(idt_table, IA32_SYSCALL_VECTOR, &null_desc);
>> +		clear_bit(IA32_SYSCALL_VECTOR, system_vectors);
>> +	}
> 
> That aside, I asked you to split IA32_SYSCALL_VECTOR out of def_idts[]
> and handle it separately, no? If you disagree with me then reply to my
> review first instead of ignoring me silently.

I tried doing this but it's no go since def_its is defined statically. 
Since tha IA32_SYSCALL_VECTOR is the last one it can't simply be tacked 
at the end of this array in a separate place. Hence the only viable 
solution ( apart from making def_its a dynamically sized array) was to 
simply overwrite IA32_SYSCALL_VECTOR in idt_table before it's being 
loaded into the ldtr.

<snip>
  
Nikolay Borisov June 9, 2023, 4:13 p.m. UTC | #3
On 9.06.23 г. 19:03 ч., Nikolay Borisov wrote:
> 

<snip>

> 
>>
>>> @@ -226,6 +226,13 @@ void __init idt_setup_early_traps(void)
>>>   void __init idt_setup_traps(void)
>>>   {
>>>       idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), 
>>> true);
>>> +
>>> +    if (IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) {
>>
>> Ditto.
> 
> This actually doesn't fail, because if IA32_EMULATION is n that 
> conditional expands to 'if (0 && ia32_disabled)' which is eliminated by 
> the compiler.

Forget this one, it works because I made ia32_disabled always compiled, 
hoowever, the problem in syscall_init() remains because now 
entry_SYSCALL_compat is also compiled in iff CONFIG_IA32_EMULATION is 
selected.

<snip>
  
Thomas Gleixner June 10, 2023, 11:26 a.m. UTC | #4
On Fri, Jun 09 2023 at 19:03, Nikolay Borisov wrote:
> On 9.06.23 г. 18:22 ч., Thomas Gleixner wrote:
>>> +	if ((IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) ||
>>> +	    !IS_ENABLED(CONFIG_IA32_EMULATION)) {
>> 
>> Aside of that this condition is convoluted and can be simplified to
>> exactly a simple and understandable
>> 
>>          if (foo)
>> 
>> which is actually the obvious solution to make it compile in all
>> configurations.
>
> I fail to see how this can be done the way you suggest given that 
> ia32_disabled is visible iff IA32_EMULATION is selected, this means an 
> #ifdef is mandatory so that ia32_disabled is checked when we know it 
> will exist as a symbol, the same applies for the entry_SYSCALL_compat 
> symbol which has to be used iff IA32_EMULATION is defined. I.e the 
> ignore code should also be duplicated in the #ifdef IA32_EMULATION && 
> ia32_disabled and in the #else  condition.

That's wrong in every aspect. Neither the #ifdeffery nor the code
duplication is mandatory.

arch/x86/include/asm/XXXX.h

#ifdef CONFIG_IA32_EMULATION
extern bool __ia32_enabled;

static inline bool ia32_enabled(void)
{
        return __ia32_enabled;
}
#else
static inline bool ia32_enabled(void)
{
        return IS_ENABLED(CONFIG_X86_32);
}
#endif

        if (ia32_enabled()) {
           ...
        } else {
           ...
        }

Which avoids the #ifdeffery in the code _and_ the code duplication.

All it needs aside of the above is to make sure that the other two
things which the compiler complains about being undeclared in the
EMULATION=n case are treated differently in the relevant header
file. It's not rocket science. See also below.

If you chose $XXXX.h carefully it simply works for everything including
asm/elf.h.

I surely wouldn't have suggested it if it wouldn't be feasible and
reasonably trivial. I wanted you to figure it out yourself instead of
serving you the solution on a silver tablet. There are tons of examples
in the code.

>>> @@ -226,6 +226,13 @@ void __init idt_setup_early_traps(void)
>>>   void __init idt_setup_traps(void)
>>>   {
>>>   	idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);
>>> +
>>> +	if (IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) {
>> 
>> Ditto.
>
> This actually doesn't fail, because if IA32_EMULATION is n that 
> conditional expands to 'if (0 && ia32_disabled)' which is eliminated by 
> the compiler.

Making uninformed claims does not make it more correct.

This _cannot_ compile because the dead code elimination pass is not even
reached due to ia32_disabled being undeclared.

        declaration != definition
and
        #ifdef CONSTANT != if (CONSTANT)

Compiler basics.

You could have spared yourself the embarrassment and the lecture by
compiling that file with IA32_EMULATION=n or by carefully analysing the
compile fail of cpu/common.c. That code is not any different:

>>> +	if ((IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) ||
>>> +	    !IS_ENABLED(CONFIG_IA32_EMULATION)) {

and the compiler also complains rightfully about ia32_disabled being
undeclared _before_ complaining about entry_*_compat.

So:

// Declaration
extern bool foo;

       if (0 && foo)
          ....

compiles and links without ever defining 'foo'.

While:

#if 0
extern bool foo;
#endif

        if (0 && foo)
          ....

already fails to compile due to -Werror

See?

>>> +		gate_desc null_desc = {};
>> 
>> Lacks a newline between declaration and code. It's documented to be
>> required, no?
>> 
>>> +		write_idt_entry(idt_table, IA32_SYSCALL_VECTOR, &null_desc);
>>> +		clear_bit(IA32_SYSCALL_VECTOR, system_vectors);
>>> +	}
>> 
>> That aside, I asked you to split IA32_SYSCALL_VECTOR out of def_idts[]
>> and handle it separately, no? If you disagree with me then reply to my
>> review first instead of ignoring me silently.
>
> I tried doing this but it's no go since def_its is defined statically. 
> Since tha IA32_SYSCALL_VECTOR is the last one it can't simply be tacked 
> at the end of this array in a separate place. Hence the only viable 
> solution ( apart from making def_its a dynamically sized array) was to 
> simply overwrite IA32_SYSCALL_VECTOR in idt_table before it's being 
> loaded into the ldtr.

Obviously we have fundamentally different interpretations of the phrase
'split IA32_SYSCALL_VECTOR out of def_idts[] and handle it separately':

This splits it out:

 def_idts[] = {
 ... 
-#if defined(CONFIG_IA32_EMULATION)
-	SYSG(IA32_SYSCALL_VECTOR,	entry_INT80_compat),
-#elif defined(CONFIG_X86_32)
-	SYSG(IA32_SYSCALL_VECTOR,	entry_INT80_32),
-#endif
 };

+ia32_idt[] = {
+#if defined(CONFIG_IA32_EMULATION)
+	SYSG(IA32_SYSCALL_VECTOR,	entry_INT80_compat),
+#elif defined(CONFIG_X86_32)
+	SYSG(IA32_SYSCALL_VECTOR,	entry_INT80_32),
+#endif
+};

This handles it separately:

	idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);

+	if (ia32_enabled())
+		idt_setup_from_table(idt_table, ia32_idt,....);

Which makes it bloody obvious what this is about.

Are you still convinced that the only viable solution is clearing it
after the fact?

So please go and ensure that all config combinations work and that it
builds and works for 32bit too. The latter fails to build because of
including traps.h into an header for no reason.

There is absolutely no urgency to get this into the tree, so please take
your time and do not rush out the next half baken version of this,
unless you aim for the fast path to my ignore list.

Thanks,

        tglx
  

Patch

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index ab97b22ac04a..1182a5b10be9 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -8,6 +8,7 @@ 
 #include <asm/fixmap.h>
 #include <asm/irq_vectors.h>
 #include <asm/cpu_entry_area.h>
+#include <asm/traps.h>
 
 #include <linux/debug_locks.h>
 #include <linux/smp.h>
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b20774181e1a..3c4055184d0f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2053,24 +2053,25 @@  void syscall_init(void)
 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
 	wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
 
-#ifdef CONFIG_IA32_EMULATION
-	wrmsrl_cstar((unsigned long)entry_SYSCALL_compat);
-	/*
-	 * This only works on Intel CPUs.
-	 * On AMD CPUs these MSRs are 32-bit, CPU truncates MSR_IA32_SYSENTER_EIP.
-	 * This does not cause SYSENTER to jump to the wrong location, because
-	 * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit).
-	 */
-	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
-	wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
-		    (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1));
-	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
-#else
-	wrmsrl_cstar((unsigned long)entry_SYSCALL32_ignore);
-	wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG);
-	wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
-	wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL);
-#endif
+	if ((IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) ||
+	    !IS_ENABLED(CONFIG_IA32_EMULATION)) {
+		wrmsrl_cstar((unsigned long)entry_SYSCALL32_ignore);
+		wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG);
+		wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
+		wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL);
+	} else {
+		wrmsrl_cstar((unsigned long)entry_SYSCALL_compat);
+		/*
+		 * This only works on Intel CPUs.
+		 * On AMD CPUs these MSRs are 32-bit, CPU truncates MSR_IA32_SYSENTER_EIP.
+		 * This does not cause SYSENTER to jump to the wrong location, because
+		 * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit).
+		 */
+		wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
+		wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
+			    (unsigned long)(cpu_entry_stack(smp_processor_id()) + 1));
+		wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
+	}
 
 	/*
 	 * Flags to clear on syscall; clear as much as possible
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index a58c6bc1cd68..d1f388ef2e66 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -226,6 +226,13 @@  void __init idt_setup_early_traps(void)
 void __init idt_setup_traps(void)
 {
 	idt_setup_from_table(idt_table, def_idts, ARRAY_SIZE(def_idts), true);
+
+	if (IS_ENABLED(CONFIG_IA32_EMULATION) && ia32_disabled) {
+		gate_desc null_desc = {};
+		write_idt_entry(idt_table, IA32_SYSCALL_VECTOR, &null_desc);
+		clear_bit(IA32_SYSCALL_VECTOR, system_vectors);
+	}
+
 }
 
 #ifdef CONFIG_X86_64