[1/1] x86/vdso: use the right GDT_ENTRY_CPUNODE for 32-bit getcpu() on 64-bit kernel

Message ID 20230311084824.2340-1-xin3.li@intel.com
State New
Headers
Series [1/1] x86/vdso: use the right GDT_ENTRY_CPUNODE for 32-bit getcpu() on 64-bit kernel |

Commit Message

Li, Xin3 March 11, 2023, 8:48 a.m. UTC
  The vDSO getcpu() reads CPU ID from the GDT_ENTRY_CPUNODE entry when the RDPID
instruction is not available. And GDT_ENTRY_CPUNODE is defined as 28 on 32-bit
Linux kernel and 15 on 64-bit. But the 32-bit getcpu() on 64-bit Linux kernel
is compiled with 32-bit Linux kernel GDT_ENTRY_CPUNODE, i.e., 28, beyond the
64-bit Linux kernel GDT limit. Thus, it just fails _silently_.

Compile the 32-bit getcpu() on 64-bit Linux kernel with the 64-bit Linux kernel
GDT_ENTRY_CPUNODE.

Fixes: 877cff5296faa6e ("x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpu")
Reported-by: kernel test robot <yujie.liu@intel.com>
https://lore.kernel.org/oe-lkp/202303020903.b01fd1de-yujie.liu@intel.com/
Reported-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/segment.h | 4 ++++
 1 file changed, 4 insertions(+)
  

Comments

Dave Hansen March 13, 2023, 3:22 p.m. UTC | #1
On 3/11/23 00:48, Xin Li wrote:
>  #define GDT_ENTRY_ESPFIX_SS		26
>  #define GDT_ENTRY_PERCPU		27
> +#ifndef BUILD_VDSO32_64
>  #define GDT_ENTRY_CPUNODE		28
> +#else
> +#define GDT_ENTRY_CPUNODE		15
> +#endif

Isn't this kinda a hack?

First, it means that we'll now have two duplicate versions of this:

	#define GDT_ENTRY_CPUNODE		15

in the same file.

Second, if any other users of fake_32bit_build.h for the VDSO show up,
they'll need a similar #ifdef.

I think I'd much rather if we define all of the GDT_ENTRY_* macros in
*one* place, then make that *one* place depend on BUILD_VDSO32_64.

Also, about the *silent* failure...  Do we not have a selftest for this
somewhere?
  
Li, Xin3 March 13, 2023, 5:42 p.m. UTC | #2
> > +#ifndef BUILD_VDSO32_64
> >  #define GDT_ENTRY_CPUNODE		28
> > +#else
> > +#define GDT_ENTRY_CPUNODE		15
> > +#endif
> 
> Isn't this kinda a hack?
> 
> First, it means that we'll now have two duplicate versions of this:
> 
> 	#define GDT_ENTRY_CPUNODE		15
> 
> in the same file.
> 
> Second, if any other users of fake_32bit_build.h for the VDSO show up, they'll
> need a similar #ifdef.
> 
> I think I'd much rather if we define all of the GDT_ENTRY_* macros in
> *one* place, then make that *one* place depend on BUILD_VDSO32_64.

Sounds a better way, let me try.

> Also, about the *silent* failure...  Do we not have a selftest for this somewhere?

When lsl is used, we should check ZF which indicates whether the segment limit
is loaded successfully.  Seems we need to refactor vdso_read_cpunode() a bit.

  Xin
  
Li, Xin3 March 18, 2023, 8:14 a.m. UTC | #3
> > I think I'd much rather if we define all of the GDT_ENTRY_* macros in
> > *one* place, then make that *one* place depend on BUILD_VDSO32_64.
> 
> Sounds a better way, let me try.

Hi Dave,

I tried the following patch, and it works:

diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 794f69625780..9d6411c65920 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -56,7 +56,7 @@

 #define GDT_ENTRY_INVALID_SEG  0

-#ifdef CONFIG_X86_32
+#if defined(CONFIG_X86_32) && !defined(BUILD_VDSO32_64)
 /*
  * The layout of the per-CPU GDT under Linux:
  *


It's simpler and looks reasonable to me. Is it what you suggested?

Thanks!
  Xin


> 
> > Also, about the *silent* failure...  Do we not have a selftest for this
> somewhere?
> 
> When lsl is used, we should check ZF which indicates whether the segment limit is
> loaded successfully.  Seems we need to refactor vdso_read_cpunode() a bit.
> 
>   Xin
  
Li, Xin3 March 29, 2023, 8:50 a.m. UTC | #4
> -----Original Message-----
> From: Li, Xin3 <xin3.li@intel.com>
> Sent: Monday, March 13, 2023 10:43 AM
> To: Hansen, Dave <dave.hansen@intel.com>; linux-kernel@vger.kernel.org;
> x86@kernel.org
> Cc: tglx@linutronix.de; mingo@redhat.com; bp@alien8.de;
> dave.hansen@linux.intel.com; hpa@zytor.com; bigeasy@linutronix.de; Liu, Yujie
> <yujie.liu@intel.com>; Kang, Shan <shan.kang@intel.com>
> Subject: RE: [PATCH 1/1] x86/vdso: use the right GDT_ENTRY_CPUNODE for 32-bit
> getcpu() on 64-bit kernel
> 
> > > +#ifndef BUILD_VDSO32_64
> > >  #define GDT_ENTRY_CPUNODE		28
> > > +#else
> > > +#define GDT_ENTRY_CPUNODE		15
> > > +#endif
> >
> > Isn't this kinda a hack?
> >
> > First, it means that we'll now have two duplicate versions of this:
> >
> > 	#define GDT_ENTRY_CPUNODE		15
> >
> > in the same file.
> >
> > Second, if any other users of fake_32bit_build.h for the VDSO show up, they'll
> > need a similar #ifdef.
> >
> > I think I'd much rather if we define all of the GDT_ENTRY_* macros in
> > *one* place, then make that *one* place depend on BUILD_VDSO32_64.
> 
> Sounds a better way, let me try.
> 
> > Also, about the *silent* failure...  Do we not have a selftest for this somewhere?
> 
> When lsl is used, we should check ZF which indicates whether the segment limit
> is loaded successfully.  Seems we need to refactor vdso_read_cpunode() a bit.

Hi Dave,

How about the following patch to address the silent failure?

diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 794f69625780..d75ce4afff5b 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -254,7 +254,10 @@ static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
         *
         * If RDPID is available, use it.
         */
-       alternative_io ("lsl %[seg],%[p]",
+       alternative_io ("lsl %[seg],%[p]\n"
+                       "jz 1f\n"
+                       "mov $-1,%[p]\n"
+                       "1:",
                        ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
                        X86_FEATURE_RDPID,
                        [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));


And the test then reports CPU id 4095 (not a big enough #), which can be
used to indicate a failure of the lsl instruction:

~/selftests$ sudo ./run_kselftest.sh -t x86:test_vsyscall_32
TAP version 13
1..1
# selftests: x86: test_vsyscall_32
# [RUN] test gettimeofday()
#       vDSO time offsets: 0.000028 0.000000
# [OK]  vDSO gettimeofday()'s timeval was okay
# [RUN] test time()
# [OK]  vDSO time() is okay
# [RUN] getcpu() on CPU 0
# [FAIL]        vDSO reported CPU 4095 but should be 0
# [FAIL]        vDSO reported node 65535 but should be 0
# [RUN] getcpu() on CPU 1
# [FAIL]        vDSO reported CPU 4095 but should be 1
# [FAIL]        vDSO reported node 65535 but should be 0
not ok 1 selftests: x86: test_vsyscall_32 # exit=1

Thanks!
  Xin
  
Li, Xin3 March 29, 2023, 11:11 p.m. UTC | #5
> > > > +#ifndef BUILD_VDSO32_64
> > > >  #define GDT_ENTRY_CPUNODE		28
> > > > +#else
> > > > +#define GDT_ENTRY_CPUNODE		15
> > > > +#endif
> > >
> > > Isn't this kinda a hack?
> > >
> > > First, it means that we'll now have two duplicate versions of this:
> > >
> > > 	#define GDT_ENTRY_CPUNODE		15
> > >
> > > in the same file.
> > >
> > > Second, if any other users of fake_32bit_build.h for the VDSO show
> > > up, they'll need a similar #ifdef.
> > >
> > > I think I'd much rather if we define all of the GDT_ENTRY_* macros
> > > in
> > > *one* place, then make that *one* place depend on BUILD_VDSO32_64.
> >
> > Sounds a better way, let me try.
> >
> > > Also, about the *silent* failure...  Do we not have a selftest for this somewhere?
> >
> > When lsl is used, we should check ZF which indicates whether the
> > segment limit is loaded successfully.  Seems we need to refactor
> vdso_read_cpunode() a bit.
> 
> Hi Dave,
> 
> How about the following patch to address the silent failure?
> 
> diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
> index 794f69625780..d75ce4afff5b 100644
> --- a/arch/x86/include/asm/segment.h
> +++ b/arch/x86/include/asm/segment.h
> @@ -254,7 +254,10 @@ static inline void vdso_read_cpunode(unsigned *cpu,
> unsigned *node)
>          *
>          * If RDPID is available, use it.
>          */
> -       alternative_io ("lsl %[seg],%[p]",
> +       alternative_io ("lsl %[seg],%[p]\n"
> +                       "jz 1f\n"
> +                       "mov $-1,%[p]\n"
> +                       "1:",
>                         ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
>                         X86_FEATURE_RDPID,
>                         [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));
> 
> 
> And the test then reports CPU id 4095 (not a big enough #), which can be used to
> indicate a failure of the lsl instruction:

Having to say, it's a *bad* idea to use a special # to indicate an error.
But there is also no reasonable errno for getcpu() to return to its caller,
saying "we had a problem in the syscall's kernel implementation".

> 
> ~/selftests$ sudo ./run_kselftest.sh -t x86:test_vsyscall_32 TAP version 13
> 1..1
> # selftests: x86: test_vsyscall_32
> # [RUN] test gettimeofday()
> #       vDSO time offsets: 0.000028 0.000000
> # [OK]  vDSO gettimeofday()'s timeval was okay # [RUN] test time() # [OK]  vDSO
> time() is okay # [RUN] getcpu() on CPU 0
> # [FAIL]        vDSO reported CPU 4095 but should be 0
> # [FAIL]        vDSO reported node 65535 but should be 0
> # [RUN] getcpu() on CPU 1
> # [FAIL]        vDSO reported CPU 4095 but should be 1
> # [FAIL]        vDSO reported node 65535 but should be 0
> not ok 1 selftests: x86: test_vsyscall_32 # exit=1
> 
> Thanks!
>   Xin
  
H. Peter Anvin March 29, 2023, 11:59 p.m. UTC | #6
On March 29, 2023 4:11:09 PM PDT, "Li, Xin3" <xin3.li@intel.com> wrote:
>> > > > +#ifndef BUILD_VDSO32_64
>> > > >  #define GDT_ENTRY_CPUNODE		28
>> > > > +#else
>> > > > +#define GDT_ENTRY_CPUNODE		15
>> > > > +#endif
>> > >
>> > > Isn't this kinda a hack?
>> > >
>> > > First, it means that we'll now have two duplicate versions of this:
>> > >
>> > > 	#define GDT_ENTRY_CPUNODE		15
>> > >
>> > > in the same file.
>> > >
>> > > Second, if any other users of fake_32bit_build.h for the VDSO show
>> > > up, they'll need a similar #ifdef.
>> > >
>> > > I think I'd much rather if we define all of the GDT_ENTRY_* macros
>> > > in
>> > > *one* place, then make that *one* place depend on BUILD_VDSO32_64.
>> >
>> > Sounds a better way, let me try.
>> >
>> > > Also, about the *silent* failure...  Do we not have a selftest for this somewhere?
>> >
>> > When lsl is used, we should check ZF which indicates whether the
>> > segment limit is loaded successfully.  Seems we need to refactor
>> vdso_read_cpunode() a bit.
>> 
>> Hi Dave,
>> 
>> How about the following patch to address the silent failure?
>> 
>> diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
>> index 794f69625780..d75ce4afff5b 100644
>> --- a/arch/x86/include/asm/segment.h
>> +++ b/arch/x86/include/asm/segment.h
>> @@ -254,7 +254,10 @@ static inline void vdso_read_cpunode(unsigned *cpu,
>> unsigned *node)
>>          *
>>          * If RDPID is available, use it.
>>          */
>> -       alternative_io ("lsl %[seg],%[p]",
>> +       alternative_io ("lsl %[seg],%[p]\n"
>> +                       "jz 1f\n"
>> +                       "mov $-1,%[p]\n"
>> +                       "1:",
>>                         ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
>>                         X86_FEATURE_RDPID,
>>                         [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));
>> 
>> 
>> And the test then reports CPU id 4095 (not a big enough #), which can be used to
>> indicate a failure of the lsl instruction:
>
>Having to say, it's a *bad* idea to use a special # to indicate an error.
>But there is also no reasonable errno for getcpu() to return to its caller,
>saying "we had a problem in the syscall's kernel implementation".
>
>> 
>> ~/selftests$ sudo ./run_kselftest.sh -t x86:test_vsyscall_32 TAP version 13
>> 1..1
>> # selftests: x86: test_vsyscall_32
>> # [RUN] test gettimeofday()
>> #       vDSO time offsets: 0.000028 0.000000
>> # [OK]  vDSO gettimeofday()'s timeval was okay # [RUN] test time() # [OK]  vDSO
>> time() is okay # [RUN] getcpu() on CPU 0
>> # [FAIL]        vDSO reported CPU 4095 but should be 0
>> # [FAIL]        vDSO reported node 65535 but should be 0
>> # [RUN] getcpu() on CPU 1
>> # [FAIL]        vDSO reported CPU 4095 but should be 1
>> # [FAIL]        vDSO reported node 65535 but should be 0
>> not ok 1 selftests: x86: test_vsyscall_32 # exit=1
>> 
>> Thanks!
>>   Xin
>

I don't think we should put a test in the vdso implementation. We need a self test, to be sure, and we should check that LSL works standalone (because of Oracle et al), but putting an assert like this in the vdso is like of odd at best.

If we *do* put in a test, it should trap to the kernel, not return an errno, because it is the equivalent of putting a BUG() in the kernel. In this case we can even do better, because we can execute the getcpu system call at that point.

But... why? We generally trust the kernel implementation.
  

Patch

diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 794f69625780..d3b4f8797054 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -119,7 +119,11 @@ 
 
 #define GDT_ENTRY_ESPFIX_SS		26
 #define GDT_ENTRY_PERCPU		27
+#ifndef BUILD_VDSO32_64
 #define GDT_ENTRY_CPUNODE		28
+#else
+#define GDT_ENTRY_CPUNODE		15
+#endif
 
 #define GDT_ENTRY_DOUBLEFAULT_TSS	31