[RESEND] riscv: asid: Fixup stale TLB entry cause application crash

Message ID 20221108102044.3317793-1-guoren@kernel.org
State New
Headers
Series [RESEND] riscv: asid: Fixup stale TLB entry cause application crash |

Commit Message

Guo Ren Nov. 8, 2022, 10:20 a.m. UTC
  From: Guo Ren <guoren@linux.alibaba.com>

After use_asid_allocator enabled, the userspace application will
crash for stale tlb entry. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
set_mm_asid would cause user space application get a stale value by
the stale tlb entry, but set_mm_noasid is okay.

Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
(gdb) i r s0
s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
(gdb) x/16 0x3fd6d92490
0x3fd6d92490:   0xd80ac8a8      0x0000003f
The core dump file shows that the value of register s0 is wrong, but the
value in memory is right. This is because 'ld s0, -148(s0)' use a stale
mapping entry in TLB and got a wrong value from a stale physical
address.

When task run on CPU0, the task loaded/speculative-loaded the value of
address(0x3fd6d92490), and the first version of tlb mapping entry was
PTWed into CPU0's tlb.
When the task switched from CPU0 to CPU1 without local_tlb_flush_all
(because of asid), the task happened to write a value on address
(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
So we only flushed the CPU1 tlb, and setted second version mapping
of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
stale tlb mapping entry which contained a wrong target physical address.
When the task happened to read that value, the bug would be raised.

Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 arch/riscv/mm/context.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Conor Dooley Nov. 8, 2022, 10:27 a.m. UTC | #1
On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
>From: Guo Ren <guoren@linux.alibaba.com>
>
>After use_asid_allocator enabled, the userspace application will
>crash for stale tlb entry. Because only using cpumask_clear_cpu without
>local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
>set_mm_asid would cause user space application get a stale value by
>the stale tlb entry, but set_mm_noasid is okay.
>
>Here is the symptom of the bug:
>unhandled signal 11 code 0x1 (coredump)
>   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
>   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
>=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
>(gdb) i r s0
>s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
>(gdb) x/16 0x3fd6d92490
>0x3fd6d92490:   0xd80ac8a8      0x0000003f
>The core dump file shows that the value of register s0 is wrong, but the
>value in memory is right. This is because 'ld s0, -148(s0)' use a stale
>mapping entry in TLB and got a wrong value from a stale physical
>address.
>
>When task run on CPU0, the task loaded/speculative-loaded the value of
>address(0x3fd6d92490), and the first version of tlb mapping entry was
>PTWed into CPU0's tlb.
>When the task switched from CPU0 to CPU1 without local_tlb_flush_all
>(because of asid), the task happened to write a value on address
>(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
>ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
>The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
>tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
>So we only flushed the CPU1 tlb, and setted second version mapping
>of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
>stale tlb mapping entry which contained a wrong target physical address.
>When the task happened to read that value, the bug would be raised.
>
>Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
>Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>Signed-off-by: Guo Ren <guoren@kernel.org>
>Cc: Anup Patel <apatel@ventanamicro.com>
>Cc: Palmer Dabbelt <palmer@rivosinc.com>
>---
> arch/riscv/mm/context.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
>index 7acbfbd14557..8ad6c2493e93 100644
>--- a/arch/riscv/mm/context.c
>+++ b/arch/riscv/mm/context.c
>@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> 	 */
> 	cpu = smp_processor_id();
> 
>-	cpumask_clear_cpu(cpu, mm_cpumask(prev));
>+	if (!static_branch_unlikely(&use_asid_allocator))
>+		cpumask_clear_cpu(cpu, mm_cpumask(prev));
>+
> 	cpumask_set_cpu(cpu, mm_cpumask(next));
> 
> 	set_mm(next, cpu);

This is a completely different patch to what you already sent. Why have you marked it RESEND rather than v2?
  
Conor Dooley Nov. 8, 2022, 2:22 p.m. UTC | #2
On Tue, Nov 08, 2022 at 10:27:51AM +0000, Conor Dooley wrote:
> 
> 
> On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
> >From: Guo Ren <guoren@linux.alibaba.com>
> >
> >After use_asid_allocator enabled, the userspace application will
> >crash for stale tlb entry. Because only using cpumask_clear_cpu without
> >local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
> >set_mm_asid would cause user space application get a stale value by
> >the stale tlb entry, but set_mm_noasid is okay.
> >
> >Here is the symptom of the bug:
> >unhandled signal 11 code 0x1 (coredump)
> >   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
> >   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
> >=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
> >(gdb) i r s0
> >s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
> >(gdb) x/16 0x3fd6d92490
> >0x3fd6d92490:   0xd80ac8a8      0x0000003f
> >The core dump file shows that the value of register s0 is wrong, but the
> >value in memory is right. This is because 'ld s0, -148(s0)' use a stale
> >mapping entry in TLB and got a wrong value from a stale physical
> >address.
> >
> >When task run on CPU0, the task loaded/speculative-loaded the value of
> >address(0x3fd6d92490), and the first version of tlb mapping entry was
> >PTWed into CPU0's tlb.
> >When the task switched from CPU0 to CPU1 without local_tlb_flush_all
> >(because of asid), the task happened to write a value on address
> >(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
> >ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
> >The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
> >tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
> >So we only flushed the CPU1 tlb, and setted second version mapping
> >of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
> >stale tlb mapping entry which contained a wrong target physical address.
> >When the task happened to read that value, the bug would be raised.
> >
> >Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
> >Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >Signed-off-by: Guo Ren <guoren@kernel.org>
> >Cc: Anup Patel <apatel@ventanamicro.com>
> >Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >---
> > arch/riscv/mm/context.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> >index 7acbfbd14557..8ad6c2493e93 100644
> >--- a/arch/riscv/mm/context.c
> >+++ b/arch/riscv/mm/context.c
> >@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> > 	 */
> > 	cpu = smp_processor_id();
> > 
> >-	cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+	if (!static_branch_unlikely(&use_asid_allocator))
> >+		cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+
> > 	cpumask_set_cpu(cpu, mm_cpumask(next));
> > 
> > 	set_mm(next, cpu);
> 
> This is a completely different patch to what you already sent.
> Why have you marked it RESEND rather than v2?

In addition, it seems to break the build for the nommu defconfigs.
  
Guo Ren Nov. 9, 2022, 12:30 a.m. UTC | #3
On Tue, Nov 8, 2022 at 6:27 PM Conor Dooley <conor@kernel.org> wrote:
>
>
>
> On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
> >From: Guo Ren <guoren@linux.alibaba.com>
> >
> >After use_asid_allocator enabled, the userspace application will
> >crash for stale tlb entry. Because only using cpumask_clear_cpu without
> >local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
> >set_mm_asid would cause user space application get a stale value by
> >the stale tlb entry, but set_mm_noasid is okay.
> >
> >Here is the symptom of the bug:
> >unhandled signal 11 code 0x1 (coredump)
> >   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
> >   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
> >=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
> >(gdb) i r s0
> >s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
> >(gdb) x/16 0x3fd6d92490
> >0x3fd6d92490:   0xd80ac8a8      0x0000003f
> >The core dump file shows that the value of register s0 is wrong, but the
> >value in memory is right. This is because 'ld s0, -148(s0)' use a stale
> >mapping entry in TLB and got a wrong value from a stale physical
> >address.
> >
> >When task run on CPU0, the task loaded/speculative-loaded the value of
> >address(0x3fd6d92490), and the first version of tlb mapping entry was
> >PTWed into CPU0's tlb.
> >When the task switched from CPU0 to CPU1 without local_tlb_flush_all
> >(because of asid), the task happened to write a value on address
> >(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
> >ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
> >The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
> >tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
> >So we only flushed the CPU1 tlb, and setted second version mapping
> >of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
> >stale tlb mapping entry which contained a wrong target physical address.
> >When the task happened to read that value, the bug would be raised.
> >
> >Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
> >Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >Signed-off-by: Guo Ren <guoren@kernel.org>
> >Cc: Anup Patel <apatel@ventanamicro.com>
> >Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >---
> > arch/riscv/mm/context.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> >index 7acbfbd14557..8ad6c2493e93 100644
> >--- a/arch/riscv/mm/context.c
> >+++ b/arch/riscv/mm/context.c
> >@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> >        */
> >       cpu = smp_processor_id();
> >
> >-      cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+      if (!static_branch_unlikely(&use_asid_allocator))
> >+              cpumask_clear_cpu(cpu, mm_cpumask(prev));
> >+
> >       cpumask_set_cpu(cpu, mm_cpumask(next));
> >
> >       set_mm(next, cpu);
>
> This is a completely different patch to what you already sent. Why have you marked it RESEND rather than v2?
Okay, I should send v2.
  
Guo Ren Nov. 9, 2022, 12:30 a.m. UTC | #4
Yes, thx for pointing that out.

On Tue, Nov 8, 2022 at 10:23 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Tue, Nov 08, 2022 at 10:27:51AM +0000, Conor Dooley wrote:
> >
> >
> > On 8 November 2022 10:20:44 GMT, guoren@kernel.org wrote:
> > >From: Guo Ren <guoren@linux.alibaba.com>
> > >
> > >After use_asid_allocator enabled, the userspace application will
> > >crash for stale tlb entry. Because only using cpumask_clear_cpu without
> > >local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
> > >set_mm_asid would cause user space application get a stale value by
> > >the stale tlb entry, but set_mm_noasid is okay.
> > >
> > >Here is the symptom of the bug:
> > >unhandled signal 11 code 0x1 (coredump)
> > >   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
> > >   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
> > >=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
> > >(gdb) i r s0
> > >s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
> > >(gdb) x/16 0x3fd6d92490
> > >0x3fd6d92490:   0xd80ac8a8      0x0000003f
> > >The core dump file shows that the value of register s0 is wrong, but the
> > >value in memory is right. This is because 'ld s0, -148(s0)' use a stale
> > >mapping entry in TLB and got a wrong value from a stale physical
> > >address.
> > >
> > >When task run on CPU0, the task loaded/speculative-loaded the value of
> > >address(0x3fd6d92490), and the first version of tlb mapping entry was
> > >PTWed into CPU0's tlb.
> > >When the task switched from CPU0 to CPU1 without local_tlb_flush_all
> > >(because of asid), the task happened to write a value on address
> > >(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
> > >ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
> > >The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
> > >tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
> > >So we only flushed the CPU1 tlb, and setted second version mapping
> > >of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
> > >stale tlb mapping entry which contained a wrong target physical address.
> > >When the task happened to read that value, the bug would be raised.
> > >
> > >Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
> > >Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > >Signed-off-by: Guo Ren <guoren@kernel.org>
> > >Cc: Anup Patel <apatel@ventanamicro.com>
> > >Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > >---
> > > arch/riscv/mm/context.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
> > >index 7acbfbd14557..8ad6c2493e93 100644
> > >--- a/arch/riscv/mm/context.c
> > >+++ b/arch/riscv/mm/context.c
> > >@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> > >      */
> > >     cpu = smp_processor_id();
> > >
> > >-    cpumask_clear_cpu(cpu, mm_cpumask(prev));
> > >+    if (!static_branch_unlikely(&use_asid_allocator))
> > >+            cpumask_clear_cpu(cpu, mm_cpumask(prev));
> > >+
> > >     cpumask_set_cpu(cpu, mm_cpumask(next));
> > >
> > >     set_mm(next, cpu);
> >
> > This is a completely different patch to what you already sent.
> > Why have you marked it RESEND rather than v2?
>
> In addition, it seems to break the build for the nommu defconfigs.
  
kernel test robot Nov. 9, 2022, 1:42 a.m. UTC | #5
Hi,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc4 next-20221108]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/guoren-kernel-org/riscv-asid-Fixup-stale-TLB-entry-cause-application-crash/20221108-182151
patch link:    https://lore.kernel.org/r/20221108102044.3317793-1-guoren%40kernel.org
patch subject: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
config: riscv-nommu_virt_defconfig
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/025af39c1eb149e584705dfc72ac117812c82f3e
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review guoren-kernel-org/riscv-asid-Fixup-stale-TLB-entry-cause-application-crash/20221108-182151
        git checkout 025af39c1eb149e584705dfc72ac117812c82f3e
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   riscv64-linux-ld: arch/riscv/mm/context.o: in function `.L0 ':
>> context.c:(__jump_table+0x8): undefined reference to `use_asid_allocator'
  
kernel test robot Nov. 9, 2022, 2:33 a.m. UTC | #6
Hi,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc4 next-20221108]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/guoren-kernel-org/riscv-asid-Fixup-stale-TLB-entry-cause-application-crash/20221108-182151
patch link:    https://lore.kernel.org/r/20221108102044.3317793-1-guoren%40kernel.org
patch subject: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
config: riscv-nommu_k210_defconfig
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/025af39c1eb149e584705dfc72ac117812c82f3e
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review guoren-kernel-org/riscv-asid-Fixup-stale-TLB-entry-cause-application-crash/20221108-182151
        git checkout 025af39c1eb149e584705dfc72ac117812c82f3e
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   riscv64-linux-ld: arch/riscv/mm/context.o: in function `.L0 ':
>> context.c:(.text+0x2e): undefined reference to `use_asid_allocator'
  

Patch

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 7acbfbd14557..8ad6c2493e93 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -317,7 +317,9 @@  void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	 */
 	cpu = smp_processor_id();
 
-	cpumask_clear_cpu(cpu, mm_cpumask(prev));
+	if (!static_branch_unlikely(&use_asid_allocator))
+		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+
 	cpumask_set_cpu(cpu, mm_cpumask(next));
 
 	set_mm(next, cpu);