Message ID | 20230112090603.1295340-2-guoren@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp3779973wrt; Thu, 12 Jan 2023 01:16:14 -0800 (PST) X-Google-Smtp-Source: AMrXdXvfTcY2pD+yokFFXddLtP2i/7msvWlPufEckqI+6QpvcgoSp54/Dl71q9AjukEHxqM0B0X5 X-Received: by 2002:a17:906:6b8c:b0:841:df6e:a32e with SMTP id l12-20020a1709066b8c00b00841df6ea32emr80819745ejr.25.1673514973998; Thu, 12 Jan 2023 01:16:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673514973; cv=none; d=google.com; s=arc-20160816; b=hORUol7W8AW9d3Lv0QPLzdDVZFjI9pjFIl8FU/HgKtUZQfeQDHk7IzYjK5z5umANJW 8d6+KyhtYy5/n4g3qznKSiND0g8gRTiby4BrJB1Oe9Os60Ylr9/kW+SYKNaDAU5aWGLy iO9d2cJ3+BabiN+oS9WmaGfHjgtssoZjJ2HuyAU4UkYrmkmf3CzrXq6J7Ucukq3Vuyge vte16QJ7gDIHmWqx/nTe0Py9bkdOZgrrcig8onOS0CneEJFAQoz09weAGO4wDCrbgzXI GBJ5F1OpHiytK8YG0qml3QyJG0AK9fbij98IwQ0DqFBcje1U1USmdOTfiHgYtX/n8plN 4tLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mDu3Gx69bunIYUgXh5x6UnX1IK7458HyOIzUj6TlLqo=; b=lVZ94fzr2s3mr9li02nOfQPgQnuSiSQBmYSsPwsyBHNEMJXYx1LF2gmTf7cNA4OneZ nUNLWuhPLy/Ax6+Y1XHI8nHDpiORIJNV6GfXgWLGOQgz6p/83MQe5RUJC4OXSBRuGg68 IsL6sGEzbNwR5wB1DRiMXnvyFXHqdgYfPrYJy7DRX1ROXwXIx7z5kNDi54J2MOWtgnrD E129fTBYZOH6nFF3/25oxskx5przIN6g0wKQCgV2nfi1tqYMBqm+4iQLWj5OdoJwGx7h SQhIzdOlHB6y1jxPUAUvDEwfB4GBhJdN3T78LcrZ0lQ7j4oW2wBRVvuid8CcT09AfkRI 07rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=nnPTpSkx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id wt10-20020a170906ee8a00b0084cd1ecf33fsi6979636ejb.740.2023.01.12.01.15.50; Thu, 12 Jan 2023 01:16:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=nnPTpSkx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231162AbjALJMM (ORCPT <rfc822;zhuangel570@gmail.com> + 99 others); Thu, 12 Jan 2023 04:12:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239994AbjALJKZ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 12 Jan 2023 04:10:25 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77085517FA for <linux-kernel@vger.kernel.org>; Thu, 12 Jan 2023 01:06:20 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 066C161F9D for <linux-kernel@vger.kernel.org>; Thu, 12 Jan 2023 09:06:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 97459C433F1; Thu, 12 Jan 2023 09:06:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673514379; bh=V792wXFwxvPtjCh6mve4UN0IqpgiVRIo4oWLOTpVhcc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nnPTpSkx33z083v/WUDDH8zWXD1TTbbMTaAeep8Y+5YsNL8elOMuiyId7m5OV1uiI 7NFy3wqJ0Q4gXZqMgxwzvJ7r90XihDt5Jc4NbZz+3kD1HwgYGA0yOq80lHLin7Xb+o vCwDHm+D6vHOIwybTi1Era66OhgKFVyurdR41GHxayJE+umEwJEQ5diZRIXYXFHglD 9a0A3JyQw1w6kADezVHvrdflM+R76rA3TlPTyTjB1Sx26B5IX9aUW9oQbQTEFv00ya TkTjFITFHuFx9cOgjTiYeofwQHLwwsUPWbKeYKaPZeU3QKIKT/K4275ZHQg/IW63g/ oXuitHr7psgXw== From: guoren@kernel.org To: anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de, rostedt@goodmis.org, mhiramat@kernel.org, jolsa@redhat.com, bp@suse.de, jpoimboe@kernel.org, suagrfillet@gmail.com, andy.chiu@sifive.com, e.shatokhin@yadro.com, guoren@kernel.org Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption Date: Thu, 12 Jan 2023 04:05:57 -0500 Message-Id: <20230112090603.1295340-2-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230112090603.1295340-1-guoren@kernel.org> References: <20230112090603.1295340-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754807637286587994?= X-GMAIL-MSGID: =?utf-8?q?1754807637286587994?= |
Series |
riscv: Optimize function trace
|
|
Commit Message
Guo Ren
Jan. 12, 2023, 9:05 a.m. UTC
From: Andy Chiu <andy.chiu@sifive.com> In RISCV, we must use an AUIPC + JALR pair to encode an immediate, forming a jump that jumps to an address over 4K. This may cause errors if we want to enable kernel preemption and remove dependency from patching code with stop_machine(). For example, if a task was switched out on auipc. And, if we changed the ftrace function before it was switched back, then it would jump to an address that has updated 11:0 bits mixing with previous XLEN:12 part. p: patched area performed by dynamic ftrace ftrace_prologue: p| REG_S ra, -SZREG(sp) p| auipc ra, 0x? ------------> preempted ... change ftrace function ... p| jalr -?(ra) <------------- switched back p| REG_L ra, -SZREG(sp) func: xxx ret Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Guo Ren <guoren@kernel.org> --- arch/riscv/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi Guo, On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > From: Andy Chiu <andy.chiu@sifive.com> > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > forming a jump that jumps to an address over 4K. This may cause errors > if we want to enable kernel preemption and remove dependency from > patching code with stop_machine(). For example, if a task was switched > out on auipc. And, if we changed the ftrace function before it was > switched back, then it would jump to an address that has updated 11:0 > bits mixing with previous XLEN:12 part. > > p: patched area performed by dynamic ftrace > ftrace_prologue: > p| REG_S ra, -SZREG(sp) > p| auipc ra, 0x? ------------> preempted > ... > change ftrace function > ... > p| jalr -?(ra) <------------- switched back > p| REG_L ra, -SZREG(sp) > func: > xxx > ret As mentioned on the last posting, I don't think this is sufficient to fix the issue. I've replied with more detail there: https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ Even in a non-preemptible SMP kernel, if one CPU can be in the middle of executing the ftrace_prologue while another CPU is patching the ftrace_prologue, you have the exact same issue. For example, if CPU X is in the prologue fetches the old AUIPC and the new JALR (because it races with CPU Y modifying those), CPU X will branch to the wrong address. The race window is much smaller in the absence of preemption, but it's still there (and will be exacerbated in virtual machines since the hypervisor can preempt a vCPU at any time). Note that the above is even assuming that instruction fetches are atomic, which I'm not sure is the case; for example arm64 has special CMODX / "Concurrent MODification and eXecutuion of instructions" rules which mean only certain instructions can be patched atomically. Either I'm missing something that provides mutual exclusion between the patching and execution of the ftrace_prologue, or this patch is not sufficient. Thanks, Mark. > Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") > Signed-off-by: Andy Chiu <andy.chiu@sifive.com> > Signed-off-by: Guo Ren <guoren@kernel.org> > --- > arch/riscv/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index e2b656043abf..ee0d39b26794 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -138,7 +138,7 @@ config RISCV > select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE > select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL > select HAVE_FUNCTION_GRAPH_TRACER > - select HAVE_FUNCTION_TRACER if !XIP_KERNEL > + select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION > > config ARCH_MMAP_RND_BITS_MIN > default 18 if 64BIT > -- > 2.36.1 >
On Thu, Jan 12, 2023 at 12:16:02PM +0000, Mark Rutland wrote: > Hi Guo, > > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > > From: Andy Chiu <andy.chiu@sifive.com> > > > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > > forming a jump that jumps to an address over 4K. This may cause errors > > if we want to enable kernel preemption and remove dependency from > > patching code with stop_machine(). For example, if a task was switched > > out on auipc. And, if we changed the ftrace function before it was > > switched back, then it would jump to an address that has updated 11:0 > > bits mixing with previous XLEN:12 part. > > > > p: patched area performed by dynamic ftrace > > ftrace_prologue: > > p| REG_S ra, -SZREG(sp) > > p| auipc ra, 0x? ------------> preempted > > ... > > change ftrace function > > ... > > p| jalr -?(ra) <------------- switched back > > p| REG_L ra, -SZREG(sp) > > func: > > xxx > > ret > > As mentioned on the last posting, I don't think this is sufficient to fix the > issue. I've replied with more detail there: > > https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ > > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of > executing the ftrace_prologue while another CPU is patching the > ftrace_prologue, you have the exact same issue. > > For example, if CPU X is in the prologue fetches the old AUIPC and the new > JALR (because it races with CPU Y modifying those), CPU X will branch to the > wrong address. The race window is much smaller in the absence of preemption, > but it's still there (and will be exacerbated in virtual machines since the > hypervisor can preempt a vCPU at any time). With that in mind, I think your current implementation of ftrace_make_call() and ftrace_make_nop() have a simlar bug. A caller might execute: NOP // not yet patched to AUIPC < AUIPC and JALR instructions both patched > JALR ... and go to the wrong place. Assuming individual instruction fetches are atomic, and that you only ever branch to the same trampoline, you could fix that by always leaving the AUIPC in place, so that you only patch the JALR to enable/disable the callsite. Depending on your calling convention, if you have two free GPRs, you might be able to avoid the stacking of RA by always saving it to a GPR in the callsite, using a different GPR for the address generation, and having the ftrace trampoline restore the original RA value, e.g. MV GPR1, ra AUIPC GPR2, high_bits_of(ftrace_caller) JALR ra, high_bits(GPR2) // only patch this ... which'd save an instruction per callsite. Thanks, Mark.
On Thu, Jan 12, 2023 at 8:16 PM Mark Rutland <mark.rutland@arm.com> wrote: > > Hi Guo, > > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > > From: Andy Chiu <andy.chiu@sifive.com> > > > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > > forming a jump that jumps to an address over 4K. This may cause errors > > if we want to enable kernel preemption and remove dependency from > > patching code with stop_machine(). For example, if a task was switched > > out on auipc. And, if we changed the ftrace function before it was > > switched back, then it would jump to an address that has updated 11:0 > > bits mixing with previous XLEN:12 part. > > > > p: patched area performed by dynamic ftrace > > ftrace_prologue: > > p| REG_S ra, -SZREG(sp) > > p| auipc ra, 0x? ------------> preempted > > ... > > change ftrace function > > ... > > p| jalr -?(ra) <------------- switched back > > p| REG_L ra, -SZREG(sp) > > func: > > xxx > > ret > > As mentioned on the last posting, I don't think this is sufficient to fix the > issue. I've replied with more detail there: > > https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ > > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of > executing the ftrace_prologue while another CPU is patching the > ftrace_prologue, you have the exact same issue. > > For example, if CPU X is in the prologue fetches the old AUIPC and the new > JALR (because it races with CPU Y modifying those), CPU X will branch to the > wrong address. The race window is much smaller in the absence of preemption, > but it's still there (and will be exacerbated in virtual machines since the > hypervisor can preempt a vCPU at any time). > > Note that the above is even assuming that instruction fetches are atomic, which > I'm not sure is the case; for example arm64 has special CMODX / "Concurrent > MODification and eXecutuion of instructions" rules which mean only certain > instructions can be patched atomically. > > Either I'm missing something that provides mutual exclusion between the > patching and execution of the ftrace_prologue, or this patch is not sufficient. This patch is sufficient because riscv isn't the same as arm64. It uses default arch_ftrace_update_code, which uses stop_machine. See kernel/trace/ftrace.c: void __weak arch_ftrace_update_code(int command) { ftrace_run_stop_machine(command); } ps: Yes, it's not good, and it's expensive. > > Thanks, > Mark. > > > Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") > > Signed-off-by: Andy Chiu <andy.chiu@sifive.com> > > Signed-off-by: Guo Ren <guoren@kernel.org> > > --- > > arch/riscv/Kconfig | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > index e2b656043abf..ee0d39b26794 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -138,7 +138,7 @@ config RISCV > > select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE > > select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL > > select HAVE_FUNCTION_GRAPH_TRACER > > - select HAVE_FUNCTION_TRACER if !XIP_KERNEL > > + select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION > > > > config ARCH_MMAP_RND_BITS_MIN > > default 18 if 64BIT > > -- > > 2.36.1 > >
On Thu, Jan 12, 2023 at 8:57 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Thu, Jan 12, 2023 at 12:16:02PM +0000, Mark Rutland wrote: > > Hi Guo, > > > > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > > > From: Andy Chiu <andy.chiu@sifive.com> > > > > > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > > > forming a jump that jumps to an address over 4K. This may cause errors > > > if we want to enable kernel preemption and remove dependency from > > > patching code with stop_machine(). For example, if a task was switched > > > out on auipc. And, if we changed the ftrace function before it was > > > switched back, then it would jump to an address that has updated 11:0 > > > bits mixing with previous XLEN:12 part. > > > > > > p: patched area performed by dynamic ftrace > > > ftrace_prologue: > > > p| REG_S ra, -SZREG(sp) > > > p| auipc ra, 0x? ------------> preempted > > > ... > > > change ftrace function > > > ... > > > p| jalr -?(ra) <------------- switched back > > > p| REG_L ra, -SZREG(sp) > > > func: > > > xxx > > > ret > > > > As mentioned on the last posting, I don't think this is sufficient to fix the > > issue. I've replied with more detail there: > > > > https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ > > > > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of > > executing the ftrace_prologue while another CPU is patching the > > ftrace_prologue, you have the exact same issue. > > > > For example, if CPU X is in the prologue fetches the old AUIPC and the new > > JALR (because it races with CPU Y modifying those), CPU X will branch to the > > wrong address. The race window is much smaller in the absence of preemption, > > but it's still there (and will be exacerbated in virtual machines since the > > hypervisor can preempt a vCPU at any time). > > With that in mind, I think your current implementation of ftrace_make_call() > and ftrace_make_nop() have a simlar bug. A caller might execute: > > NOP // not yet patched to AUIPC > > < AUIPC and JALR instructions both patched > > > JALR > > ... and go to the wrong place. > > Assuming individual instruction fetches are atomic, and that you only ever > branch to the same trampoline, you could fix that by always leaving the AUIPC > in place, so that you only patch the JALR to enable/disable the callsite. Yes, the same trampoline is one of the antidotes. > > Depending on your calling convention, if you have two free GPRs, you might be > able to avoid the stacking of RA by always saving it to a GPR in the callsite, > using a different GPR for the address generation, and having the ftrace > trampoline restore the original RA value, e.g. > > MV GPR1, ra > AUIPC GPR2, high_bits_of(ftrace_caller) > JALR ra, high_bits(GPR2) // only patch this I think you mean temp registers here. We are at the prologue of a function, so we have all of them. But why do you need another "MV GPR1, ra" AUIPC GPR2, high_bits_of(ftrace_caller) JALR GPR2, high_bits(GPR2) // only patch this We could reserve ra on the trampoline. MV XX, ra > > ... which'd save an instruction per callsite. > > Thanks, > Mark.
On Sat, Jan 28, 2023 at 05:37:46PM +0800, Guo Ren wrote: > On Thu, Jan 12, 2023 at 8:16 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > Hi Guo, > > > > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > > > From: Andy Chiu <andy.chiu@sifive.com> > > > > > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > > > forming a jump that jumps to an address over 4K. This may cause errors > > > if we want to enable kernel preemption and remove dependency from > > > patching code with stop_machine(). For example, if a task was switched > > > out on auipc. And, if we changed the ftrace function before it was > > > switched back, then it would jump to an address that has updated 11:0 > > > bits mixing with previous XLEN:12 part. > > > > > > p: patched area performed by dynamic ftrace > > > ftrace_prologue: > > > p| REG_S ra, -SZREG(sp) > > > p| auipc ra, 0x? ------------> preempted > > > ... > > > change ftrace function > > > ... > > > p| jalr -?(ra) <------------- switched back > > > p| REG_L ra, -SZREG(sp) > > > func: > > > xxx > > > ret > > > > As mentioned on the last posting, I don't think this is sufficient to fix the > > issue. I've replied with more detail there: > > > > https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ > > > > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of > > executing the ftrace_prologue while another CPU is patching the > > ftrace_prologue, you have the exact same issue. > > > > For example, if CPU X is in the prologue fetches the old AUIPC and the new > > JALR (because it races with CPU Y modifying those), CPU X will branch to the > > wrong address. The race window is much smaller in the absence of preemption, > > but it's still there (and will be exacerbated in virtual machines since the > > hypervisor can preempt a vCPU at any time). > > > > Note that the above is even assuming that instruction fetches are atomic, which > > I'm not sure is the case; for example arm64 has special CMODX / "Concurrent > > MODification and eXecutuion of instructions" rules which mean only certain > > instructions can be patched atomically. > > > > Either I'm missing something that provides mutual exclusion between the > > patching and execution of the ftrace_prologue, or this patch is not sufficient. > This patch is sufficient because riscv isn't the same as arm64. It > uses default arch_ftrace_update_code, which uses stop_machine. > See kernel/trace/ftrace.c: > void __weak arch_ftrace_update_code(int command) > { > ftrace_run_stop_machine(command); > } Ah; sorry, I had misunderstood here, since the commit message spoke in terms of removing that. As long as stop_machine() is used I agree this is safe; sorry for the noise. > ps: > Yes, it's not good, and it's expensive. We can't have everything! :) Thanks, Mark.
On Mon, Jan 30, 2023 at 6:54 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Sat, Jan 28, 2023 at 05:37:46PM +0800, Guo Ren wrote: > > On Thu, Jan 12, 2023 at 8:16 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > Hi Guo, > > > > > > On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@kernel.org wrote: > > > > From: Andy Chiu <andy.chiu@sifive.com> > > > > > > > > In RISCV, we must use an AUIPC + JALR pair to encode an immediate, > > > > forming a jump that jumps to an address over 4K. This may cause errors > > > > if we want to enable kernel preemption and remove dependency from > > > > patching code with stop_machine(). For example, if a task was switched > > > > out on auipc. And, if we changed the ftrace function before it was > > > > switched back, then it would jump to an address that has updated 11:0 > > > > bits mixing with previous XLEN:12 part. > > > > > > > > p: patched area performed by dynamic ftrace > > > > ftrace_prologue: > > > > p| REG_S ra, -SZREG(sp) > > > > p| auipc ra, 0x? ------------> preempted > > > > ... > > > > change ftrace function > > > > ... > > > > p| jalr -?(ra) <------------- switched back > > > > p| REG_L ra, -SZREG(sp) > > > > func: > > > > xxx > > > > ret > > > > > > As mentioned on the last posting, I don't think this is sufficient to fix the > > > issue. I've replied with more detail there: > > > > > > https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/ > > > > > > Even in a non-preemptible SMP kernel, if one CPU can be in the middle of > > > executing the ftrace_prologue while another CPU is patching the > > > ftrace_prologue, you have the exact same issue. > > > > > > For example, if CPU X is in the prologue fetches the old AUIPC and the new > > > JALR (because it races with CPU Y modifying those), CPU X will branch to the > > > wrong address. The race window is much smaller in the absence of preemption, > > > but it's still there (and will be exacerbated in virtual machines since the > > > hypervisor can preempt a vCPU at any time). > > > > > > Note that the above is even assuming that instruction fetches are atomic, which > > > I'm not sure is the case; for example arm64 has special CMODX / "Concurrent > > > MODification and eXecutuion of instructions" rules which mean only certain > > > instructions can be patched atomically. > > > > > > Either I'm missing something that provides mutual exclusion between the > > > patching and execution of the ftrace_prologue, or this patch is not sufficient. > > This patch is sufficient because riscv isn't the same as arm64. It > > uses default arch_ftrace_update_code, which uses stop_machine. > > See kernel/trace/ftrace.c: > > void __weak arch_ftrace_update_code(int command) > > { > > ftrace_run_stop_machine(command); > > } > > Ah; sorry, I had misunderstood here, since the commit message spoke in terms of > removing that. > > As long as stop_machine() is used I agree this is safe; sorry for the noise. Okay. Hi Palmer, Please take Andy's fixup patch. We would continue to find a way for PREEMPTION. > > > ps: > > Yes, it's not good, and it's expensive. > > We can't have everything! :) > > Thanks, > Mark.
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index e2b656043abf..ee0d39b26794 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -138,7 +138,7 @@ config RISCV select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL select HAVE_FUNCTION_GRAPH_TRACER - select HAVE_FUNCTION_TRACER if !XIP_KERNEL + select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION config ARCH_MMAP_RND_BITS_MIN default 18 if 64BIT