Message ID | 20230201163420.1579014-7-revest@chromium.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp381148wrn; Wed, 1 Feb 2023 08:39:13 -0800 (PST) X-Google-Smtp-Source: AK7set8793keAv32h3yzkZrjztmsKDAN2WePVjHiuc4h3NXoqDPh+WDbwsXYPprFqiy4hlvvIPMZ X-Received: by 2002:a50:ce47:0:b0:4a2:45e3:ede3 with SMTP id k7-20020a50ce47000000b004a245e3ede3mr2672592edj.14.1675269553698; Wed, 01 Feb 2023 08:39:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675269553; cv=none; d=google.com; s=arc-20160816; b=Q2FHEEV9Nhqk+Rq6L/VzwNOMUasalpcY/4lNoXMLewi6KHyfhC5SHkpQKIizFKxCW2 0olISqzTmqU8oESp1R7c/BVXHVnPIHtYDojHLqT04WYtJ/9+0rQaePMhAh647bpKk92G KEd0alKtEwq3DGZsVxXWW+jb4LG4xEwO01h47xD7OA+v7xiQTbjnPHTTgfUUQfqWMYAv CyjF/jE8YTyuX2TGvtrIojJtQz5i/psO5pYP84lBqYBuuEsBLkifC2OuqMWJsXSFqu5s J3JRmyZZOhTZejfLISgvutHBhfEPepyO7fFnEVnn75OlgFDUUVmYetfxUOgVYYrPm4KA Ughw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4h7L/6sUB2KnxbssfPmMIi6qgi9N82FF7O8XsVF6WqE=; b=aQPrFk7cDdUCqNSbgJbyZh49b2v6kIQe6jVgFGS4VhGzQk2FVPy9rKlwQz7WTpcexH z0vPg/Ua8/9LVI/gICHDeGIAbBinpY8O2n190XnHTgNDosriT3jrT7NAbTqZ06EPQpcX nh8EWgvaBKGNv6lldF8lM027L+yT1btnUj636B+6FFdujGqaQ53GPGzzpIuLli0osv8j +wYx+MQE7ckBzUX8DyfdPTRigmTNNY7ESlVAQBB91LZ6qQ2OaOr5JnB9d622GR7NoR3F eP32IU8eotGhmAiHzgeZV94K/EtzkGQ0OLTIC1mn0t+xL6NXPzDtZ6XtV0mPHHOYolP0 7YNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=eO7ef0lb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z16-20020a05640235d000b0049e1782c0edsi27210704edc.296.2023.02.01.08.38.50; Wed, 01 Feb 2023 08:39:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=eO7ef0lb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232755AbjBAQf4 (ORCPT <rfc822;duw91626@gmail.com> + 99 others); Wed, 1 Feb 2023 11:35:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232345AbjBAQfr (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 1 Feb 2023 11:35:47 -0500 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCBBB65F2C for <linux-kernel@vger.kernel.org>; Wed, 1 Feb 2023 08:35:32 -0800 (PST) Received: by mail-wm1-x330.google.com with SMTP id hn2-20020a05600ca38200b003dc5cb96d46so1823832wmb.4 for <linux-kernel@vger.kernel.org>; Wed, 01 Feb 2023 08:35:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4h7L/6sUB2KnxbssfPmMIi6qgi9N82FF7O8XsVF6WqE=; b=eO7ef0lbntzUy8m51n14OyG3x4x9Mi3sefAioFTJImzU5ZFg5Vs78dtPVoYekbT3pk Cg3uxLoGda0UZ7Xy5jaLG8tClEL1RJYhO9a8twyYmoUmUPMXXAoMJjtxTKVjATmXRm3i jc39euXLJjlYSDkRk5RHgqVAXCCYmDq7Le2k8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4h7L/6sUB2KnxbssfPmMIi6qgi9N82FF7O8XsVF6WqE=; b=XVi9WQx3oO2aaEAi7B56X47ISIW9J3mrqYFEWuQWfRY3bbpg4GWiYoITahop9TvnNS wz5Il9hAJ/6/3YNJ0wJCnaJoyGENz2ArrumYrU/OW2gbqBW9F8gxjlQv4dlD0lRTIke5 F6NHi9x8dgoJZLJh5Qb1CVMxvRe0aLlM19OqyITdPWE0TA4YT6U5XGXBJup2G9cJHu+I /092TSbzdi7cTLUBgy/ZKymee4jQDMaF0LyLZeCeV7NOGQJXHfl6uNq0x9eWhDsXxmlY A81GcTyEVBV9OZGCFmY+jP8dv9yK1dZDT0/C+rAwhg25yHp+Uek2XrCBXkqTKcZHpezt ZSkg== X-Gm-Message-State: AO0yUKVSMgXGS8BIyjEsbQIrTnmrjzajZQKJhygKKHHfA3iaLV27mQZ5 GEKLBkKVMafnPa67k4Qew9qAoA== X-Received: by 2002:a05:600c:4f12:b0:3de:271:7b24 with SMTP id l18-20020a05600c4f1200b003de02717b24mr2802649wmq.16.1675269331250; Wed, 01 Feb 2023 08:35:31 -0800 (PST) Received: from revest.zrh.corp.google.com ([2a00:79e0:9d:6:4399:89a1:4a86:9630]) by smtp.gmail.com with ESMTPSA id r38-20020a05600c322600b003dd7edcc960sm2058522wmp.45.2023.02.01.08.35.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 08:35:30 -0800 (PST) From: Florent Revest <revest@chromium.org> To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org Cc: catalin.marinas@arm.com, will@kernel.org, rostedt@goodmis.org, mhiramat@kernel.org, mark.rutland@arm.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kpsingh@kernel.org, jolsa@kernel.org, xukuohai@huaweicloud.com, Xu Kuohai <xukuohai@huawei.com>, Li Huafei <lihuafei1@huawei.com>, Florent Revest <revest@chromium.org> Subject: [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Date: Wed, 1 Feb 2023 17:34:18 +0100 Message-Id: <20230201163420.1579014-7-revest@chromium.org> X-Mailer: git-send-email 2.39.1.519.gcb327c4b5f-goog In-Reply-To: <20230201163420.1579014-1-revest@chromium.org> References: <20230201163420.1579014-1-revest@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756647447894887643?= X-GMAIL-MSGID: =?utf-8?q?1756647447894887643?= |
Series |
Add ftrace direct call for arm64
|
|
Commit Message
Florent Revest
Feb. 1, 2023, 4:34 p.m. UTC
From: Xu Kuohai <xukuohai@huawei.com> After direct call is enabled for arm64, ftrace selftest enters a dead loop: <trace_selftest_dynamic_test_func>: 00 bti c 01 mov x9, x30 <trace_direct_tramp>: 02 bl <trace_direct_tramp> ----------> ret | lr/x30 is 03, return to 03 | 03 mov w0, #0x0 <-----------------------------| | | | dead loop! | | | 04 ret ---- lr/x30 is still 03, go back to 03 ----| The reason is that when the direct caller trace_direct_tramp() returns to the patched function trace_selftest_dynamic_test_func(), lr is still the address after the instrumented instruction in the patched function, so when the patched function exits, it returns to itself! To fix this issue, we need to restore lr before trace_direct_tramp() exits, so use a dedicated trace_direct_tramp() for arm64. Reported-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Florent Revest <revest@chromium.org> --- arch/arm64/include/asm/ftrace.h | 7 +++++++ arch/arm64/kernel/entry-ftrace.S | 10 ++++++++++ kernel/trace/trace_selftest.c | 2 ++ 3 files changed, 19 insertions(+)
Comments
On Wed, Feb 01, 2023 at 05:34:18PM +0100, Florent Revest wrote: > From: Xu Kuohai <xukuohai@huawei.com> > > After direct call is enabled for arm64, ftrace selftest enters a > dead loop: > > <trace_selftest_dynamic_test_func>: > 00 bti c > 01 mov x9, x30 <trace_direct_tramp>: > 02 bl <trace_direct_tramp> ----------> ret > | > lr/x30 is 03, return to 03 > | > 03 mov w0, #0x0 <-----------------------------| > | | > | dead loop! | > | | > 04 ret ---- lr/x30 is still 03, go back to 03 ----| > > The reason is that when the direct caller trace_direct_tramp() returns > to the patched function trace_selftest_dynamic_test_func(), lr is still > the address after the instrumented instruction in the patched function, > so when the patched function exits, it returns to itself! > > To fix this issue, we need to restore lr before trace_direct_tramp() > exits, so use a dedicated trace_direct_tramp() for arm64. > > Reported-by: Li Huafei <lihuafei1@huawei.com> > Signed-off-by: Xu Kuohai <xukuohai@huawei.com> > Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> > Signed-off-by: Florent Revest <revest@chromium.org> Looking at this, I don't think that the existing trace_direct_tramp makes sense -- in general a C function doesn't follow the direct call trampoline calling convention, and cannot be used as a direct call trampoline. Looking further, there's a distinct latent issue on s390, where the mismatch between their regular calling convention and their direct call trampoline calling convention means that trace_direct_tramp() returns into the *caller* of the instrumented function, skipping that entirely (verified locally with QEMU and printk()s added to DYN_FTRACE_TEST_NAME() / DYN_FTRACE_TEST_NAME2() functions). I think it'd be much better to do something like the below as a preparatory cleanup (tested on s390 under QEMU). Thanks, Mark. ---->8---- From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001 From: Mark Rutland <mark.rutland@arm.com> Date: Thu, 2 Feb 2023 18:37:40 +0000 Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp The ftrace selftest code has a trace_direct_tramp() function which it uses as a direct call trampoline. This happens to work on x86, since the direct call's return address is in the usual place, and can be returned to via a RET, but in general the calling convention for direct calls is different from regular function calls, and requires a trampoline written in assembly. On s390, regular function calls place the return address in %r14, and an ftrace patch-site in an instrumented function places the trampoline's return address (which is within the instrumented function) in %r0, preserving the original %r14 value in-place. As a regular C function will return to the address in %r14, using a C function as the trampoline results in the trampoline returning to the caller of the instrumented function, skipping the body of the instrumented function. Note that the s390 issue is not detcted by the ftrace selftest code, as the instrumented function is trivial, and returning back into the caller happens to be equivalent. On arm64, regular function calls place the return address in x30, and an ftrace patch-site in an instrumented function saves this into r9 and places the trampoline's return address (within the instrumented function) in x30. A regular C function will return to the address in x30, but will not restore x9 into x30. Consequently, using a C function as the trampoline results in returning to the trampoline's return address having corrupted x30, such that when the instrumented function returns, it will return back into itself. To avoid future issues in this area, remove the trace_direct_tramp() function, and require that each architecture with direct calls provides a stub trampoline, named ftrace_stub_direct_tramp. This can be written to handle the architecture's trampoline calling convention, and in future could be used elsewhere (e.g. in the ftrace ops sample, to measure the overhead of direct calls), so we may as well always build it in. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Xu Kuohai <xukuohai@huawei.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Florent Revest <revest@chromium.org> --- arch/s390/kernel/mcount.S | 5 +++++ arch/x86/kernel/ftrace_32.S | 5 +++++ arch/x86/kernel/ftrace_64.S | 4 ++++ include/linux/ftrace.h | 2 ++ kernel/trace/trace_selftest.c | 15 ++------------- 5 files changed, 18 insertions(+), 13 deletions(-) diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S index 4786bfe02144..ad13a0e2c307 100644 --- a/arch/s390/kernel/mcount.S +++ b/arch/s390/kernel/mcount.S @@ -32,6 +32,11 @@ ENTRY(ftrace_stub) BR_EX %r14 ENDPROC(ftrace_stub) +SYM_CODE_START(ftrace_stub_direct_tramp) + lgr %r1, %r0 + BR_EX %r1 +SYM_CODE_END(ftrace_stub_direct_tramp) + .macro ftrace_regs_entry, allregs=0 stg %r14,(__SF_GPRS+8*8)(%r15) # save traced function caller diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index a0ed0e4a2c0c..0d9a14528176 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) jmp .Lftrace_ret SYM_CODE_END(ftrace_regs_caller) +SYM_FUNC_START(ftrace_stub_direct_tramp) + CALL_DEPTH_ACCOUNT + RET +SYM_FUNC_END(ftrace_stub_direct_tramp) + #ifdef CONFIG_FUNCTION_GRAPH_TRACER SYM_CODE_START(ftrace_graph_caller) pushl %eax diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 1265ad519249..8fc77e3e039c 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL) SYM_FUNC_END(ftrace_regs_caller) STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller) +SYM_FUNC_START(ftrace_stub_direct_tramp) + CALL_DEPTH_ACCOUNT + RET +SYM_FUNC_END(ftrace_stub_direct_tramp) #else /* ! CONFIG_DYNAMIC_FTRACE */ diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 3d2156e335d7..a9836b40630e 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr); +void ftrace_stub_direct_tramp(void *); + #else struct ftrace_ops; # define ftrace_direct_func_count 0 diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index 06218fc9374b..e6530b7b42e4 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata = { .retfunc = &trace_graph_return, }; -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS -#ifndef CALL_DEPTH_ACCOUNT -#define CALL_DEPTH_ACCOUNT "" -#endif - -noinline __noclone static void trace_direct_tramp(void) -{ - asm(CALL_DEPTH_ACCOUNT); -} -#endif - /* * Pretty much the same than for the function tracer from which the selftest * has been borrowed. @@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, */ ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0); ret = register_ftrace_direct(&direct, - (unsigned long)trace_direct_tramp); + (unsigned long)ftrace_stub_direct_tramp); if (ret) goto out; @@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, unregister_ftrace_graph(&fgraph_ops); ret = unregister_ftrace_direct(&direct, - (unsigned long)trace_direct_tramp); + (unsigned long)ftrace_stub_direct_tramp); if (ret) goto out;
On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Wed, Feb 01, 2023 at 05:34:18PM +0100, Florent Revest wrote: > > From: Xu Kuohai <xukuohai@huawei.com> > > > > After direct call is enabled for arm64, ftrace selftest enters a > > dead loop: > > > > <trace_selftest_dynamic_test_func>: > > 00 bti c > > 01 mov x9, x30 <trace_direct_tramp>: > > 02 bl <trace_direct_tramp> ----------> ret > > | > > lr/x30 is 03, return to 03 > > | > > 03 mov w0, #0x0 <-----------------------------| > > | | > > | dead loop! | > > | | > > 04 ret ---- lr/x30 is still 03, go back to 03 ----| > > > > The reason is that when the direct caller trace_direct_tramp() returns > > to the patched function trace_selftest_dynamic_test_func(), lr is still > > the address after the instrumented instruction in the patched function, > > so when the patched function exits, it returns to itself! > > > > To fix this issue, we need to restore lr before trace_direct_tramp() > > exits, so use a dedicated trace_direct_tramp() for arm64. > > > > Reported-by: Li Huafei <lihuafei1@huawei.com> > > Signed-off-by: Xu Kuohai <xukuohai@huawei.com> > > Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> > > Signed-off-by: Florent Revest <revest@chromium.org> > > Looking at this, I don't think that the existing trace_direct_tramp makes > sense -- in general a C function doesn't follow the direct call trampoline > calling convention, and cannot be used as a direct call trampoline. Agreed. > Looking further, there's a distinct latent issue on s390, where the mismatch > between their regular calling convention and their direct call trampoline > calling convention means that trace_direct_tramp() returns into the *caller* of > the instrumented function, skipping that entirely (verified locally with QEMU > and printk()s added to DYN_FTRACE_TEST_NAME() / DYN_FTRACE_TEST_NAME2() > functions). Good catch! I didn't dare to adventure that far into s390 :) > I think it'd be much better to do something like the below as a preparatory > cleanup (tested on s390 under QEMU). Thanks, that looks great to me. I'll make it a part of the series in v2 then. Unless it's preferred that this gets merged separately? > Thanks, > Mark. > > ---->8---- > From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001 > From: Mark Rutland <mark.rutland@arm.com> > Date: Thu, 2 Feb 2023 18:37:40 +0000 > Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp > > The ftrace selftest code has a trace_direct_tramp() function which it > uses as a direct call trampoline. This happens to work on x86, since the > direct call's return address is in the usual place, and can be returned > to via a RET, but in general the calling convention for direct calls is > different from regular function calls, and requires a trampoline written > in assembly. > > On s390, regular function calls place the return address in %r14, and an > ftrace patch-site in an instrumented function places the trampoline's > return address (which is within the instrumented function) in %r0, > preserving the original %r14 value in-place. As a regular C function > will return to the address in %r14, using a C function as the trampoline > results in the trampoline returning to the caller of the instrumented > function, skipping the body of the instrumented function. > > Note that the s390 issue is not detcted by the ftrace selftest code, as > the instrumented function is trivial, and returning back into the caller > happens to be equivalent. > > On arm64, regular function calls place the return address in x30, and > an ftrace patch-site in an instrumented function saves this into r9 > and places the trampoline's return address (within the instrumented > function) in x30. A regular C function will return to the address in > x30, but will not restore x9 into x30. Consequently, using a C function > as the trampoline results in returning to the trampoline's return > address having corrupted x30, such that when the instrumented function > returns, it will return back into itself. > > To avoid future issues in this area, remove the trace_direct_tramp() > function, and require that each architecture with direct calls provides > a stub trampoline, named ftrace_stub_direct_tramp. This can be written > to handle the architecture's trampoline calling convention, and in > future could be used elsewhere (e.g. in the ftrace ops sample, to > measure the overhead of direct calls), so we may as well always build it > in. > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Cc: Li Huafei <lihuafei1@huawei.com> > Cc: Xu Kuohai <xukuohai@huawei.com> > Cc: Steven Rostedt (Google) <rostedt@goodmis.org> > Cc: Florent Revest <revest@chromium.org> > --- > arch/s390/kernel/mcount.S | 5 +++++ > arch/x86/kernel/ftrace_32.S | 5 +++++ > arch/x86/kernel/ftrace_64.S | 4 ++++ > include/linux/ftrace.h | 2 ++ > kernel/trace/trace_selftest.c | 15 ++------------- > 5 files changed, 18 insertions(+), 13 deletions(-) > > diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S > index 4786bfe02144..ad13a0e2c307 100644 > --- a/arch/s390/kernel/mcount.S > +++ b/arch/s390/kernel/mcount.S > @@ -32,6 +32,11 @@ ENTRY(ftrace_stub) > BR_EX %r14 > ENDPROC(ftrace_stub) > > +SYM_CODE_START(ftrace_stub_direct_tramp) > + lgr %r1, %r0 > + BR_EX %r1 > +SYM_CODE_END(ftrace_stub_direct_tramp) > + > .macro ftrace_regs_entry, allregs=0 > stg %r14,(__SF_GPRS+8*8)(%r15) # save traced function caller > > diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S > index a0ed0e4a2c0c..0d9a14528176 100644 > --- a/arch/x86/kernel/ftrace_32.S > +++ b/arch/x86/kernel/ftrace_32.S > @@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) > jmp .Lftrace_ret > SYM_CODE_END(ftrace_regs_caller) > > +SYM_FUNC_START(ftrace_stub_direct_tramp) > + CALL_DEPTH_ACCOUNT > + RET > +SYM_FUNC_END(ftrace_stub_direct_tramp) > + > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > SYM_CODE_START(ftrace_graph_caller) > pushl %eax > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S > index 1265ad519249..8fc77e3e039c 100644 > --- a/arch/x86/kernel/ftrace_64.S > +++ b/arch/x86/kernel/ftrace_64.S > @@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL) > SYM_FUNC_END(ftrace_regs_caller) > STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller) > > +SYM_FUNC_START(ftrace_stub_direct_tramp) > + CALL_DEPTH_ACCOUNT > + RET > +SYM_FUNC_END(ftrace_stub_direct_tramp) > > #else /* ! CONFIG_DYNAMIC_FTRACE */ > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h > index 3d2156e335d7..a9836b40630e 100644 > --- a/include/linux/ftrace.h > +++ b/include/linux/ftrace.h > @@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); > int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); > int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr); > > +void ftrace_stub_direct_tramp(void *); > + > #else > struct ftrace_ops; > # define ftrace_direct_func_count 0 > diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c > index 06218fc9374b..e6530b7b42e4 100644 > --- a/kernel/trace/trace_selftest.c > +++ b/kernel/trace/trace_selftest.c > @@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata = { > .retfunc = &trace_graph_return, > }; > > -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS > -#ifndef CALL_DEPTH_ACCOUNT > -#define CALL_DEPTH_ACCOUNT "" > -#endif > - > -noinline __noclone static void trace_direct_tramp(void) > -{ > - asm(CALL_DEPTH_ACCOUNT); > -} > -#endif > - > /* > * Pretty much the same than for the function tracer from which the selftest > * has been borrowed. > @@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, > */ > ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0); > ret = register_ftrace_direct(&direct, > - (unsigned long)trace_direct_tramp); > + (unsigned long)ftrace_stub_direct_tramp); > if (ret) > goto out; > > @@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, > unregister_ftrace_graph(&fgraph_ops); > > ret = unregister_ftrace_direct(&direct, > - (unsigned long)trace_direct_tramp); > + (unsigned long)ftrace_stub_direct_tramp); > if (ret) > goto out; > > -- > 2.30.2 >
On Fri, Feb 03, 2023 at 01:35:00PM +0100, Florent Revest wrote: > On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote: > > I think it'd be much better to do something like the below as a preparatory > > cleanup (tested on s390 under QEMU). > > Thanks, that looks great to me. I'll make it a part of the series in v2 then. > Unless it's preferred that this gets merged separately? I reckon put it in the series for v2, and if Steve or Masami want to pick it up beforehand, they can choose to do so from there? Since it's not currently exploding, I suspect it's not urgent. Mark. > > > Thanks, > > Mark. > > > > ---->8---- > > From 3b3abc89fe014ea49282622c4312521b710d1463 Mon Sep 17 00:00:00 2001 > > From: Mark Rutland <mark.rutland@arm.com> > > Date: Thu, 2 Feb 2023 18:37:40 +0000 > > Subject: [PATCH] ftrace: selftest: remove broken trace_direct_tramp > > > > The ftrace selftest code has a trace_direct_tramp() function which it > > uses as a direct call trampoline. This happens to work on x86, since the > > direct call's return address is in the usual place, and can be returned > > to via a RET, but in general the calling convention for direct calls is > > different from regular function calls, and requires a trampoline written > > in assembly. > > > > On s390, regular function calls place the return address in %r14, and an > > ftrace patch-site in an instrumented function places the trampoline's > > return address (which is within the instrumented function) in %r0, > > preserving the original %r14 value in-place. As a regular C function > > will return to the address in %r14, using a C function as the trampoline > > results in the trampoline returning to the caller of the instrumented > > function, skipping the body of the instrumented function. > > > > Note that the s390 issue is not detcted by the ftrace selftest code, as > > the instrumented function is trivial, and returning back into the caller > > happens to be equivalent. > > > > On arm64, regular function calls place the return address in x30, and > > an ftrace patch-site in an instrumented function saves this into r9 > > and places the trampoline's return address (within the instrumented > > function) in x30. A regular C function will return to the address in > > x30, but will not restore x9 into x30. Consequently, using a C function > > as the trampoline results in returning to the trampoline's return > > address having corrupted x30, such that when the instrumented function > > returns, it will return back into itself. > > > > To avoid future issues in this area, remove the trace_direct_tramp() > > function, and require that each architecture with direct calls provides > > a stub trampoline, named ftrace_stub_direct_tramp. This can be written > > to handle the architecture's trampoline calling convention, and in > > future could be used elsewhere (e.g. in the ftrace ops sample, to > > measure the overhead of direct calls), so we may as well always build it > > in. > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > > Cc: Li Huafei <lihuafei1@huawei.com> > > Cc: Xu Kuohai <xukuohai@huawei.com> > > Cc: Steven Rostedt (Google) <rostedt@goodmis.org> > > Cc: Florent Revest <revest@chromium.org> > > --- > > arch/s390/kernel/mcount.S | 5 +++++ > > arch/x86/kernel/ftrace_32.S | 5 +++++ > > arch/x86/kernel/ftrace_64.S | 4 ++++ > > include/linux/ftrace.h | 2 ++ > > kernel/trace/trace_selftest.c | 15 ++------------- > > 5 files changed, 18 insertions(+), 13 deletions(-) > > > > diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S > > index 4786bfe02144..ad13a0e2c307 100644 > > --- a/arch/s390/kernel/mcount.S > > +++ b/arch/s390/kernel/mcount.S > > @@ -32,6 +32,11 @@ ENTRY(ftrace_stub) > > BR_EX %r14 > > ENDPROC(ftrace_stub) > > > > +SYM_CODE_START(ftrace_stub_direct_tramp) > > + lgr %r1, %r0 > > + BR_EX %r1 > > +SYM_CODE_END(ftrace_stub_direct_tramp) > > + > > .macro ftrace_regs_entry, allregs=0 > > stg %r14,(__SF_GPRS+8*8)(%r15) # save traced function caller > > > > diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S > > index a0ed0e4a2c0c..0d9a14528176 100644 > > --- a/arch/x86/kernel/ftrace_32.S > > +++ b/arch/x86/kernel/ftrace_32.S > > @@ -163,6 +163,11 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) > > jmp .Lftrace_ret > > SYM_CODE_END(ftrace_regs_caller) > > > > +SYM_FUNC_START(ftrace_stub_direct_tramp) > > + CALL_DEPTH_ACCOUNT > > + RET > > +SYM_FUNC_END(ftrace_stub_direct_tramp) > > + > > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > > SYM_CODE_START(ftrace_graph_caller) > > pushl %eax > > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S > > index 1265ad519249..8fc77e3e039c 100644 > > --- a/arch/x86/kernel/ftrace_64.S > > +++ b/arch/x86/kernel/ftrace_64.S > > @@ -307,6 +307,10 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL) > > SYM_FUNC_END(ftrace_regs_caller) > > STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller) > > > > +SYM_FUNC_START(ftrace_stub_direct_tramp) > > + CALL_DEPTH_ACCOUNT > > + RET > > +SYM_FUNC_END(ftrace_stub_direct_tramp) > > > > #else /* ! CONFIG_DYNAMIC_FTRACE */ > > > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h > > index 3d2156e335d7..a9836b40630e 100644 > > --- a/include/linux/ftrace.h > > +++ b/include/linux/ftrace.h > > @@ -412,6 +412,8 @@ int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); > > int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr); > > int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr); > > > > +void ftrace_stub_direct_tramp(void *); > > + > > #else > > struct ftrace_ops; > > # define ftrace_direct_func_count 0 > > diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c > > index 06218fc9374b..e6530b7b42e4 100644 > > --- a/kernel/trace/trace_selftest.c > > +++ b/kernel/trace/trace_selftest.c > > @@ -784,17 +784,6 @@ static struct fgraph_ops fgraph_ops __initdata = { > > .retfunc = &trace_graph_return, > > }; > > > > -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS > > -#ifndef CALL_DEPTH_ACCOUNT > > -#define CALL_DEPTH_ACCOUNT "" > > -#endif > > - > > -noinline __noclone static void trace_direct_tramp(void) > > -{ > > - asm(CALL_DEPTH_ACCOUNT); > > -} > > -#endif > > - > > /* > > * Pretty much the same than for the function tracer from which the selftest > > * has been borrowed. > > @@ -875,7 +864,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, > > */ > > ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0); > > ret = register_ftrace_direct(&direct, > > - (unsigned long)trace_direct_tramp); > > + (unsigned long)ftrace_stub_direct_tramp); > > if (ret) > > goto out; > > > > @@ -896,7 +885,7 @@ trace_selftest_startup_function_graph(struct tracer *trace, > > unregister_ftrace_graph(&fgraph_ops); > > > > ret = unregister_ftrace_direct(&direct, > > - (unsigned long)trace_direct_tramp); > > + (unsigned long)ftrace_stub_direct_tramp); > > if (ret) > > goto out; > > > > -- > > 2.30.2 > >
On Fri, Feb 3, 2023 at 4:37 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Fri, Feb 03, 2023 at 01:35:00PM +0100, Florent Revest wrote: > > On Thu, Feb 2, 2023 at 8:03 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > I think it'd be much better to do something like the below as a preparatory > > > cleanup (tested on s390 under QEMU). > > > > Thanks, that looks great to me. I'll make it a part of the series in v2 then. > > Unless it's preferred that this gets merged separately? > > I reckon put it in the series for v2, and if Steve or Masami want to pick it up > beforehand, they can choose to do so from there? > > Since it's not currently exploding, I suspect it's not urgent. > > Mark. Ack
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h index 1c2672bbbf37..cf6d9c42ff36 100644 --- a/arch/arm64/include/asm/ftrace.h +++ b/arch/arm64/include/asm/ftrace.h @@ -168,6 +168,13 @@ static inline bool arch_syscall_match_sym_name(const char *sym, */ return !strcmp(sym + 8, name); } + +#if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \ + defined(CONFIG_FTRACE_SELFTEST) +extern void ftrace_dummy_tramp(void); +#define trace_direct_tramp ftrace_dummy_tramp +#endif + #endif /* ifndef __ASSEMBLY__ */ #endif /* __ASM_FTRACE_H */ diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S index 350ed81324ac..9869debd22fb 100644 --- a/arch/arm64/kernel/entry-ftrace.S +++ b/arch/arm64/kernel/entry-ftrace.S @@ -118,6 +118,16 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) ret x9 SYM_CODE_END(ftrace_caller) +#if defined(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) && \ + defined(CONFIG_FTRACE_SELFTEST) +SYM_CODE_START(ftrace_dummy_tramp) + bti c + mov x10, x30 + mov x30, x9 + ret x10 +SYM_CODE_END(ftrace_dummy_tramp) +#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ + #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ /* diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index 06218fc9374b..f9f5d4e8ab50 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -789,11 +789,13 @@ static struct fgraph_ops fgraph_ops __initdata = { #define CALL_DEPTH_ACCOUNT "" #endif +#ifndef trace_direct_tramp noinline __noclone static void trace_direct_tramp(void) { asm(CALL_DEPTH_ACCOUNT); } #endif +#endif /* * Pretty much the same than for the function tracer from which the selftest