Message ID | 20230525184244.2311-1-namit@vmware.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:3046:b0:115:7a1d:dabb with SMTP id p6csp724526rwl; Thu, 25 May 2023 12:41:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7imVpuPmOdzgfPlu+0ok36APv7nPTYJ1LXWaVVxgI+0n1EXgFR1RmgQKBinisviYUSrEIv X-Received: by 2002:a17:902:d384:b0:1af:e8cf:7004 with SMTP id e4-20020a170902d38400b001afe8cf7004mr2777929pld.15.1685043687856; Thu, 25 May 2023 12:41:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685043687; cv=none; d=google.com; s=arc-20160816; b=yovsudt+8jc6bb7f2mYe3sFrfFpelOU9NPGbAO8/Y6UN79PXgBFl55arSxZjQ4N0d9 dltGKWFRBGabATF+erSesTdPzQWuECvk1Uy1SHuuNreEntXxUUsla8UdZysCuiss9Woo hk8C/3GMyN0OT4qhR3Pn8qOHD1Nk08U4LU2kt8bOEUhG5GfoVUMxf0yrW/HwXoo+zoo9 l6MUdPMCRxLvPlTx8GDLFbLHjIm4x/3WnTXsDmZSRBxxPDxUznH/0Rnm6FFRtirmRic7 VEavSnTIV99Ab+w/dFdqNlYANHsCt5qQg/HF8AuVRuzRO6d4uAK1/Nmjrbzocd2ELj3C ffGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=fjWbDEw2QH6XF35tVFZ0HAaapmFpGTcu4LzSvUqVlcI=; b=JjyEzOMW6vhEPW8E4HqGZZNBKHfN4XTNYd3mILJiPowvtK445OI//Vl5D5HpB4acJY YxvtwAkMrwSLziTysf5hdF7/IRjdyHSlRGU5VDuYLN5HACNN+SRXU5jqyJ17tSgMyW8b L1pYjadl2W3rWMeNNeSGn1d9WsqvBPxRF2uJqJsMflDFWaP5CU2Bzca3jqcnrdhO1EeU CwEdJ1sJcUpT2V4dP81eX+imr/dKJLkdlRK1dF7liuSAspwPR7gWXQIzJr9QnfEJVL2t BZyBxVHkdsMTDswnrSWUToDr01Jt4fwMC2f3LBqoYPBbEslsJC1bhbTlq0VwL702PYPx AboQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="JKNxP5/A"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v10-20020a1709029a0a00b001ab1411f3e6si1999828plp.260.2023.05.25.12.41.12; Thu, 25 May 2023 12:41:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="JKNxP5/A"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243025AbjEYSwn (ORCPT <rfc822;ahmedalshaiji.dev@gmail.com> + 99 others); Thu, 25 May 2023 14:52:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243338AbjEYSuf (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 25 May 2023 14:50:35 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C42101984 for <linux-kernel@vger.kernel.org>; Thu, 25 May 2023 11:43:58 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-64d2da69fdfso106402b3a.0 for <linux-kernel@vger.kernel.org>; Thu, 25 May 2023 11:43:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685040173; x=1687632173; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=fjWbDEw2QH6XF35tVFZ0HAaapmFpGTcu4LzSvUqVlcI=; b=JKNxP5/AUR37jOIze+kYEBgzrgyVEZSKrKb9xeg1aLJVBudQqTOThOxq9gD+G7LTeJ spRfv2ER5vf/xdAzAWyxuILc+HicrO9W5u/HfVcEQcpHZGI6dzjpz/0jZ35+eIHLA4K3 ebxFvs1WhQ8IjUQSJGNYJbaONXJmCIIWS1/15TVXXLFch4ZKk+QAS4sNWiI9d7ky5ICA kbFiMc4jxXWe2Ycaj3vF/pViEMlNDQSiJInExqwTTV6hmBxolNZucxC/ywxfuGbRxJAr DhW5/WIcnVQZo+/6eRKPFe/eH4uTI6wngQ7W7nUqAO/Bg7D2biMZ4mtXST81OQ9zrPOp vBoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685040173; x=1687632173; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fjWbDEw2QH6XF35tVFZ0HAaapmFpGTcu4LzSvUqVlcI=; b=Zho273hoZti0fWC6z5IMuvKZWQDEx85cC8Omn5PXCDN1PabSCOWyjjG0Hef5cTVTcX /OA/qXZFprd3HksR75zD13KlOgz5NTIqsUr+OZ/GgTMs+9cm/1ELGd5fMf/p2LyIhZgk Xdt/kCcznSOiWTNeP6iTx0OfURuMHaYrwOhYkqzK2uvUoN4hfhrJsaiG6490RE+CY1zM 1JVs7gvGm9dalKByceAtmk2ljyhm7tV6BQoNs0dXRM5LNESVkvAjGPsKoxZFOtG2olJW Q5AsMru2BN0ZahhSFoJOU0ta91yaBzZjZaFyqkuA09L50EGL7qswWfReiPzbmX3gfEZ3 g21w== X-Gm-Message-State: AC+VfDyt3VKS8DXzfOeDSVZXyLTmnFIQ2fBvDqGLM3OPL7VKndWcP01h 2TgXSw75iFML5csohDBIKJuE7EPZVRI= X-Received: by 2002:a05:6a00:2284:b0:646:5041:9729 with SMTP id f4-20020a056a00228400b0064650419729mr9764123pfe.0.1685040172885; Thu, 25 May 2023 11:42:52 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id c19-20020aa78813000000b0064d6b6aac5dsm1474267pfo.73.2023.05.25.11.42.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 May 2023 11:42:52 -0700 (PDT) From: Nadav Amit <nadav.amit@gmail.com> X-Google-Original-From: Nadav Amit To: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, linux-kernel@vger.kernel.org, Nadav Amit <namit@vmware.com> Subject: [PATCH v2] x86/lib: Do not use local symbols with SYM_CODE_START_LOCAL() Date: Thu, 25 May 2023 11:42:44 -0700 Message-Id: <20230525184244.2311-1-namit@vmware.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766896369659820054?= X-GMAIL-MSGID: =?utf-8?q?1766896369659820054?= |
Series |
[v2] x86/lib: Do not use local symbols with SYM_CODE_START_LOCAL()
|
|
Commit Message
Nadav Amit
May 25, 2023, 6:42 p.m. UTC
From: Nadav Amit <namit@vmware.com> When SYM_CODE_START_LOCAL() is used, the symbols are local but need to be preserved in the object. However, using the ".L" label prefix does not retain the symbol in the object. It is beneficial to be able to map instruction pointers back to symbols, for instance for profiling. Otherwise, there are code addresses that do not map back to any symbol. Consequently, the ".L" label prefix should not be used when SYM_CODE_START_LOCAL() is used. Few symbols, such as .Lbad_put_user_clac and currently have both the SYM_CODE_START_LOCAL() invocation and the ".L" prefix. This commit removes the ".L" prefix from these symbols. No functional change, other then emitting these symbols into the object, is intended. Signed-off-by: Nadav Amit <namit@vmware.com> --- v1 -> v2: * Rebase on 6.4 (the affected symbols have changed) --- arch/x86/lib/getuser.S | 32 ++++++++++++++++---------------- arch/x86/lib/putuser.S | 24 ++++++++++++------------ 2 files changed, 28 insertions(+), 28 deletions(-)
Comments
On 5/25/23 11:42, Nadav Amit wrote: > From: Nadav Amit <namit@vmware.com> > > When SYM_CODE_START_LOCAL() is used, the symbols are local but need to > be preserved in the object. However, using the ".L" label prefix does > not retain the symbol in the object. > > It is beneficial to be able to map instruction pointers back to symbols, > for instance for profiling. Otherwise, there are code addresses that do > not map back to any symbol. Consequently, the ".L" label prefix should > not be used when SYM_CODE_START_LOCAL() is used. > > Few symbols, such as .Lbad_put_user_clac and currently have both the > SYM_CODE_START_LOCAL() invocation and the ".L" prefix. This commit > removes the ".L" prefix from these symbols. > > No functional change, other then emitting these symbols into the object, > is intended. Nadav, could you perhaps do a bit of research on how this situation came to be? Was it an accident or on purpose that these symbols came to be .L? Then, could you CC the folks who made this change and ask them directly if they intended to induce the effects that you find undesirable?
> On May 25, 2023, at 12:05 PM, Dave Hansen <dave.hansen@intel.com> wrote: > > On 5/25/23 11:42, Nadav Amit wrote: >> From: Nadav Amit <namit@vmware.com> >> >> When SYM_CODE_START_LOCAL() is used, the symbols are local but need to >> be preserved in the object. However, using the ".L" label prefix does >> not retain the symbol in the object. >> >> It is beneficial to be able to map instruction pointers back to symbols, >> for instance for profiling. Otherwise, there are code addresses that do >> not map back to any symbol. Consequently, the ".L" label prefix should >> not be used when SYM_CODE_START_LOCAL() is used. >> >> Few symbols, such as .Lbad_put_user_clac and currently have both the >> SYM_CODE_START_LOCAL() invocation and the ".L" prefix. This commit >> removes the ".L" prefix from these symbols. >> >> No functional change, other then emitting these symbols into the object, >> is intended. > > Nadav, could you perhaps do a bit of research on how this situation came > to be? Was it an accident or on purpose that these symbols came to be > .L? Then, could you CC the folks who made this change and ask them > directly if they intended to induce the effects that you find undesirable? Fair enough. I actually thought it is an oversight, but it now seems intentional (although I am not sure I understand/agree with the reason). So apparently, for one of the symbols from my v1 (which was already removed), I see that Borislav Petkov suggested to prepend the “.L” in order to for them not to be visible [1]. The reason that is given for not making the functions visible is that these are "functions with very local names”. I do not think in this tradeoff not exposing local names worth preventing profilers (and their users) from understanding where a sample/trace is was taken. If for instance you look at a branch trace (e.g., using Intel PT) you want to see the symbol to which a branch goes to. Borislav, Jiri - do you agree? [1] https://lore.kernel.org/all/20190906075550.23435-2-jslaby@suse.cz/
On 25. 05. 23, 21:39, Nadav Amit wrote: > >> On May 25, 2023, at 12:05 PM, Dave Hansen <dave.hansen@intel.com> wrote: >> >> On 5/25/23 11:42, Nadav Amit wrote: >>> From: Nadav Amit <namit@vmware.com> >>> >>> When SYM_CODE_START_LOCAL() is used, the symbols are local but need to >>> be preserved in the object. However, using the ".L" label prefix does >>> not retain the symbol in the object. >>> >>> It is beneficial to be able to map instruction pointers back to symbols, >>> for instance for profiling. Otherwise, there are code addresses that do >>> not map back to any symbol. Consequently, the ".L" label prefix should >>> not be used when SYM_CODE_START_LOCAL() is used. >>> >>> Few symbols, such as .Lbad_put_user_clac and currently have both the >>> SYM_CODE_START_LOCAL() invocation and the ".L" prefix. This commit >>> removes the ".L" prefix from these symbols. >>> >>> No functional change, other then emitting these symbols into the object, >>> is intended. >> >> Nadav, could you perhaps do a bit of research on how this situation came >> to be? Was it an accident or on purpose that these symbols came to be >> .L? Then, could you CC the folks who made this change and ask them >> directly if they intended to induce the effects that you find undesirable? > > Fair enough. I actually thought it is an oversight, but it now seems > intentional (although I am not sure I understand/agree with the reason). > > So apparently, for one of the symbols from my v1 (which was already > removed), I see that Borislav Petkov suggested to prepend the “.L” in > order to for them not to be visible [1]. > > The reason that is given for not making the functions visible is that > these are "functions with very local names”. > > I do not think in this tradeoff not exposing local names worth > preventing profilers (and their users) from understanding where a > sample/trace is was taken. If for instance you look at a branch > trace (e.g., using Intel PT) you want to see the symbol to which a > branch goes to. > > Borislav, Jiri - do you agree? Ah, if it makes tools' output harder to follow, I would indeed switch back to emitting these symbols, i.e. remove the .L prefix from them. That said: Acked-by: Jiri Slaby <jirislaby@kernel.org> thanks,
On Thu, May 25, 2023 at 12:39:47PM -0700, Nadav Amit wrote: > I do not think in this tradeoff not exposing local names worth > preventing profilers (and their users) from understanding where a > sample/trace is was taken. If for instance you look at a branch > trace (e.g., using Intel PT) you want to see the symbol to which a > branch goes to. If those functions were written in C, you wouldn't see any exception-handling symbols either. It is the fact that they're asm and the exception labels are defined "out-of-line" so that you don't have code duplication and thus are symbols outside of the respective functions. So you'd have to give a lot more detailed example where making those symbols global, helps. And if those symbols are going to be global, then they better have more descriptive names as they're gonna be pretty much independent functions. Something like __get_user_handle_exception() or so. Thx.
> On May 26, 2023, at 8:53 AM, Borislav Petkov <bp@alien8.de> wrote: > > On Thu, May 25, 2023 at 12:39:47PM -0700, Nadav Amit wrote: >> I do not think in this tradeoff not exposing local names worth >> preventing profilers (and their users) from understanding where a >> sample/trace is was taken. If for instance you look at a branch >> trace (e.g., using Intel PT) you want to see the symbol to which a >> branch goes to. > > If those functions were written in C, you wouldn't see any > exception-handling symbols either. It is the fact that they're asm > and the exception labels are defined "out-of-line" so that you don't > have code duplication and thus are symbols outside of the respective > functions. According to my experience any or virtually any code address, C or asm, can be mapped back to a symbol. I say virtually all, but it is actually all the code addresses that I encountered. Can you give me some examples for code whose address cannot be mapped back to a symbol? > So you'd have to give a lot more detailed example where making those > symbols global, helps. I did not ask to make them global. Just to keep them as local after linkage in the executable, like all other functions in the kernel. > And if those symbols are going to be global, then they better have more > descriptive names as they're gonna be pretty much independent functions. > Something like __get_user_handle_exception() or so. I can do that.
On Fri, May 26, 2023 at 10:29:29AM -0700, Nadav Amit wrote: > Can you give me some examples for code whose address cannot be mapped > back to a symbol? No, this is not what I'm talking about. I'm talking about all the local labels the compiler uses. For example: $ make kernel/sched/core.s $ grep -E "^\.L" kernel/sched/core.s | wc -l 2799 All those local labels are not in the symbol table (get discarded) and the addresses they represent are shown as belonging to the containing function. > I did not ask to make them global. Just to keep them as local after > linkage in the executable, like all other functions in the kernel. Ok, not global. But local and present in the symbol table: 105185: ffffffff81b89330 17 NOTYPE LOCAL DEFAULT 1 bad_get_user_clac And again, this helps how exactly?
> On May 26, 2023, at 1:45 PM, Borislav Petkov <bp@alien8.de> wrote: > > On Fri, May 26, 2023 at 10:29:29AM -0700, Nadav Amit wrote: >> Can you give me some examples for code whose address cannot be mapped >> back to a symbol? > > No, this is not what I'm talking about. > > I'm talking about all the local labels the compiler uses. For example: > > $ make kernel/sched/core.s > $ grep -E "^\.L" kernel/sched/core.s | wc -l > 2799 > > All those local labels are not in the symbol table (get discarded) and > the addresses they represent are shown as belonging to the containing > function. Right. But the symbols I mentioned are not contained in any other symbol. If you run gdb and try to disasm this bad_get_user_clac (its address), you’d currently get "No function contains specified address”. That what makes these 2 symbols different than the others. > >> I did not ask to make them global. Just to keep them as local after >> linkage in the executable, like all other functions in the kernel. > > Ok, not global. But local and present in the symbol table: > > 105185: ffffffff81b89330 17 NOTYPE LOCAL DEFAULT 1 bad_get_user_clac > > And again, this helps how exactly? Allowing debuggers, tracers, disassemblers and instrumentation tools to work the same way they work as they work with any other piece of code in the kernel. I personally work on code instrumentation and this makes my life hard for no good reason. [ Perhaps the question should go the other way around: why addresses of code in these functions should not be mapped to any symbol? ]
On 5/26/23 14:10, Nadav Amit wrote: >>> I did not ask to make them global. Just to keep them as local after >>> linkage in the executable, like all other functions in the kernel. >> Ok, not global. But local and present in the symbol table: >> >> 105185: ffffffff81b89330 17 NOTYPE LOCAL DEFAULT 1 bad_get_user_clac >> >> And again, this helps how exactly? > Allowing debuggers, tracers, disassemblers and instrumentation tools to > work the same way they work as they work with any other piece of code in > the kernel. > > I personally work on code instrumentation and this makes my life hard for > no good reason. > > [ Perhaps the question should go the other way around: why addresses of > code in these functions should not be mapped to any symbol? ] Nadav, is there a chance you could give us a real-life example of how this affects you as an end user? What's a specific tool that you were using or a specific problem that you were trying to solve where these local symbols caused a problem? How would the global symbol have helped? I can certainly _imagine_ some, but I'm curious what you saw that prompted you to send this patch.
> On May 26, 2023, at 2:17 PM, Dave Hansen <dave.hansen@intel.com> wrote: > > On 5/26/23 14:10, Nadav Amit wrote: >>>> I did not ask to make them global. Just to keep them as local after >>>> linkage in the executable, like all other functions in the kernel. >>> Ok, not global. But local and present in the symbol table: >>> >>> 105185: ffffffff81b89330 17 NOTYPE LOCAL DEFAULT 1 bad_get_user_clac >>> >>> And again, this helps how exactly? >> Allowing debuggers, tracers, disassemblers and instrumentation tools to >> work the same way they work as they work with any other piece of code in >> the kernel. >> >> I personally work on code instrumentation and this makes my life hard for >> no good reason. >> >> [ Perhaps the question should go the other way around: why addresses of >> code in these functions should not be mapped to any symbol? ] > > Nadav, is there a chance you could give us a real-life example of how > this affects you as an end user? What's a specific tool that you were > using or a specific problem that you were trying to solve where these > local symbols caused a problem? How would the global symbol have helped? > > I can certainly _imagine_ some, but I'm curious what you saw that > prompted you to send this patch. So my tool takes a branch trace and then simulates the code execution. As a preparatory step I need to disassemble the code, yet as I do not know where the symbol starts and its size, I can only disassemble one instruction at a time. [ I prefer to disassemble the whole symbol at once not just for performance, but also to figure out if it includes some instructions that my simulator does not know to simulate correctly. ] In addition, as I read the code from kcore and the binary keeps changing, I want to assume that if I do not find an address in the symbol table [*] then it means this is some dynamically generated code that is no longer available through kcore (eBPF, ftrace, etc.). These are only 2 things that break to one extent or another. I can have workarounds for them (I already do). I just see no reason to treat these two symbols differently. I would also note that I can think of many many additional reasons to have each piece of code mapped back to a symbol (besides debuggers, tracers, etc.) For instance, security monitoring tools should prefer to be able to check what code is running in the kernel. I seriously see no downside here and only benefit in consistency and usability. I have no hidden agenda if for some reason you suspect that I do. I don’t want to start talking too much about the tool I work on, as I am afraid it is off-topic, but I hope to open source it soon. -- [*] I know kallsyms does not give sizes, but I make some reasonable assumptions and augment kallsyms with the symbols from the binary.
On Fri, May 26, 2023 at 02:55:21PM -0700, Nadav Amit wrote: > So my tool takes a branch trace and then simulates the code execution. > As a preparatory step I need to disassemble the code, yet as I do not > know where the symbol starts and its size, I can only disassemble one > instruction at a time. [ I prefer to disassemble the whole symbol at once So in this particular case, the exception handling ends up being part of __get_user_nocheck_8, see the end of this mail. However, the symbol table says it is of size 19: 123630: ffffffff81b89310 19 FUNC GLOBAL DEFAULT 1 __get_user_nocheck_8 which means the trailing exception handling is not part of that symbol when looking at the size. And that's where your tool fails. Close? And if so, your tool could simply look at the next symbol's address and attribute the trailing bytes to the previous symbol. If you look at the disassembly at the end, some other option has added INT3 padding to the end of the symbol so that the next one is aligned. But you can simply skip over those 0xcc insn bytes. And skipping over those 0xcc bytes is something your tool needs to handle anyway because they're not part of the symbol either. > These are only 2 things that break to one extent or another. I can > have workarounds for them (I already do). I just see no reason to > treat these two symbols differently. Right, the kernel should not dance just because some tool says so. And every time a new tool pops up, there are patches to "fix" the kernel. > I seriously see no downside The downside is polluting the symbol table unnecessarily. If it doesn't have to be there then it shouldn't be. And yeah yeah, this particular case can be fixed easily but the bigger issue remains: we have a lot of local symbols which get discarded around the tree. Does that mean that because your tool cannot handle that, we have to stop using local symbols? What happens if we do something else weird in the future and your tool breaks again? Also, why can't your tool handle such cases gracefully instead of having to "fix" the kernel each time? Thx. ffffffff81b89310 <__get_user_nocheck_8>: ffffffff81b89310: 90 nop ffffffff81b89311: 90 nop ffffffff81b89312: 90 nop ffffffff81b89313: 90 nop ffffffff81b89314: 90 nop ffffffff81b89315: 90 nop ffffffff81b89316: 48 8b 10 mov (%rax),%rdx ffffffff81b89319: 31 c0 xor %eax,%eax ffffffff81b8931b: 90 nop ffffffff81b8931c: 90 nop ffffffff81b8931d: 90 nop ffffffff81b8931e: e9 9d a3 01 00 jmp ffffffff81ba36c0 <__x86_return_thunk> ffffffff81b89323: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) ffffffff81b8932a: 00 00 00 00 ffffffff81b8932e: 66 90 xchg %ax,%ax ffffffff81b89330: 90 nop ffffffff81b89331: 90 nop ffffffff81b89332: 90 nop ffffffff81b89333: 31 d2 xor %edx,%edx ffffffff81b89335: 48 c7 c0 f2 ff ff ff mov $0xfffffffffffffff2,%rax ffffffff81b8933c: e9 7f a3 01 00 jmp ffffffff81ba36c0 <__x86_return_thunk> ffffffff81b89341: cc int3 ffffffff81b89342: cc int3 ffffffff81b89343: cc int3 ffffffff81b89344: cc int3 ffffffff81b89345: cc int3 ffffffff81b89346: cc int3 ffffffff81b89347: cc int3 ffffffff81b89348: cc int3 ffffffff81b89349: cc int3 ffffffff81b8934a: cc int3 ffffffff81b8934b: cc int3 ffffffff81b8934c: cc int3 ffffffff81b8934d: cc int3 ffffffff81b8934e: cc int3 ffffffff81b8934f: cc int3 ffffffff81b89350 <__pfx_inat_get_opcode_attribute>
> On May 27, 2023, at 12:23 AM, Borislav Petkov <bp@alien8.de> wrote: > > On Fri, May 26, 2023 at 02:55:21PM -0700, Nadav Amit wrote: >> So my tool takes a branch trace and then simulates the code execution. >> As a preparatory step I need to disassemble the code, yet as I do not >> know where the symbol starts and its size, I can only disassemble one >> instruction at a time. [ I prefer to disassemble the whole symbol at once > > So in this particular case, the exception handling ends up being part of > __get_user_nocheck_8, see the end of this mail. That’s not according to the symbol table - that’s in your mind. Anyhow, the argument that __get_user_nocheck_8 and bad_get_user_clac are related makes no sense even conceptually. __get_user_nocheck_8 jumps in the case of an exception to bad_get_user_8_clac, not to bad_get_user_clac. So even conceptually this notion that these two symbols are connected makes no sense. > > However, the symbol table says it is of size 19: > > 123630: ffffffff81b89310 19 FUNC GLOBAL DEFAULT 1 __get_user_nocheck_8 > > which means the trailing exception handling is not part of that symbol > when looking at the size. And that's where your tool fails. > > Close? Some people would even say “elementary”. I was sure it was already clear. > > And if so, your tool could simply look at the next symbol's address and > attribute the trailing bytes to the previous symbol. > > If you look at the disassembly at the end, some other option has added > INT3 padding to the end of the symbol so that the next one is aligned. > But you can simply skip over those 0xcc insn bytes. > > And skipping over those 0xcc bytes is something your tool needs to > handle anyway because they're not part of the symbol either. I appreciate your help, but I have reasonable workarounds for my use-case (and for the record, no, I don’t think that this solution that you propose is reasonable). > >> These are only 2 things that break to one extent or another. I can >> have workarounds for them (I already do). I just see no reason to >> treat these two symbols differently. > > Right, the kernel should not dance just because some tool says so. And > every time a new tool pops up, there are patches to "fix" the kernel. It is not “a new tool". You screw up every tool that tries to understand the context of an instruction pointer - gdb (that people use to debug VMs) and probably perf, crash, drgn and many others. > >> I seriously see no downside > > The downside is polluting the symbol table unnecessarily. If it doesn't > have to be there then it shouldn't be. That’s a tautology, not a downside. And it is not “unnecessarily” if it helps debugging and profiling. > > And yeah yeah, this particular case can be fixed easily but the bigger > issue remains: we have a lot of local symbols which get discarded around > the tree. Does that mean that because your tool cannot handle that, we > have to stop using local symbols? All the other local symbols are irrelevant to the discussion as they fall within some other symbol's range. > > What happens if we do something else weird in the future and your tool > breaks again? > > Also, why can't your tool handle such cases gracefully instead of having > to "fix" the kernel each time? As I stated multiple times (which are even quoted in this email): I have a workaround. You are not (not) helping me. I am trying to help you (and other users). The kernel right now messes up with people's debugging and profiling tools. So allow me to rehash, since I thought that we have already agreed on the details of the problem, but I see again that various arguments are although they are either incorrect or not relevant for this discussion: 1. It is *not* about global symbols. It is just about exposing symbols. 2. It is *not* about symbols that fall within other symbols. Therefore, all the other local symbols you grep’d are irrelevant for this discussion. 3. There are exactly 2 symbols we discuss (SYM_CODE_START_LOCAL + .L). 4. These 2 cases are the only ones that I know of in which code address does not fall into any symbol. 5. It is not about “my tool”. It is about gdb (think about people debug their VMs), perf and most likely crash, drgn, and others. Now, as for your question “what happens if we do something else weird”: If a tool relies on internal kernel data structures, it’s its own problem. But if you decide to do “something else weird” with standard executable data structures - such as DWARF or symbol table - you mess up with debuggers and profilers. So just don’t do such weird things.
On Sat, May 27, 2023 at 02:17:43AM -0700, Nadav Amit wrote: > That’s not according to the symbol table - that’s in your mind. s/your mind/objdump/ Objdump takes the next symbol's address as the end of the previous one. > Anyhow, the argument that __get_user_nocheck_8 and bad_get_user_clac are > related makes no sense even conceptually. I don't think anyone's making that argument. Maybe you should read again what I said: "the exception handling ends up being part of __get_user_nocheck_8" > Some people would even say “elementary”. I was sure it was already clear. Your cocky attitude will get you nowhere. But whatever you prefer. > I appreciate your help, but I have reasonable workarounds for my use-case > (and for the record, no, I don’t think that this solution that you > propose is reasonable). I'm simply stating what objdump does. I guess objdump is not good enough for you. > It is not “a new tool". You screw up every tool that tries to understand I'm not screwing up anything - that's your claim. > All the other local symbols are irrelevant to the discussion as they fall > within some other symbol's range. As does this one if you deal with it just like objdump does. > You are not (not) helping me. I am trying to help you (and other users). Gee, thanks. I didn't know this needed any help. > So just don’t do such weird things. Yah, good luck with that. If it needs to be done in a weird way and it is the *right* thing to do for the kernel, I couldn't care less about some external tools. As to what you want to address, I'll talk to toolchain folks first and get back to you. Thx.
> On May 27, 2023, at 5:29 AM, Borislav Petkov <bp@alien8.de> wrote: > > Your cocky attitude will get you nowhere. But whatever you prefer. I am sorry if it came this way, and most like it was cocky. I apologize. I felt the volume is turned up from your side and I turned it up even further. I felt that I fell into Dave’s “trap” (metaphorically) and it turned to be me adapting the kernel for my needs - when it really isn’t. Try to use gdb to disasm to address. It does not work. It’s an exception (to the rule). The fix is easy. Anyhow, that’s not the point - I just want to apologize. Thanks for your involvement and support.
> On May 27, 2023, at 5:29 AM, Borislav Petkov <bp@alien8.de> wrote: > > As to what you want to address, I'll talk to toolchain folks first and > get back to you. I hope you got the answer you were looking for. I am not sure what is holding this simple patch back. To rehash - we are talking about local two symbols that are not exposed. Based on my search they cover the only region of the kernel text (on x86) that is not covered by any symbol. Doing so have two types of impacts. Some tools are affected by the fact the closest previous symbol is not related, and as a result the symbol they show when they unwind the stack is unrelated. So instead of seeing “bad_get_user_clac”, you may see __get_user_nocheck_8 . This is confusing and misleading users. This should impact perf, ftrace, /proc/[pid]/stack, dump_stack(), BUG(). The second type of impact occurs since certain code addresses is not covered by any symbol. This mostly results in reduced functionality of tools. This includes for instance gdb that cannot “disas” addresses in bad_get_user_clac (you can x/i for reduced functionality) or crash in which “dis” only disassembles a single instruction. It might also have impact on backtraces - I did not try it. addr2line and llvm-symbolizer also seem to be affected in such a way and they do not find the symbol that is associated with addresses in bad_get_user_clac. This means that tools that rely such tools, including decode_stacktrace.sh are also affected. [*] There might be other impacts, for instance on kpatch. Overall, as a general rule, I think it would be best not to have code that is not covered by any symbol. It can result in misleading output from the kernel or related tools, and in addition in more limited functionality from tools such as gdb and crash. More concretely, I think these two symbols should not be stripped. The fact that the code of these symbols runs under relatively complicated conditions (exception tables), makes it even more important to let debug/tracing tools to see them. I you wish, I can include the gist of these points in the commit log, although I think it might be an overkill. [*] It is worth noting that tools such as addrline and llvm-symbolizer are able to use DWARF to find the source code location, yet they do not appear to be able to find the relevant symbol.
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index b64a2bd1a1ef..d98a80e0cdaf 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -143,43 +143,43 @@ SYM_FUNC_END(__get_user_nocheck_8) EXPORT_SYMBOL(__get_user_nocheck_8) -SYM_CODE_START_LOCAL(.Lbad_get_user_clac) +SYM_CODE_START_LOCAL(bad_get_user_clac) ASM_CLAC .Lbad_get_user: xor %edx,%edx mov $(-EFAULT),%_ASM_AX RET -SYM_CODE_END(.Lbad_get_user_clac) +SYM_CODE_END(bad_get_user_clac) #ifdef CONFIG_X86_32 -SYM_CODE_START_LOCAL(.Lbad_get_user_8_clac) +SYM_CODE_START_LOCAL(bad_get_user_8_clac) ASM_CLAC bad_get_user_8: xor %edx,%edx xor %ecx,%ecx mov $(-EFAULT),%_ASM_AX RET -SYM_CODE_END(.Lbad_get_user_8_clac) +SYM_CODE_END(bad_get_user_8_clac) #endif /* get_user */ - _ASM_EXTABLE(1b, .Lbad_get_user_clac) - _ASM_EXTABLE(2b, .Lbad_get_user_clac) - _ASM_EXTABLE(3b, .Lbad_get_user_clac) + _ASM_EXTABLE(1b, bad_get_user_clac) + _ASM_EXTABLE(2b, bad_get_user_clac) + _ASM_EXTABLE(3b, bad_get_user_clac) #ifdef CONFIG_X86_64 - _ASM_EXTABLE(4b, .Lbad_get_user_clac) + _ASM_EXTABLE(4b, bad_get_user_clac) #else - _ASM_EXTABLE(4b, .Lbad_get_user_8_clac) - _ASM_EXTABLE(5b, .Lbad_get_user_8_clac) + _ASM_EXTABLE(4b, bad_get_user_8_clac) + _ASM_EXTABLE(5b, bad_get_user_8_clac) #endif /* __get_user */ - _ASM_EXTABLE(6b, .Lbad_get_user_clac) - _ASM_EXTABLE(7b, .Lbad_get_user_clac) - _ASM_EXTABLE(8b, .Lbad_get_user_clac) + _ASM_EXTABLE(6b, bad_get_user_clac) + _ASM_EXTABLE(7b, bad_get_user_clac) + _ASM_EXTABLE(8b, bad_get_user_clac) #ifdef CONFIG_X86_64 - _ASM_EXTABLE(9b, .Lbad_get_user_clac) + _ASM_EXTABLE(9b, bad_get_user_clac) #else - _ASM_EXTABLE(9b, .Lbad_get_user_8_clac) - _ASM_EXTABLE(10b, .Lbad_get_user_8_clac) + _ASM_EXTABLE(9b, bad_get_user_8_clac) + _ASM_EXTABLE(10b, bad_get_user_8_clac) #endif diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S index 3062d09a776d..f0c80e07229d 100644 --- a/arch/x86/lib/putuser.S +++ b/arch/x86/lib/putuser.S @@ -131,22 +131,22 @@ SYM_FUNC_START(__put_user_nocheck_8) SYM_FUNC_END(__put_user_nocheck_8) EXPORT_SYMBOL(__put_user_nocheck_8) -SYM_CODE_START_LOCAL(.Lbad_put_user_clac) +SYM_CODE_START_LOCAL(bad_put_user_clac) ASM_CLAC .Lbad_put_user: movl $-EFAULT,%ecx RET -SYM_CODE_END(.Lbad_put_user_clac) +SYM_CODE_END(bad_put_user_clac) - _ASM_EXTABLE(1b, .Lbad_put_user_clac) - _ASM_EXTABLE(2b, .Lbad_put_user_clac) - _ASM_EXTABLE(3b, .Lbad_put_user_clac) - _ASM_EXTABLE(4b, .Lbad_put_user_clac) - _ASM_EXTABLE(5b, .Lbad_put_user_clac) - _ASM_EXTABLE(6b, .Lbad_put_user_clac) - _ASM_EXTABLE(7b, .Lbad_put_user_clac) - _ASM_EXTABLE(9b, .Lbad_put_user_clac) + _ASM_EXTABLE(1b, bad_put_user_clac) + _ASM_EXTABLE(2b, bad_put_user_clac) + _ASM_EXTABLE(3b, bad_put_user_clac) + _ASM_EXTABLE(4b, bad_put_user_clac) + _ASM_EXTABLE(5b, bad_put_user_clac) + _ASM_EXTABLE(6b, bad_put_user_clac) + _ASM_EXTABLE(7b, bad_put_user_clac) + _ASM_EXTABLE(9b, bad_put_user_clac) #ifdef CONFIG_X86_32 - _ASM_EXTABLE(8b, .Lbad_put_user_clac) - _ASM_EXTABLE(10b, .Lbad_put_user_clac) + _ASM_EXTABLE(8b, bad_put_user_clac) + _ASM_EXTABLE(10b, bad_put_user_clac) #endif