From patchwork Thu Feb 23 17:26:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Clifton X-Patchwork-Id: 61037 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp432416wrd; Thu, 23 Feb 2023 09:26:34 -0800 (PST) X-Google-Smtp-Source: AK7set9tHKr61tb9/w6uDEddlVDeEMvD3OebWeqQCP2OUdvBomvoZErfM+oUeD7mkpEje1GIa+bi X-Received: by 2002:a17:907:33cc:b0:895:58be:94a with SMTP id zk12-20020a17090733cc00b0089558be094amr21728270ejb.14.1677173194780; Thu, 23 Feb 2023 09:26:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677173194; cv=none; d=google.com; s=arc-20160816; b=IQsAlACeKMPxgVtT8kZQz6lCD9z4N04xHwuW6wGmyaV9cgTk/8w3kIRE0wU0yFuyH0 FwOwycOdWgaDs+gRFIKREps8gnIchpXf4k2IfOVzYcJUPfU63ZMa2E06OfAPOMp4ELe4 wZUQs91vnMrnXE+KDJcAq2+g/rQRKHyhiBpfphoa8CYk23jvUsiF8Uxf7byhtxj+k13h mxK6b2mnfai6kRyRYD0F/cjqhs+8y4sJzHN4fTyJE0FDSqPJcKskngML+VkZYw4q2hSh 4gEfjVJg0j3n1TDuEFAFEUZxaIYRbkMmTWzkrTmrMkHTB5LqcWirDYR3B29jklnHmQps rFVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=ajMViL8RzphD559PClOXsOBmw1l3cqV3ONrW+M586Gk=; b=dJHPfkIfEZVRAyMpv7FdSjH7DiYwiFavtKV5ZIGBHVSIzmjrrQpQrHhCU/3b+YWyXb WoMKqSTzZlJkXZ24DdT79cAZW+a9FaDxwiTlmG0K1/GxD130xMfvj4yab82QuIuS04yO pcXWxWUBE/xOT4Pfd0Hwp3DrvmQNH4tCGZUFQaFCUspBvsAY5Z6Np2KyDbf5Y4Z0sjfL ttMbS4ZQ6QLTY1poIXpANJIp5FbKMkaNyZjMNjXGL0d4Uq5BWZXLHzXiZZ40Knjz8sT6 9itpmqljKJ0mD7pddG0JXWASZJtwz+omeyNUyZzPKfKifQYc5R0WgBlMF46d4wGoHg67 jUmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=L02xZlUn; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id eu23-20020a170907299700b008d2b0211398si11579321ejc.534.2023.02.23.09.26.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Feb 2023 09:26:34 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=L02xZlUn; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4DCF7385800A for ; Thu, 23 Feb 2023 17:26:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4DCF7385800A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677173192; bh=ajMViL8RzphD559PClOXsOBmw1l3cqV3ONrW+M586Gk=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=L02xZlUnTSlLlsN0wsOcguTnpPT9YGC55M0Q1Q45tsNTtJm1QLQIBr0qf7neFGVb8 heahJ3cjF/9+71KqWcq5kISjNLCPOGELQI2S4DDMs82FubMZUrJzrW26DhgD+ejkRy IlPh29JnuOEOZcGclf+nDCSj2GJdTSlUukM/DgM4= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 8AB4F3858425 for ; Thu, 23 Feb 2023 17:26:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8AB4F3858425 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-459-HJSJa1J9M2255q71552wsg-1; Thu, 23 Feb 2023 12:26:21 -0500 X-MC-Unique: HJSJa1J9M2255q71552wsg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 18808280604D for ; Thu, 23 Feb 2023 17:26:21 +0000 (UTC) Received: from prancer.redhat.com (unknown [10.33.36.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A6BB4492C14 for ; Thu, 23 Feb 2023 17:26:20 +0000 (UTC) To: binutils@sourceware.org Subject: Commit: Better caching in elf_find_function Date: Thu, 23 Feb 2023 17:26:19 +0000 Message-ID: <87bklkths4.fsf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Nick Clifton via Binutils From: Nick Clifton Reply-To: Nick Clifton Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758643560088483601?= X-GMAIL-MSGID: =?utf-8?q?1758643560088483601?= Hi Guys, I Following on from the fix for addr2line inconsistent behaviour, as reported in PR 30150, I am applying the attached patch to correct a related problem. The issue is _bfd_elf_find_function() and its attempts to cache a previous result in order to speed up future lookups. If the function is called successively with two addresses and the second address happens to lie within the region of the symbol discovered for the first address, then the cached value will be used, even if there is a better match available. For example: % addr2line -fipae vmlinux 0xffffffff81002000 0xffffffff81002020 0xffffffff81002000: hypercall_page at /tmp/linux-5.15.92/linux-5.15.92/arch/x86/kernel/../../x86/xen/xen-head.S:75 0xffffffff81002020: hypercall_page at /tmp/linux-5.15.92/linux-5.15.92/arch/x86/kernel/../../x86/xen/xen-head.S:75 The second result is wrong as there is an exact match for the 0xffffffff8100202 address: % readelf --syms --wide vmlinux | grep -e hypercall_page -e xen_hypercall_mmu_update 82: ffffffff81002020 32 FUNC LOCAL DEFAULT 1 xen_hypercall_mmu_update 117144: ffffffff81002000 4096 NOTYPE GLOBAL DEFAULT 1 hypercall_page The patch fixes the problem by checking to see if symbols beyond the target address lie within the region covered by the current best-fit, and if they do, reducing the size of the best fit so that it no longer overlaps. In addition the patch moves the logic for choosing a best fit into a separate inline function in order to make it simpler and easier to understand. Tested with no regressions on a large number of different toolchains. Cheers Nick diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c index 8c9bd7a3026..ac7c4f63c57 100644 --- a/bfd/dwarf2.c +++ b/bfd/dwarf2.c @@ -4681,6 +4681,7 @@ comp_unit_find_nearest_line (struct comp_unit *unit, *function_ptr = NULL; func_p = lookup_address_in_function_table (unit, addr, function_ptr); + if (func_p && (*function_ptr)->tag == DW_TAG_inlined_subroutine) unit->stash->inliner_chain = *function_ptr; @@ -6134,6 +6135,60 @@ _bfd_dwarf2_cleanup_debug_info (bfd *abfd, void **pinfo) bfd_close (stash->alt.bfd_ptr); } +typedef struct elf_find_function_cache +{ + asection * last_section; + asymbol * func; + const char * filename; + bfd_size_type code_size; + bfd_vma code_off; + +} elf_find_function_cache; + + +/* Returns TRUE if the symbol with address CODE_OFF and size CODE_SIZE + is a better fit to match OFFSET than whatever is currenly stored in + CACHE. */ + +static inline bool +better_fit (elf_find_function_cache * cache, + bfd_vma code_off, + bfd_size_type code_size, + bfd_vma offset) +{ + /* If the symbol is beyond the desired offset, ignore it. */ + if (code_off > offset) + return false; + + /* If the symbol is further away from the desired + offset than our current best, then ignore it. */ + if (code_off < cache->code_off) + return false; + + /* On the other hand, if it is closer, then use it. */ + if (code_off > cache->code_off) + return true; + + /* If our current best fit does not actually reach the desired + offset... */ + if (cache->code_off + cache->code_size <= offset) + { + /* Then return whichever candidate covers more area. */ + return code_size > cache->code_size; + } + + /* If the new symbol also covers the desired offset... */ + if (code_off + code_size > offset) + { + /* Then choose whichever is smaller. */ + /* FIXME: Maybe prefer LOCAL over GLOBAL or something else here ? */ + return code_size < cache->code_size; + } + + /* Otherwise the cached symbol is better. */ + return false; +} + /* Find the function to a particular section and offset, for error reporting. */ @@ -6145,21 +6200,14 @@ _bfd_elf_find_function (bfd *abfd, const char **filename_ptr, const char **functionname_ptr) { - struct elf_find_function_cache - { - asection *last_section; - asymbol *func; - const char *filename; - bfd_size_type func_size; - } *cache; - if (symbols == NULL) return NULL; if (bfd_get_flavour (abfd) != bfd_target_elf_flavour) return NULL; - cache = elf_tdata (abfd)->elf_find_function_cache; + elf_find_function_cache * cache = elf_tdata (abfd)->elf_find_function_cache; + if (cache == NULL) { cache = bfd_zalloc (abfd, sizeof (*cache)); @@ -6167,13 +6215,13 @@ _bfd_elf_find_function (bfd *abfd, if (cache == NULL) return NULL; } + if (cache->last_section != section || cache->func == NULL || offset < cache->func->value - || offset >= cache->func->value + cache->func_size) + || offset >= cache->func->value + cache->code_size) { asymbol *file; - bfd_vma low_func; asymbol **p; /* ??? Given multiple file symbols, it is impossible to reliably choose the right file name for global symbols. File symbols are @@ -6187,11 +6235,11 @@ _bfd_elf_find_function (bfd *abfd, const struct elf_backend_data *bed = get_elf_backend_data (abfd); file = NULL; - low_func = 0; state = nothing_seen; cache->filename = NULL; cache->func = NULL; - cache->func_size = 0; + cache->code_size = 0; + cache->code_off = 0; cache->last_section = section; for (p = symbols; *p != NULL; p++) @@ -6208,24 +6256,36 @@ _bfd_elf_find_function (bfd *abfd, continue; } + if (state == nothing_seen) + state = symbol_seen; + size = bed->maybe_function_sym (sym, section, &code_off); - if (size != 0 - && code_off <= offset - && (code_off > low_func - || (code_off == low_func - && size > cache->func_size))) + + if (size == 0) + continue; + + if (better_fit (cache, code_off, size, offset)) { cache->func = sym; - cache->func_size = size; + cache->code_size = size; + cache->code_off = code_off; cache->filename = NULL; - low_func = code_off; + if (file != NULL && ((sym->flags & BSF_LOCAL) != 0 || state != file_after_symbol_seen)) cache->filename = bfd_asymbol_name (file); } - if (state == nothing_seen) - state = symbol_seen; + /* Otherwise, if the symbol is beyond the desired offset but it + lies within the bounds of the current best match then reduce + the size of the current best match so that future searches + will not not used the cached symbol by mistake. */ + else if (code_off > offset + && code_off > cache->code_off + && code_off < cache->code_off + cache->code_size) + { + cache->code_size = code_off - cache->code_off; + } } }