From patchwork Mon Nov 28 04:47:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tsukasa OI X-Patchwork-Id: 2289 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5434088wrr; Sun, 27 Nov 2022 20:51:18 -0800 (PST) X-Google-Smtp-Source: AA0mqf6WW+qBYwJ/Vwyyvf9ZBJwiXcZIAIzTHqQQM161asS9F8t5sdzzAToVhcxuJR9ji/vboaPv X-Received: by 2002:a17:906:ccd2:b0:7bd:fe2a:efef with SMTP id ot18-20020a170906ccd200b007bdfe2aefefmr7183641ejb.158.1669611078824; Sun, 27 Nov 2022 20:51:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669611078; cv=none; d=google.com; s=arc-20160816; b=MIyIWY7ljmzGF3qnCvJ2yyAyoWFvR3+BFK7aPYr1u7fjZy3JZtvK4lwNBIuYeIYhbl Xs7mIQduo0EuvyfLJoFCuQWox4ItzYeSEUmPE8M+mt66+adol3FNfhWXHmapRpawAfqw QtNwCiXBZrtiz8OkCKNFhuzQ8/63iHjf420Q057j53hq9M+7X+8XElSO7wvfWTY4lb1g yk8tzCczGenVj3s0/kOkgej9tr3lu+POHoKZprmRhrMS/sbNLLbs0PHVLFLaEIwRCl4T 8NSRwtFHxyJUpc24kC2/W3N5JRxOPJ1TOUK30i8WfW/OGLxvS/XqoPMoxWaxCd4/duRc UdlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=lc8pQFbTmRDevRot/mOT7TilkLjwl0RWP4bM9RBzANo=; b=bS1Jh35gF1tEUpxeQx0ueYBReJ8hK2+NTyDkwHcU1KpKRornTOA3fTBPQaoVBXapkz aXWbAMD2/UMUNP0zzRCcWXtVdtdxg6UP6oJZqDLmf5wacYbDI+i40o5L2q/8Yb0ol1GZ 7kKh38l51MZnZcA7b3919lY4RWL1dIZXkcEgCsZHUCFtIcI4/i9NQ2XaR4XewPfv+Rn8 S6V31f1L9hWxcFke+ndrpJSYsc1G7AKAssfM88RUso+VzSHYiGrFd+alfPg0UnlqjozG hBJQai0CyXeZTWJTAJqPMG7uAr/T5qVNIsEatahe2nL770i2Vgly+WE7ohqErpCBf2VN fNjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WtvJlrhX; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id hr39-20020a1709073fa700b007aae0b30283si9683580ejc.691.2022.11.27.20.51.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 27 Nov 2022 20:51:18 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WtvJlrhX; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5C91E3894C0E for ; Mon, 28 Nov 2022 04:47:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5C91E3894C0E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1669610872; bh=lc8pQFbTmRDevRot/mOT7TilkLjwl0RWP4bM9RBzANo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=WtvJlrhXvwvwsj6k1YeWcLdvTcCNHjCClcdUwyGNdvvJN+k6F3pJGGs39htxQSbWy o+lez33YDpSUqIihStDoItPVLukK7R6LanrQQdTiVPSsKafbgZK7WN/2JO8ctvd0DX VNXgWUH+cFudCcR7n0obgmD7/ihzP+buQOSuRmfw= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-sender-0.a4lg.com (mail-sender-0.a4lg.com [IPv6:2401:2500:203:30b:4000:6bfe:4757:0]) by sourceware.org (Postfix) with ESMTPS id 55746386EC00 for ; Mon, 28 Nov 2022 04:47:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55746386EC00 Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id AFCEA300089; Mon, 28 Nov 2022 04:47:27 +0000 (UTC) To: Tsukasa OI , Nelson Chu , Kito Cheng , Palmer Dabbelt Cc: binutils@sourceware.org Subject: [PATCH v2 0/3] RISC-V: Disassembler Core Optimization 1-2 (Mapping symbols) Date: Mon, 28 Nov 2022 04:47:20 +0000 Message-Id: In-Reply-To: References: Mime-Version: 1.0 X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tsukasa OI via Binutils From: Tsukasa OI Reply-To: Tsukasa OI Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749975449696505745?= X-GMAIL-MSGID: =?utf-8?q?1750714106844126095?= Hello, This is the Part 4 of 4-part project to improve disassembler performance drastically: ** this patchset does not apply to master directly. ** This patchset requires following patchset(s) to be applied first: [Changes: v1 -> v2] - Rebased against commit 97f006bc56af: "RISC-V: Better support for long instructions (disassembler)" Following is basically a copy from the PATCH 3/3 commit message. For ELF files with many symbols and/or sections (static libraries, partially linked files [e.g. vmlinux.o] or large object files), the disassembler is drastically slowed down by looking up the suitable mapping symbol. This is caused by the fact that: - It used an inefficient linear search to find the suitable mapping symbol - symtab_pos is not always a good hint for forward linear search and - The symbol table accessible by the disassembler is sorted by address and then section (not section, then address). They sometimes force O(n^2) mapping symbol search time while searching for the suitable mapping symbol for given address. This commit implements: - A binary search to look up suitable mapping symbol (O(log(n)) time per a lookup call, O(m + n*log(n)) time on initialization where n < m), - Separate mapping symbol table, sorted by section and then address (unless the section to disassemble is NULL), - A very short linear search, even faster than binary search, when disassembling consecutive addresses (usually traverses only 1 or 2 symbols, O(n) on the worst case but this is only expected on adversarial samples) and - Efficient tracking of mapping symbols with ISA string (by propagating arch field of "$x+(arch)" to succeeding "$x" symbols). It also changes when the disassembler reuses the last mapping symbol. This commit only uses the last disassembled address to determine whether the last mapping symbol should be reused. This commit doesn't improve the disassembler performance much on regular programs in general. However, it expects >50% disassembler performance improvements on some files that "RISC-V: Use faster hash table on disassembling" was not effective enough. On bigger libraries, following numbers are observed during the benchmark in development: - x 2.13 - 2.22 : Static library : Newlib (libc.a) - x 5.67 - 6.09 : Static library : GNU libc (libc.a) - x 11.72 - 12.04 : Shared library : OpenSSL (libcrypto.so) - x 96.29 : Shared library : LLVM 14 (libLLVM-14.so) Thanks, Tsukasa Tsukasa OI (3): RISC-V: Easy optimization on riscv_search_mapping_symbol RISC-V: Per-section private data initialization RISC-V: Optimized search on mapping symbols opcodes/disassemble.c | 1 + opcodes/disassemble.h | 2 + opcodes/riscv-dis.c | 443 +++++++++++++++++++++++++++++------------- 3 files changed, 311 insertions(+), 135 deletions(-) base-commit: 5b561967091a59d0052bd717d1b9f3e31ef841cc