From patchwork Mon Nov 28 04:46:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tsukasa OI X-Patchwork-Id: 2288 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5433499wrr; Sun, 27 Nov 2022 20:48:51 -0800 (PST) X-Google-Smtp-Source: AA0mqf6HMJFn17bYzrIPLKso5tI1GO7IQ546JIoWVv7nd9ps7c8z1BmRmv5KuM7A38nCsvY6EPqL X-Received: by 2002:a17:906:3993:b0:7ad:f5a9:ece3 with SMTP id h19-20020a170906399300b007adf5a9ece3mr41863947eje.635.1669610931099; Sun, 27 Nov 2022 20:48:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669610931; cv=none; d=google.com; s=arc-20160816; b=nUVAR4nlGYSr39+AVZicNGEMmhIWrhhA+KtajacPAVCQGWY3br6WP8buyLx0jT9vwq jYny8dQ8I8Z4Q210MIaWEaxSuExnVqHCHvviaEIjre3SpYOoncj8ebSHdO47UFemJI1N 65CQ3w4hJSa5nAGVWF+qCGpeZKFY0ORz5VvsT1j74eSSnpJU2JL//qVh894i9mRR0m5V mjB6vkti0RK4gA/widLjzvlXOcQFvEwgilpBBkByZ7oIkMlL2u4kbkzu0IR2nVkx8eeS w9POJqPVzu+xtdXCewI03stceq4RuIp5r3Y/wvQ83mYxtui6sMu/LHCB34xs1HgdNeHc 4G3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=GBde5aRtOjo1fCwfCTfRok6hVaKN8gP8YXeMYyFpMck=; b=pf4oOIMZqDpZVR3EvpNr/iSerX7LF7F05q7AuUwOdcgElY5kt48a/wxO+ViPngdAik aqi/G3ZT899yj2PdB9eCmGtiA0fBYhlftZATbdIslQAAX8mpuaZjR4J5xubFiyeqh8UC sxAppQPWV5mBDQw/nmsb4kfx4fnBDraK2b88NCgA2YFLIFimRUm7JbHx7GPKKxabMkxQ egoBqdZtCRlQAgq+JkJp28utfyGiZK1UXVz7r0ZeYCHegQz0BP7o0LIESb5lt1oE2zEI gz+o8zhRoG0xmt8ipiEdGz5ZUia1BnL/Tet0yyet9jgXkC8H6qCbIHAr605ZM08yhxfk Rgcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="r/MjwYR+"; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id sb29-20020a1709076d9d00b0078d8cc2006csi10446088ejc.697.2022.11.27.20.48.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 27 Nov 2022 20:48:51 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="r/MjwYR+"; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 36DF83854553 for ; Mon, 28 Nov 2022 04:46:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 36DF83854553 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1669610801; bh=GBde5aRtOjo1fCwfCTfRok6hVaKN8gP8YXeMYyFpMck=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=r/MjwYR+mRunsR/6P/q4dyUi4TZADnYUTdA7+urhLesXOHNMFDgY4QgIO5Yjz8c1b TMbiHKddyIuJ4gfXeMiJ+GDsODZVqMCRBPsCsQ9S4XnhHXuMSjVeoabpf2mqfsKP0o mTj58U/R5lmINh0HLuupujgV50Gqw7BQwgd6J3k0= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-sender-0.a4lg.com (mail-sender-0.a4lg.com [IPv6:2401:2500:203:30b:4000:6bfe:4757:0]) by sourceware.org (Postfix) with ESMTPS id 105883858439 for ; Mon, 28 Nov 2022 04:46:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 105883858439 Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id 6C2EE300089; Mon, 28 Nov 2022 04:46:31 +0000 (UTC) To: Tsukasa OI , Nelson Chu , Kito Cheng , Palmer Dabbelt Cc: binutils@sourceware.org Subject: [PATCH v2 0/3] RISC-V: Disassembler Core Optimization 1-1 (Hash table and Caching) Date: Mon, 28 Nov 2022 04:46:19 +0000 Message-Id: In-Reply-To: References: Mime-Version: 1.0 X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tsukasa OI via Binutils From: Tsukasa OI Reply-To: Tsukasa OI Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749975358940566925?= X-GMAIL-MSGID: =?utf-8?q?1750713951380199886?= Hello, This is the Part 3 of 4-part project to improve disassembler performance drastically: ** this patchset does not apply to master directly. ** This patchset requires following patchset(s) to be applied first: [Changes: v1 -> v2] - Rebased against commit 97f006bc56af: "RISC-V: Better support for long instructions (disassembler)" PATCH 1/3 improves performance on disassembling RISC-V code (which may also possibly contain invalid data). It replaces riscv_hash (on opcodes/ riscv-dis.c) with much faster data structure: sorted and partitioned hash table. This is a technique actually used on SPARC architecture (opcodes/sparc-dis.c) and the author simplified the algorithm even further. Unlike SPARC, RISC-V's hashed opcode table is not a table to linked lists, it's just a table, pointing "start" elements in the sorted opcode list (per hash code) and a global tail. PATCH 3/3 takes care of per-instruction instruction support probing problem. By caching which instruction classes are queried already, we no longer have to call riscv_multi_subset_supports function for every instruction. It speeds up the disassembling even further. PATCH 2/3 is not a part of the optimization but a safety net to complement PATCH 1/3. It enables implementing custom instructions that span through multiple major opcodes (such as both CUSTOM_0 and CUSTOM_1 **in a single instruction**) without causing disassembler functionality problems. Note that it has a big performance penalty if a vendor implements such instruction so if such instruction is implemented in the mainline, a separate solution will be required. I benchmarked some of the programs and I usually get 20-50% performance improvements while disassembling code section of compiled RISC-V ELF programs ("objdump -d $FILE"). That is significant and pretty nice for such a small modification (with about 12KB heap memory allocation on 64-bit environment). On libraries and big programs with many debug symbols, the improvements are not that high but this is to be taken care with the next part (the mapping symbol optimization). This is not the end. This structure significantly improves plain binary file handling (on objdump, "objdump -b binary -m riscv:rv[32|64] -D $FILE"). I tested on various binary files including random one and big vmlinux images and I confirmed significant performance improvements (over 70% on many cases). This is partially due to the fact that, disassembling about one quarter of invalid "instruction" words required iterating over one thousand opcode entries (348 or more being vector instructions with OP-V, that can be easily skipped with this new data structure). Another reason for this significance is it doesn't have various ELF overhead. It also has a great synergy with the commit "RISC-V: One time CSR hash table initialization" and disassembling many CSR instructions is now over 6 times faster (in contrast to only about 30% faster at the patchset part 2). Thanks, Tsukasa Tsukasa OI (3): RISC-V: Use faster hash table on disassembling RISC-V: Fallback on faster hash table RISC-V: Cache instruction support include/opcode/riscv.h | 2 + opcodes/riscv-dis.c | 129 +++++++++++++++++++++++++++++++++++------ 2 files changed, 113 insertions(+), 18 deletions(-) base-commit: 75d9e5b0cfebcce34c855e4c98a956de4d7d7753