From patchwork Sun Nov 20 01:08:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tsukasa OI X-Patchwork-Id: 23362 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp928173wrr; Sat, 19 Nov 2022 17:10:53 -0800 (PST) X-Google-Smtp-Source: AA0mqf7nuHKabGZ0TAiyDwrJXHX8bI9goyacPo3wYUHZf0zODrMtucmsX0vQSjf6vdpOfczXyH+m X-Received: by 2002:a05:6402:1f88:b0:461:7a9d:c2ee with SMTP id c8-20020a0564021f8800b004617a9dc2eemr11412361edc.36.1668906652932; Sat, 19 Nov 2022 17:10:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668906652; cv=none; d=google.com; s=arc-20160816; b=E0qr14ofD2KBN+T541FAZ6ls/qpB6cX2GSiLimSqUyLsVTogt9UYbSH6CsMQOnNOHg NKdiBpQ7HIsWyBNZSXo8c1Kja2nLCzijMgIBvaEuHzEEbXbANabqBQefvIx+Jwas5ArY WML54CFRDChgCMuiMnY0eF4YlGWQBjcnowVCZRFjVFYFpIr08OYObqDwU3jZNchrt05K HR8lRJUeLczOcIxzKN4/kjkmbLY9pgVlqpuNDcU8kkqlwsGUFNop5JSC/ijmIjuvZ73N 8PXM9DSEJX+uyh9ByJk1LnY47ui9Fcex5pBepC+k2/y3QQ+rptq/SJRM6UJiTKDG+4DT ey0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=1hdMo+Rh52E9j8/7q5aWzPlJiGOAaU7VsBma9epqZo0=; b=YcdRUJBtCjIpd82GDLgMV09YgN6tzuuhrMafVTrcrv2pBP+zCyK5n2phGNTZJHcQwD e4Z8X84YY6Du4q/HW36co8PJGX+7/cJnwzbsK2/irDnpxrWi59yUB79kjEr78/diZMDo 5QgKMFrP285hZLLW1RkNy6vE+COpON2dwdRwhQjxMb/ELbO+M3Pg1Vmha6kyz0wvDE3U JNO0LwuXYJNiQK6i8cZLGvd3x2Vg+jzhgPi8odzstZivkpGXN7TS2IM333dCkLQZRXHk NBLgYkhEc7mIYulvVEmU+aybntXxyEayZQMKa9IB9yDVPH53xF0tKBL0CrhpZmwbig/c 4Vpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gM2n+IMR; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id k9-20020a056402048900b004621e2c473dsi5771293edv.354.2022.11.19.17.10.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Nov 2022 17:10:52 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gM2n+IMR; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A17B03889E27 for ; Sun, 20 Nov 2022 01:10:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A17B03889E27 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1668906642; bh=1hdMo+Rh52E9j8/7q5aWzPlJiGOAaU7VsBma9epqZo0=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=gM2n+IMRQR+9u1Dx5MICRT61ExHHLwg0AjU27gR4WxwJgreZBZtEjAwUor2EH2zq3 EniiC9NhFsp4URCNRNkpVIl2+AFw6EfJIDiUcbykeX4LyvT0znFFVFGclcRiHfceFW JgB81U2Lp0MRzsNkznJl6BPHhlGx+oXFTjRfwEDw= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-sender-0.a4lg.com (mail-sender-0.a4lg.com [IPv6:2401:2500:203:30b:4000:6bfe:4757:0]) by sourceware.org (Postfix) with ESMTPS id BA3763889E2A for ; Sun, 20 Nov 2022 01:09:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BA3763889E2A Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id E7928300089; Sun, 20 Nov 2022 01:09:00 +0000 (UTC) To: Tsukasa OI , Nelson Chu , Kito Cheng , Palmer Dabbelt Cc: binutils@sourceware.org Subject: [PATCH 1/3] RISC-V: Use faster hash table on disassembling Date: Sun, 20 Nov 2022 01:08:40 +0000 Message-Id: <7487608537fcd71f322e56d40bfb2cc605cee89a.1668906514.git.research_trasio@irq.a4lg.com> In-Reply-To: References: Mime-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tsukasa OI via Binutils From: Tsukasa OI Reply-To: Tsukasa OI Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749975462541234034?= X-GMAIL-MSGID: =?utf-8?q?1749975462541234034?= From: Tsukasa OI This commit improves performance on disassembling RISC-V code. It replaces riscv_hash (in opcodes/riscv-dis.c) with much faster data structure: a sorted and partitioned hash table. This is a technique actually used on SPARC architecture (opcodes/sparc-dis.c) and the author simplified the algorithm even further. Unlike SPARC, RISC-V's hashed opcode table is not a table to linked lists, it's just a table, pointing "start" elements in the sorted opcode list (per hash code) and a global tail. It is expected to have 20-40% performance improvements when disassembling linked RISC-V ELF programs using objdump. That is a significant improvement and pretty nice for such a small modification (with about 12KB heap memory allocation on 64-bit environment). This is not the end. This structure significantly improves plain binary file handling (on objdump, "objdump -b binary -m riscv:rv[32|64] -D $FILE"). The author tested on various binary files including random one and big vmlinux images and confirmed significant performance improvements (>70% on many cases). This is partially due to the fact that, disassembling about one quarter of invalid "instruction" words required iterating over one thousand opcode entries (348 or more being vector instructions with OP-V, that can be easily skipped with this new data structure). Another reason for this significance is it doesn't have various ELF overhead. opcodes/ChangeLog: * riscv-dis.c (init_riscv_dis_state_for_arch_and_options): Build the hash table on the first run. (OP_HASH_LEN): Move from riscv_disassemble_insn. (OP_HASH_IDX): Move from riscv_disassemble_insn and mask by OP_MASK_OP2 == 0x03 for only real 16-bit instructions. (riscv_hash): New sorted and partitioned hash table. (riscv_opcodes_sorted): New sorted opcode table. (compare_opcodes): New function to compare RISC-V opcode entries. (build_riscv_opcodes_hash_table): New function to build faster hash table to disassemble. (riscv_disassemble_insn): Use sorted and partitioned hash table. --- opcodes/riscv-dis.c | 93 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 76 insertions(+), 17 deletions(-) diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c index 328a34501549..a4a74e5733a5 100644 --- a/opcodes/riscv-dis.c +++ b/opcodes/riscv-dis.c @@ -162,6 +162,8 @@ set_riscv_dis_arch_context (riscv_dis_arch_context_t *context, } +static void build_riscv_opcodes_hash_table (void); + /* Guess and update current XLEN. */ static void @@ -205,6 +207,12 @@ init_riscv_dis_state_for_arch (void) static void init_riscv_dis_state_for_arch_and_options (void) { + static bool init = false; + if (!init) + { + build_riscv_opcodes_hash_table (); + init = true; + } /* If the architecture string is changed, update XLEN. */ if (is_arch_changed) update_riscv_dis_xlen (NULL); @@ -818,6 +826,69 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info } } +/* Build a hash table for the disassembler to shorten the search time. + We sort riscv_opcodes entry pointers for further performance. + Hash index is computed by masking the instruction with... + - 0x03 (OP_MASK_OP2) for real 16-bit instructions + - 0x7f (OP_MASK_OP) for all other instructions. */ + +#define OP_HASH_LEN (OP_MASK_OP + 1) +#define OP_HASH_IDX(i) \ + ((i) & (((i & OP_MASK_OP2) != OP_MASK_OP2) ? OP_MASK_OP2 : OP_MASK_OP)) +static const struct riscv_opcode **riscv_hash[OP_HASH_LEN + 1]; +static const struct riscv_opcode **riscv_opcodes_sorted; + +/* Compare two riscv_opcode* objects to sort by hash index. */ + +static int +compare_opcodes (const void *ap, const void *bp) +{ + const struct riscv_opcode *a = *(const struct riscv_opcode **) ap; + const struct riscv_opcode *b = *(const struct riscv_opcode **) bp; + int ai = (int) OP_HASH_IDX (a->match); + int bi = (int) OP_HASH_IDX (b->match); + if (ai != bi) + return ai - bi; + /* Stable sort (on riscv_opcodes entry order) is required. */ + if (a < b) + return -1; + if (a > b) + return +1; + return 0; +} + +/* Build riscv_opcodes-based hash table. */ + +static void +build_riscv_opcodes_hash_table (void) +{ + const struct riscv_opcode *op; + const struct riscv_opcode **pop, **pop_end; + size_t len = 0; + + /* Sort riscv_opcodes entry pointers (except macros). */ + for (op = riscv_opcodes; op->name; op++) + if (op->pinfo != INSN_MACRO) + len++; + riscv_opcodes_sorted = xcalloc (len, sizeof (struct riscv_opcode *)); + pop_end = riscv_opcodes_sorted; + for (op = riscv_opcodes; op->name; op++) + if (op->pinfo != INSN_MACRO) + *pop_end++ = op; + qsort (riscv_opcodes_sorted, len, sizeof (struct riscv_opcode *), + compare_opcodes); + + /* Initialize faster hash table. */ + pop = riscv_opcodes_sorted; + for (unsigned i = 0; i < OP_HASH_LEN; i++) + { + riscv_hash[i] = pop; + while (pop != pop_end && OP_HASH_IDX ((*pop)->match) == i) + pop++; + } + riscv_hash[OP_HASH_LEN] = pop_end; +} + /* Print the RISC-V instruction at address MEMADDR in debugged memory, on using INFO. Returns length of the instruction, in bytes. BIGENDIAN must be 1 if this is big-endian code, 0 if @@ -826,24 +897,11 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info static int riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) { + const struct riscv_opcode **pop, **pop_end; const struct riscv_opcode *op, *matched_op; - static bool init = false; - static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1]; struct riscv_private_data *pd = info->private_data; int insnlen; -#define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP)) - - /* Build a hash table to shorten the search time. */ - if (! init) - { - for (op = riscv_opcodes; op->name; op++) - if (!riscv_hash[OP_HASH_IDX (op->match)]) - riscv_hash[OP_HASH_IDX (op->match)] = op; - - init = true; - } - insnlen = riscv_insn_length (word); /* RISC-V instructions are always little-endian. */ @@ -861,10 +919,11 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) info->target2 = 0; matched_op = NULL; - op = riscv_hash[OP_HASH_IDX (word)]; - - for (; op && op->name; op++) + pop = riscv_hash[OP_HASH_IDX (word)]; + pop_end = riscv_hash[OP_HASH_IDX (word) + 1]; + for (; pop != pop_end; pop++) { + op = *pop; /* Does the opcode match? */ if (!(op->match_func) (op, word)) continue;