bfd: optimize bfd_elf_hash

Message ID 745eb337-facd-244a-bd02-d9b8b1f653a5@acm.org
State Accepted
Headers
Series bfd: optimize bfd_elf_hash |

Checks

Context Check Description
snail/binutils-gdb-check success Github commit url

Commit Message

Nathan Sidwell April 9, 2023, 10:55 p.m. UTC
  I happened to be poking at llvm's implementation of the sysv hash and notice 
binutils' version had similar optimization issues.

neither gcc nor llvm can spot these xforms, and the if (...) at least obscures 
the data flow IMHO anyway.

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

nathan
  

Comments

Alan Modra April 11, 2023, 12:36 p.m. UTC | #1
On Sun, Apr 09, 2023 at 06:55:13PM -0400, Nathan Sidwell via Binutils wrote:
> 	* elf.c (bfd_elf_hash): Refactor to optimize loop.
> 	(bfd_elf_gnu_hash): Refactor to use 32-bit type.

Looks good to me.
  
Fangrui Song April 12, 2023, 5:33 a.m. UTC | #2
On Tue, Apr 11, 2023 at 5:36 AM Alan Modra via Binutils
<binutils@sourceware.org> wrote:
>
> On Sun, Apr 09, 2023 at 06:55:13PM -0400, Nathan Sidwell via Binutils wrote:
> >       * elf.c (bfd_elf_hash): Refactor to optimize loop.
> >       (bfd_elf_gnu_hash): Refactor to use 32-bit type.
>
> Looks good to me.
>
> --
> Alan Modra
> Australia Development Lab, IBM

Thanks for the patch! The generic ABI code fragment may have an oversight.
When long represents a 64-bit integer, elf_hash((const unsigned char
*)"\xff\x0f\x0f\x0f\x0f\x0f\x12") returns 0x100000002, larger than
UINT32_MAX.
It is unclear whether a value larger than UINT32_MAX is intended.

I then investigated the binutils-gdb implementation and found that in
a 2003 commit (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=32dfa85d9015feea4a06d423fe58f6eaf841456e),
Andrew Haley appeared to have noticed the issue and made a change to
"Mask lower 32 bits of hash."
This change made me more convinced that the generic ABI code fragment
had an oversight.

If the generic ABI code fragment is indeed an oversight, this
optimized version looks good to me.

I asked https://groups.google.com/g/generic-abi/c/8J_jtjsonrE ("What
if the result of elf_hash is larger than UINT32_MAX?").
  

Patch

From 4fc6668fcbee2945d79b40f51cdcaeac78f08aa3 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell <nathan@acm.org>
Date: Sun, 9 Apr 2023 18:44:07 -0400
Subject: [PATCH] bfd: optimize bfd_elf_hash

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

	bfd/
	* elf.c (bfd_elf_hash): Refactor to optimize loop.
	(bfd_elf_gnu_hash): Refactor to use 32-bit type.
---
 bfd/elf.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/bfd/elf.c b/bfd/elf.c
index 87ec1623313..bda83469edd 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -196,23 +196,15 @@  _bfd_elf_swap_versym_out (bfd *abfd,
 unsigned long
 bfd_elf_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 0;
-  unsigned long g;
-  int ch;
+  uint32_t h = 0;
 
-  while ((ch = *name++) != '\0')
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
     {
-      h = (h << 4) + ch;
-      if ((g = (h & 0xf0000000)) != 0)
-	{
-	  h ^= g >> 24;
-	  /* The ELF ABI says `h &= ~g', but this is equivalent in
-	     this case and on some machines one insn instead of two.  */
-	  h ^= g;
-	}
+      h = (h << 4) + *name;
+      h ^= (h >> 24) & 0xf0;
     }
-  return h & 0xffffffff;
+  return h & 0x0fffffff;
 }
 
 /* DT_GNU_HASH hash function.  Do not change this function; you will
@@ -221,13 +213,12 @@  bfd_elf_hash (const char *namearg)
 unsigned long
 bfd_elf_gnu_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 5381;
-  unsigned char ch;
+  uint32_t h = 5381;
 
-  while ((ch = *name++) != '\0')
-    h = (h << 5) + h + ch;
-  return h & 0xffffffff;
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
+    h = (h << 5) + h + *name;
+  return h;
 }
 
 /* Create a tdata field OBJECT_SIZE bytes in length, zeroed out and with
-- 
2.39.2