Support objcopy changing compression to or from zstd

Message ID YzvrMEdkXjIn5Lfz@squeak.grove.modra.org
State New, archived
Headers
Series Support objcopy changing compression to or from zstd |

Commit Message

Alan Modra Oct. 4, 2022, 8:13 a.m. UTC
  Commit 2cac01e3ffff lacked support for objcopy changing compression
style.  Add that support, which meant a rewrite of
bfd_compress_section_contents.  In the process I've fixed some memory
leaks.

	* compress.c (bfd_is_section_compressed_info): Rename from
	bfd_is_section_compressed_with_header and add ch_type param
	to return compression header ch_type field.
	Update all callers.
	(decompress_section_contents): Remove buffer and size params.
	Rewrite.  Update callers.
	(bfd_init_section_compress_status): Free contents on failure.
	(bfd_compress_section): Likewise.
	* elf.c (_bfd_elf_make_section_from_shdr): Support objcopy
	changing between any of the three compression schemes.  Report
	"unable to compress/decompress" rather than "unable to
	initialize compress/decompress status" on compress/decompress
	failures.
	* bfd-in2.h: Regenerate.
  

Comments

Fangrui Song Oct. 4, 2022, 10:01 p.m. UTC | #1
On 2022-10-04, Alan Modra via Binutils wrote:
>Commit 2cac01e3ffff lacked support for objcopy changing compression
>style.  Add that support, which meant a rewrite of
>bfd_compress_section_contents.  In the process I've fixed some memory
>leaks.

For objcopy --compress-debug-sections=zstd , I know that omitting
recompression of zlib into zstd was intentional.  --compress-debug-sections=zstd 
does not specify what to do when a section is compressed, so both (a)
do nothing (b) re-compression with zstd are fine but I think
avoiding recompression can avoid some complexity...

If a .o is compressed with zlib level 5, should --compress-debug-sections=zlib
re-compress it or leave it as-is?

At any rate, I think the  objcopy --compress-debug-sections=zstd
behavior with zlib compressed sections is mostly not interesting to a
user.
  
Jan Beulich Oct. 5, 2022, 6:02 a.m. UTC | #2
On 05.10.2022 00:01, Fangrui Song wrote:
> On 2022-10-04, Alan Modra via Binutils wrote:
>> Commit 2cac01e3ffff lacked support for objcopy changing compression
>> style.  Add that support, which meant a rewrite of
>> bfd_compress_section_contents.  In the process I've fixed some memory
>> leaks.
> 
> For objcopy --compress-debug-sections=zstd , I know that omitting
> recompression of zlib into zstd was intentional.  --compress-debug-sections=zstd 
> does not specify what to do when a section is compressed, so both (a)
> do nothing (b) re-compression with zstd are fine but I think
> avoiding recompression can avoid some complexity...
> 
> If a .o is compressed with zlib level 5, should --compress-debug-sections=zlib
> re-compress it or leave it as-is?
> 
> At any rate, I think the  objcopy --compress-debug-sections=zstd
> behavior with zlib compressed sections is mostly not interesting to a
> user.

Considering that --compress-debug-sections=none means "decompress"
rather than "leave alone" (which I would consider more reasonable),
I think it is a logical consequence that =zstd means "(re)compress"
(even if, like for =none, I'm not convinced this is the best
possible behavior). In any event, even more so now that there are
two truly distinct compression methods, I think a way to express
such variants is quite desirable, perhaps even necessary.

Jan
  
Alan Modra Oct. 5, 2022, 9:04 p.m. UTC | #3
On Tue, Oct 04, 2022 at 03:01:34PM -0700, Fangrui Song wrote:
> On 2022-10-04, Alan Modra via Binutils wrote:
> > Commit 2cac01e3ffff lacked support for objcopy changing compression
> > style.  Add that support, which meant a rewrite of
> > bfd_compress_section_contents.  In the process I've fixed some memory
> > leaks.
> 
> For objcopy --compress-debug-sections=zstd , I know that omitting
> recompression of zlib into zstd was intentional.
> 
> --compress-debug-sections=zstd does not specify what to do when a section is
> compressed, so both (a)
> do nothing (b) re-compression with zstd are fine but I think
> avoiding recompression can avoid some complexity...
> 
> If a .o is compressed with zlib level 5, should --compress-debug-sections=zlib
> re-compress it or leave it as-is?
> 
> At any rate, I think the  objcopy --compress-debug-sections=zstd
> behavior with zlib compressed sections is mostly not interesting to a
> user.

I could have lived with objcopy --compress-debug-sections=zstd doing
nothing on already compressed sections.  After all, users could change
compression schemes by a two step process, first objcopy
--decompress-debug-sections then compress with the desired scheme.

However, I discovered that objcopy --compress-debug-sections=zstd
on a zlib-gnu compressed file copied the zlib data and tacked on a
header specifying zstd.  Which of course then causes errors on
attempted decompression.  That needed fixing regardless of what anyone
thinks objcopy --compress-debug-sections=zstd should do with already
compressed debug sections.
  

Patch

diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 5c80956c79c..d9b49a8c820 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -7953,11 +7953,12 @@  bool bfd_get_full_section_contents
 void bfd_cache_section_contents
    (asection *sec, void *contents);
 
-bool bfd_is_section_compressed_with_header
+bool bfd_is_section_compressed_info
    (bfd *abfd, asection *section,
     int *compression_header_size_p,
     bfd_size_type *uncompressed_size_p,
-    unsigned int *uncompressed_alignment_power_p);
+    unsigned int *uncompressed_alignment_power_p,
+    unsigned int *ch_type);
 
 bool bfd_is_section_compressed
    (bfd *abfd, asection *section);
diff --git a/bfd/compress.c b/bfd/compress.c
index 0e75b687013..364df14142b 100644
--- a/bfd/compress.c
+++ b/bfd/compress.c
@@ -81,8 +81,7 @@  decompress_contents (bool is_zstd, bfd_byte *compressed_buffer,
   return inflateEnd (&strm) == Z_OK && rc == Z_OK && strm.avail_out == 0;
 }
 
-/* Compress data of the size specified in @var{uncompressed_size}
-   and pointed to by @var{uncompressed_buffer} using zlib/zstd and store
+/* Compress section contents using zlib/zstd and store
    as the contents field.  This function assumes the contents
    field was allocated using bfd_malloc() or equivalent.
 
@@ -90,111 +89,99 @@  decompress_contents (bool is_zstd, bfd_byte *compressed_buffer,
    compressed successfully.  Otherwise return 0.  */
 
 static bfd_size_type
-bfd_compress_section_contents (bfd *abfd, sec_ptr sec,
-			       bfd_byte *uncompressed_buffer,
-			       bfd_size_type uncompressed_size)
+bfd_compress_section_contents (bfd *abfd, sec_ptr sec)
 {
+  bfd_byte *input_buffer;
   uLong compressed_size;
   bfd_byte *buffer;
   bfd_size_type buffer_size;
-  bool decompress;
   int zlib_size = 0;
-  int orig_compression_header_size;
-  bfd_size_type orig_uncompressed_size;
-  unsigned int orig_uncompressed_alignment_pow;
-  int header_size = bfd_get_compression_header_size (abfd, NULL);
+  int orig_header_size;
+  bfd_size_type uncompressed_size;
+  unsigned int uncompressed_alignment_pow;
+  unsigned int ch_type = 0;
+  int new_header_size = bfd_get_compression_header_size (abfd, NULL);
   bool compressed
-    = bfd_is_section_compressed_with_header (abfd, sec,
-					     &orig_compression_header_size,
-					     &orig_uncompressed_size,
-					     &orig_uncompressed_alignment_pow);
+    = bfd_is_section_compressed_info (abfd, sec,
+				      &orig_header_size,
+				      &uncompressed_size,
+				      &uncompressed_alignment_pow,
+				      &ch_type);
+  bool update = false;
+
+  /* We shouldn't be trying to decompress unsupported compressed sections.  */
+  if (compressed && orig_header_size < 0)
+    abort ();
 
   /* Either ELF compression header or the 12-byte, "ZLIB" + 8-byte size,
      overhead in .zdebug* section.  */
-  if (!header_size)
-     header_size = 12;
+  if (!new_header_size)
+    new_header_size = 12;
+  if (ch_type == 0)
+    orig_header_size = 12;
 
+  input_buffer = sec->contents;
   if (compressed)
     {
-      /* We shouldn't decompress unsupported compressed section.  */
-      if (orig_compression_header_size < 0)
-	abort ();
-
-      /* Different compression schemes.  Just move the compressed section
-	 contents to the right position. */
-      if (orig_compression_header_size == 0)
-	{
-	  /* Convert it from .zdebug* section.  Get the uncompressed
-	     size first.  We need to subtract the 12-byte overhead in
-	     .zdebug* section.  Set orig_compression_header_size to
-	     the 12-bye overhead.  */
-	  orig_compression_header_size = 12;
-	  zlib_size = uncompressed_size - 12;
-	}
-      else
-	{
-	  /* Convert it to .zdebug* section.  */
-	  zlib_size = uncompressed_size - orig_compression_header_size;
-	}
-
-      /* Add the header size.  */
-      compressed_size = zlib_size + header_size;
-    }
-  else
-    compressed_size = compressBound (uncompressed_size) + header_size;
+      zlib_size = sec->size - orig_header_size;
+      compressed_size = zlib_size + new_header_size;
 
-  /* Uncompress if it leads to smaller size.  */
-  if (compressed && compressed_size > orig_uncompressed_size)
-    {
-      decompress = true;
-      buffer_size = orig_uncompressed_size;
-    }
-  else
-    {
-      decompress = false;
-      buffer_size = compressed_size;
-    }
-  buffer = (bfd_byte *) bfd_alloc (abfd, buffer_size);
-  if (buffer == NULL)
-    return 0;
+      /* If we are converting between zlib-gnu and zlib-gabi then the
+	 compressed contents just need to be moved.  */
+      update = (ch_type < ELFCOMPRESS_ZSTD
+		&& (abfd->flags & BFD_COMPRESS_ZSTD) == 0);
 
-  if (compressed)
-    {
-      sec->size = orig_uncompressed_size;
-      if (decompress)
+      /* Uncompress when not just moving contents or when compressed
+	 is not smaller than uncompressed.  */
+      if (!update || compressed_size >= uncompressed_size)
 	{
-	  if (!decompress_contents (
-		  sec->compress_status == DECOMPRESS_SECTION_ZSTD,
-		  uncompressed_buffer + orig_compression_header_size,
-		  zlib_size, buffer, buffer_size))
+	  buffer_size = uncompressed_size;
+	  buffer = bfd_malloc (buffer_size);
+	  if (buffer == NULL)
+	    return 0;
+
+	  if (!decompress_contents (ch_type == ELFCOMPRESS_ZSTD,
+				    input_buffer + orig_header_size,
+				    zlib_size, buffer, buffer_size))
 	    {
 	      bfd_set_error (bfd_error_bad_value);
-	      bfd_release (abfd, buffer);
+	      free (buffer);
 	      return 0;
 	    }
-	  free (uncompressed_buffer);
-	  bfd_set_section_alignment (sec, orig_uncompressed_alignment_pow);
-
+	  free (input_buffer);
+	  bfd_set_section_alignment (sec, uncompressed_alignment_pow);
 	  sec->contents = buffer;
 	  sec->compress_status = COMPRESS_SECTION_DONE;
-	  return orig_uncompressed_size;
-	}
-      else
-	{
-	  bfd_update_compression_header (abfd, buffer, sec);
-	  memmove (buffer + header_size,
-		   uncompressed_buffer + orig_compression_header_size,
-		   zlib_size);
+	  sec->size = uncompressed_size;
+	  input_buffer = buffer;
 	}
     }
+
+  if (!update)
+    compressed_size = compressBound (uncompressed_size) + new_header_size;
+
+  buffer_size = compressed_size;
+  buffer = bfd_alloc (abfd, buffer_size);
+  if (buffer == NULL)
+    return 0;
+
+  if (update)
+    {
+      if (compressed_size < uncompressed_size)
+	memcpy (buffer + new_header_size,
+		input_buffer + orig_header_size,
+		zlib_size);
+    }
   else
     {
       if (abfd->flags & BFD_COMPRESS_ZSTD)
 	{
 #if HAVE_ZSTD
-	  compressed_size = ZSTD_compress (
-		  buffer + header_size, compressed_size, uncompressed_buffer,
-		  uncompressed_size, ZSTD_CLEVEL_DEFAULT);
+	  compressed_size = ZSTD_compress (buffer + new_header_size,
+					   compressed_size,
+					   input_buffer,
+					   uncompressed_size,
+					   ZSTD_CLEVEL_DEFAULT);
 	  if (ZSTD_isError (compressed_size))
 	    {
 	      bfd_release (abfd, buffer);
@@ -203,8 +190,8 @@  bfd_compress_section_contents (bfd *abfd, sec_ptr sec,
 	    }
 #endif
 	}
-      else if (compress ((Bytef *)buffer + header_size, &compressed_size,
-			 (const Bytef *)uncompressed_buffer, uncompressed_size)
+      else if (compress ((Bytef *) buffer + new_header_size, &compressed_size,
+			 (const Bytef *) input_buffer, uncompressed_size)
 	       != Z_OK)
 	{
 	  bfd_release (abfd, buffer);
@@ -212,27 +199,24 @@  bfd_compress_section_contents (bfd *abfd, sec_ptr sec,
 	  return 0;
 	}
 
-      compressed_size += header_size;
-      /* PR binutils/18087: If compression didn't make the section smaller,
-	 just keep it uncompressed.  */
-      if (compressed_size < uncompressed_size)
-	bfd_update_compression_header (abfd, buffer, sec);
-      else
-	{
-	  /* NOTE: There is a small memory leak here since
-	     uncompressed_buffer is malloced and won't be freed.  */
-	  bfd_release (abfd, buffer);
-	  sec->contents = uncompressed_buffer;
-	  sec->compress_status = COMPRESS_SECTION_NONE;
-	  return uncompressed_size;
-	}
+      compressed_size += new_header_size;
     }
 
-  free (uncompressed_buffer);
+  /* If compression didn't make the section smaller, keep it uncompressed.  */
+  if (compressed_size >= uncompressed_size)
+    {
+      memcpy (buffer, input_buffer, uncompressed_size);
+      sec->compress_status = COMPRESS_SECTION_NONE;
+    }
+  else
+    {
+      sec->size = uncompressed_size;
+      bfd_update_compression_header (abfd, buffer, sec);
+      sec->size = compressed_size;
+      sec->compress_status = COMPRESS_SECTION_DONE;
+    }
   sec->contents = buffer;
-  sec->size = compressed_size;
-  sec->compress_status = COMPRESS_SECTION_DONE;
-
+  free (input_buffer);
   return uncompressed_size;
 }
 
@@ -421,14 +405,15 @@  bfd_cache_section_contents (asection *sec, void *contents)
 
 /*
 FUNCTION
-	bfd_is_section_compressed_with_header
+	bfd_is_section_compressed_info
 
 SYNOPSIS
-	bool bfd_is_section_compressed_with_header
+	bool bfd_is_section_compressed_info
 	  (bfd *abfd, asection *section,
-	  int *compression_header_size_p,
-	  bfd_size_type *uncompressed_size_p,
-	  unsigned int *uncompressed_alignment_power_p);
+	   int *compression_header_size_p,
+	   bfd_size_type *uncompressed_size_p,
+	   unsigned int *uncompressed_alignment_power_p,
+	   unsigned int *ch_type);
 
 DESCRIPTION
 	Return @code{TRUE} if @var{section} is compressed.  Compression
@@ -441,16 +426,16 @@  DESCRIPTION
 */
 
 bool
-bfd_is_section_compressed_with_header (bfd *abfd, sec_ptr sec,
-				       int *compression_header_size_p,
-				       bfd_size_type *uncompressed_size_p,
-				       unsigned int *uncompressed_align_pow_p)
+bfd_is_section_compressed_info (bfd *abfd, sec_ptr sec,
+				int *compression_header_size_p,
+				bfd_size_type *uncompressed_size_p,
+				unsigned int *uncompressed_align_pow_p,
+				unsigned int *ch_type)
 {
   bfd_byte header[MAX_COMPRESSION_HEADER_SIZE];
   int compression_header_size;
   int header_size;
   unsigned int saved = sec->compress_status;
-  unsigned int ch_type;
   bool compressed;
 
   *uncompressed_align_pow_p = 0;
@@ -481,7 +466,7 @@  bfd_is_section_compressed_with_header (bfd *abfd, sec_ptr sec,
     {
       if (compression_header_size != 0)
 	{
-	  if (!bfd_check_compression_header (abfd, header, sec, &ch_type,
+	  if (!bfd_check_compression_header (abfd, header, sec, ch_type,
 					     uncompressed_size_p,
 					     uncompressed_align_pow_p))
 	    compression_header_size = -1;
@@ -521,10 +506,12 @@  bfd_is_section_compressed (bfd *abfd, sec_ptr sec)
   int compression_header_size;
   bfd_size_type uncompressed_size;
   unsigned int uncompressed_align_power;
-  return (bfd_is_section_compressed_with_header (abfd, sec,
-						 &compression_header_size,
-						 &uncompressed_size,
-						 &uncompressed_align_power)
+  unsigned int ch_type;
+  return (bfd_is_section_compressed_info (abfd, sec,
+					  &compression_header_size,
+					  &uncompressed_size,
+					  &uncompressed_align_power,
+					  &ch_type)
 	  && compression_header_size >= 0
 	  && uncompressed_size > 0);
 }
@@ -654,10 +641,14 @@  bfd_init_section_compress_status (bfd *abfd, sec_ptr sec)
 				 0, uncompressed_size))
     return false;
 
-  uncompressed_size = bfd_compress_section_contents (abfd, sec,
-						     uncompressed_buffer,
-						     uncompressed_size);
-  return uncompressed_size != 0;
+  sec->contents = uncompressed_buffer;
+  if (bfd_compress_section_contents (abfd, sec) == 0)
+    {
+      free (sec->contents);
+      sec->contents = NULL;
+      return false;
+    }
+  return true;
 }
 
 /*
@@ -673,7 +664,7 @@  DESCRIPTION
 	compressed size and set compress_status to COMPRESS_SECTION_DONE.
 
 	Return @code{FALSE} if compression fail.  Otherwise, return
-	@code{TRUE}.
+	@code{TRUE}.  UNCOMPRESSED_BUFFER is freed in both cases.
 */
 
 bool
@@ -693,7 +684,12 @@  bfd_compress_section (bfd *abfd, sec_ptr sec, bfd_byte *uncompressed_buffer)
       return false;
     }
 
-  /* Compress it.  */
-  return bfd_compress_section_contents (abfd, sec, uncompressed_buffer,
-					uncompressed_size) != 0;
+  sec->contents = uncompressed_buffer;
+  if (bfd_compress_section_contents (abfd, sec) == 0)
+    {
+      free (sec->contents);
+      sec->contents = NULL;
+      return false;
+    }
+  return true;
 }
diff --git a/bfd/elf.c b/bfd/elf.c
index 420b524aae8..fe00e0f9189 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -1209,32 +1209,35 @@  _bfd_elf_make_section_from_shdr (bfd *abfd,
       int compression_header_size;
       bfd_size_type uncompressed_size;
       unsigned int uncompressed_align_power;
+      unsigned int ch_type = 0;
       bool compressed
-	= bfd_is_section_compressed_with_header (abfd, newsect,
-						 &compression_header_size,
-						 &uncompressed_size,
-						 &uncompressed_align_power);
-      if (compressed)
-	{
-	  /* Compressed section.  Check if we should decompress.  */
-	  if ((abfd->flags & BFD_DECOMPRESS))
-	    action = decompress;
-	}
-
-      /* Compress the uncompressed section or convert from/to .zdebug*
-	 section.  Check if we should compress.  */
-      if (action == nothing)
-	{
-	  if (newsect->size != 0
-	      && (abfd->flags & BFD_COMPRESS)
-	      && compression_header_size >= 0
-	      && uncompressed_size > 0
-	      && (!compressed
-		  || ((compression_header_size > 0)
-		      != ((abfd->flags & BFD_COMPRESS_GABI) != 0))))
+	= bfd_is_section_compressed_info (abfd, newsect,
+					  &compression_header_size,
+					  &uncompressed_size,
+					  &uncompressed_align_power,
+					  &ch_type);
+
+      /* Should we decompress?  */
+      if ((abfd->flags & BFD_DECOMPRESS) != 0 && compressed)
+	action = decompress;
+
+      /* Should we compress?  Or convert to a different compression?  */
+      else if ((abfd->flags & BFD_COMPRESS) != 0
+	       && newsect->size != 0
+	       && compression_header_size >= 0
+	       && uncompressed_size > 0)
+	{
+	  if (!compressed)
 	    action = compress;
 	  else
-	    return true;
+	    {
+	      unsigned int new_ch_type = 0;
+	      if ((abfd->flags & BFD_COMPRESS_GABI) != 0)
+		new_ch_type = ((abfd->flags & BFD_COMPRESS_ZSTD) != 0
+			       ? ELFCOMPRESS_ZSTD : ELFCOMPRESS_ZLIB);
+	      if (new_ch_type != ch_type)
+		action = compress;
+	    }
 	}
 
       if (action == compress)
@@ -1243,20 +1246,17 @@  _bfd_elf_make_section_from_shdr (bfd *abfd,
 	    {
 	      _bfd_error_handler
 		/* xgettext:c-format */
-		(_("%pB: unable to initialize compress status for section %s"),
-		 abfd, name);
+		(_("%pB: unable to compress section %s"), abfd, name);
 	      return false;
 	    }
 	}
-      else
+      else if (action == decompress)
 	{
 	  if (!bfd_init_section_decompress_status (abfd, newsect))
 	    {
 	      _bfd_error_handler
 		/* xgettext:c-format */
-		(_("%pB: unable to initialize decompress status"
-		   " for section %s"),
-		 abfd, name);
+		(_("%pB: unable to decompress section %s"), abfd, name);
 	      return false;
 	    }
 #ifndef HAVE_ZSTD
@@ -1273,26 +1273,29 @@  _bfd_elf_make_section_from_shdr (bfd *abfd,
 #endif
 	}
 
-      if (abfd->is_linker_input)
+      if (action != nothing)
 	{
-	  if (name[1] == 'z'
-	      && (action == decompress
-		  || (action == compress
-		      && (abfd->flags & BFD_COMPRESS_GABI) != 0)))
+	  if (abfd->is_linker_input)
 	    {
-	      /* Convert section name from .zdebug_* to .debug_* so
-		 that linker will consider this section as a debug
-		 section.  */
-	      char *new_name = convert_zdebug_to_debug (abfd, name);
-	      if (new_name == NULL)
-		return false;
-	      bfd_rename_section (newsect, new_name);
+	      if (name[1] == 'z'
+		  && (action == decompress
+		      || (action == compress
+			  && (abfd->flags & BFD_COMPRESS_GABI) != 0)))
+		{
+		  /* Convert section name from .zdebug_* to .debug_* so
+		     that linker will consider this section as a debug
+		     section.  */
+		  char *new_name = convert_zdebug_to_debug (abfd, name);
+		  if (new_name == NULL)
+		    return false;
+		  bfd_rename_section (newsect, new_name);
+		}
 	    }
+	  else
+	    /* For objdump, don't rename the section.  For objcopy, delay
+	       section rename to elf_fake_sections.  */
+	    newsect->flags |= SEC_ELF_RENAME;
 	}
-      else
-	/* For objdump, don't rename the section.  For objcopy, delay
-	   section rename to elf_fake_sections.  */
-	newsect->flags |= SEC_ELF_RENAME;
     }
 
   /* GCC uses .gnu.lto_.lto.<some_hash> as a LTO bytecode information