Fix emit_group_store regression on big-endian

Message ID 1908900.PYKUYFuaPT@fomalhaut
State Accepted, archived
Headers
Series Fix emit_group_store regression on big-endian |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Eric Botcazou Oct. 11, 2022, 10:57 p.m. UTC
  Hi,

the recent optimization implemented for complex modes in:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595865.html
contains an oversight for big-endian platforms in the "interesting corner 
case" mentioned in the message: it uses a lowpart SUBREG when the integer 
modes have different sizes, but this does not match the semantics of the 
PARALLELs which have a bundled byte offset; this offset is always zero in the 
code path and the lowpart is not at offset zero on big-endian platforms.

Calling validate_subreg with this zero offset would fix the regression by 
disabling the optimization on big-endian platforms, so instead the attached 
fix adds the appropriate right shift for them.

This fixes the following regressions in the C testsuite on SPARC64/Linux:
FAIL: gcc.c-torture/execute/20041124-1.c   -O0  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -O1  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -O2 -flto -fno-use-linker-plugin -
flto-partition=none  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -O2 -flto -fuse-linker-plugin -fno-
fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/20041124-1.c   -Os  execution test
FAIL: gcc.dg/compat/struct-by-value-11 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: gcc.dg/compat/struct-by-value-12 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t027 c_compat_x_tst.o-c_compat_y_tst.o 
execute 

Tested on SPARC64/Linux, OK for the mainline?


2022-10-11  Eric Botcazou  <ebotcazou@adacore.com>

	* expr.cc (emit_group_stote): Fix handling of modes of different
	sizes for big-endian targets in latest change and add commentary.
  

Comments

Richard Biener Oct. 13, 2022, 11:15 a.m. UTC | #1
On Wed, Oct 12, 2022 at 1:01 AM Eric Botcazou via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> the recent optimization implemented for complex modes in:
>   https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595865.html
> contains an oversight for big-endian platforms in the "interesting corner
> case" mentioned in the message: it uses a lowpart SUBREG when the integer
> modes have different sizes, but this does not match the semantics of the
> PARALLELs which have a bundled byte offset; this offset is always zero in the
> code path and the lowpart is not at offset zero on big-endian platforms.
>
> Calling validate_subreg with this zero offset would fix the regression by
> disabling the optimization on big-endian platforms, so instead the attached
> fix adds the appropriate right shift for them.
>
> This fixes the following regressions in the C testsuite on SPARC64/Linux:
> FAIL: gcc.c-torture/execute/20041124-1.c   -O0  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -O1  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -O2  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -O2 -flto -fno-use-linker-plugin -
> flto-partition=none  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -O2 -flto -fuse-linker-plugin -fno-
> fat-lto-objects  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -O3 -g  execution test
> FAIL: gcc.c-torture/execute/20041124-1.c   -Os  execution test
> FAIL: gcc.dg/compat/struct-by-value-11 c_compat_x_tst.o-c_compat_y_tst.o
> execute
> FAIL: gcc.dg/compat/struct-by-value-12 c_compat_x_tst.o-c_compat_y_tst.o
> execute
> FAIL: tmpdir-gcc.dg-struct-layout-1/t027 c_compat_x_tst.o-c_compat_y_tst.o
> execute
>
> Tested on SPARC64/Linux, OK for the mainline?

OK.

Thanks,
Richard.

>
> 2022-10-11  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * expr.cc (emit_group_stote): Fix handling of modes of different
>         sizes for big-endian targets in latest change and add commentary.
>
> --
> Eric Botcazou
  

Patch

diff --git a/gcc/expr.cc b/gcc/expr.cc
index ba627f176a7..b897b6dc385 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -2813,50 +2813,69 @@  emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
       else
 	adj_bytelen = bytelen;
 
+      /* Deal with destination CONCATs by either storing into one of the parts
+	 or doing a copy after storing into a register or stack temporary.  */
       if (GET_CODE (dst) == CONCAT)
 	{
 	  if (known_le (bytepos + adj_bytelen,
 			GET_MODE_SIZE (GET_MODE (XEXP (dst, 0)))))
 	    dest = XEXP (dst, 0);
+
 	  else if (known_ge (bytepos, GET_MODE_SIZE (GET_MODE (XEXP (dst, 0)))))
 	    {
 	      bytepos -= GET_MODE_SIZE (GET_MODE (XEXP (dst, 0)));
 	      dest = XEXP (dst, 1);
 	    }
+
 	  else
 	    {
 	      machine_mode dest_mode = GET_MODE (dest);
 	      machine_mode tmp_mode = GET_MODE (tmps[i]);
-	      scalar_int_mode imode;
+	      scalar_int_mode dest_imode;
 
 	      gcc_assert (known_eq (bytepos, 0) && XVECLEN (src, 0));
 
-	      if (finish == 1
+	      /* If the source is a single scalar integer register, and the
+		 destination has a complex mode for which a same-sized integer
+		 mode exists, then we can take the left-justified part of the
+		 source in the complex mode.  */
+	      if (finish == start + 1
 		  && REG_P (tmps[i])
-		  && COMPLEX_MODE_P (dest_mode)
 		  && SCALAR_INT_MODE_P (tmp_mode)
-		  && int_mode_for_mode (dest_mode).exists (&imode))
+		  && COMPLEX_MODE_P (dest_mode)
+		  && int_mode_for_mode (dest_mode).exists (&dest_imode))
 		{
-		  if (tmp_mode != imode)
+		  const scalar_int_mode tmp_imode
+		    = as_a <scalar_int_mode> (tmp_mode);
+
+		  if (GET_MODE_BITSIZE (dest_imode)
+		      < GET_MODE_BITSIZE (tmp_imode))
 		    {
-		      rtx tmp = gen_reg_rtx (imode);
-		      emit_move_insn (tmp, gen_lowpart (imode, tmps[i]));
-		      dst = gen_lowpart (dest_mode, tmp);
+		      dest = gen_reg_rtx (dest_imode);
+		      if (BYTES_BIG_ENDIAN)
+			tmps[i] = expand_shift (RSHIFT_EXPR, tmp_mode, tmps[i],
+						GET_MODE_BITSIZE (tmp_imode)
+						- GET_MODE_BITSIZE (dest_imode),
+						NULL_RTX, 1);
+		      emit_move_insn (dest, gen_lowpart (dest_imode, tmps[i]));
+		      dst = gen_lowpart (dest_mode, dest);
 		    }
 		  else
 		    dst = gen_lowpart (dest_mode, tmps[i]);
 		}
+
+	      /* Otherwise spill the source onto the stack using the more
+		 aligned of the two modes.  */
 	      else if (GET_MODE_ALIGNMENT (dest_mode)
-		  >= GET_MODE_ALIGNMENT (tmp_mode))
+		       >= GET_MODE_ALIGNMENT (tmp_mode))
 		{
 		  dest = assign_stack_temp (dest_mode,
 					    GET_MODE_SIZE (dest_mode));
-		  emit_move_insn (adjust_address (dest,
-						  tmp_mode,
-						  bytepos),
+		  emit_move_insn (adjust_address (dest, tmp_mode, bytepos),
 				  tmps[i]);
 		  dst = dest;
 		}
+
 	      else
 		{
 		  dest = assign_stack_temp (tmp_mode,
@@ -2864,6 +2883,7 @@  emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
 		  emit_move_insn (dest, tmps[i]);
 		  dst = adjust_address (dest, dest_mode, bytepos);
 		}
+
 	      break;
 	    }
 	}