expand: Fix asm goto expansion [PR113415]

Message ID ZcXwPOkA9eizigX1@tucnak
State Unresolved
Headers
Series expand: Fix asm goto expansion [PR113415] |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek Feb. 9, 2024, 9:28 a.m. UTC
  Hi!

The asm goto expansion ICEs on the following testcase (which normally
is rejected later), because expand_asm_stmt emits the code to copy
the large var out of the out operand to its memory location into
after_rtl_seq ... after_rtl_end sequence and because it is asm goto,
it duplicates the sequence on each successor edge of the asm goto.
The problem is that with -mstringop-strategy=byte_loop that sequence
contains loops, so CODE_LABELs, JUMP_INSNs, with other strategies
could contain CALL_INSNs etc.
But the copying is done using a loop doing
emit_insn (copy_insn (PATTERN (curr)));
which does the right thing solely for INSNs, it will do the wrong thing
for JUMP_INSNs, CALL_INSNs, CODE_LABELs (with RTL checking even ICE on
them), BARRIERs and the like.

The following patch partially fixes it (with the hope that such stuff only
occurs in asms that really can't be accepted; if one uses say "=rm" or
"=g" constraint then the operand uses the memory directly and nothing is
copied) by using the
duplicate_insn_chain function which is used e.g. in RTL loop unrolling and
which can handle JUMP_INSNs, CALL_INSNs, BARRIERs etc.
As it is meant to operate on sequences inside of basic blocks, it doesn't
handle CODE_LABELs (well, it skips them), so if we need a solution that
will be correct at runtime here for those cases, we'd need to do further
work (e.g. still use duplicate_insn_chain, but if we notice any CODE_LABELs,
walk the sequence again, add copies of the CODE_LABELs and then remap
references to the old CODE_LABELs in the copied sequence to the new ones).
Because as is now, if the code in one of the sequence copies (where the
CODE_LABELs have been left out) decides to jump to such a CODE_LABEL, it
will jump to the CODE_LABEL which has been in the original sequence (which
the code emits on the last edge, after all, duplicating the sequence
EDGE_COUNT times and throwing away the original was wasteful, compared to
doing that just EDGE_COUNT - 1 times and using the original.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or do need to handle even CODE_LABELs?

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/113415
	* cfgexpand.cc (expand_asm_stmt): For asm goto, use
	duplicate_insn_chain to duplicate after_rtl_seq sequence instead
	of hand written loop with emit_insn of copy_insn and emit original
	after_rtl_seq on the last edge.

	* gcc.target/i386/pr113415.c: New test.


	Jakub
  

Comments

Richard Biener Feb. 9, 2024, 9:34 a.m. UTC | #1
On Fri, 9 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> The asm goto expansion ICEs on the following testcase (which normally
> is rejected later), because expand_asm_stmt emits the code to copy
> the large var out of the out operand to its memory location into
> after_rtl_seq ... after_rtl_end sequence and because it is asm goto,
> it duplicates the sequence on each successor edge of the asm goto.
> The problem is that with -mstringop-strategy=byte_loop that sequence
> contains loops, so CODE_LABELs, JUMP_INSNs, with other strategies
> could contain CALL_INSNs etc.
> But the copying is done using a loop doing
> emit_insn (copy_insn (PATTERN (curr)));
> which does the right thing solely for INSNs, it will do the wrong thing
> for JUMP_INSNs, CALL_INSNs, CODE_LABELs (with RTL checking even ICE on
> them), BARRIERs and the like.
> 
> The following patch partially fixes it (with the hope that such stuff only
> occurs in asms that really can't be accepted; if one uses say "=rm" or
> "=g" constraint then the operand uses the memory directly and nothing is
> copied) by using the
> duplicate_insn_chain function which is used e.g. in RTL loop unrolling and
> which can handle JUMP_INSNs, CALL_INSNs, BARRIERs etc.
> As it is meant to operate on sequences inside of basic blocks, it doesn't
> handle CODE_LABELs (well, it skips them), so if we need a solution that
> will be correct at runtime here for those cases, we'd need to do further
> work (e.g. still use duplicate_insn_chain, but if we notice any CODE_LABELs,
> walk the sequence again, add copies of the CODE_LABELs and then remap
> references to the old CODE_LABELs in the copied sequence to the new ones).
> Because as is now, if the code in one of the sequence copies (where the
> CODE_LABELs have been left out) decides to jump to such a CODE_LABEL, it
> will jump to the CODE_LABEL which has been in the original sequence (which
> the code emits on the last edge, after all, duplicating the sequence
> EDGE_COUNT times and throwing away the original was wasteful, compared to
> doing that just EDGE_COUNT - 1 times and using the original.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> Or do need to handle even CODE_LABELs?

I guess we'll do when we run into it?  I'm not sure we can constrain
asm gotos in a way such complicated "after" sequences never happen,
right?  A sorry () might be a possibility for cases we don't handle.

Richard.

> 2024-02-09  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/113415
> 	* cfgexpand.cc (expand_asm_stmt): For asm goto, use
> 	duplicate_insn_chain to duplicate after_rtl_seq sequence instead
> 	of hand written loop with emit_insn of copy_insn and emit original
> 	after_rtl_seq on the last edge.
> 
> 	* gcc.target/i386/pr113415.c: New test.
> 
> --- gcc/cfgexpand.cc.jj	2024-01-24 13:11:21.011469855 +0100
> +++ gcc/cfgexpand.cc	2024-02-08 18:22:04.699621085 +0100
> @@ -3671,16 +3671,21 @@ expand_asm_stmt (gasm *stmt)
>  	{
>  	  edge e;
>  	  edge_iterator ei;
> -	  
> +	  unsigned int cnt = EDGE_COUNT (gimple_bb (stmt)->succs);
> +
>  	  FOR_EACH_EDGE (e, ei, gimple_bb (stmt)->succs)
>  	    {
> -	      start_sequence ();
> -	      for (rtx_insn *curr = after_rtl_seq;
> -		   curr != NULL_RTX;
> -		   curr = NEXT_INSN (curr))
> -		emit_insn (copy_insn (PATTERN (curr)));
> -	      rtx_insn *copy = get_insns ();
> -	      end_sequence ();
> +	      rtx_insn *copy;
> +	      if (--cnt == 0)
> +		copy = after_rtl_seq;
> +	      else
> +		{
> +		  start_sequence ();
> +		  duplicate_insn_chain (after_rtl_seq, after_rtl_end,
> +					NULL, NULL);
> +		  copy = get_insns ();
> +		  end_sequence ();
> +		}
>  	      insert_insn_on_edge (copy, e);
>  	    }
>  	}
> --- gcc/testsuite/gcc.target/i386/pr113415.c.jj	2024-02-08 18:26:27.622966847 +0100
> +++ gcc/testsuite/gcc.target/i386/pr113415.c	2024-02-08 18:26:11.336193222 +0100
> @@ -0,0 +1,11 @@
> +/* PR middle-end/113415 */
> +/* { dg-do compile } */
> +/* { dg-options "-mstringop-strategy=byte_loop" } */
> +
> +void
> +foo (void)
> +{
> +  unsigned long arr[64];
> +lab:
> +  __asm__ goto ("" : "=r" (arr) : : : lab);	/* { dg-error "impossible constraint in 'asm'" } */
> +}
> 
> 	Jakub
> 
>
  

Patch

--- gcc/cfgexpand.cc.jj	2024-01-24 13:11:21.011469855 +0100
+++ gcc/cfgexpand.cc	2024-02-08 18:22:04.699621085 +0100
@@ -3671,16 +3671,21 @@  expand_asm_stmt (gasm *stmt)
 	{
 	  edge e;
 	  edge_iterator ei;
-	  
+	  unsigned int cnt = EDGE_COUNT (gimple_bb (stmt)->succs);
+
 	  FOR_EACH_EDGE (e, ei, gimple_bb (stmt)->succs)
 	    {
-	      start_sequence ();
-	      for (rtx_insn *curr = after_rtl_seq;
-		   curr != NULL_RTX;
-		   curr = NEXT_INSN (curr))
-		emit_insn (copy_insn (PATTERN (curr)));
-	      rtx_insn *copy = get_insns ();
-	      end_sequence ();
+	      rtx_insn *copy;
+	      if (--cnt == 0)
+		copy = after_rtl_seq;
+	      else
+		{
+		  start_sequence ();
+		  duplicate_insn_chain (after_rtl_seq, after_rtl_end,
+					NULL, NULL);
+		  copy = get_insns ();
+		  end_sequence ();
+		}
 	      insert_insn_on_edge (copy, e);
 	    }
 	}
--- gcc/testsuite/gcc.target/i386/pr113415.c.jj	2024-02-08 18:26:27.622966847 +0100
+++ gcc/testsuite/gcc.target/i386/pr113415.c	2024-02-08 18:26:11.336193222 +0100
@@ -0,0 +1,11 @@ 
+/* PR middle-end/113415 */
+/* { dg-do compile } */
+/* { dg-options "-mstringop-strategy=byte_loop" } */
+
+void
+foo (void)
+{
+  unsigned long arr[64];
+lab:
+  __asm__ goto ("" : "=r" (arr) : : : lab);	/* { dg-error "impossible constraint in 'asm'" } */
+}