[committed] amdgcn: Fix vector TImode reload loop

Message ID f840b1c7-581e-4132-8c7c-bf60bf9e18b9@codesourcery.com
State Unresolved
Headers
Series [committed] amdgcn: Fix vector TImode reload loop |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Andrew Stubbs Nov. 22, 2023, 2:27 p.m. UTC
  This patch fixes a reload bug that's hard to reproduce reliably (so far 
I've only observed it on the OG13 branch, with testcase 
gcc.c-torture/compile/pr70355.c), but causes an infinite loop in reload 
when it fails.

For some reason it wants to save a value from AVGPRs to memory, this 
can't happen directly on CDNA1, so secondary reload moves the value to 
VGPRS, but instead of proceeding to memory, LRA just goes and moves the 
value right back into AVGPRs.  Disparaging this move (when a reload is 
needed) fixes the issue, but I don't know if this is the intended or 
optimal solution in these cases.

Andrew
amdgcn: Fix vector TImode reload loop

I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also.  The mov insns for the other modes
already have '$', so this completes the set.

gcc/ChangeLog:

	* config/gcn/gcn-valu.md (*mov<mode>_4reg): Disparage AVGPR use when a
	reload is required.
  

Patch

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 23f2bbe454b..a928decd408 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -566,10 +566,10 @@  (define_insn "*mov<mode>_4reg"
 	(match_operand:V_4REG 1 "general_operand"))]
   ""
   {@ [cons: =0, 1; attrs: type, length, gcn_version]
-  [v,vDB;vmult,16,*    ]           v_mov_b32\t%L0, %L1\;          v_mov_b32\t%H0, %H1\;          v_mov_b32\t%J0, %J1\;          v_mov_b32\t%K0, %K1
-  [v,a  ;vmult,32,*    ]  v_accvgpr_read_b32\t%L0, %L1\; v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; v_accvgpr_read_b32\t%K0, %K1
-  [a,v  ;vmult,32,*    ] v_accvgpr_write_b32\t%L0, %L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, %J1\;v_accvgpr_write_b32\t%K0, %K1
-  [a,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  v_accvgpr_mov_b32\t%K0, %K1
+  [v ,vDB;vmult,16,*    ]           v_mov_b32\t%L0, %L1\;          v_mov_b32\t%H0, %H1\;          v_mov_b32\t%J0, %J1\;          v_mov_b32\t%K0, %K1
+  [v ,a  ;vmult,32,*    ]  v_accvgpr_read_b32\t%L0, %L1\; v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; v_accvgpr_read_b32\t%K0, %K1
+  [$a,v  ;vmult,32,*    ] v_accvgpr_write_b32\t%L0, %L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, %J1\;v_accvgpr_write_b32\t%K0, %K1
+  [a ,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  v_accvgpr_mov_b32\t%K0, %K1
   })
 
 (define_insn "mov<mode>_exec"