This patch fixes a reload bug that's hard to reproduce reliably (so far
I've only observed it on the OG13 branch, with testcase
gcc.c-torture/compile/pr70355.c), but causes an infinite loop in reload
when it fails.
For some reason it wants to save a value from AVGPRs to memory, this
can't happen directly on CDNA1, so secondary reload moves the value to
VGPRS, but instead of proceeding to memory, LRA just goes and moves the
value right back into AVGPRs. Disparaging this move (when a reload is
needed) fixes the issue, but I don't know if this is the intended or
optimal solution in these cases.
Andrew
amdgcn: Fix vector TImode reload loop
I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also. The mov insns for the other modes
already have '$', so this completes the set.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (*mov<mode>_4reg): Disparage AVGPR use when a
reload is required.
@@ -566,10 +566,10 @@ (define_insn "*mov<mode>_4reg"
(match_operand:V_4REG 1 "general_operand"))]
""
{@ [cons: =0, 1; attrs: type, length, gcn_version]
- [v,vDB;vmult,16,* ] v_mov_b32\t%L0, %L1\; v_mov_b32\t%H0, %H1\; v_mov_b32\t%J0, %J1\; v_mov_b32\t%K0, %K1
- [v,a ;vmult,32,* ] v_accvgpr_read_b32\t%L0, %L1\; v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; v_accvgpr_read_b32\t%K0, %K1
- [a,v ;vmult,32,* ] v_accvgpr_write_b32\t%L0, %L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, %J1\;v_accvgpr_write_b32\t%K0, %K1
- [a,a ;vmult,32,cdna2] v_accvgpr_mov_b32\t%L0, %L1\; v_accvgpr_mov_b32\t%H0, %H1\; v_accvgpr_mov_b32\t%J0, %J1\; v_accvgpr_mov_b32\t%K0, %K1
+ [v ,vDB;vmult,16,* ] v_mov_b32\t%L0, %L1\; v_mov_b32\t%H0, %H1\; v_mov_b32\t%J0, %J1\; v_mov_b32\t%K0, %K1
+ [v ,a ;vmult,32,* ] v_accvgpr_read_b32\t%L0, %L1\; v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; v_accvgpr_read_b32\t%K0, %K1
+ [$a,v ;vmult,32,* ] v_accvgpr_write_b32\t%L0, %L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, %J1\;v_accvgpr_write_b32\t%K0, %K1
+ [a ,a ;vmult,32,cdna2] v_accvgpr_mov_b32\t%L0, %L1\; v_accvgpr_mov_b32\t%H0, %H1\; v_accvgpr_mov_b32\t%J0, %J1\; v_accvgpr_mov_b32\t%K0, %K1
})
(define_insn "mov<mode>_exec"