RISC-V: Split VF iterators for Zvfh(min).
Checks
Commit Message
Hi,
when working on FP widening/narrowing I realized the Zvfhmin handling
is not ideal right now: We use the "enabled" insn attribute to disable
instructions not available with Zvfhmin (but only with Zvfh).
However, "enabled == 0" only disables insn alternatives, in our case all
of them when the mode is a HFmode. The insn itself remains available
(e.g. for combine to match) and we end up with an insn without alternatives
that reload cannot handle --> ICE.
The proper solution is to disable the instruction for the respective
mode altogether. This patch achieves this by splitting the VF as well
as VWEXTF iterators into variants with TARGET_ZVFH and
TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or
TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need
adjustments.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md: VF_AUTO -> VF.
* config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN,
VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and
VHF_LMUL1.
* config/riscv/vector.md: Use new iterators.
---
gcc/config/riscv/autovec.md | 28 ++++++-------
gcc/config/riscv/vector-iterators.md | 63 ++++++++++++++++++----------
gcc/config/riscv/vector.md | 20 ++++-----
3 files changed, 64 insertions(+), 47 deletions(-)
Comments
You change "VF" constraint as "TARGET_ZVFH" which is incorrect since we
a lot of instructions are valid in "TARGET_ZVFHMIN" in vector.md but you disabled them in this patch.
You disabled them unexpectedly.
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-06-22 21:03
To: gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai; Li, Pan2; jeffreyalaw
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
Hi,
when working on FP widening/narrowing I realized the Zvfhmin handling
is not ideal right now: We use the "enabled" insn attribute to disable
instructions not available with Zvfhmin (but only with Zvfh).
However, "enabled == 0" only disables insn alternatives, in our case all
of them when the mode is a HFmode. The insn itself remains available
(e.g. for combine to match) and we end up with an insn without alternatives
that reload cannot handle --> ICE.
The proper solution is to disable the instruction for the respective
mode altogether. This patch achieves this by splitting the VF as well
as VWEXTF iterators into variants with TARGET_ZVFH and
TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or
TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need
adjustments.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md: VF_AUTO -> VF.
* config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN,
VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and
VHF_LMUL1.
* config/riscv/vector.md: Use new iterators.
---
gcc/config/riscv/autovec.md | 28 ++++++-------
gcc/config/riscv/vector-iterators.md | 63 ++++++++++++++++++----------
gcc/config/riscv/vector.md | 20 ++++-----
3 files changed, 64 insertions(+), 47 deletions(-)
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index f1641d7e1ea..7f0b2befd6b 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -539,9 +539,9 @@ (define_expand "abs<mode>2"
;; - vfneg.v/vfabs.v
;; -------------------------------------------------------------------------------
(define_expand "<optab><mode>2"
- [(set (match_operand:VF_AUTO 0 "register_operand")
- (any_float_unop_nofrm:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")))]
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop_nofrm:VF
+ (match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
@@ -556,9 +556,9 @@ (define_expand "<optab><mode>2"
;; - vfsqrt.v
;; -------------------------------------------------------------------------------
(define_expand "<optab><mode>2"
- [(set (match_operand:VF_AUTO 0 "register_operand")
- (any_float_unop:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")))]
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop:VF
+ (match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
@@ -777,10 +777,10 @@ (define_expand "vec_extract<mode><vel>"
;; - vfadd.vf/vfsub.vf/...
;; -------------------------------------------------------------------------
(define_expand "<optab><mode>3"
- [(match_operand:VF_AUTO 0 "register_operand")
- (any_float_binop:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")
- (match_operand:VF_AUTO 2 "register_operand"))]
+ [(match_operand:VF 0 "register_operand")
+ (any_float_binop:VF
+ (match_operand:VF 1 "register_operand")
+ (match_operand:VF 2 "register_operand"))]
"TARGET_VECTOR"
{
riscv_vector::emit_vlmax_fp_insn (code_for_pred (<CODE>, <MODE>mode),
@@ -794,10 +794,10 @@ (define_expand "<optab><mode>3"
;; - vfmin.vf/vfmax.vf
;; -------------------------------------------------------------------------
(define_expand "<optab><mode>3"
- [(match_operand:VF_AUTO 0 "register_operand")
- (any_float_binop_nofrm:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")
- (match_operand:VF_AUTO 2 "register_operand"))]
+ [(match_operand:VF 0 "register_operand")
+ (any_float_binop_nofrm:VF
+ (match_operand:VF 1 "register_operand")
+ (match_operand:VF 2 "register_operand"))]
"TARGET_VECTOR"
{
riscv_vector::emit_vlmax_insn (code_for_pred (<CODE>, <MODE>mode),
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 6ca1c54c709..ae9f44b5f78 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -271,7 +271,7 @@ (define_mode_iterator VWI [
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
])
-(define_mode_iterator VF [
+(define_mode_iterator VF_ZVFHMIN [
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
@@ -295,11 +295,12 @@ (define_mode_iterator VF [
;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
-;; TARGET_ZVFHMIN while we actually disable all instructions apart from
-;; load, store and convert for it.
-;; Consequently the autovec expanders should also only be enabled with
-;; TARGET_ZVFH.
-(define_mode_iterator VF_AUTO [
+;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
+;; from load, store and convert for it.
+;; It is not enough to set the "enabled" attribute to false
+;; since this will only disable insn alternatives in reload but still
+;; allow the instruction and mode to be matched during combine et al.
+(define_mode_iterator VF [
(VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_ZVFH")
(VNx4HF "TARGET_ZVFH")
@@ -494,7 +495,8 @@ (define_mode_iterator VWEXTI [
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
])
-(define_mode_iterator VWEXTF [
+;; Same iterator split reason as VF_ZVFHMIN and VF.
+(define_mode_iterator VWEXTF_ZVFHMIN [
(VNx1SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2SF "TARGET_VECTOR_ELEN_FP_16")
(VNx4SF "TARGET_VECTOR_ELEN_FP_16")
@@ -509,13 +511,28 @@ (define_mode_iterator VWEXTF [
(VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VWEXTF [
+ (VNx1SF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2SF "TARGET_ZVFH")
+ (VNx4SF "TARGET_ZVFH")
+ (VNx8SF "TARGET_ZVFH")
+ (VNx16SF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx32SF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
+
+ (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
+ (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+])
+
(define_mode_iterator VWCONVERTI [
- (VNx1SI "TARGET_MIN_VLEN < 128 && TARGET_VECTOR_ELEN_FP_16")
- (VNx2SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx4SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx8SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx16SI "TARGET_MIN_VLEN > 32 && TARGET_VECTOR_ELEN_FP_16")
- (VNx32SI "TARGET_MIN_VLEN >= 128 && TARGET_VECTOR_ELEN_FP_16")
+ (VNx1SI "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2SI "TARGET_ZVFH")
+ (VNx4SI "TARGET_ZVFH")
+ (VNx8SI "TARGET_ZVFH")
+ (VNx16SI "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx32SI "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
(VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
@@ -992,13 +1009,13 @@ (define_mode_iterator VDI [
])
(define_mode_iterator VHF [
- (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
- (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
- (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+ (VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2HF "TARGET_ZVFH")
+ (VNx4HF "TARGET_ZVFH")
+ (VNx8HF "TARGET_ZVFH")
+ (VNx16HF "TARGET_ZVFH")
+ (VNx32HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx64HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VSF [
@@ -1042,9 +1059,9 @@ (define_mode_iterator VDI_LMUL1 [
])
(define_mode_iterator VHF_LMUL1 [
- (VNx8HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
- (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
- (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+ (VNx8HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
+ (VNx4HF "TARGET_ZVFH && TARGET_MIN_VLEN == 64")
+ (VNx2HF "TARGET_ZVFH && TARGET_MIN_VLEN == 32")
])
(define_mode_iterator VSF_LMUL1 [
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 884e7435cc2..11ed851a564 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1351,8 +1351,8 @@ (define_insn_and_split "*pred_broadcast<mode>"
(set_attr "mode" "<MODE>")])
(define_insn "*pred_broadcast<mode>"
- [(set (match_operand:VF 0 "register_operand" "=vr, vr, vr, vr, vr, vr, vr, vr")
- (if_then_else:VF
+ [(set (match_operand:VF_ZVFHMIN 0 "register_operand" "=vr, vr, vr, vr, vr, vr, vr, vr")
+ (if_then_else:VF_ZVFHMIN
(unspec:<VM>
[(match_operand:<VM> 1 "vector_broadcast_mask_operand" "Wc1,Wc1, vm, vm,Wc1,Wc1,Wb1,Wb1")
(match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK")
@@ -1361,9 +1361,9 @@ (define_insn "*pred_broadcast<mode>"
(match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (vec_duplicate:VF
+ (vec_duplicate:VF_ZVFHMIN
(match_operand:<VEL> 3 "direct_broadcast_operand" " f, f,Wdm,Wdm,Wdm,Wdm, f, f"))
- (match_operand:VF 2 "vector_merge_operand" "vu, 0, vu, 0, vu, 0, vu, 0")))]
+ (match_operand:VF_ZVFHMIN 2 "vector_merge_operand" "vu, 0, vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"@
vfmv.v.f\t%0,%3
@@ -7106,8 +7106,8 @@ (define_insn "@pred_widen_<float_cvt><mode>"
(set_attr "mode" "<VNCONVERT>")])
(define_insn "@pred_extend<mode>"
- [(set (match_operand:VWEXTF 0 "register_operand" "=&vr, &vr")
- (if_then_else:VWEXTF
+ [(set (match_operand:VWEXTF_ZVFHMIN 0 "register_operand" "=&vr, &vr")
+ (if_then_else:VWEXTF_ZVFHMIN
(unspec:<VM>
[(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
(match_operand 4 "vector_length_operand" " rK, rK")
@@ -7116,9 +7116,9 @@ (define_insn "@pred_extend<mode>"
(match_operand 7 "const_int_operand" " i, i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (float_extend:VWEXTF
+ (float_extend:VWEXTF_ZVFHMIN
(match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" " vr, vr"))
- (match_operand:VWEXTF 2 "vector_merge_operand" " vu, 0")))]
+ (match_operand:VWEXTF_ZVFHMIN 2 "vector_merge_operand" " vu, 0")))]
"TARGET_VECTOR"
"vfwcvt.f.f.v\t%0,%3%p1"
[(set_attr "type" "vfwcvtftof")
@@ -7206,7 +7206,7 @@ (define_insn "@pred_trunc<mode>"
(reg:SI VTYPE_REGNUM)
(reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
(float_truncate:<V_DOUBLE_TRUNC>
- (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr"))
+ (match_operand:VWEXTF_ZVFHMIN 3 "register_operand" " 0, 0, 0, 0, vr, vr"))
(match_operand:<V_DOUBLE_TRUNC> 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"vfncvt.f.f.w\t%0,%3%p1"
@@ -7226,7 +7226,7 @@ (define_insn "@pred_rod_trunc<mode>"
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(unspec:<V_DOUBLE_TRUNC>
[(float_truncate:<V_DOUBLE_TRUNC>
- (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr"))] UNSPEC_ROD)
+ (match_operand:VWEXTF_ZVFHMIN 3 "register_operand" " 0, 0, 0, 0, vr, vr"))] UNSPEC_ROD)
(match_operand:<V_DOUBLE_TRUNC> 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"vfncvt.rod.f.f.w\t%0,%3%p1"
--
2.40.1
> You change "VF" constraint as "TARGET_ZVFH" which is incorrect since
> we a lot of instructions are valid in "TARGET_ZVFHMIN" in vector.md
> but you disabled them in this patch. You disabled them unexpectedly.
Yes that was kind of the point :) IMHO all the :VF insns are actually
only valid in a TARGET_ZVFH setting with the exception of pred_broadcast
which I changed to VF_ZVFHMIN (vfmv.v.f and vfmv.s.f will be "masked out"
by the "enabled" attribute). Now I'm not saying I might have missed/mixed
up some insns in this patch but e.g. the binops/unops shouldn't be enabled
(and their alternatives are already disabled as of now).
What are the many instructions that are valid in TARGET_ZVFHMIN?
Regards
Robin
I don't understand why it is necessary to bother "VF".
"VF” should not be changed since intrinsic stuff is quite stable and any unreasonable changes
are unacceptable.
>> What are the many instructions that are valid in TARGET_ZVFHMIN?
vle/vse/vluxei/vloxei/vsuxei/vsoxei/vlse/vsse.
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-06-22 21:22
To: 钟居哲; gcc-patches; palmer; kito.cheng; pan2.li; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
> You change "VF" constraint as "TARGET_ZVFH" which is incorrect since
> we a lot of instructions are valid in "TARGET_ZVFHMIN" in vector.md
> but you disabled them in this patch. You disabled them unexpectedly.
Yes that was kind of the point :) IMHO all the :VF insns are actually
only valid in a TARGET_ZVFH setting with the exception of pred_broadcast
which I changed to VF_ZVFHMIN (vfmv.v.f and vfmv.s.f will be "masked out"
by the "enabled" attribute). Now I'm not saying I might have missed/mixed
up some insns in this patch but e.g. the binops/unops shouldn't be enabled
(and their alternatives are already disabled as of now).
What are the many instructions that are valid in TARGET_ZVFHMIN?
Regards
Robin
> I don't understand why it is necessary to bother "VF". "VF” should
> not be changed since intrinsic stuff is quite stable and any
> unreasonable changes are unacceptable.
Ok, I hear your concern. My argument is: Currently our mechanism
of disabling instructions is incorrect and if any of the VF instructions
were to be created by combine, fwprop or other passes we'd potentially
ICE in reload. The other option is to leave VF unchanged and duplicate
all patterns for VHF. Those can have a TARGET_ZVFH then.
> vle/vse/vluxei/vloxei/vsuxei/vsoxei/vlse/vsse.
These are all V/VT and not VF? (apart from vlse which I adjusted)
Regards
Robin
Oh. I see. I think I am wrong. Sorry for that :).
load/store are using 'V' iterators.
This patch looks reasonable to me now.
Thanks for catching this.
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-06-22 21:32
To: 钟居哲; gcc-patches; palmer; kito.cheng; pan2.li; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
> I don't understand why it is necessary to bother "VF". "VF” should
> not be changed since intrinsic stuff is quite stable and any
> unreasonable changes are unacceptable.
Ok, I hear your concern. My argument is: Currently our mechanism
of disabling instructions is incorrect and if any of the VF instructions
were to be created by combine, fwprop or other passes we'd potentially
ICE in reload. The other option is to leave VF unchanged and duplicate
all patterns for VHF. Those can have a TARGET_ZVFH then.
> vle/vse/vluxei/vloxei/vsuxei/vsoxei/vlse/vsse.
These are all V/VT and not VF? (apart from vlse which I adjusted)
Regards
Robin
Just curious about the combine pass you mentioned, not very sure my understand is correct but it looks like the combine pass totally ignore the iterator requirement?
It is sort of surprise to me as the combine pass may also need the information of iterators.
Pan
From: 钟居哲 <juzhe.zhong@rivai.ai>
Sent: Thursday, June 22, 2023 9:37 PM
To: rdapp.gcc <rdapp.gcc@gmail.com>; gcc-patches <gcc-patches@gcc.gnu.org>; palmer <palmer@dabbelt.com>; kito.cheng <kito.cheng@gmail.com>; Li, Pan2 <pan2.li@intel.com>; Jeff Law <jeffreyalaw@gmail.com>
Cc: rdapp.gcc <rdapp.gcc@gmail.com>
Subject: Re: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
Oh. I see. I think I am wrong. Sorry for that :).
load/store are using 'V' iterators.
This patch looks reasonable to me now.
Thanks for catching this.
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>
From: Robin Dapp<mailto:rdapp.gcc@gmail.com>
Date: 2023-06-22 21:32
To: 钟居哲<mailto:juzhe.zhong@rivai.ai>; gcc-patches<mailto:gcc-patches@gcc.gnu.org>; palmer<mailto:palmer@dabbelt.com>; kito.cheng<mailto:kito.cheng@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; Jeff Law<mailto:jeffreyalaw@gmail.com>
CC: rdapp.gcc<mailto:rdapp.gcc@gmail.com>
Subject: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
> I don't understand why it is necessary to bother "VF". "VF” should
> not be changed since intrinsic stuff is quite stable and any
> unreasonable changes are unacceptable.
Ok, I hear your concern. My argument is: Currently our mechanism
of disabling instructions is incorrect and if any of the VF instructions
were to be created by combine, fwprop or other passes we'd potentially
ICE in reload. The other option is to leave VF unchanged and duplicate
all patterns for VHF. Those can have a TARGET_ZVFH then.
> vle/vse/vluxei/vloxei/vsuxei/vsoxei/vlse/vsse.
These are all V/VT and not VF? (apart from vlse which I adjusted)
Regards
Robin
> Just curious about the combine pass you mentioned, not very sure my
> understand is correct but it looks like the combine pass totally
> ignore the iterator requirement?
>
> It is sort of surprise to me as the combine pass may also need the
> information of iterators.
combine tries to match instructions (with fitting modes of course).
It does not look at the insn constraints that reload/lra later can
use to switch between alternatives depending on the register situation
and other factors.
We e.g. have an instruction
(define_insn "bla"
(set (match_operand:VF 1 "=vd")
(match_operand:VF 2 "vr"))
...
and implicitly
[(set_attr "enabled" "true")]
This instruction gets multiplexed via the VF iterator into (among others)
(define_insn "bla"
(set (match_operand:VNx4HF 1 "=vd")
(match_operand:VNx4HF 2 "vr"))
...
[(set_attr "enabled" "true")]
When we set "enabled" to "false" via "fp_vector_disabled", we have:
(define_insn "bla"
(set (match_operand:VNx4HF 1 "=vd")
(match_operand:VNx4HF 2 "vr"))
...
[(set_attr "enabled" "false")]
This means the only available alternative is disabled but the insn
itself is still there, particularly for combine which does not look
into the constraints.
So in our case the iterator "allowed" the instruction (leading combine
to think it is available) and we later masked it out with "enabled = false".
Now we could argue that combine's behavior should change here and an
insn without any alternatives is not actually available but that's not
a battle I'm willing to fight :D
Regards
Robin
Thanks Robine for the explanation, it is very clear to me. Totally agree below parts and I think we can leave it to the maintainers of the RTL/Machine Descriptions.
> Now we could argue that combine's behavior should change here and an
> insn without any alternatives is not actually available but that's not
> a battle I'm willing to fight 😃
Pan
-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com>
Sent: Thursday, June 22, 2023 10:31 PM
To: Li, Pan2 <pan2.li@intel.com>; 钟居哲 <juzhe.zhong@rivai.ai>; gcc-patches <gcc-patches@gcc.gnu.org>; palmer <palmer@dabbelt.com>; kito.cheng <kito.cheng@gmail.com>; Jeff Law <jeffreyalaw@gmail.com>
Cc: rdapp.gcc@gmail.com
Subject: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
> Just curious about the combine pass you mentioned, not very sure my
> understand is correct but it looks like the combine pass totally
> ignore the iterator requirement?
>
> It is sort of surprise to me as the combine pass may also need the
> information of iterators.
combine tries to match instructions (with fitting modes of course).
It does not look at the insn constraints that reload/lra later can
use to switch between alternatives depending on the register situation
and other factors.
We e.g. have an instruction
(define_insn "bla"
(set (match_operand:VF 1 "=vd")
(match_operand:VF 2 "vr"))
...
and implicitly
[(set_attr "enabled" "true")]
This instruction gets multiplexed via the VF iterator into (among others)
(define_insn "bla"
(set (match_operand:VNx4HF 1 "=vd")
(match_operand:VNx4HF 2 "vr"))
...
[(set_attr "enabled" "true")]
When we set "enabled" to "false" via "fp_vector_disabled", we have:
(define_insn "bla"
(set (match_operand:VNx4HF 1 "=vd")
(match_operand:VNx4HF 2 "vr"))
...
[(set_attr "enabled" "false")]
This means the only available alternative is disabled but the insn
itself is still there, particularly for combine which does not look
into the constraints.
So in our case the iterator "allowed" the instruction (leading combine
to think it is available) and we later masked it out with "enabled = false".
Now we could argue that combine's behavior should change here and an
insn without any alternatives is not actually available but that's not
a battle I'm willing to fight :D
Regards
Robin
On 6/23/23 06:54, Li, Pan2 wrote:
> Thanks Robine for the explanation, it is very clear to me. Totally agree below parts and I think we can leave it to the maintainers of the RTL/Machine Descriptions.
>
>> Now we could argue that combine's behavior should change here and an
>> insn without any alternatives is not actually available but that's not
>> a battle I'm willing to fight 😃
>
> Pan
>
> -----Original Message-----
> From: Robin Dapp <rdapp.gcc@gmail.com>
> Sent: Thursday, June 22, 2023 10:31 PM
> To: Li, Pan2 <pan2.li@intel.com>; 钟居哲 <juzhe.zhong@rivai.ai>; gcc-patches <gcc-patches@gcc.gnu.org>; palmer <palmer@dabbelt.com>; kito.cheng <kito.cheng@gmail.com>; Jeff Law <jeffreyalaw@gmail.com>
> Cc: rdapp.gcc@gmail.com
> Subject: Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
>
>> Just curious about the combine pass you mentioned, not very sure my
>> understand is correct but it looks like the combine pass totally
>> ignore the iterator requirement?
>>
>> It is sort of surprise to me as the combine pass may also need the
>> information of iterators.
>
> combine tries to match instructions (with fitting modes of course).
> It does not look at the insn constraints that reload/lra later can
> use to switch between alternatives depending on the register situation
> and other factors.
>
> We e.g. have an instruction
> (define_insn "bla"
> (set (match_operand:VF 1 "=vd")
> (match_operand:VF 2 "vr"))
> ...
> and implicitly
> [(set_attr "enabled" "true")]
>
> This instruction gets multiplexed via the VF iterator into (among others)
> (define_insn "bla"
> (set (match_operand:VNx4HF 1 "=vd")
> (match_operand:VNx4HF 2 "vr"))
> ...
> [(set_attr "enabled" "true")]
>
> When we set "enabled" to "false" via "fp_vector_disabled", we have:
> (define_insn "bla"
> (set (match_operand:VNx4HF 1 "=vd")
> (match_operand:VNx4HF 2 "vr"))
> ...
> [(set_attr "enabled" "false")]
>
> This means the only available alternative is disabled but the insn
> itself is still there, particularly for combine which does not look
> into the constraints.
>
> So in our case the iterator "allowed" the instruction (leading combine
> to think it is available) and we later masked it out with "enabled = false".
> Now we could argue that combine's behavior should change here and an
> insn without any alternatives is not actually available but that's not
> a battle I'm willing to fight :D
More importantly, at combine time we don't know which alternative will
match. In fact, you can run into cases where no alternative matches
until register allocation -- this was fairly common in the past as it
allowed for simpler machine descriptions. It fell out of favor in the
90s as more targets started using scheduling and we wanted to expose as
much of the final code as we could for the first scheduling pass.
Jeff
On 6/22/23 07:03, Robin Dapp wrote:
> Hi,
>
> when working on FP widening/narrowing I realized the Zvfhmin handling
> is not ideal right now: We use the "enabled" insn attribute to disable
> instructions not available with Zvfhmin (but only with Zvfh).
>
> However, "enabled == 0" only disables insn alternatives, in our case all
> of them when the mode is a HFmode. The insn itself remains available
> (e.g. for combine to match) and we end up with an insn without alternatives
> that reload cannot handle --> ICE.
>
> The proper solution is to disable the instruction for the respective
> mode altogether. This patch achieves this by splitting the VF as well
> as VWEXTF iterators into variants with TARGET_ZVFH and
> TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or
> TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need
> adjustments.
>
> Regards
> Robin
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md: VF_AUTO -> VF.
> * config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN,
> VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and
> VHF_LMUL1.
> * config/riscv/vector.md: Use new iterators.
OK for the trunk. Thanks for walking everyone through the issues here.
jeff
@@ -539,9 +539,9 @@ (define_expand "abs<mode>2"
;; - vfneg.v/vfabs.v
;; -------------------------------------------------------------------------------
(define_expand "<optab><mode>2"
- [(set (match_operand:VF_AUTO 0 "register_operand")
- (any_float_unop_nofrm:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")))]
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop_nofrm:VF
+ (match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
@@ -556,9 +556,9 @@ (define_expand "<optab><mode>2"
;; - vfsqrt.v
;; -------------------------------------------------------------------------------
(define_expand "<optab><mode>2"
- [(set (match_operand:VF_AUTO 0 "register_operand")
- (any_float_unop:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")))]
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop:VF
+ (match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
@@ -777,10 +777,10 @@ (define_expand "vec_extract<mode><vel>"
;; - vfadd.vf/vfsub.vf/...
;; -------------------------------------------------------------------------
(define_expand "<optab><mode>3"
- [(match_operand:VF_AUTO 0 "register_operand")
- (any_float_binop:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")
- (match_operand:VF_AUTO 2 "register_operand"))]
+ [(match_operand:VF 0 "register_operand")
+ (any_float_binop:VF
+ (match_operand:VF 1 "register_operand")
+ (match_operand:VF 2 "register_operand"))]
"TARGET_VECTOR"
{
riscv_vector::emit_vlmax_fp_insn (code_for_pred (<CODE>, <MODE>mode),
@@ -794,10 +794,10 @@ (define_expand "<optab><mode>3"
;; - vfmin.vf/vfmax.vf
;; -------------------------------------------------------------------------
(define_expand "<optab><mode>3"
- [(match_operand:VF_AUTO 0 "register_operand")
- (any_float_binop_nofrm:VF_AUTO
- (match_operand:VF_AUTO 1 "register_operand")
- (match_operand:VF_AUTO 2 "register_operand"))]
+ [(match_operand:VF 0 "register_operand")
+ (any_float_binop_nofrm:VF
+ (match_operand:VF 1 "register_operand")
+ (match_operand:VF 2 "register_operand"))]
"TARGET_VECTOR"
{
riscv_vector::emit_vlmax_insn (code_for_pred (<CODE>, <MODE>mode),
@@ -271,7 +271,7 @@ (define_mode_iterator VWI [
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
])
-(define_mode_iterator VF [
+(define_mode_iterator VF_ZVFHMIN [
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
@@ -295,11 +295,12 @@ (define_mode_iterator VF [
;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
-;; TARGET_ZVFHMIN while we actually disable all instructions apart from
-;; load, store and convert for it.
-;; Consequently the autovec expanders should also only be enabled with
-;; TARGET_ZVFH.
-(define_mode_iterator VF_AUTO [
+;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
+;; from load, store and convert for it.
+;; It is not enough to set the "enabled" attribute to false
+;; since this will only disable insn alternatives in reload but still
+;; allow the instruction and mode to be matched during combine et al.
+(define_mode_iterator VF [
(VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_ZVFH")
(VNx4HF "TARGET_ZVFH")
@@ -494,7 +495,8 @@ (define_mode_iterator VWEXTI [
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
])
-(define_mode_iterator VWEXTF [
+;; Same iterator split reason as VF_ZVFHMIN and VF.
+(define_mode_iterator VWEXTF_ZVFHMIN [
(VNx1SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2SF "TARGET_VECTOR_ELEN_FP_16")
(VNx4SF "TARGET_VECTOR_ELEN_FP_16")
@@ -509,13 +511,28 @@ (define_mode_iterator VWEXTF [
(VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VWEXTF [
+ (VNx1SF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2SF "TARGET_ZVFH")
+ (VNx4SF "TARGET_ZVFH")
+ (VNx8SF "TARGET_ZVFH")
+ (VNx16SF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx32SF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
+
+ (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
+ (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
+ (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+])
+
(define_mode_iterator VWCONVERTI [
- (VNx1SI "TARGET_MIN_VLEN < 128 && TARGET_VECTOR_ELEN_FP_16")
- (VNx2SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx4SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx8SI "TARGET_VECTOR_ELEN_FP_16")
- (VNx16SI "TARGET_MIN_VLEN > 32 && TARGET_VECTOR_ELEN_FP_16")
- (VNx32SI "TARGET_MIN_VLEN >= 128 && TARGET_VECTOR_ELEN_FP_16")
+ (VNx1SI "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2SI "TARGET_ZVFH")
+ (VNx4SI "TARGET_ZVFH")
+ (VNx8SI "TARGET_ZVFH")
+ (VNx16SI "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx32SI "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
(VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
@@ -992,13 +1009,13 @@ (define_mode_iterator VDI [
])
(define_mode_iterator VHF [
- (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
- (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
- (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
- (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+ (VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
+ (VNx2HF "TARGET_ZVFH")
+ (VNx4HF "TARGET_ZVFH")
+ (VNx8HF "TARGET_ZVFH")
+ (VNx16HF "TARGET_ZVFH")
+ (VNx32HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
+ (VNx64HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VSF [
@@ -1042,9 +1059,9 @@ (define_mode_iterator VDI_LMUL1 [
])
(define_mode_iterator VHF_LMUL1 [
- (VNx8HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
- (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
- (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+ (VNx8HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
+ (VNx4HF "TARGET_ZVFH && TARGET_MIN_VLEN == 64")
+ (VNx2HF "TARGET_ZVFH && TARGET_MIN_VLEN == 32")
])
(define_mode_iterator VSF_LMUL1 [
@@ -1351,8 +1351,8 @@ (define_insn_and_split "*pred_broadcast<mode>"
(set_attr "mode" "<MODE>")])
(define_insn "*pred_broadcast<mode>"
- [(set (match_operand:VF 0 "register_operand" "=vr, vr, vr, vr, vr, vr, vr, vr")
- (if_then_else:VF
+ [(set (match_operand:VF_ZVFHMIN 0 "register_operand" "=vr, vr, vr, vr, vr, vr, vr, vr")
+ (if_then_else:VF_ZVFHMIN
(unspec:<VM>
[(match_operand:<VM> 1 "vector_broadcast_mask_operand" "Wc1,Wc1, vm, vm,Wc1,Wc1,Wb1,Wb1")
(match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK")
@@ -1361,9 +1361,9 @@ (define_insn "*pred_broadcast<mode>"
(match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (vec_duplicate:VF
+ (vec_duplicate:VF_ZVFHMIN
(match_operand:<VEL> 3 "direct_broadcast_operand" " f, f,Wdm,Wdm,Wdm,Wdm, f, f"))
- (match_operand:VF 2 "vector_merge_operand" "vu, 0, vu, 0, vu, 0, vu, 0")))]
+ (match_operand:VF_ZVFHMIN 2 "vector_merge_operand" "vu, 0, vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"@
vfmv.v.f\t%0,%3
@@ -7106,8 +7106,8 @@ (define_insn "@pred_widen_<float_cvt><mode>"
(set_attr "mode" "<VNCONVERT>")])
(define_insn "@pred_extend<mode>"
- [(set (match_operand:VWEXTF 0 "register_operand" "=&vr, &vr")
- (if_then_else:VWEXTF
+ [(set (match_operand:VWEXTF_ZVFHMIN 0 "register_operand" "=&vr, &vr")
+ (if_then_else:VWEXTF_ZVFHMIN
(unspec:<VM>
[(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
(match_operand 4 "vector_length_operand" " rK, rK")
@@ -7116,9 +7116,9 @@ (define_insn "@pred_extend<mode>"
(match_operand 7 "const_int_operand" " i, i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (float_extend:VWEXTF
+ (float_extend:VWEXTF_ZVFHMIN
(match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" " vr, vr"))
- (match_operand:VWEXTF 2 "vector_merge_operand" " vu, 0")))]
+ (match_operand:VWEXTF_ZVFHMIN 2 "vector_merge_operand" " vu, 0")))]
"TARGET_VECTOR"
"vfwcvt.f.f.v\t%0,%3%p1"
[(set_attr "type" "vfwcvtftof")
@@ -7206,7 +7206,7 @@ (define_insn "@pred_trunc<mode>"
(reg:SI VTYPE_REGNUM)
(reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
(float_truncate:<V_DOUBLE_TRUNC>
- (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr"))
+ (match_operand:VWEXTF_ZVFHMIN 3 "register_operand" " 0, 0, 0, 0, vr, vr"))
(match_operand:<V_DOUBLE_TRUNC> 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"vfncvt.f.f.w\t%0,%3%p1"
@@ -7226,7 +7226,7 @@ (define_insn "@pred_rod_trunc<mode>"
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(unspec:<V_DOUBLE_TRUNC>
[(float_truncate:<V_DOUBLE_TRUNC>
- (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr"))] UNSPEC_ROD)
+ (match_operand:VWEXTF_ZVFHMIN 3 "register_operand" " 0, 0, 0, 0, vr, vr"))] UNSPEC_ROD)
(match_operand:<V_DOUBLE_TRUNC> 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))]
"TARGET_VECTOR"
"vfncvt.rod.f.f.w\t%0,%3%p1"