RISC-V: Add autovec FP unary operations.
Checks
Commit Message
Hi,
this patch adds floating-point autovec expanders for vfneg, vfabs as well as
vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time.
Similary to the binop tests, there are flavors for zvfh now. Prerequisites
as before.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
---
gcc/config/riscv/autovec.md | 36 ++++++++++++++++++-
.../riscv/rvv/autovec/unop/abs-run.c | 6 ++--
.../riscv/rvv/autovec/unop/abs-rv32gcv.c | 3 +-
.../riscv/rvv/autovec/unop/abs-rv64gcv.c | 3 +-
.../riscv/rvv/autovec/unop/abs-template.h | 14 +++++++-
.../riscv/rvv/autovec/unop/abs-zvfh-run.c | 35 ++++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-run.c | 29 +++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-template.h | 31 ++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c | 32 +++++++++++++++++
.../riscv/rvv/autovec/unop/vneg-run.c | 6 ++--
.../riscv/rvv/autovec/unop/vneg-rv32gcv.c | 3 +-
.../riscv/rvv/autovec/unop/vneg-rv64gcv.c | 3 +-
.../riscv/rvv/autovec/unop/vneg-template.h | 5 ++-
.../riscv/rvv/autovec/unop/vneg-zvfh-run.c | 26 ++++++++++++++
16 files changed, 241 insertions(+), 11 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
Comments
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
>
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use
it effectively. With its limited initial accuracy, we'll be stuck with
another round of Newton-Raphson or Goldschmidt, so we're not likely
going to beat the latency of a standard vsqrt. We can use it to improve
throughput though since it does pipeline (using the fmacs of course, so
there's a definite trade-off if the fmacs are already saturated).
>
> Similary to the binop tests, there are flavors for zvfh now. Prerequisites
> as before.
>
> Regards
> Robin
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM. So if Juzhe is happy with it, then it's good to go once
dependencies are resolved.
jeff
Hi, Jeff. Thanks for quick approval.
When I reviewed the patch:
(define_expand "<optab><mode>2"
[(set (match_operand:VF 0 "register_operand")
(any_float_unop_nofrm:VF
(match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
DONE;
})
There could be issue here of FP16 vector.
Since let's see VF iterator:
(define_mode_iterator VF [
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
....
You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
The reason we do that since most floating-point instructions are using same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
in naive way. Some instructions pattern are using VF for example vle16.v which should be enabled as long as TARGET_ZVFHMIN wheras
the instructions like vfneg.v need TARGET_ZVFH.
So I do the experiment:
void
f (_Float16 *restrict a, _Float16 *restrict b)
{
for (int i = 0; i < 100; ++i)
{
a[i] = -b[i];
}
}
with compile option:
-march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3
ICE happens:
auto.c:26:1: error: unable to generate reloads for:
(insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
(if_then_else:VNx8HF (unspec:VNx8BI [
(const_vector:VNx8BI [
(const_int 1 [0x1]) repeated x8
])
(const_int 8 [0x8])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
(unspec:VNx8HF [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
(expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
(nil)))
The reason of ICE is that we have enabled auto-vectorzation pattern of vfneg.v when TARGET_ZVFHMIN according to VF iterators but
the instructions pattern of vfneg.v is correctly disabled and only enabled when TARGET_ZVFH since we have this attribute for each
RVV instruction pattern:
(define_attr "fp_vector_disabled" "no,yes"
(cond [
(and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
vfwalu,vfwmul,vfmuladd,vfwmuladd,
vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
vfclass,vfmerge,
vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
vfredo,vfredu,vfwredo,vfwredu,
vfslide1up,vfslide1down")
(and (eq_attr "mode" "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
(match_test "!TARGET_ZVFH")))
(const_string "yes")
;; The mode records as QI for the FP16 <=> INT8 instruction.
(and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
(and (eq_attr "mode" "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
(match_test "!TARGET_ZVFH")))
(const_string "yes")
]
(const_string "no")))
When I slightly change the pattern as follows:
(define_expand "<optab><mode>2"
[(set (match_operand:VF 0 "register_operand")
(any_float_unop_nofrm:VF
(match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
DONE;
})
Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
to condition.
It works for both TARGET_ZVFH and TARGET_ZVFHMIN
-march=rv64gcv_zvfhmin:
f:
li a4,2147450880
li a5,-2147450880
addi a4,a4,-1
addi a5,a5,1
slli a3,a5,32
slli a2,a4,32
mv a5,a4
li a4,-2147450880
addi a6,a1,200
add a3,a3,a4
add a2,a2,a5
.L2:
ld a5,0(a1)
addi a0,a0,8
addi a1,a1,8
not a4,a5
and a5,a5,a2
and a4,a4,a3
sub a5,a3,a5
xor a5,a4,a5
sd a5,-8(a0)
bne a1,a6,.L2
ret
-march=rv64gcv_zvfh:
f:
vsetivli zero,8,e16,m1,ta,ma
addi a4,a1,16
addi a5,a0,16
vle16.v v1,0(a1)
vfneg.v v1,v1
vse16.v v1,0(a0)
addi a2,a1,32
addi a3,a0,32
vle16.v v1,0(a4)
vfneg.v v1,v1
vse16.v v1,0(a5)
addi a4,a1,48
addi a5,a0,48
vle16.v v1,0(a2)
vfneg.v v1,v1
vse16.v v1,0(a3)
addi a2,a1,64
addi a3,a0,64
vle16.v v1,0(a4)
vfneg.v v1,v1
vse16.v v1,0(a5)
addi a4,a1,80
addi a5,a0,80
vle16.v v1,0(a2)
vfneg.v v1,v1
vse16.v v1,0(a3)
....
This is what we expected, TARGET_ZVFH enable auto-vectorization wheras no auto-vectorization when TARGET_ZVFHMIN since
vfneg.v is not allowed in TARGET_ZVFHMIN.
However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
is an ugly implementation and not easy to maintain since we will need add this condition to each floating-point patterns.
So, give me some time to figure out an elegant way to support auto-vectorization.
Thanks.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-06-15 03:43
To: Robin Dapp; gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai
Subject: Re: [PATCH] RISC-V: Add autovec FP unary operations.
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
>
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use
it effectively. With its limited initial accuracy, we'll be stuck with
another round of Newton-Raphson or Goldschmidt, so we're not likely
going to beat the latency of a standard vsqrt. We can use it to improve
throughput though since it does pipeline (using the fmacs of course, so
there's a definite trade-off if the fmacs are already saturated).
>
> Similary to the binop tests, there are flavors for zvfh now. Prerequisites
> as before.
>
> Regards
> Robin
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM. So if Juzhe is happy with it, then it's good to go once
dependencies are resolved.
jeff
After several tries:
(define_mode_iterator VF_AUTO [
(VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_ZVFH")
(VNx4HF "TARGET_ZVFH")
(VNx8HF "TARGET_ZVFH")
(VNx16HF "TARGET_ZVFH")
(VNx32HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
(VNx64HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")
(VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
(VNx2SF "TARGET_VECTOR_ELEN_FP_32")
(VNx4SF "TARGET_VECTOR_ELEN_FP_32")
(VNx8SF "TARGET_VECTOR_ELEN_FP_32")
(VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
(VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
(VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
(VNx2DF "TARGET_VECTOR_ELEN_FP_64")
(VNx4DF "TARGET_VECTOR_ELEN_FP_64")
(VNx8DF "TARGET_VECTOR_ELEN_FP_64")
(VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])
I think we should add VF_AUTO change iterator into using TARGET_ZVFH.
Then it also works now. -march=zvfhmin no auto-vectorization , -march=zvfh has auto-vectorization.
Feel free to comments more solutions.
Thanks.
juzhe.zhong@rivai.ai
From: 钟居哲
Date: 2023-06-15 05:15
To: Jeff Law; rdapp.gcc; gcc-patches; palmer; kito.cheng
Subject: Re: Re: [PATCH] RISC-V: Add autovec FP unary operations.
Hi, Jeff. Thanks for quick approval.
When I reviewed the patch:
(define_expand "<optab><mode>2"
[(set (match_operand:VF 0 "register_operand")
(any_float_unop_nofrm:VF
(match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
DONE;
})
There could be issue here of FP16 vector.
Since let's see VF iterator:
(define_mode_iterator VF [
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
....
You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
The reason we do that since most floating-point instructions are using same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
in naive way. Some instructions pattern are using VF for example vle16.v which should be enabled as long as TARGET_ZVFHMIN wheras
the instructions like vfneg.v need TARGET_ZVFH.
So I do the experiment:
void
f (_Float16 *restrict a, _Float16 *restrict b)
{
for (int i = 0; i < 100; ++i)
{
a[i] = -b[i];
}
}
with compile option:
-march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3
ICE happens:
auto.c:26:1: error: unable to generate reloads for:
(insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
(if_then_else:VNx8HF (unspec:VNx8BI [
(const_vector:VNx8BI [
(const_int 1 [0x1]) repeated x8
])
(const_int 8 [0x8])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
(unspec:VNx8HF [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
(expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
(nil)))
The reason of ICE is that we have enabled auto-vectorzation pattern of vfneg.v when TARGET_ZVFHMIN according to VF iterators but
the instructions pattern of vfneg.v is correctly disabled and only enabled when TARGET_ZVFH since we have this attribute for each
RVV instruction pattern:
(define_attr "fp_vector_disabled" "no,yes"
(cond [
(and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
vfwalu,vfwmul,vfmuladd,vfwmuladd,
vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
vfclass,vfmerge,
vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
vfredo,vfredu,vfwredo,vfwredu,
vfslide1up,vfslide1down")
(and (eq_attr "mode" "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
(match_test "!TARGET_ZVFH")))
(const_string "yes")
;; The mode records as QI for the FP16 <=> INT8 instruction.
(and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
(and (eq_attr "mode" "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
(match_test "!TARGET_ZVFH")))
(const_string "yes")
]
(const_string "no")))
When I slightly change the pattern as follows:
(define_expand "<optab><mode>2"
[(set (match_operand:VF 0 "register_operand")
(any_float_unop_nofrm:VF
(match_operand:VF 1 "register_operand")))]
"TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
{
insn_code icode = code_for_pred (<CODE>, <MODE>mode);
riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
DONE;
})
Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
to condition.
It works for both TARGET_ZVFH and TARGET_ZVFHMIN
-march=rv64gcv_zvfhmin:
f:
li a4,2147450880
li a5,-2147450880
addi a4,a4,-1
addi a5,a5,1
slli a3,a5,32
slli a2,a4,32
mv a5,a4
li a4,-2147450880
addi a6,a1,200
add a3,a3,a4
add a2,a2,a5
.L2:
ld a5,0(a1)
addi a0,a0,8
addi a1,a1,8
not a4,a5
and a5,a5,a2
and a4,a4,a3
sub a5,a3,a5
xor a5,a4,a5
sd a5,-8(a0)
bne a1,a6,.L2
ret
-march=rv64gcv_zvfh:
f:
vsetivli zero,8,e16,m1,ta,ma
addi a4,a1,16
addi a5,a0,16
vle16.v v1,0(a1)
vfneg.v v1,v1
vse16.v v1,0(a0)
addi a2,a1,32
addi a3,a0,32
vle16.v v1,0(a4)
vfneg.v v1,v1
vse16.v v1,0(a5)
addi a4,a1,48
addi a5,a0,48
vle16.v v1,0(a2)
vfneg.v v1,v1
vse16.v v1,0(a3)
addi a2,a1,64
addi a3,a0,64
vle16.v v1,0(a4)
vfneg.v v1,v1
vse16.v v1,0(a5)
addi a4,a1,80
addi a5,a0,80
vle16.v v1,0(a2)
vfneg.v v1,v1
vse16.v v1,0(a3)
....
This is what we expected, TARGET_ZVFH enable auto-vectorization wheras no auto-vectorization when TARGET_ZVFHMIN since
vfneg.v is not allowed in TARGET_ZVFHMIN.
However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
is an ugly implementation and not easy to maintain since we will need add this condition to each floating-point patterns.
So, give me some time to figure out an elegant way to support auto-vectorization.
Thanks.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-06-15 03:43
To: Robin Dapp; gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai
Subject: Re: [PATCH] RISC-V: Add autovec FP unary operations.
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
>
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use
it effectively. With its limited initial accuracy, we'll be stuck with
another round of Newton-Raphson or Goldschmidt, so we're not likely
going to beat the latency of a standard vsqrt. We can use it to improve
throughput though since it does pipeline (using the fmacs of course, so
there's a definite trade-off if the fmacs are already saturated).
>
> Similary to the binop tests, there are flavors for zvfh now. Prerequisites
> as before.
>
> Regards
> Robin
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM. So if Juzhe is happy with it, then it's good to go once
dependencies are resolved.
jeff
After several considerations, I think we may need to add VF_AUTO iterators (with predicate TARGET_ZVFH for vector HF mode) for FP autovec.
Add add testcase of these unary operations with -march=rv64gc_zvfhmin to make sure they don't
cause any ICE and vectorizations.
like https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621322.html
this patch.
Thanks.
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-06-14 23:31
To: gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai; jeffreyalaw
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Add autovec FP unary operations.
Hi,
this patch adds floating-point autovec expanders for vfneg, vfabs as well as
vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time.
Similary to the binop tests, there are flavors for zvfh now. Prerequisites
as before.
Regards
Robin
gcc/ChangeLog:
* config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
---
gcc/config/riscv/autovec.md | 36 ++++++++++++++++++-
.../riscv/rvv/autovec/unop/abs-run.c | 6 ++--
.../riscv/rvv/autovec/unop/abs-rv32gcv.c | 3 +-
.../riscv/rvv/autovec/unop/abs-rv64gcv.c | 3 +-
.../riscv/rvv/autovec/unop/abs-template.h | 14 +++++++-
.../riscv/rvv/autovec/unop/abs-zvfh-run.c | 35 ++++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-run.c | 29 +++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-template.h | 31 ++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c | 32 +++++++++++++++++
.../riscv/rvv/autovec/unop/vneg-run.c | 6 ++--
.../riscv/rvv/autovec/unop/vneg-rv32gcv.c | 3 +-
.../riscv/rvv/autovec/unop/vneg-rv64gcv.c | 3 +-
.../riscv/rvv/autovec/unop/vneg-template.h | 5 ++-
.../riscv/rvv/autovec/unop/vneg-zvfh-run.c | 26 ++++++++++++++
16 files changed, 241 insertions(+), 11 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 1c6d793cae0..72154400f1f 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -498,7 +498,7 @@ (define_expand "<optab><mode>2"
})
;; -------------------------------------------------------------------------------
-;; - ABS expansion to vmslt and vneg
+;; - [INT] ABS expansion to vmslt and vneg.
;; -------------------------------------------------------------------------------
(define_expand "abs<mode>2"
@@ -517,6 +517,40 @@ (define_expand "abs<mode>2"
DONE;
})
+;; -------------------------------------------------------------------------------
+;; ---- [FP] Unary operations
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfneg.v/vfabs.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop_nofrm:VF
+ (match_operand:VF 1 "register_operand")))]
+ "TARGET_VECTOR"
+{
+ insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+ riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
+ DONE;
+})
+
+;; -------------------------------------------------------------------------------
+;; - [FP] Square root
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfsqrt.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop:VF
+ (match_operand:VF 1 "register_operand")))]
+ "TARGET_VECTOR"
+{
+ insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+ riscv_vector::emit_vlmax_fp_insn (icode, riscv_vector::RVV_UNOP, operands);
+ DONE;
+})
+
;; =========================================================================
;; == Ternary arithmetic
;; =========================================================================
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
index d93a7c768d2..18c7a55e23d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
@@ -30,7 +30,9 @@
RUN(int8_t) \
RUN(int16_t) \
RUN(int32_t) \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float) \
+ RUN(double) \
int main ()
{
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
index a8b92c9450f..dea790ccc2d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
index 2e7f0864ee7..b58f1aa3496 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
index 882de9f4efb..b86d04bfbc8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
@@ -1,5 +1,6 @@
#include <stdlib.h>
#include <stdint-gcc.h>
+#include <math.h>
#define TEST_TYPE(TYPE) \
__attribute__((noipa)) \
@@ -17,10 +18,21 @@
dst[i] = llabs (a[i]); \
}
+#define TEST_TYPE3(TYPE) \
+ __attribute__((noipa)) \
+ void vabs_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = fabs (a[i]); \
+ }
+
#define TEST_ALL() \
TEST_TYPE(int8_t) \
TEST_TYPE(int16_t) \
TEST_TYPE(int32_t) \
- TEST_TYPE2(int64_t)
+ TEST_TYPE2(int64_t) \
+ TEST_TYPE3(_Float16) \
+ TEST_TYPE3(float) \
+ TEST_TYPE3(double) \
TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
new file mode 100644
index 00000000000..9b1c26381d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
@@ -0,0 +1,35 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "abs-template.h"
+
+#include <assert.h>
+
+#define SZ 128
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ if (i & 1) \
+ a##TYPE[i] = i - 64; \
+ else \
+ a##TYPE[i] = i; \
+ } \
+ vabs_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ if (i & 1) \
+ assert (a##TYPE[i] == abs (i - 64)); \
+ else \
+ assert (a##TYPE[i] == i); \
+ }
+
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
new file mode 100644
index 00000000000..8038a87bdc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = (TYPE)i; \
+ } \
+ vsqrt_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL() \
+ RUN(float) \
+ RUN(double) \
+
+int main ()
+{
+ RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
new file mode 100644
index 00000000000..96c6f959925
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
new file mode 100644
index 00000000000..ea724e9548f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
new file mode 100644
index 00000000000..314ea646bec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
@@ -0,0 +1,31 @@
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrtf (a[i]); \
+ }
+
+#define TEST_TYPE2(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrt (a[i]); \
+ }
+
+#define TEST_TYPE3(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrtf16 (a[i]); \
+ }
+
+#define TEST_ALL() \
+ TEST_TYPE(float) \
+ TEST_TYPE2(double) \
+
+TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
new file mode 100644
index 00000000000..655bc1c42dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
@@ -0,0 +1,32 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = (TYPE)i; \
+ } \
+ vsqrt_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
index 98c7f30ec56..4805538f252 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
@@ -21,7 +21,9 @@
RUN(int8_t) \
RUN(int16_t) \
RUN(int32_t) \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float) \
+ RUN(double) \
int main ()
{
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
index 69d9ebb0953..4a9ceb5faf2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
index d2c2e17c13e..2c5e2bd2a0b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
index 93e690f3cec..892d9d72c38 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
@@ -13,6 +13,9 @@
TEST_TYPE(int8_t) \
TEST_TYPE(int16_t) \
TEST_TYPE(int32_t) \
- TEST_TYPE(int64_t)
+ TEST_TYPE(int64_t) \
+ TEST_TYPE(_Float16) \
+ TEST_TYPE(float) \
+ TEST_TYPE(double) \
TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
new file mode 100644
index 00000000000..e9de7a003c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
@@ -0,0 +1,26 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vneg-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = i - 127; \
+ } \
+ vneg_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (a##TYPE[i] == -(i - 127));
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}
--
2.40.1
Hi Juzhe,
I like the iterator solution better, I added it to the
binops V2 patch with a comment and will post it in a while.
Also realized there is already a testcase and the "enabled"
attribute is set properly now but I hadn't rebased to the
current master branch in a while...
Btw. I'm currently running the testsuite with rv64gcv_zfhmin
default march and see some additional FAILs. Will report back.
Regards
Robin
> Btw. I'm currently running the testsuite with rv64gcv_zfhmin
> default march and see some additional FAILs. Will report back.
Reporting back - the FAILs are a combination of an older qemu
version and not fully comprehensive target selectors. I'm going
to send a V2 for the testsuite patch as well.
Regards
Robin
On 6/14/23 15:15, 钟居哲 wrote:
> Hi, Jeff. Thanks for quick approval.
>
> When I reviewed the patch:
> (define_expand "<optab><mode>2"
> [(set (match_operand:VF 0 "register_operand")
> (any_float_unop_nofrm:VF
> (match_operand:VF 1 "register_operand")))]
> "TARGET_VECTOR"
> {
> insn_code icode = code_for_pred (<CODE>, <MODE>mode);
> riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
> DONE;
> })
>
> There could be issue here of FP16 vector.
> Since let's see VF iterator:
> (define_mode_iterator VF [
> (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
> (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
> (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
> (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
> (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
> (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
> (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
> ....
>
> You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
> which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
> The reason we do that since most floating-point instructions are using
> same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
> in naive way. Some instructions pattern are using VF for example vle16.v
> which should be enabled as long as TARGET_ZVFHMIN wheras
> the instructions like vfneg.v need TARGET_ZVFH.
>
> So I do the experiment:
> void
> f (_Float16 *restrict a, _Float16 *restrict b)
> {
> for (int i = 0; i < 100; ++i)
> {
> a[i] = -b[i];
> }
> }
>
> with compile option:
> -march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3
>
> ICE happens:
> auto.c:26:1: error: unable to generate reloads for:
> (insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
> (if_then_else:VNx8HF (unspec:VNx8BI [
> (const_vector:VNx8BI [
> (const_int 1 [0x1]) repeated x8
> ])
> (const_int 8 [0x8])
> (const_int 2 [0x2]) repeated x2
> (const_int 0 [0])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> ] UNSPEC_VPREDICATE)
> (neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
> (unspec:VNx8HF [
> (reg:SI 0 zero)
> ] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
> (expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
> (nil)))
>
> The reason of ICE is that we have enabled auto-vectorzation pattern of
> vfneg.v when TARGET_ZVFHMIN according to VF iterators but
> the instructions pattern of vfneg.v is correctly disabled and only
> enabled when TARGET_ZVFH since we have this attribute for each
> RVV instruction pattern:
> (define_attr "fp_vector_disabled" "no,yes"
> (cond [
> (and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
> vfwalu,vfwmul,vfmuladd,vfwmuladd,
> vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
> vfclass,vfmerge,
> vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
> vfredo,vfredu,vfwredo,vfwredu,
> vfslide1up,vfslide1down")
> (and (eq_attr "mode"
> "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
> (match_test "!TARGET_ZVFH")))
> (const_string "yes")
>
> ;; The mode records as QI for the FP16 <=> INT8 instruction.
> (and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
> (and (eq_attr "mode"
> "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
> (match_test "!TARGET_ZVFH")))
> (const_string "yes")
> ]
> (const_string "no")))
>
> When I slightly change the pattern as follows:
> (define_expand "<optab><mode>2"
> [(set (match_operand:VF 0 "register_operand")
> (any_float_unop_nofrm:VF
> (match_operand:VF 1 "register_operand")))]
> "TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
> {
> insn_code icode = code_for_pred (<CODE>, <MODE>mode);
> riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
> DONE;
> })
>
> Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
> to condition.
>
> It works for both TARGET_ZVFH and TARGET_ZVFHMIN
> -march=rv64gcv_zvfhmin:
> f:
> li a4,2147450880
> li a5,-2147450880
> addi a4,a4,-1
> addi a5,a5,1
> slli a3,a5,32
> slli a2,a4,32
> mv a5,a4
> li a4,-2147450880
> addi a6,a1,200
> add a3,a3,a4
> add a2,a2,a5
> .L2:
> ld a5,0(a1)
> addi a0,a0,8
> addi a1,a1,8
> not a4,a5
> and a5,a5,a2
> and a4,a4,a3
> sub a5,a3,a5
> xor a5,a4,a5
> sd a5,-8(a0)
> bne a1,a6,.L2
> ret
>
> -march=rv64gcv_zvfh:
> f:
> vsetivli zero,8,e16,m1,ta,ma
> addi a4,a1,16
> addi a5,a0,16
> vle16.v v1,0(a1)
> vfneg.v v1,v1
> vse16.v v1,0(a0)
> addi a2,a1,32
> addi a3,a0,32
> vle16.v v1,0(a4)
> vfneg.v v1,v1
> vse16.v v1,0(a5)
> addi a4,a1,48
> addi a5,a0,48
> vle16.v v1,0(a2)
> vfneg.v v1,v1
> vse16.v v1,0(a3)
> addi a2,a1,64
> addi a3,a0,64
> vle16.v v1,0(a4)
> vfneg.v v1,v1
> vse16.v v1,0(a5)
> addi a4,a1,80
> addi a5,a0,80
> vle16.v v1,0(a2)
> vfneg.v v1,v1
> vse16.v v1,0(a3)
> ....
>
>
> This is what we expected, TARGET_ZVFH enable auto-vectorization wheras
> no auto-vectorization when TARGET_ZVFHMIN since
> vfneg.v is not allowed in TARGET_ZVFHMIN.
>
> However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode &&
> !TARGET_ZVFH)
> is an ugly implementation and not easy to maintain since we will need
> add this condition to each floating-point patterns.
>
> So, give me some time to figure out an elegant way to support
> auto-vectorization.
Sigh. There are days when I look at how the ISA is managed and I don't
know whether to cry or scream.
Thanks for the detailed explanation, you're absolutely correct that we
need to be cognizant of the pitfalls of how the iterators interact ZVFH
and ZVFHMIN.
jeff
@@ -498,7 +498,7 @@ (define_expand "<optab><mode>2"
})
;; -------------------------------------------------------------------------------
-;; - ABS expansion to vmslt and vneg
+;; - [INT] ABS expansion to vmslt and vneg.
;; -------------------------------------------------------------------------------
(define_expand "abs<mode>2"
@@ -517,6 +517,40 @@ (define_expand "abs<mode>2"
DONE;
})
+;; -------------------------------------------------------------------------------
+;; ---- [FP] Unary operations
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfneg.v/vfabs.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop_nofrm:VF
+ (match_operand:VF 1 "register_operand")))]
+ "TARGET_VECTOR"
+{
+ insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+ riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
+ DONE;
+})
+
+;; -------------------------------------------------------------------------------
+;; - [FP] Square root
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfsqrt.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+ [(set (match_operand:VF 0 "register_operand")
+ (any_float_unop:VF
+ (match_operand:VF 1 "register_operand")))]
+ "TARGET_VECTOR"
+{
+ insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+ riscv_vector::emit_vlmax_fp_insn (icode, riscv_vector::RVV_UNOP, operands);
+ DONE;
+})
+
;; =========================================================================
;; == Ternary arithmetic
;; =========================================================================
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
@@ -30,7 +30,9 @@
RUN(int8_t) \
RUN(int16_t) \
RUN(int32_t) \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float) \
+ RUN(double) \
int main ()
{
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
@@ -1,5 +1,6 @@
#include <stdlib.h>
#include <stdint-gcc.h>
+#include <math.h>
#define TEST_TYPE(TYPE) \
__attribute__((noipa)) \
@@ -17,10 +18,21 @@
dst[i] = llabs (a[i]); \
}
+#define TEST_TYPE3(TYPE) \
+ __attribute__((noipa)) \
+ void vabs_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = fabs (a[i]); \
+ }
+
#define TEST_ALL() \
TEST_TYPE(int8_t) \
TEST_TYPE(int16_t) \
TEST_TYPE(int32_t) \
- TEST_TYPE2(int64_t)
+ TEST_TYPE2(int64_t) \
+ TEST_TYPE3(_Float16) \
+ TEST_TYPE3(float) \
+ TEST_TYPE3(double) \
TEST_ALL()
new file mode 100644
@@ -0,0 +1,35 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "abs-template.h"
+
+#include <assert.h>
+
+#define SZ 128
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ if (i & 1) \
+ a##TYPE[i] = i - 64; \
+ else \
+ a##TYPE[i] = i; \
+ } \
+ vabs_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ if (i & 1) \
+ assert (a##TYPE[i] == abs (i - 64)); \
+ else \
+ assert (a##TYPE[i] == i); \
+ }
+
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}
new file mode 100644
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = (TYPE)i; \
+ } \
+ vsqrt_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL() \
+ RUN(float) \
+ RUN(double) \
+
+int main ()
+{
+ RUN_ALL()
+}
new file mode 100644
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
new file mode 100644
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
new file mode 100644
@@ -0,0 +1,31 @@
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrtf (a[i]); \
+ }
+
+#define TEST_TYPE2(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrt (a[i]); \
+ }
+
+#define TEST_TYPE3(TYPE) \
+ __attribute__((noipa)) \
+ void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+ { \
+ for (int i = 0; i < n; i++) \
+ dst[i] = __builtin_sqrtf16 (a[i]); \
+ }
+
+#define TEST_ALL() \
+ TEST_TYPE(float) \
+ TEST_TYPE2(double) \
+
+TEST_ALL()
new file mode 100644
@@ -0,0 +1,32 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+ it here instead of in the template directly. */
+TEST_TYPE3(_Float16)
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = (TYPE)i; \
+ } \
+ vsqrt_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
@@ -21,7 +21,9 @@
RUN(int8_t) \
RUN(int16_t) \
RUN(int32_t) \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float) \
+ RUN(double) \
int main ()
{
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
@@ -13,6 +13,9 @@
TEST_TYPE(int8_t) \
TEST_TYPE(int16_t) \
TEST_TYPE(int32_t) \
- TEST_TYPE(int64_t)
+ TEST_TYPE(int64_t) \
+ TEST_TYPE(_Float16) \
+ TEST_TYPE(float) \
+ TEST_TYPE(double) \
TEST_ALL()
new file mode 100644
@@ -0,0 +1,26 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vneg-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define RUN(TYPE) \
+ TYPE a##TYPE[SZ]; \
+ for (int i = 0; i < SZ; i++) \
+ { \
+ a##TYPE[i] = i - 127; \
+ } \
+ vneg_##TYPE (a##TYPE, a##TYPE, SZ); \
+ for (int i = 0; i < SZ; i++) \
+ assert (a##TYPE[i] == -(i - 127));
+
+#define RUN_ALL() \
+ RUN(_Float16) \
+
+int main ()
+{
+ RUN_ALL()
+}