[v2] RISC-V: Implement vec_set and vec_extract.
Checks
Commit Message
Hi,
with the recent changes that we also pass the return value via
stack this is can go forward now.
Changes in V2:
- Remove redundant force_reg.
- Change target selectors to those introduced in the binop patch.
Regards
Robin
This implements the vec_set and vec_extract patterns for integer and
floating-point data types. For vec_set we broadcast the insert value to
a vector register and then perform a vslideup with effective length 1 to
the requested index.
vec_extract is done by sliding down the requested element to index 0
and v(f)mv.[xf].s to a scalar register.
The patch does not include vector-vector extraction which
will be done at a later time.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_set<mode>): Implement.
(vec_extract<mode><vel>): Implement.
* config/riscv/riscv-protos.h (enum insn_type): Add slide insn.
(emit_vlmax_slide_insn): Declare.
(emit_nonvlmax_slide_tu_insn): Declare.
(emit_scalar_move_insn): Export.
(emit_nonvlmax_integer_move_insn): Export.
* config/riscv/riscv-v.cc (emit_vlmax_slide_insn): New function.
(emit_nonvlmax_slide_tu_insn): New function.
(emit_vlmax_masked_mu_insn): No change.
(emit_vlmax_integer_move_insn): Export.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c:
New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c:
New test.
---
gcc/config/riscv/autovec.md | 79 ++++++
gcc/config/riscv/riscv-protos.h | 5 +
gcc/config/riscv/riscv-v.cc | 50 +++-
.../rvv/autovec/vls-vlmax/vec_extract-1.c | 57 +++++
.../rvv/autovec/vls-vlmax/vec_extract-2.c | 68 +++++
.../rvv/autovec/vls-vlmax/vec_extract-3.c | 69 +++++
.../rvv/autovec/vls-vlmax/vec_extract-4.c | 72 ++++++
.../rvv/autovec/vls-vlmax/vec_extract-run.c | 239 +++++++++++++++++
.../autovec/vls-vlmax/vec_extract-zvfh-run.c | 77 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-1.c | 62 +++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-2.c | 74 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-3.c | 76 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-4.c | 79 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-run.c | 240 ++++++++++++++++++
.../rvv/autovec/vls-vlmax/vec_set-zvfh-run.c | 78 ++++++
15 files changed, 1323 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c
Comments
LGTM
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-06-16 21:41
To: Jeff Law; gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai
CC: rdapp.gcc
Subject: [PATCH v2] RISC-V: Implement vec_set and vec_extract.
Hi,
with the recent changes that we also pass the return value via
stack this is can go forward now.
Changes in V2:
- Remove redundant force_reg.
- Change target selectors to those introduced in the binop patch.
Regards
Robin
This implements the vec_set and vec_extract patterns for integer and
floating-point data types. For vec_set we broadcast the insert value to
a vector register and then perform a vslideup with effective length 1 to
the requested index.
vec_extract is done by sliding down the requested element to index 0
and v(f)mv.[xf].s to a scalar register.
The patch does not include vector-vector extraction which
will be done at a later time.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_set<mode>): Implement.
(vec_extract<mode><vel>): Implement.
* config/riscv/riscv-protos.h (enum insn_type): Add slide insn.
(emit_vlmax_slide_insn): Declare.
(emit_nonvlmax_slide_tu_insn): Declare.
(emit_scalar_move_insn): Export.
(emit_nonvlmax_integer_move_insn): Export.
* config/riscv/riscv-v.cc (emit_vlmax_slide_insn): New function.
(emit_nonvlmax_slide_tu_insn): New function.
(emit_vlmax_masked_mu_insn): No change.
(emit_vlmax_integer_move_insn): Export.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c:
New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c:
New test.
---
gcc/config/riscv/autovec.md | 79 ++++++
gcc/config/riscv/riscv-protos.h | 5 +
gcc/config/riscv/riscv-v.cc | 50 +++-
.../rvv/autovec/vls-vlmax/vec_extract-1.c | 57 +++++
.../rvv/autovec/vls-vlmax/vec_extract-2.c | 68 +++++
.../rvv/autovec/vls-vlmax/vec_extract-3.c | 69 +++++
.../rvv/autovec/vls-vlmax/vec_extract-4.c | 72 ++++++
.../rvv/autovec/vls-vlmax/vec_extract-run.c | 239 +++++++++++++++++
.../autovec/vls-vlmax/vec_extract-zvfh-run.c | 77 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-1.c | 62 +++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-2.c | 74 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-3.c | 76 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-4.c | 79 ++++++
.../riscv/rvv/autovec/vls-vlmax/vec_set-run.c | 240 ++++++++++++++++++
.../rvv/autovec/vls-vlmax/vec_set-zvfh-run.c | 78 ++++++
15 files changed, 1323 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index c23a625afe1..9569b420d45 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -655,3 +655,82 @@ (define_expand "select_vl<mode>"
riscv_vector::expand_select_vl (operands);
DONE;
})
+
+;; -------------------------------------------------------------------------
+;; ---- [INT,FP] Insert a vector element.
+;; -------------------------------------------------------------------------
+
+(define_expand "vec_set<mode>"
+ [(match_operand:V 0 "register_operand")
+ (match_operand:<VEL> 1 "register_operand")
+ (match_operand 2 "immediate_operand")]
+ "TARGET_VECTOR"
+{
+ /* If we set the first element, emit an v(f)mv.s.[xf]. */
+ if (operands[2] == const0_rtx)
+ {
+ rtx ops[] = {operands[0], riscv_vector::gen_scalar_move_mask (<VM>mode),
+ RVV_VUNDEF (<MODE>mode), operands[1]};
+ riscv_vector::emit_scalar_move_insn
+ (code_for_pred_broadcast (<MODE>mode), ops);
+ }
+ else
+ {
+ /* Move the desired value into a vector register and insert
+ it at the proper position using vslideup with an
+ "effective length" of 1 i.e. a VL 1 past the offset. */
+
+ /* Slide offset = element index. */
+ int offset = INTVAL (operands[2]);
+
+ /* Only insert one element, i.e. VL = offset + 1. */
+ rtx length = gen_reg_rtx (Pmode);
+ emit_move_insn (length, GEN_INT (offset + 1));
+
+ /* Move operands[1] into a vector register via vmv.v.x using the same
+ VL we need for the slide. */
+ rtx tmp = gen_reg_rtx (<MODE>mode);
+ rtx ops1[] = {tmp, operands[1]};
+ riscv_vector::emit_nonvlmax_integer_move_insn
+ (code_for_pred_broadcast (<MODE>mode), ops1, length);
+
+ /* Slide exactly one element up leaving the tail elements
+ unchanged. */
+ rtx ops2[] = {operands[0], operands[0], tmp, operands[2]};
+ riscv_vector::emit_nonvlmax_slide_tu_insn
+ (code_for_pred_slide (UNSPEC_VSLIDEUP, <MODE>mode), ops2, length);
+ }
+ DONE;
+})
+
+;; -------------------------------------------------------------------------
+;; ---- [INT,FP] Extract a vector element.
+;; -------------------------------------------------------------------------
+(define_expand "vec_extract<mode><vel>"
+ [(set (match_operand:<VEL> 0 "register_operand")
+ (vec_select:<VEL>
+ (match_operand:V 1 "register_operand")
+ (parallel
+ [(match_operand 2 "nonmemory_operand")])))]
+ "TARGET_VECTOR"
+{
+ /* Element extraction can be done by sliding down the requested element
+ to index 0 and then v(f)mv.[xf].s it to a scalar register. */
+
+ /* When extracting any other than the first element we need to slide
+ it down. */
+ rtx tmp = NULL_RTX;
+ if (operands[2] != const0_rtx)
+ {
+ /* Emit the slide down to index 0 in a new vector. */
+ tmp = gen_reg_rtx (<MODE>mode);
+ rtx ops[] = {tmp, RVV_VUNDEF (<MODE>mode), operands[1], operands[2]};
+ riscv_vector::emit_vlmax_slide_insn
+ (code_for_pred_slide (UNSPEC_VSLIDEDOWN, <MODE>mode), ops);
+ }
+
+ /* Emit v(f)mv.[xf].s. */
+ emit_insn (gen_pred_extract_first (<MODE>mode, operands[0],
+ tmp ? tmp : operands[1]));
+ DONE;
+})
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b23a9c12465..f422adf8521 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -146,6 +146,7 @@ enum insn_type
RVV_TERNOP = 5,
RVV_WIDEN_TERNOP = 4,
RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md. */
+ RVV_SLIDE_OP = 4, /* Dest, VUNDEF, source and offset. */
};
enum vlmul_type
{
@@ -186,10 +187,14 @@ void emit_hard_vlmax_vsetvl (machine_mode, rtx);
void emit_vlmax_insn (unsigned, int, rtx *, rtx = 0);
void emit_vlmax_ternary_insn (unsigned, int, rtx *, rtx = 0);
void emit_nonvlmax_insn (unsigned, int, rtx *, rtx);
+void emit_vlmax_slide_insn (unsigned, rtx *);
+void emit_nonvlmax_slide_tu_insn (unsigned, rtx *, rtx);
void emit_vlmax_merge_insn (unsigned, int, rtx *);
void emit_vlmax_cmp_insn (unsigned, rtx *);
void emit_vlmax_cmp_mu_insn (unsigned, rtx *);
void emit_vlmax_masked_mu_insn (unsigned, int, rtx *);
+void emit_scalar_move_insn (unsigned, rtx *);
+void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx);
enum vlmul_type get_vlmul (machine_mode);
unsigned int get_ratio (machine_mode);
unsigned int get_nf (machine_mode);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index f9dded6e8c0..1c86cfbdcee 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -695,6 +695,52 @@ emit_nonvlmax_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
e.emit_insn ((enum insn_code) icode, ops);
}
+/* This function emits a {NONVLMAX, TAIL_UNDISTURBED, MASK_ANY} vsetvli
+ followed by a vslide insn (with real merge operand). */
+void
+emit_vlmax_slide_insn (unsigned icode, rtx *ops)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (RVV_SLIDE_OP,
+ /* HAS_DEST_P */ true,
+ /* FULLY_UNMASKED_P */ true,
+ /* USE_REAL_MERGE_P */ true,
+ /* HAS_AVL_P */ true,
+ /* VLMAX_P */ true,
+ dest_mode,
+ mask_mode);
+
+ e.set_policy (TAIL_ANY);
+ e.set_policy (MASK_ANY);
+
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a {NONVLMAX, TAIL_UNDISTURBED, MASK_ANY} vsetvli
+ followed by a vslide insn (with real merge operand). */
+void
+emit_nonvlmax_slide_tu_insn (unsigned icode, rtx *ops, rtx avl)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (RVV_SLIDE_OP,
+ /* HAS_DEST_P */ true,
+ /* FULLY_UNMASKED_P */ true,
+ /* USE_REAL_MERGE_P */ true,
+ /* HAS_AVL_P */ true,
+ /* VLMAX_P */ false,
+ dest_mode,
+ mask_mode);
+
+ e.set_policy (TAIL_UNDISTURBED);
+ e.set_policy (MASK_ANY);
+ e.set_vl (avl);
+
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
+
/* This function emits merge instruction. */
void
emit_vlmax_merge_insn (unsigned icode, int op_num, rtx *ops)
@@ -768,7 +814,7 @@ emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
/* Emit vmv.s.x instruction. */
-static void
+void
emit_scalar_move_insn (unsigned icode, rtx *ops)
{
machine_mode dest_mode = GET_MODE (ops[0]);
@@ -798,7 +844,7 @@ emit_vlmax_integer_move_insn (unsigned icode, rtx *ops, rtx vl)
/* Emit vmv.v.x instruction with nonvlmax. */
-static void
+void
emit_nonvlmax_integer_move_insn (unsigned icode, rtx *ops, rtx avl)
{
emit_nonvlmax_insn (icode, riscv_vector::RVV_UNOP, ops, avl);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c
new file mode 100644
index 00000000000..bda5843e8e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx2di __attribute__((vector_size (16)));
+typedef int32_t vnx4si __attribute__((vector_size (16)));
+typedef int16_t vnx8hi __attribute__((vector_size (16)));
+typedef int8_t vnx16qi __attribute__((vector_size (16)));
+typedef _Float16 vnx8hf __attribute__((vector_size (16)));
+typedef float vnx4sf __attribute__((vector_size (16)));
+typedef double vnx2df __attribute__((vector_size (16)));
+
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL1(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+
+TEST_ALL1 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 14 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 0 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 8 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 13 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c
new file mode 100644
index 00000000000..43aa15c7ddb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx4di __attribute__((vector_size (32)));
+typedef int32_t vnx8si __attribute__((vector_size (32)));
+typedef int16_t vnx16hi __attribute__((vector_size (32)));
+typedef int8_t vnx32qi __attribute__((vector_size (32)));
+typedef _Float16 vnx16hf __attribute__((vector_size (32)));
+typedef float vnx8sf __attribute__((vector_size (32)));
+typedef double vnx4df __attribute__((vector_size (32)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL2(T) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+
+TEST_ALL2 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 26 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 0 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 14 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 19 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c
new file mode 100644
index 00000000000..da26ed9715f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx8di __attribute__((vector_size (64)));
+typedef int32_t vnx16si __attribute__((vector_size (64)));
+typedef int16_t vnx32hi __attribute__((vector_size (64)));
+typedef int8_t vnx64qi __attribute__((vector_size (64)));
+typedef _Float16 vnx32hf __attribute__((vector_size (64)));
+typedef float vnx16sf __attribute__((vector_size (64)));
+typedef double vnx8df __attribute__((vector_size (64)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL3(T) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+
+TEST_ALL3 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 11 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 25 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 15 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 19 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c
new file mode 100644
index 00000000000..0d7c0e16586
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c
@@ -0,0 +1,72 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx16di __attribute__((vector_size (128)));
+typedef int32_t vnx32si __attribute__((vector_size (128)));
+typedef int16_t vnx64hi __attribute__((vector_size (128)));
+typedef int8_t vnx128qi __attribute__((vector_size (128)));
+typedef _Float16 vnx64hf __attribute__((vector_size (128)));
+typedef float vnx32sf __attribute__((vector_size (128)));
+typedef double vnx16df __attribute__((vector_size (128)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL4(T) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+TEST_ALL4 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 13 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 23 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 7 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 17 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 20 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c
new file mode 100644
index 00000000000..82bf6d674ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c
@@ -0,0 +1,239 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -march=rv64gcv -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_extract-1.c"
+#include "vec_extract-2.c"
+#include "vec_extract-3.c"
+#include "vec_extract-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == v[IDX]); \
+ }
+
+#define CHECK_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c
new file mode 100644
index 00000000000..a0b2cf97afe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c
@@ -0,0 +1,77 @@
+/* { dg-do run {target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_extract-1.c"
+#include "vec_extract-2.c"
+#include "vec_extract-3.c"
+#include "vec_extract-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == v[IDX]); \
+ }
+
+#define CHECK_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
new file mode 100644
index 00000000000..4fb4e822b93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx2di __attribute__((vector_size (16)));
+typedef int32_t vnx4si __attribute__((vector_size (16)));
+typedef int16_t vnx8hi __attribute__((vector_size (16)));
+typedef int8_t vnx16qi __attribute__((vector_size (16)));
+typedef _Float16 vnx8hf __attribute__((vector_size (16)));
+typedef float vnx4sf __attribute__((vector_size (16)));
+typedef double vnx2df __attribute__((vector_size (16)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL1(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+
+TEST_ALL1 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*tu,\s*ma} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 9 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 5 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 14 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c
new file mode 100644
index 00000000000..379e92f30bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c
@@ -0,0 +1,74 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx4di __attribute__((vector_size (32)));
+typedef int32_t vnx8si __attribute__((vector_size (32)));
+typedef int16_t vnx16hi __attribute__((vector_size (32)));
+typedef int8_t vnx32qi __attribute__((vector_size (32)));
+typedef _Float16 vnx16hf __attribute__((vector_size (32)));
+typedef float vnx8sf __attribute__((vector_size (32)));
+typedef double vnx4df __attribute__((vector_size (32)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL2(T) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+
+TEST_ALL2 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 15 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 11 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 26 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c
new file mode 100644
index 00000000000..b1e78150b30
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c
@@ -0,0 +1,76 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx8di __attribute__((vector_size (64)));
+typedef int32_t vnx16si __attribute__((vector_size (64)));
+typedef int16_t vnx32hi __attribute__((vector_size (64)));
+typedef int8_t vnx64qi __attribute__((vector_size (64)));
+typedef _Float16 vnx32hf __attribute__((vector_size (64)));
+typedef float vnx16sf __attribute__((vector_size (64)));
+typedef double vnx8df __attribute__((vector_size (64)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL3(T) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+
+TEST_ALL3 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*tu,\s*ma} 9 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 15 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 12 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 25 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vx} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c
new file mode 100644
index 00000000000..0b7f53d1cf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx16di __attribute__((vector_size (128)));
+typedef int32_t vnx32si __attribute__((vector_size (128)));
+typedef int16_t vnx64hi __attribute__((vector_size (128)));
+typedef int8_t vnx128qi __attribute__((vector_size (128)));
+typedef _Float16 vnx64hf __attribute__((vector_size (128)));
+typedef float vnx32sf __attribute__((vector_size (128)));
+typedef double vnx16df __attribute__((vector_size (128)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL4(T) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+TEST_ALL4 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*tu,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*tu,\s*ma} 11 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 16 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 14 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 23 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vx} 7 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c
new file mode 100644
index 00000000000..7e5e0e69d51
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c
@@ -0,0 +1,240 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -march=rv64gcv -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_set-1.c"
+#include "vec_set-2.c"
+#include "vec_set-3.c"
+#include "vec_set-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ V res = vec_set_##V##_##IDX (v, 77); \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ assert (res[i] == (i == IDX ? 77 : i)); \
+ }
+
+#define CHECK_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c
new file mode 100644
index 00000000000..bf514f9426b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c
@@ -0,0 +1,78 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_set-1.c"
+#include "vec_set-2.c"
+#include "vec_set-3.c"
+#include "vec_set-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ V res = vec_set_##V##_##IDX (v, 77); \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ assert (res[i] == (i == IDX ? 77 : i)); \
+ }
+
+#define CHECK_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
--
2.40.1
On 6/16/23 07:55, 钟居哲 wrote:
> LGTM
OK for the trunk. Sorry for the delays.
jeff
@@ -655,3 +655,82 @@ (define_expand "select_vl<mode>"
riscv_vector::expand_select_vl (operands);
DONE;
})
+
+;; -------------------------------------------------------------------------
+;; ---- [INT,FP] Insert a vector element.
+;; -------------------------------------------------------------------------
+
+(define_expand "vec_set<mode>"
+ [(match_operand:V 0 "register_operand")
+ (match_operand:<VEL> 1 "register_operand")
+ (match_operand 2 "immediate_operand")]
+ "TARGET_VECTOR"
+{
+ /* If we set the first element, emit an v(f)mv.s.[xf]. */
+ if (operands[2] == const0_rtx)
+ {
+ rtx ops[] = {operands[0], riscv_vector::gen_scalar_move_mask (<VM>mode),
+ RVV_VUNDEF (<MODE>mode), operands[1]};
+ riscv_vector::emit_scalar_move_insn
+ (code_for_pred_broadcast (<MODE>mode), ops);
+ }
+ else
+ {
+ /* Move the desired value into a vector register and insert
+ it at the proper position using vslideup with an
+ "effective length" of 1 i.e. a VL 1 past the offset. */
+
+ /* Slide offset = element index. */
+ int offset = INTVAL (operands[2]);
+
+ /* Only insert one element, i.e. VL = offset + 1. */
+ rtx length = gen_reg_rtx (Pmode);
+ emit_move_insn (length, GEN_INT (offset + 1));
+
+ /* Move operands[1] into a vector register via vmv.v.x using the same
+ VL we need for the slide. */
+ rtx tmp = gen_reg_rtx (<MODE>mode);
+ rtx ops1[] = {tmp, operands[1]};
+ riscv_vector::emit_nonvlmax_integer_move_insn
+ (code_for_pred_broadcast (<MODE>mode), ops1, length);
+
+ /* Slide exactly one element up leaving the tail elements
+ unchanged. */
+ rtx ops2[] = {operands[0], operands[0], tmp, operands[2]};
+ riscv_vector::emit_nonvlmax_slide_tu_insn
+ (code_for_pred_slide (UNSPEC_VSLIDEUP, <MODE>mode), ops2, length);
+ }
+ DONE;
+})
+
+;; -------------------------------------------------------------------------
+;; ---- [INT,FP] Extract a vector element.
+;; -------------------------------------------------------------------------
+(define_expand "vec_extract<mode><vel>"
+ [(set (match_operand:<VEL> 0 "register_operand")
+ (vec_select:<VEL>
+ (match_operand:V 1 "register_operand")
+ (parallel
+ [(match_operand 2 "nonmemory_operand")])))]
+ "TARGET_VECTOR"
+{
+ /* Element extraction can be done by sliding down the requested element
+ to index 0 and then v(f)mv.[xf].s it to a scalar register. */
+
+ /* When extracting any other than the first element we need to slide
+ it down. */
+ rtx tmp = NULL_RTX;
+ if (operands[2] != const0_rtx)
+ {
+ /* Emit the slide down to index 0 in a new vector. */
+ tmp = gen_reg_rtx (<MODE>mode);
+ rtx ops[] = {tmp, RVV_VUNDEF (<MODE>mode), operands[1], operands[2]};
+ riscv_vector::emit_vlmax_slide_insn
+ (code_for_pred_slide (UNSPEC_VSLIDEDOWN, <MODE>mode), ops);
+ }
+
+ /* Emit v(f)mv.[xf].s. */
+ emit_insn (gen_pred_extract_first (<MODE>mode, operands[0],
+ tmp ? tmp : operands[1]));
+ DONE;
+})
@@ -146,6 +146,7 @@ enum insn_type
RVV_TERNOP = 5,
RVV_WIDEN_TERNOP = 4,
RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md. */
+ RVV_SLIDE_OP = 4, /* Dest, VUNDEF, source and offset. */
};
enum vlmul_type
{
@@ -186,10 +187,14 @@ void emit_hard_vlmax_vsetvl (machine_mode, rtx);
void emit_vlmax_insn (unsigned, int, rtx *, rtx = 0);
void emit_vlmax_ternary_insn (unsigned, int, rtx *, rtx = 0);
void emit_nonvlmax_insn (unsigned, int, rtx *, rtx);
+void emit_vlmax_slide_insn (unsigned, rtx *);
+void emit_nonvlmax_slide_tu_insn (unsigned, rtx *, rtx);
void emit_vlmax_merge_insn (unsigned, int, rtx *);
void emit_vlmax_cmp_insn (unsigned, rtx *);
void emit_vlmax_cmp_mu_insn (unsigned, rtx *);
void emit_vlmax_masked_mu_insn (unsigned, int, rtx *);
+void emit_scalar_move_insn (unsigned, rtx *);
+void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx);
enum vlmul_type get_vlmul (machine_mode);
unsigned int get_ratio (machine_mode);
unsigned int get_nf (machine_mode);
@@ -695,6 +695,52 @@ emit_nonvlmax_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
e.emit_insn ((enum insn_code) icode, ops);
}
+/* This function emits a {NONVLMAX, TAIL_UNDISTURBED, MASK_ANY} vsetvli
+ followed by a vslide insn (with real merge operand). */
+void
+emit_vlmax_slide_insn (unsigned icode, rtx *ops)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (RVV_SLIDE_OP,
+ /* HAS_DEST_P */ true,
+ /* FULLY_UNMASKED_P */ true,
+ /* USE_REAL_MERGE_P */ true,
+ /* HAS_AVL_P */ true,
+ /* VLMAX_P */ true,
+ dest_mode,
+ mask_mode);
+
+ e.set_policy (TAIL_ANY);
+ e.set_policy (MASK_ANY);
+
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a {NONVLMAX, TAIL_UNDISTURBED, MASK_ANY} vsetvli
+ followed by a vslide insn (with real merge operand). */
+void
+emit_nonvlmax_slide_tu_insn (unsigned icode, rtx *ops, rtx avl)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (RVV_SLIDE_OP,
+ /* HAS_DEST_P */ true,
+ /* FULLY_UNMASKED_P */ true,
+ /* USE_REAL_MERGE_P */ true,
+ /* HAS_AVL_P */ true,
+ /* VLMAX_P */ false,
+ dest_mode,
+ mask_mode);
+
+ e.set_policy (TAIL_UNDISTURBED);
+ e.set_policy (MASK_ANY);
+ e.set_vl (avl);
+
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
+
/* This function emits merge instruction. */
void
emit_vlmax_merge_insn (unsigned icode, int op_num, rtx *ops)
@@ -768,7 +814,7 @@ emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
/* Emit vmv.s.x instruction. */
-static void
+void
emit_scalar_move_insn (unsigned icode, rtx *ops)
{
machine_mode dest_mode = GET_MODE (ops[0]);
@@ -798,7 +844,7 @@ emit_vlmax_integer_move_insn (unsigned icode, rtx *ops, rtx vl)
/* Emit vmv.v.x instruction with nonvlmax. */
-static void
+void
emit_nonvlmax_integer_move_insn (unsigned icode, rtx *ops, rtx avl)
{
emit_nonvlmax_insn (icode, riscv_vector::RVV_UNOP, ops, avl);
new file mode 100644
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx2di __attribute__((vector_size (16)));
+typedef int32_t vnx4si __attribute__((vector_size (16)));
+typedef int16_t vnx8hi __attribute__((vector_size (16)));
+typedef int8_t vnx16qi __attribute__((vector_size (16)));
+typedef _Float16 vnx8hf __attribute__((vector_size (16)));
+typedef float vnx4sf __attribute__((vector_size (16)));
+typedef double vnx2df __attribute__((vector_size (16)));
+
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL1(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+
+TEST_ALL1 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 14 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 0 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 8 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 13 } } */
new file mode 100644
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx4di __attribute__((vector_size (32)));
+typedef int32_t vnx8si __attribute__((vector_size (32)));
+typedef int16_t vnx16hi __attribute__((vector_size (32)));
+typedef int8_t vnx32qi __attribute__((vector_size (32)));
+typedef _Float16 vnx16hf __attribute__((vector_size (32)));
+typedef float vnx8sf __attribute__((vector_size (32)));
+typedef double vnx4df __attribute__((vector_size (32)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL2(T) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+
+TEST_ALL2 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 26 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 0 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 14 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 19 } } */
new file mode 100644
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx8di __attribute__((vector_size (64)));
+typedef int32_t vnx16si __attribute__((vector_size (64)));
+typedef int16_t vnx32hi __attribute__((vector_size (64)));
+typedef int8_t vnx64qi __attribute__((vector_size (64)));
+typedef _Float16 vnx32hf __attribute__((vector_size (64)));
+typedef float vnx16sf __attribute__((vector_size (64)));
+typedef double vnx8df __attribute__((vector_size (64)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL3(T) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+
+TEST_ALL3 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 11 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 25 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 15 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 19 } } */
new file mode 100644
@@ -0,0 +1,72 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx16di __attribute__((vector_size (128)));
+typedef int32_t vnx32si __attribute__((vector_size (128)));
+typedef int16_t vnx64hi __attribute__((vector_size (128)));
+typedef int8_t vnx128qi __attribute__((vector_size (128)));
+typedef _Float16 vnx64hf __attribute__((vector_size (128)));
+typedef float vnx32sf __attribute__((vector_size (128)));
+typedef double vnx16df __attribute__((vector_size (128)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define TEST_ALL4(T) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+TEST_ALL4 (VEC_EXTRACT)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 13 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 10 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 8 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 23 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 7 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.f.s} 17 } } */
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 20 } } */
new file mode 100644
@@ -0,0 +1,239 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -march=rv64gcv -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_extract-1.c"
+#include "vec_extract-2.c"
+#include "vec_extract-3.c"
+#include "vec_extract-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == v[IDX]); \
+ }
+
+#define CHECK_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
new file mode 100644
@@ -0,0 +1,77 @@
+/* { dg-do run {target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_extract-1.c"
+#include "vec_extract-2.c"
+#include "vec_extract-3.c"
+#include "vec_extract-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == v[IDX]); \
+ }
+
+#define CHECK_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
new file mode 100644
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx2di __attribute__((vector_size (16)));
+typedef int32_t vnx4si __attribute__((vector_size (16)));
+typedef int16_t vnx8hi __attribute__((vector_size (16)));
+typedef int8_t vnx16qi __attribute__((vector_size (16)));
+typedef _Float16 vnx8hf __attribute__((vector_size (16)));
+typedef float vnx4sf __attribute__((vector_size (16)));
+typedef double vnx2df __attribute__((vector_size (16)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL1(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+
+TEST_ALL1 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*tu,\s*ma} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 9 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 5 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 14 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
new file mode 100644
@@ -0,0 +1,74 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx4di __attribute__((vector_size (32)));
+typedef int32_t vnx8si __attribute__((vector_size (32)));
+typedef int16_t vnx16hi __attribute__((vector_size (32)));
+typedef int8_t vnx32qi __attribute__((vector_size (32)));
+typedef _Float16 vnx16hf __attribute__((vector_size (32)));
+typedef float vnx8sf __attribute__((vector_size (32)));
+typedef double vnx4df __attribute__((vector_size (32)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL2(T) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+
+TEST_ALL2 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 15 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 11 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 26 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
new file mode 100644
@@ -0,0 +1,76 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx8di __attribute__((vector_size (64)));
+typedef int32_t vnx16si __attribute__((vector_size (64)));
+typedef int16_t vnx32hi __attribute__((vector_size (64)));
+typedef int8_t vnx64qi __attribute__((vector_size (64)));
+typedef _Float16 vnx32hf __attribute__((vector_size (64)));
+typedef float vnx16sf __attribute__((vector_size (64)));
+typedef double vnx8df __attribute__((vector_size (64)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL3(T) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+
+TEST_ALL3 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*tu,\s*ma} 9 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 15 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 12 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 25 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vx} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
new file mode 100644
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <stdint-gcc.h>
+
+typedef int64_t vnx16di __attribute__((vector_size (128)));
+typedef int32_t vnx32si __attribute__((vector_size (128)));
+typedef int16_t vnx64hi __attribute__((vector_size (128)));
+typedef int8_t vnx128qi __attribute__((vector_size (128)));
+typedef _Float16 vnx64hf __attribute__((vector_size (128)));
+typedef float vnx32sf __attribute__((vector_size (128)));
+typedef double vnx16df __attribute__((vector_size (128)));
+
+#define VEC_SET(S,V,IDX) \
+ V \
+ __attribute__((noipa)) \
+ vec_set_##V##_##IDX (V v, S s) \
+ { \
+ v[IDX] = s; \
+ return v; \
+ }
+
+#define TEST_ALL4(T) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+TEST_ALL4 (VEC_SET)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*tu,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*tu,\s*ma} 11 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 2 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*tu,\s*ma} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.v.x} 16 } } */
+/* { dg-final { scan-assembler-times {\tvfmv.v.f} 14 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vi} 23 } } */
+/* { dg-final { scan-assembler-times {\tvslideup.vx} 7 } } */
+
+/* { dg-final { scan-assembler-times {\tvfmv.s.f} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmv.s.x} 4 } } */
new file mode 100644
@@ -0,0 +1,240 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -march=rv64gcv -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_set-1.c"
+#include "vec_set-2.c"
+#include "vec_set-3.c"
+#include "vec_set-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ V res = vec_set_##V##_##IDX (v, 77); \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ assert (res[i] == (i == IDX ? 77 : i)); \
+ }
+
+#define CHECK_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (float, vnx4sf, 0) \
+ T (float, vnx4sf, 1) \
+ T (float, vnx4sf, 3) \
+ T (double, vnx2df, 0) \
+ T (double, vnx2df, 1) \
+ T (int64_t, vnx2di, 0) \
+ T (int64_t, vnx2di, 1) \
+ T (int32_t, vnx4si, 0) \
+ T (int32_t, vnx4si, 1) \
+ T (int32_t, vnx4si, 3) \
+ T (int16_t, vnx8hi, 0) \
+ T (int16_t, vnx8hi, 2) \
+ T (int16_t, vnx8hi, 6) \
+ T (int8_t, vnx16qi, 0) \
+ T (int8_t, vnx16qi, 1) \
+ T (int8_t, vnx16qi, 7) \
+ T (int8_t, vnx16qi, 11) \
+ T (int8_t, vnx16qi, 15) \
+ T (float, vnx8sf, 0) \
+ T (float, vnx8sf, 1) \
+ T (float, vnx8sf, 3) \
+ T (float, vnx8sf, 4) \
+ T (float, vnx8sf, 7) \
+ T (double, vnx4df, 0) \
+ T (double, vnx4df, 1) \
+ T (double, vnx4df, 2) \
+ T (double, vnx4df, 3) \
+ T (int64_t, vnx4di, 0) \
+ T (int64_t, vnx4di, 1) \
+ T (int64_t, vnx4di, 2) \
+ T (int64_t, vnx4di, 3) \
+ T (int32_t, vnx8si, 0) \
+ T (int32_t, vnx8si, 1) \
+ T (int32_t, vnx8si, 3) \
+ T (int32_t, vnx8si, 4) \
+ T (int32_t, vnx8si, 7) \
+ T (int16_t, vnx16hi, 0) \
+ T (int16_t, vnx16hi, 1) \
+ T (int16_t, vnx16hi, 7) \
+ T (int16_t, vnx16hi, 8) \
+ T (int16_t, vnx16hi, 15) \
+ T (int8_t, vnx32qi, 0) \
+ T (int8_t, vnx32qi, 1) \
+ T (int8_t, vnx32qi, 15) \
+ T (int8_t, vnx32qi, 16) \
+ T (int8_t, vnx32qi, 31) \
+ T (float, vnx16sf, 0) \
+ T (float, vnx16sf, 2) \
+ T (float, vnx16sf, 6) \
+ T (float, vnx16sf, 8) \
+ T (float, vnx16sf, 14) \
+ T (double, vnx8df, 0) \
+ T (double, vnx8df, 2) \
+ T (double, vnx8df, 4) \
+ T (double, vnx8df, 6) \
+ T (int64_t, vnx8di, 0) \
+ T (int64_t, vnx8di, 2) \
+ T (int64_t, vnx8di, 4) \
+ T (int64_t, vnx8di, 6) \
+ T (int32_t, vnx16si, 0) \
+ T (int32_t, vnx16si, 2) \
+ T (int32_t, vnx16si, 6) \
+ T (int32_t, vnx16si, 8) \
+ T (int32_t, vnx16si, 14) \
+ T (int16_t, vnx32hi, 0) \
+ T (int16_t, vnx32hi, 2) \
+ T (int16_t, vnx32hi, 14) \
+ T (int16_t, vnx32hi, 16) \
+ T (int16_t, vnx32hi, 30) \
+ T (int8_t, vnx64qi, 0) \
+ T (int8_t, vnx64qi, 2) \
+ T (int8_t, vnx64qi, 30) \
+ T (int8_t, vnx64qi, 32) \
+ T (int8_t, vnx64qi, 63) \
+ T (float, vnx32sf, 0) \
+ T (float, vnx32sf, 3) \
+ T (float, vnx32sf, 12) \
+ T (float, vnx32sf, 17) \
+ T (float, vnx32sf, 14) \
+ T (double, vnx16df, 0) \
+ T (double, vnx16df, 4) \
+ T (double, vnx16df, 8) \
+ T (double, vnx16df, 12) \
+ T (int64_t, vnx16di, 0) \
+ T (int64_t, vnx16di, 4) \
+ T (int64_t, vnx16di, 8) \
+ T (int64_t, vnx16di, 12) \
+ T (int32_t, vnx32si, 0) \
+ T (int32_t, vnx32si, 4) \
+ T (int32_t, vnx32si, 12) \
+ T (int32_t, vnx32si, 16) \
+ T (int32_t, vnx32si, 28) \
+ T (int16_t, vnx64hi, 0) \
+ T (int16_t, vnx64hi, 4) \
+ T (int16_t, vnx64hi, 28) \
+ T (int16_t, vnx64hi, 32) \
+ T (int16_t, vnx64hi, 60) \
+ T (int8_t, vnx128qi, 0) \
+ T (int8_t, vnx128qi, 4) \
+ T (int8_t, vnx128qi, 30) \
+ T (int8_t, vnx128qi, 60) \
+ T (int8_t, vnx128qi, 64) \
+ T (int8_t, vnx128qi, 127) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}
new file mode 100644
@@ -0,0 +1,78 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -Wno-pedantic" } */
+
+#include <assert.h>
+
+#include "vec_set-1.c"
+#include "vec_set-2.c"
+#include "vec_set-3.c"
+#include "vec_set-4.c"
+
+#define CHECK(S, V, IDX) \
+void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = i; \
+ V res = vec_set_##V##_##IDX (v, 77); \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ assert (res[i] == (i == IDX ? 77 : i)); \
+ }
+
+#define CHECK_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+CHECK_ALL (CHECK)
+
+#define RUN(S, V, IDX) \
+ check_##V##_##IDX ();
+
+#define RUN_ALL(T) \
+ T (_Float16, vnx8hf, 0) \
+ T (_Float16, vnx8hf, 3) \
+ T (_Float16, vnx8hf, 7) \
+ T (_Float16, vnx16hf, 0) \
+ T (_Float16, vnx16hf, 3) \
+ T (_Float16, vnx16hf, 7) \
+ T (_Float16, vnx16hf, 8) \
+ T (_Float16, vnx16hf, 15) \
+ T (_Float16, vnx32hf, 0) \
+ T (_Float16, vnx32hf, 3) \
+ T (_Float16, vnx32hf, 7) \
+ T (_Float16, vnx32hf, 8) \
+ T (_Float16, vnx32hf, 16) \
+ T (_Float16, vnx32hf, 31) \
+ T (_Float16, vnx64hf, 0) \
+ T (_Float16, vnx64hf, 3) \
+ T (_Float16, vnx64hf, 7) \
+ T (_Float16, vnx64hf, 8) \
+ T (_Float16, vnx64hf, 16) \
+ T (_Float16, vnx64hf, 31) \
+ T (_Float16, vnx64hf, 42) \
+ T (_Float16, vnx64hf, 63) \
+
+int main ()
+{
+ RUN_ALL (RUN);
+}