gimple-isel: Recognize vec_extract pattern.
Checks
Commit Message
Hi,
In gimple-isel we already deduce a vec_set pattern from an
ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a
vec_extract.
The code is largely similar to the vec_set one including
the addition of a can_vec_extract_var_idx_p function
in optabs.cc to check if the backend can handle a register
operand as index. We already have can_vec_extract in
optabs-query but that one checks whether we can extract
specific modes.
With the introduction of an internal function for vec_extract
the expander must not FAIL. For vec_set this has already been
the case so adjust the documentation accordingly.
Additionally, clarify the wording of the vector-vector case for
vec_extract.
During testing I noticed that the aarch64 simd vec_extract
expander is the only one that FAILs. Richard is currently
testing a patch that tries to remove this. Bootstrap and
testsuite was unchanged on x86 and power.
I was a bit torn whether to add a separate function to recognize
vec_extract or not and ended up doing it inline with several
is_extract checks.
Regards
Robin
gcc/ChangeLog:
* doc/md.texi: Document that vec_set and vec_extract must not
fail.
* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
(gimple_expand_vec_set_extract_expr): ...to this.
(gimple_expand_vec_exprs): Call renamed function.
* internal-fn.cc (vec_extract_direct): Add.
(expand_vec_extract_optab_fn): New function to expand
vec_extract optab.
(direct_vec_extract_optab_supported_p): Add.
* internal-fn.def (VEC_EXTRACT): Add.
* optabs.cc (can_vec_extract_var_idx_p): New function.
* optabs.h (can_vec_extract_var_idx_p): Declare.
---
gcc/doc/md.texi | 7 +++-
gcc/gimple-isel.cc | 85 +++++++++++++++++++++++++++++++++++++--------
gcc/internal-fn.cc | 39 +++++++++++++++++++++
gcc/internal-fn.def | 1 +
gcc/optabs.cc | 24 +++++++++++++
gcc/optabs.h | 1 +
6 files changed, 141 insertions(+), 16 deletions(-)
Comments
On Mon, 3 Jul 2023, Robin Dapp wrote:
> Hi,
>
> In gimple-isel we already deduce a vec_set pattern from an
> ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a
> vec_extract.
>
> The code is largely similar to the vec_set one including
> the addition of a can_vec_extract_var_idx_p function
> in optabs.cc to check if the backend can handle a register
> operand as index. We already have can_vec_extract in
> optabs-query but that one checks whether we can extract
> specific modes.
>
> With the introduction of an internal function for vec_extract
> the expander must not FAIL. For vec_set this has already been
> the case so adjust the documentation accordingly.
>
> Additionally, clarify the wording of the vector-vector case for
> vec_extract.
>
> During testing I noticed that the aarch64 simd vec_extract
> expander is the only one that FAILs. Richard is currently
> testing a patch that tries to remove this. Bootstrap and
> testsuite was unchanged on x86 and power.
>
> I was a bit torn whether to add a separate function to recognize
> vec_extract or not and ended up doing it inline with several
> is_extract checks.
>
> Regards
> Robin
>
> gcc/ChangeLog:
>
> * doc/md.texi: Document that vec_set and vec_extract must not
> fail.
> * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
> (gimple_expand_vec_set_extract_expr): ...to this.
> (gimple_expand_vec_exprs): Call renamed function.
> * internal-fn.cc (vec_extract_direct): Add.
> (expand_vec_extract_optab_fn): New function to expand
> vec_extract optab.
> (direct_vec_extract_optab_supported_p): Add.
> * internal-fn.def (VEC_EXTRACT): Add.
> * optabs.cc (can_vec_extract_var_idx_p): New function.
> * optabs.h (can_vec_extract_var_idx_p): Declare.
> ---
> gcc/doc/md.texi | 7 +++-
> gcc/gimple-isel.cc | 85 +++++++++++++++++++++++++++++++++++++--------
> gcc/internal-fn.cc | 39 +++++++++++++++++++++
> gcc/internal-fn.def | 1 +
> gcc/optabs.cc | 24 +++++++++++++
> gcc/optabs.h | 1 +
> 6 files changed, 141 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 9648fdc846a..c61602fb04d 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5074,6 +5074,8 @@ of the result should be stored to memory.
> Set given field in the vector value. Operand 0 is the vector to modify,
> operand 1 is new value of field and operand 2 specify the field index.
>
> +This pattern is not allowed to @code{FAIL}.
> +
> @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
> @item @samp{vec_extract@var{m}@var{n}}
> Extract given field from the vector value. Operand 1 is the vector, operand 2
> @@ -5081,7 +5083,10 @@ specify field index and operand 0 place to store value into. The
> @var{n} mode is the mode of the field or vector of fields that should be
> extracted, should be either element mode of the vector mode @var{m}, or
> a vector mode with the same element mode and smaller number of elements.
> -If @var{n} is a vector mode, the index is counted in units of that mode.
> +If @var{n} is a vector mode the index is counted in multiples of
> +mode @var{n}.
> +
> +This pattern is not allowed to @code{FAIL}.
>
> @cindex @code{vec_init@var{m}@var{n}} instruction pattern
> @item @samp{vec_init@var{m}@var{n}}
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index ef688ddb57f..d6c560b63dd 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -42,15 +42,26 @@ along with GCC; see the file COPYING3. If not see
>
> /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
> internal function based on vector type of selected expansion.
> - i.e.:
> +
> + For vec_set:
> +
> VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
> =>
> _7 = u;
> _8 = .VEC_SET (_7, i_4(D), _1);
> - u = _8; */
> + u = _8;
> +
> + For vec_extract:
> +
> + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
> + =>
> + _4 = vD.2208;
> + _5 = .VEC_EXTRACT (_4, idx_2(D));
> + _3 = _5; */
>
> static bool
> -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> +gimple_expand_vec_set_extract_expr (struct function *fun,
> + gimple_stmt_iterator *gsi)
> {
> enum tree_code code;
> gcall *new_stmt = NULL;
> @@ -62,45 +73,88 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> if (!stmt)
> return false;
>
> + bool is_extract = false;
> +
> tree lhs = gimple_assign_lhs (stmt);
> + tree rhs = gimple_assign_rhs1 (stmt);
> + tree val, op0;
> code = TREE_CODE (lhs);
> - if (code != ARRAY_REF)
> - return false;
> + if (code == ARRAY_REF)
> + {
> + /* Assume it is a vec_set. */
> + val = rhs;
> + op0 = TREE_OPERAND (lhs, 0);
> + }
> + else
> + {
> + /* It can still be a vec_extract. */
> + code = TREE_CODE (rhs);
> + if (code != ARRAY_REF)
> + return false;
> +
can you make this
if (TREE_CODE (lhs) == ARRAY_REF)
..
else if (TREE_CODE (rhs) == ARRAY_REF)
..
else
return false;
that's way easier to read.
> + /* Sides are swapped for vec_extract. */
> + is_extract = true;
> + val = lhs;
> + op0 = TREE_OPERAND (rhs, 0);
> + }
>
> - tree val = gimple_assign_rhs1 (stmt);
> - tree op0 = TREE_OPERAND (lhs, 0);
> if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
> && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
> && TYPE_MODE (TREE_TYPE (lhs))
this 'lhs' would need to be 'rhs' for the extract? So in addition
to 'val' add 'ref' and initialize that?
> == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
> {
> - tree pos = TREE_OPERAND (lhs, 1);
> + tree pos;
> + if (!is_extract)
> + pos = TREE_OPERAND (lhs, 1);
> + else
> + pos = TREE_OPERAND (rhs, 1);
> +
then makes TREE_OPERAND (ref, 1);
> tree view_op0 = TREE_OPERAND (op0, 0);
> machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> + machine_mode extract_mode;
> + if (is_extract)
> + extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +
and this would be 'val', though looking at 'ref' looks better here.
> if (auto_var_in_fn_p (view_op0, fun->decl)
> - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
> + && !TREE_ADDRESSABLE (view_op0)
> + && ((!is_extract && can_vec_set_var_idx_p (outermode))
> + || (is_extract &&
> + can_vec_extract_var_idx_p (outermode, extract_mode))))
> {
> location_t loc = gimple_location (stmt);
> tree var_src = make_ssa_name (TREE_TYPE (view_op0));
> - tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
> + tree var_dst;
> + if (!is_extract)
> + var_dst = make_ssa_name (TREE_TYPE (view_op0));
> + else
> + var_dst = make_ssa_name (TREE_TYPE (lhs));
>
> ass_stmt = gimple_build_assign (var_src, view_op0);
> gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
> gimple_set_location (ass_stmt, loc);
> gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>
> - new_stmt
> - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> + if (!is_extract)
> + new_stmt
> + = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> + else
> + new_stmt
> + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
> +
> gimple_call_set_lhs (new_stmt, var_dst);
> gimple_set_location (new_stmt, loc);
> gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>
> - ass_stmt = gimple_build_assign (view_op0, var_dst);
> + if (!is_extract)
> + ass_stmt = gimple_build_assign (view_op0, var_dst);
> + else
> + ass_stmt = gimple_build_assign (lhs, var_dst);
as you preserve the LHS for the original testcase you should
use gsi_replace () for it. I think that making a bigger if (is_extract)
group covering all stmt generation would make this easier to read.
Thanks,
Richard.
> gimple_set_location (ass_stmt, loc);
> + if (!is_extract)
> + gimple_move_vops (ass_stmt, stmt);
> gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>
> basic_block bb = gimple_bb (stmt);
> - gimple_move_vops (ass_stmt, stmt);
> if (gsi_remove (gsi, true)
> && gimple_purge_dead_eh_edges (bb))
> cfg_changed = true;
> @@ -317,7 +371,8 @@ gimple_expand_vec_exprs (struct function *fun)
> gsi_replace (&gsi, g, false);
> }
>
> - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
> + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
> +
> if (gsi_end_p (gsi))
> break;
> }
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 9017176dc7a..cbdb7e64d7c 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -175,6 +175,7 @@ init_internal_fns ()
> #define len_store_direct { 3, 3, false }
> #define len_maskstore_direct { 4, 3, false }
> #define vec_set_direct { 3, 3, false }
> +#define vec_extract_direct { 3, 3, false }
> #define unary_direct { 0, 0, true }
> #define unary_convert_direct { -1, 0, true }
> #define binary_direct { 0, 0, true }
> @@ -3126,6 +3127,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> gcc_unreachable ();
> }
>
> +/* Expand VEC_EXTRACT optab internal function. */
> +
> +static void
> +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> + tree lhs = gimple_call_lhs (stmt);
> + tree op0 = gimple_call_arg (stmt, 0);
> + tree op1 = gimple_call_arg (stmt, 1);
> +
> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +
> + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
> + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +
> + rtx src = expand_normal (op0);
> + rtx pos = expand_normal (op1);
> +
> + class expand_operand ops[3];
> + enum insn_code icode = convert_optab_handler (optab, outermode,
> + extract_mode);
> +
> + if (icode != CODE_FOR_nothing)
> + {
> + create_output_operand (&ops[0], target, extract_mode);
> + create_input_operand (&ops[1], src, outermode);
> + create_convert_operand_from (&ops[2], pos,
> + TYPE_MODE (TREE_TYPE (op1)), true);
> + if (maybe_expand_insn (icode, 3, ops))
> + {
> + if (!rtx_equal_p (target, ops[0].value))
> + emit_move_insn (target, ops[0].value);
> + return;
> + }
> + }
> + gcc_unreachable ();
> +}
> +
> static void
> expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
> {
> @@ -3976,6 +4014,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
> #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
> #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
> #define direct_vec_set_optab_supported_p direct_optab_supported_p
> +#define direct_vec_extract_optab_supported_p direct_optab_supported_p
>
> /* Return the optab used by internal function FN. */
>
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index bc947c0fde7..032624b1547 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -176,6 +176,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
> DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
>
> DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
> +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
>
> DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
> DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 89ab55b72ae..4440bc505b1 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
> && insn_operand_matches (icode, 2, reg3);
> }
>
> +/* Return whether the backend can emit a vec_extract instruction with
> + a non-constant index. */
> +bool
> +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
> +{
> + if (!VECTOR_MODE_P (vec_mode))
> + return false;
> +
> + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
> + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
> +
> + enum insn_code icode = convert_optab_handler (vec_extract_optab,
> + vec_mode, extr_mode);
> +
> + const struct insn_data_d *data = &insn_data[icode];
> + machine_mode idx_mode = data->operand[2].mode;
> +
> + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
> +
> + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
> + && insn_operand_matches (icode, 1, reg2)
> + && insn_operand_matches (icode, 2, reg3);
> +}
> +
> /* This function is called when we are going to emit a compare instruction that
> compares the values found in X and Y, using the rtl operator COMPARISON.
>
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index 29ccbe9235e..f14d012d92f 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
> /* Return whether the backend can emit vector set instructions for inserting
> element into vector at variable index position. */
> extern bool can_vec_set_var_idx_p (machine_mode);
> +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
>
> extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
> machine_mode, int);
>
Hi Richard,
changed the patch according to your comments and I agree that
it is more readable that way. I hope using lhs as target for
the extract directly is possible the way I did it. Richard's
patch for aarch64 is already, therefore testsuites on aarch64 and
i386 are unchanged.
Regards
Robin
Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern.
In gimple-isel we already deduce a vec_set pattern from an
ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a
vec_extract.
The code is largely similar to the vec_set one
including the addition of a can_vec_extract_var_idx_p function
in optabs.cc to check if the backend can handle a register
operand as index. We already have can_vec_extract in
optabs-query but that one checks whether we can extract
specific modes.
With the introduction of an internal function for vec_extract
the expander must not FAIL. For vec_set this has already been
the case so adjust the documentation accordingly.
Additionally, clarify the wording of the vector-vector case for
vec_extract.
gcc/ChangeLog:
* doc/md.texi: Document that vec_set and vec_extract must not
fail.
* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
(gimple_expand_vec_set_extract_expr): ...to this.
(gimple_expand_vec_exprs): Call renamed function.
* internal-fn.cc (vec_extract_direct): Add.
(expand_vec_extract_optab_fn): New function to expand
vec_extract optab.
(direct_vec_extract_optab_supported_p): Add.
* internal-fn.def (VEC_EXTRACT): Add.
* optabs.cc (can_vec_extract_var_idx_p): New function.
* optabs.h (can_vec_extract_var_idx_p): Declare.
---
gcc/doc/md.texi | 7 +++-
gcc/gimple-isel.cc | 100 ++++++++++++++++++++++++++++++++------------
gcc/internal-fn.cc | 39 +++++++++++++++++
gcc/internal-fn.def | 1 +
gcc/optabs.cc | 24 +++++++++++
gcc/optabs.h | 1 +
6 files changed, 144 insertions(+), 28 deletions(-)
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index f14dd32b2dc..b30a824488b 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
Set given field in the vector value. Operand 0 is the vector to modify,
operand 1 is new value of field and operand 2 specify the field index.
+This pattern is not allowed to @code{FAIL}.
+
@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
@item @samp{vec_extract@var{m}@var{n}}
Extract given field from the vector value. Operand 1 is the vector, operand 2
@@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into. The
@var{n} mode is the mode of the field or vector of fields that should be
extracted, should be either element mode of the vector mode @var{m}, or
a vector mode with the same element mode and smaller number of elements.
-If @var{n} is a vector mode, the index is counted in units of that mode.
+If @var{n} is a vector mode the index is counted in multiples of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
@cindex @code{vec_init@var{m}@var{n}} instruction pattern
@item @samp{vec_init@var{m}@var{n}}
diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index ef688ddb57f..a18b26dec7b 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -42,17 +42,27 @@ along with GCC; see the file COPYING3. If not see
/* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
internal function based on vector type of selected expansion.
- i.e.:
+
+ For vec_set:
+
VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
=>
_7 = u;
_8 = .VEC_SET (_7, i_4(D), _1);
- u = _8; */
+ u = _8;
+
+ For vec_extract:
+
+ _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
+ =>
+ _4 = vD.2208;
+ _5 = .VEC_EXTRACT (_4, idx_2(D));
+ _3 = _5; */
static bool
-gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
+gimple_expand_vec_set_extract_expr (struct function *fun,
+ gimple_stmt_iterator *gsi)
{
- enum tree_code code;
gcall *new_stmt = NULL;
gassign *ass_stmt = NULL;
bool cfg_changed = false;
@@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
if (!stmt)
return false;
+ bool is_extract = false;
+
tree lhs = gimple_assign_lhs (stmt);
- code = TREE_CODE (lhs);
- if (code != ARRAY_REF)
+ tree rhs = gimple_assign_rhs1 (stmt);
+ tree val, ref;
+ if (TREE_CODE (lhs) == ARRAY_REF)
+ {
+ /* Assume it is a vec_set. */
+ val = rhs;
+ ref = lhs;
+ }
+ else if (TREE_CODE (rhs) == ARRAY_REF)
+ {
+ /* vec_extract. */
+ is_extract = true;
+ val = lhs;
+ ref = rhs;
+ }
+ else
return false;
- tree val = gimple_assign_rhs1 (stmt);
- tree op0 = TREE_OPERAND (lhs, 0);
+ tree op0 = TREE_OPERAND (ref, 0);
if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
&& VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
- && TYPE_MODE (TREE_TYPE (lhs))
+ && TYPE_MODE (TREE_TYPE (ref))
== TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
{
- tree pos = TREE_OPERAND (lhs, 1);
+ tree pos = TREE_OPERAND (ref, 1);
+
tree view_op0 = TREE_OPERAND (op0, 0);
machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
+ machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
+
if (auto_var_in_fn_p (view_op0, fun->decl)
- && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
+ && !TREE_ADDRESSABLE (view_op0)
+ && ((!is_extract && can_vec_set_var_idx_p (outermode))
+ || (is_extract
+ && can_vec_extract_var_idx_p (outermode, extract_mode))))
{
location_t loc = gimple_location (stmt);
tree var_src = make_ssa_name (TREE_TYPE (view_op0));
- tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
ass_stmt = gimple_build_assign (var_src, view_op0);
gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
gimple_set_location (ass_stmt, loc);
gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
- new_stmt
- = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
- gimple_call_set_lhs (new_stmt, var_dst);
- gimple_set_location (new_stmt, loc);
- gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+ if (!is_extract)
+ {
+ tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
- ass_stmt = gimple_build_assign (view_op0, var_dst);
- gimple_set_location (ass_stmt, loc);
- gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
+ new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src,
+ val, pos);
+
+ gimple_call_set_lhs (new_stmt, var_dst);
+ gimple_set_location (new_stmt, loc);
+ gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+
+ ass_stmt = gimple_build_assign (view_op0, var_dst);
+ gimple_set_location (ass_stmt, loc);
+ gimple_move_vops (ass_stmt, stmt);
+ gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
- basic_block bb = gimple_bb (stmt);
- gimple_move_vops (ass_stmt, stmt);
- if (gsi_remove (gsi, true)
- && gimple_purge_dead_eh_edges (bb))
- cfg_changed = true;
- *gsi = gsi_for_stmt (ass_stmt);
+ basic_block bb = gimple_bb (stmt);
+ if (gsi_remove (gsi, true)
+ && gimple_purge_dead_eh_edges (bb))
+ cfg_changed = true;
+ *gsi = gsi_for_stmt (ass_stmt);
+ }
+ else
+ {
+ new_stmt
+ = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
+ gimple_call_set_lhs (new_stmt, lhs);
+
+ gsi_replace (gsi, new_stmt, true);
+ cfg_changed = true;
+ }
}
}
@@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun)
gsi_replace (&gsi, g, false);
}
- cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
+ cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
+
if (gsi_end_p (gsi))
break;
}
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 303df102d81..23f9309572b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -175,6 +175,7 @@ init_internal_fns ()
#define len_store_direct { 3, 3, false }
#define len_maskstore_direct { 4, 5, false }
#define vec_set_direct { 3, 3, false }
+#define vec_extract_direct { 3, 3, false }
#define unary_direct { 0, 0, true }
#define unary_convert_direct { -1, 0, true }
#define binary_direct { 0, 0, true }
@@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
gcc_unreachable ();
}
+/* Expand VEC_EXTRACT optab internal function. */
+
+static void
+expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+ tree lhs = gimple_call_lhs (stmt);
+ tree op0 = gimple_call_arg (stmt, 0);
+ tree op1 = gimple_call_arg (stmt, 1);
+
+ rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+ machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
+ machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
+ rtx src = expand_normal (op0);
+ rtx pos = expand_normal (op1);
+
+ class expand_operand ops[3];
+ enum insn_code icode = convert_optab_handler (optab, outermode,
+ extract_mode);
+
+ if (icode != CODE_FOR_nothing)
+ {
+ create_output_operand (&ops[0], target, extract_mode);
+ create_input_operand (&ops[1], src, outermode);
+ create_convert_operand_from (&ops[2], pos,
+ TYPE_MODE (TREE_TYPE (op1)), true);
+ if (maybe_expand_insn (icode, 3, ops))
+ {
+ if (!rtx_equal_p (target, ops[0].value))
+ emit_move_insn (target, ops[0].value);
+ return;
+ }
+ }
+ gcc_unreachable ();
+}
+
static void
expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
{
@@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
#define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
#define direct_check_ptrs_optab_supported_p direct_optab_supported_p
#define direct_vec_set_optab_supported_p direct_optab_supported_p
+#define direct_vec_extract_optab_supported_p direct_optab_supported_p
/* Return the optab used by internal function FN. */
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 9b73e540d55..238b7ee0bc9 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
+DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 0a2bcf3620c..4e9f58f8060 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
&& insn_operand_matches (icode, 2, reg3);
}
+/* Return whether the backend can emit a vec_extract instruction with
+ a non-constant index. */
+bool
+can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
+{
+ if (!VECTOR_MODE_P (vec_mode))
+ return false;
+
+ rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
+ rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
+
+ enum insn_code icode = convert_optab_handler (vec_extract_optab,
+ vec_mode, extr_mode);
+
+ const struct insn_data_d *data = &insn_data[icode];
+ machine_mode idx_mode = data->operand[2].mode;
+
+ rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
+
+ return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
+ && insn_operand_matches (icode, 1, reg2)
+ && insn_operand_matches (icode, 2, reg3);
+}
+
/* This function is called when we are going to emit a compare instruction that
compares the values found in X and Y, using the rtl operator COMPARISON.
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 781d5548be9..c80b7f4dc1b 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
/* Return whether the backend can emit vector set instructions for inserting
element into vector at variable index position. */
extern bool can_vec_set_var_idx_p (machine_mode);
+extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
machine_mode, int);
On Tue, 4 Jul 2023, Robin Dapp wrote:
> Hi Richard,
>
> changed the patch according to your comments and I agree that
> it is more readable that way. I hope using lhs as target for
> the extract directly is possible the way I did it. Richard's
> patch for aarch64 is already, therefore testsuites on aarch64 and
> i386 are unchanged.
>
> Regards
> Robin
>
> Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern.
>
> In gimple-isel we already deduce a vec_set pattern from an
> ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a
> vec_extract.
>
> The code is largely similar to the vec_set one
> including the addition of a can_vec_extract_var_idx_p function
> in optabs.cc to check if the backend can handle a register
> operand as index. We already have can_vec_extract in
> optabs-query but that one checks whether we can extract
> specific modes.
>
> With the introduction of an internal function for vec_extract
> the expander must not FAIL. For vec_set this has already been
> the case so adjust the documentation accordingly.
>
> Additionally, clarify the wording of the vector-vector case for
> vec_extract.
>
> gcc/ChangeLog:
>
> * doc/md.texi: Document that vec_set and vec_extract must not
> fail.
> * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
> (gimple_expand_vec_set_extract_expr): ...to this.
> (gimple_expand_vec_exprs): Call renamed function.
> * internal-fn.cc (vec_extract_direct): Add.
> (expand_vec_extract_optab_fn): New function to expand
> vec_extract optab.
> (direct_vec_extract_optab_supported_p): Add.
> * internal-fn.def (VEC_EXTRACT): Add.
> * optabs.cc (can_vec_extract_var_idx_p): New function.
> * optabs.h (can_vec_extract_var_idx_p): Declare.
> ---
> gcc/doc/md.texi | 7 +++-
> gcc/gimple-isel.cc | 100 ++++++++++++++++++++++++++++++++------------
> gcc/internal-fn.cc | 39 +++++++++++++++++
> gcc/internal-fn.def | 1 +
> gcc/optabs.cc | 24 +++++++++++
> gcc/optabs.h | 1 +
> 6 files changed, 144 insertions(+), 28 deletions(-)
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index f14dd32b2dc..b30a824488b 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
> Set given field in the vector value. Operand 0 is the vector to modify,
> operand 1 is new value of field and operand 2 specify the field index.
>
> +This pattern is not allowed to @code{FAIL}.
> +
> @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
> @item @samp{vec_extract@var{m}@var{n}}
> Extract given field from the vector value. Operand 1 is the vector, operand 2
> @@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into. The
> @var{n} mode is the mode of the field or vector of fields that should be
> extracted, should be either element mode of the vector mode @var{m}, or
> a vector mode with the same element mode and smaller number of elements.
> -If @var{n} is a vector mode, the index is counted in units of that mode.
> +If @var{n} is a vector mode the index is counted in multiples of
> +mode @var{n}.
> +
> +This pattern is not allowed to @code{FAIL}.
>
> @cindex @code{vec_init@var{m}@var{n}} instruction pattern
> @item @samp{vec_init@var{m}@var{n}}
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index ef688ddb57f..a18b26dec7b 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -42,17 +42,27 @@ along with GCC; see the file COPYING3. If not see
>
> /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
> internal function based on vector type of selected expansion.
> - i.e.:
> +
> + For vec_set:
> +
> VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
> =>
> _7 = u;
> _8 = .VEC_SET (_7, i_4(D), _1);
> - u = _8; */
> + u = _8;
> +
> + For vec_extract:
> +
> + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
> + =>
> + _4 = vD.2208;
> + _5 = .VEC_EXTRACT (_4, idx_2(D));
> + _3 = _5; */
I think you are doing
_3 = .VEC_EXTRACT (_4, idx_2(D));
and avoiding the SSA name copy correctly. Can you double-check?
OK with the comment adjusted.
Thanks,
Richard.
>
> static bool
> -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> +gimple_expand_vec_set_extract_expr (struct function *fun,
> + gimple_stmt_iterator *gsi)
> {
> - enum tree_code code;
> gcall *new_stmt = NULL;
> gassign *ass_stmt = NULL;
> bool cfg_changed = false;
> @@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> if (!stmt)
> return false;
>
> + bool is_extract = false;
> +
> tree lhs = gimple_assign_lhs (stmt);
> - code = TREE_CODE (lhs);
> - if (code != ARRAY_REF)
> + tree rhs = gimple_assign_rhs1 (stmt);
> + tree val, ref;
> + if (TREE_CODE (lhs) == ARRAY_REF)
> + {
> + /* Assume it is a vec_set. */
> + val = rhs;
> + ref = lhs;
> + }
> + else if (TREE_CODE (rhs) == ARRAY_REF)
> + {
> + /* vec_extract. */
> + is_extract = true;
> + val = lhs;
> + ref = rhs;
> + }
> + else
> return false;
>
> - tree val = gimple_assign_rhs1 (stmt);
> - tree op0 = TREE_OPERAND (lhs, 0);
> + tree op0 = TREE_OPERAND (ref, 0);
> if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
> && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
> - && TYPE_MODE (TREE_TYPE (lhs))
> + && TYPE_MODE (TREE_TYPE (ref))
> == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
> {
> - tree pos = TREE_OPERAND (lhs, 1);
> + tree pos = TREE_OPERAND (ref, 1);
> +
> tree view_op0 = TREE_OPERAND (op0, 0);
> machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
> +
> if (auto_var_in_fn_p (view_op0, fun->decl)
> - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
> + && !TREE_ADDRESSABLE (view_op0)
> + && ((!is_extract && can_vec_set_var_idx_p (outermode))
> + || (is_extract
> + && can_vec_extract_var_idx_p (outermode, extract_mode))))
> {
> location_t loc = gimple_location (stmt);
> tree var_src = make_ssa_name (TREE_TYPE (view_op0));
> - tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
>
> ass_stmt = gimple_build_assign (var_src, view_op0);
> gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
> gimple_set_location (ass_stmt, loc);
> gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>
> - new_stmt
> - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> - gimple_call_set_lhs (new_stmt, var_dst);
> - gimple_set_location (new_stmt, loc);
> - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> + if (!is_extract)
> + {
> + tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
>
> - ass_stmt = gimple_build_assign (view_op0, var_dst);
> - gimple_set_location (ass_stmt, loc);
> - gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
> + new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src,
> + val, pos);
> +
> + gimple_call_set_lhs (new_stmt, var_dst);
> + gimple_set_location (new_stmt, loc);
> + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> +
> + ass_stmt = gimple_build_assign (view_op0, var_dst);
> + gimple_set_location (ass_stmt, loc);
> + gimple_move_vops (ass_stmt, stmt);
> + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>
> - basic_block bb = gimple_bb (stmt);
> - gimple_move_vops (ass_stmt, stmt);
> - if (gsi_remove (gsi, true)
> - && gimple_purge_dead_eh_edges (bb))
> - cfg_changed = true;
> - *gsi = gsi_for_stmt (ass_stmt);
> + basic_block bb = gimple_bb (stmt);
> + if (gsi_remove (gsi, true)
> + && gimple_purge_dead_eh_edges (bb))
> + cfg_changed = true;
> + *gsi = gsi_for_stmt (ass_stmt);
> + }
> + else
> + {
> + new_stmt
> + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
> + gimple_call_set_lhs (new_stmt, lhs);
> +
> + gsi_replace (gsi, new_stmt, true);
> + cfg_changed = true;
> + }
> }
> }
>
> @@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun)
> gsi_replace (&gsi, g, false);
> }
>
> - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
> + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
> +
> if (gsi_end_p (gsi))
> break;
> }
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 303df102d81..23f9309572b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -175,6 +175,7 @@ init_internal_fns ()
> #define len_store_direct { 3, 3, false }
> #define len_maskstore_direct { 4, 5, false }
> #define vec_set_direct { 3, 3, false }
> +#define vec_extract_direct { 3, 3, false }
> #define unary_direct { 0, 0, true }
> #define unary_convert_direct { -1, 0, true }
> #define binary_direct { 0, 0, true }
> @@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> gcc_unreachable ();
> }
>
> +/* Expand VEC_EXTRACT optab internal function. */
> +
> +static void
> +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> + tree lhs = gimple_call_lhs (stmt);
> + tree op0 = gimple_call_arg (stmt, 0);
> + tree op1 = gimple_call_arg (stmt, 1);
> +
> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +
> + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
> + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +
> + rtx src = expand_normal (op0);
> + rtx pos = expand_normal (op1);
> +
> + class expand_operand ops[3];
> + enum insn_code icode = convert_optab_handler (optab, outermode,
> + extract_mode);
> +
> + if (icode != CODE_FOR_nothing)
> + {
> + create_output_operand (&ops[0], target, extract_mode);
> + create_input_operand (&ops[1], src, outermode);
> + create_convert_operand_from (&ops[2], pos,
> + TYPE_MODE (TREE_TYPE (op1)), true);
> + if (maybe_expand_insn (icode, 3, ops))
> + {
> + if (!rtx_equal_p (target, ops[0].value))
> + emit_move_insn (target, ops[0].value);
> + return;
> + }
> + }
> + gcc_unreachable ();
> +}
> +
> static void
> expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
> {
> @@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
> #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
> #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
> #define direct_vec_set_optab_supported_p direct_optab_supported_p
> +#define direct_vec_extract_optab_supported_p direct_optab_supported_p
>
> /* Return the optab used by internal function FN. */
>
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 9b73e540d55..238b7ee0bc9 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
> DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
>
> DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
> +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
>
> DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
> DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 0a2bcf3620c..4e9f58f8060 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
> && insn_operand_matches (icode, 2, reg3);
> }
>
> +/* Return whether the backend can emit a vec_extract instruction with
> + a non-constant index. */
> +bool
> +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
> +{
> + if (!VECTOR_MODE_P (vec_mode))
> + return false;
> +
> + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
> + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
> +
> + enum insn_code icode = convert_optab_handler (vec_extract_optab,
> + vec_mode, extr_mode);
> +
> + const struct insn_data_d *data = &insn_data[icode];
> + machine_mode idx_mode = data->operand[2].mode;
> +
> + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
> +
> + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
> + && insn_operand_matches (icode, 1, reg2)
> + && insn_operand_matches (icode, 2, reg3);
> +}
> +
> /* This function is called when we are going to emit a compare instruction that
> compares the values found in X and Y, using the rtl operator COMPARISON.
>
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index 781d5548be9..c80b7f4dc1b 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
> /* Return whether the backend can emit vector set instructions for inserting
> element into vector at variable index position. */
> extern bool can_vec_set_var_idx_p (machine_mode);
> +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
>
> extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
> machine_mode, int);
>
>> + _4 = vD.2208;
>> + _5 = .VEC_EXTRACT (_4, idx_2(D));
>> + _3 = _5; */
>
> I think you are doing
>
> _3 = .VEC_EXTRACT (_4, idx_2(D));
>
> and avoiding the SSA name copy correctly. Can you double-check?
>
> OK with the comment adjusted.
Argh, yes, thanks.
Regards
Robin
@@ -5074,6 +5074,8 @@ of the result should be stored to memory.
Set given field in the vector value. Operand 0 is the vector to modify,
operand 1 is new value of field and operand 2 specify the field index.
+This pattern is not allowed to @code{FAIL}.
+
@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
@item @samp{vec_extract@var{m}@var{n}}
Extract given field from the vector value. Operand 1 is the vector, operand 2
@@ -5081,7 +5083,10 @@ specify field index and operand 0 place to store value into. The
@var{n} mode is the mode of the field or vector of fields that should be
extracted, should be either element mode of the vector mode @var{m}, or
a vector mode with the same element mode and smaller number of elements.
-If @var{n} is a vector mode, the index is counted in units of that mode.
+If @var{n} is a vector mode the index is counted in multiples of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
@cindex @code{vec_init@var{m}@var{n}} instruction pattern
@item @samp{vec_init@var{m}@var{n}}
@@ -42,15 +42,26 @@ along with GCC; see the file COPYING3. If not see
/* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
internal function based on vector type of selected expansion.
- i.e.:
+
+ For vec_set:
+
VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
=>
_7 = u;
_8 = .VEC_SET (_7, i_4(D), _1);
- u = _8; */
+ u = _8;
+
+ For vec_extract:
+
+ _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
+ =>
+ _4 = vD.2208;
+ _5 = .VEC_EXTRACT (_4, idx_2(D));
+ _3 = _5; */
static bool
-gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
+gimple_expand_vec_set_extract_expr (struct function *fun,
+ gimple_stmt_iterator *gsi)
{
enum tree_code code;
gcall *new_stmt = NULL;
@@ -62,45 +73,88 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
if (!stmt)
return false;
+ bool is_extract = false;
+
tree lhs = gimple_assign_lhs (stmt);
+ tree rhs = gimple_assign_rhs1 (stmt);
+ tree val, op0;
code = TREE_CODE (lhs);
- if (code != ARRAY_REF)
- return false;
+ if (code == ARRAY_REF)
+ {
+ /* Assume it is a vec_set. */
+ val = rhs;
+ op0 = TREE_OPERAND (lhs, 0);
+ }
+ else
+ {
+ /* It can still be a vec_extract. */
+ code = TREE_CODE (rhs);
+ if (code != ARRAY_REF)
+ return false;
+
+ /* Sides are swapped for vec_extract. */
+ is_extract = true;
+ val = lhs;
+ op0 = TREE_OPERAND (rhs, 0);
+ }
- tree val = gimple_assign_rhs1 (stmt);
- tree op0 = TREE_OPERAND (lhs, 0);
if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
&& VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
&& TYPE_MODE (TREE_TYPE (lhs))
== TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
{
- tree pos = TREE_OPERAND (lhs, 1);
+ tree pos;
+ if (!is_extract)
+ pos = TREE_OPERAND (lhs, 1);
+ else
+ pos = TREE_OPERAND (rhs, 1);
+
tree view_op0 = TREE_OPERAND (op0, 0);
machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
+ machine_mode extract_mode;
+ if (is_extract)
+ extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
if (auto_var_in_fn_p (view_op0, fun->decl)
- && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
+ && !TREE_ADDRESSABLE (view_op0)
+ && ((!is_extract && can_vec_set_var_idx_p (outermode))
+ || (is_extract &&
+ can_vec_extract_var_idx_p (outermode, extract_mode))))
{
location_t loc = gimple_location (stmt);
tree var_src = make_ssa_name (TREE_TYPE (view_op0));
- tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
+ tree var_dst;
+ if (!is_extract)
+ var_dst = make_ssa_name (TREE_TYPE (view_op0));
+ else
+ var_dst = make_ssa_name (TREE_TYPE (lhs));
ass_stmt = gimple_build_assign (var_src, view_op0);
gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
gimple_set_location (ass_stmt, loc);
gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
- new_stmt
- = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
+ if (!is_extract)
+ new_stmt
+ = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
+ else
+ new_stmt
+ = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
+
gimple_call_set_lhs (new_stmt, var_dst);
gimple_set_location (new_stmt, loc);
gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
- ass_stmt = gimple_build_assign (view_op0, var_dst);
+ if (!is_extract)
+ ass_stmt = gimple_build_assign (view_op0, var_dst);
+ else
+ ass_stmt = gimple_build_assign (lhs, var_dst);
gimple_set_location (ass_stmt, loc);
+ if (!is_extract)
+ gimple_move_vops (ass_stmt, stmt);
gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
basic_block bb = gimple_bb (stmt);
- gimple_move_vops (ass_stmt, stmt);
if (gsi_remove (gsi, true)
&& gimple_purge_dead_eh_edges (bb))
cfg_changed = true;
@@ -317,7 +371,8 @@ gimple_expand_vec_exprs (struct function *fun)
gsi_replace (&gsi, g, false);
}
- cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
+ cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
+
if (gsi_end_p (gsi))
break;
}
@@ -175,6 +175,7 @@ init_internal_fns ()
#define len_store_direct { 3, 3, false }
#define len_maskstore_direct { 4, 3, false }
#define vec_set_direct { 3, 3, false }
+#define vec_extract_direct { 3, 3, false }
#define unary_direct { 0, 0, true }
#define unary_convert_direct { -1, 0, true }
#define binary_direct { 0, 0, true }
@@ -3126,6 +3127,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
gcc_unreachable ();
}
+/* Expand VEC_EXTRACT optab internal function. */
+
+static void
+expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+ tree lhs = gimple_call_lhs (stmt);
+ tree op0 = gimple_call_arg (stmt, 0);
+ tree op1 = gimple_call_arg (stmt, 1);
+
+ rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+ machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
+ machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
+ rtx src = expand_normal (op0);
+ rtx pos = expand_normal (op1);
+
+ class expand_operand ops[3];
+ enum insn_code icode = convert_optab_handler (optab, outermode,
+ extract_mode);
+
+ if (icode != CODE_FOR_nothing)
+ {
+ create_output_operand (&ops[0], target, extract_mode);
+ create_input_operand (&ops[1], src, outermode);
+ create_convert_operand_from (&ops[2], pos,
+ TYPE_MODE (TREE_TYPE (op1)), true);
+ if (maybe_expand_insn (icode, 3, ops))
+ {
+ if (!rtx_equal_p (target, ops[0].value))
+ emit_move_insn (target, ops[0].value);
+ return;
+ }
+ }
+ gcc_unreachable ();
+}
+
static void
expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
{
@@ -3976,6 +4014,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
#define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
#define direct_check_ptrs_optab_supported_p direct_optab_supported_p
#define direct_vec_set_optab_supported_p direct_optab_supported_p
+#define direct_vec_extract_optab_supported_p direct_optab_supported_p
/* Return the optab used by internal function FN. */
@@ -176,6 +176,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
+DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
@@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
&& insn_operand_matches (icode, 2, reg3);
}
+/* Return whether the backend can emit a vec_extract instruction with
+ a non-constant index. */
+bool
+can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
+{
+ if (!VECTOR_MODE_P (vec_mode))
+ return false;
+
+ rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
+ rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
+
+ enum insn_code icode = convert_optab_handler (vec_extract_optab,
+ vec_mode, extr_mode);
+
+ const struct insn_data_d *data = &insn_data[icode];
+ machine_mode idx_mode = data->operand[2].mode;
+
+ rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
+
+ return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
+ && insn_operand_matches (icode, 1, reg2)
+ && insn_operand_matches (icode, 2, reg3);
+}
+
/* This function is called when we are going to emit a compare instruction that
compares the values found in X and Y, using the rtl operator COMPARISON.
@@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
/* Return whether the backend can emit vector set instructions for inserting
element into vector at variable index position. */
extern bool can_vec_set_var_idx_p (machine_mode);
+extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
machine_mode, int);