Implement Bit-field lowering
Checks
Commit Message
From: Naveen H S <naveenh@marvell.com>
This patch adds lowering bit-field and opposite endian accesses pass.
The patch addresses many issues in:-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466
2023-07-14 Andrew Pinski <apinski@marvell.com>
Co-authored-by: Naveen H S <naveenh@marvell.com>
gcc/ChangeLog:
* Makefile.in (OBJS): Add gimple-lower-accesses.o.
* gimple-lower-accesses.cc: New file.
* passes.def (pass_lower_accesses): Add.
* tree-pass.h (make_pass_lower_accesses): Define.
gcc/testsuite/ChangeLog:
* gcc.dg/store_merging_14.c: Modify the pattern found as per new code.
* gcc.dg/store_merging_16.c: Likewise.
* gcc.dg/store_merging_20.c: Likewise.
* gcc.dg/store_merging_21.c: Likewise.
* gcc.dg/store_merging_24.c: Likewise.
* gcc.dg/store_merging_25.c: Likewise.
* gcc.dg/store_merging_6.c: Likewise.
* gcc.dg/tree-ssa/20030729-1.c: Likewise.
* gcc.dg/tree-ssa/20030814-6.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-14.c: Likewise.
* gcc.dg/tree-ssa/20030714-1.c: Remove.
---
gcc/Makefile.in | 1 +
gcc/gimple-lower-accesses.cc | 463 ++++++++++++++++++
gcc/passes.def | 4 +
gcc/testsuite/gcc.dg/store_merging_10.c | 2 +-
gcc/testsuite/gcc.dg/store_merging_14.c | 2 +-
gcc/testsuite/gcc.dg/store_merging_16.c | 4 +-
gcc/testsuite/gcc.dg/store_merging_20.c | 2 +-
gcc/testsuite/gcc.dg/store_merging_21.c | 2 +-
gcc/testsuite/gcc.dg/store_merging_24.c | 4 +-
gcc/testsuite/gcc.dg/store_merging_25.c | 4 +-
gcc/testsuite/gcc.dg/store_merging_6.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c | 45 --
gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c | 5 +-
gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c | 5 +-
.../gcc.dg/tree-ssa/loop-interchange-14.c | 2 +-
gcc/tree-pass.h | 1 +
16 files changed, 485 insertions(+), 63 deletions(-)
create mode 100644 gcc/gimple-lower-accesses.cc
delete mode 100644 gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c
Comments
On Fri, Jul 14, 2023 at 7:49 AM naveenh--- via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> From: Naveen H S <naveenh@marvell.com>
>
> This patch adds lowering bit-field and opposite endian accesses pass.
> The patch addresses many issues in:-
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19466
A little bit more description would be nice ...
> 2023-07-14 Andrew Pinski <apinski@marvell.com>
> Co-authored-by: Naveen H S <naveenh@marvell.com>
>
> gcc/ChangeLog:
And some bug references for bugzilla. In fact you don't add any testcases
that are presumably fixed?
> * Makefile.in (OBJS): Add gimple-lower-accesses.o.
> * gimple-lower-accesses.cc: New file.
> * passes.def (pass_lower_accesses): Add.
> * tree-pass.h (make_pass_lower_accesses): Define.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/store_merging_14.c: Modify the pattern found as per new code.
> * gcc.dg/store_merging_16.c: Likewise.
> * gcc.dg/store_merging_20.c: Likewise.
> * gcc.dg/store_merging_21.c: Likewise.
> * gcc.dg/store_merging_24.c: Likewise.
> * gcc.dg/store_merging_25.c: Likewise.
> * gcc.dg/store_merging_6.c: Likewise.
> * gcc.dg/tree-ssa/20030729-1.c: Likewise.
> * gcc.dg/tree-ssa/20030814-6.c: Likewise.
> * gcc.dg/tree-ssa/loop-interchange-14.c: Likewise.
> * gcc.dg/tree-ssa/20030714-1.c: Remove.
> ---
> gcc/Makefile.in | 1 +
> gcc/gimple-lower-accesses.cc | 463 ++++++++++++++++++
> gcc/passes.def | 4 +
> gcc/testsuite/gcc.dg/store_merging_10.c | 2 +-
> gcc/testsuite/gcc.dg/store_merging_14.c | 2 +-
> gcc/testsuite/gcc.dg/store_merging_16.c | 4 +-
> gcc/testsuite/gcc.dg/store_merging_20.c | 2 +-
> gcc/testsuite/gcc.dg/store_merging_21.c | 2 +-
> gcc/testsuite/gcc.dg/store_merging_24.c | 4 +-
> gcc/testsuite/gcc.dg/store_merging_25.c | 4 +-
> gcc/testsuite/gcc.dg/store_merging_6.c | 2 +-
> gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c | 45 --
> gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c | 5 +-
> gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c | 5 +-
> .../gcc.dg/tree-ssa/loop-interchange-14.c | 2 +-
> gcc/tree-pass.h | 1 +
> 16 files changed, 485 insertions(+), 63 deletions(-)
> create mode 100644 gcc/gimple-lower-accesses.cc
> delete mode 100644 gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index c478ec85201..50bd28f4d04 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1338,6 +1338,7 @@ OBJS = \
> $(GIMPLE_MATCH_PD_SEQ_O) \
> gimple-match-exports.o \
> $(GENERIC_MATCH_PD_SEQ_O) \
> + gimple-lower-accesses.o \
> insn-attrtab.o \
> insn-automata.o \
> insn-dfatab.o \
> diff --git a/gcc/gimple-lower-accesses.cc b/gcc/gimple-lower-accesses.cc
> new file mode 100644
> index 00000000000..9d87acfba56
> --- /dev/null
> +++ b/gcc/gimple-lower-accesses.cc
> @@ -0,0 +1,463 @@
> +/* GIMPLE lowering bit-field and opposite endian accesses pass.
> +
> + Copyright (C) 2017-2023 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "rtl.h"
> +#include "tree.h"
> +#include "gimple.h"
> +#include "cfghooks.h"
> +#include "tree-pass.h"
> +#include "ssa.h"
> +#include "fold-const.h"
> +#include "stor-layout.h"
> +#include "tree-eh.h"
> +#include "gimplify.h"
> +#include "gimple-iterator.h"
> +#include "gimplify-me.h"
> +#include "tree-cfg.h"
> +#include "tree-dfa.h"
> +#include "tree-ssa.h"
> +#include "tree-ssa-propagate.h"
> +#include "tree-hasher.h"
> +#include "cfgloop.h"
> +#include "cfganal.h"
> +#include "alias.h"
> +#include "expr.h"
> +#include "tree-pretty-print.h"
> +
> +namespace {
> +
> +class lower_accesses
> +{
> + function *fn;
> +public:
> + lower_accesses (function *f) : fn(f) {}
> + unsigned int execute (void);
> +};
please merge this with the gimple_opt_pass class
> +
> +
> +/* Handle reference to a bitfield EXPR.
> + If BITPOS_P is non-null assume that reference is LHS and set *BITPOS_P
> + to bit position of the field.
> + If REF_P is non-null set it to the memory reference for the encompassing
> + allocation unit.
there is no REF_P argument.
BITSIZE_P and PREVESEP are undocumented.
> + Note *BITPOS_P is suitable only for BIT_FIELD_REF/BIT_FIELD_INSERT. */
> +
> +tree
> +extract_bitfield (tree expr, tree *bitpos_p, tree *bitsize_p,
> + bool *preversep)
> +{
> + tree base, addr, ref;
> + tree base_type;
> + tree bytepos;
> + machine_mode mode1;
> + int unsignedp = false, volatilep = false, reversep = false;
> +
> + poly_int64 bitsize, bitpos;
> + HOST_WIDE_INT ibitsize = 0, ibitpos = 0;
> +
> + poly_uint64 bitregion_start = 0;
> + poly_uint64 bitregion_end = 0;
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "Trying to expand bitfield reference: \n");
> + print_generic_expr (dump_file, expr);
> + fprintf (dump_file, "\n");
> + }
> +
> + base = get_inner_reference (expr, &bitsize, &bitpos, &bytepos, &mode1,
> + &unsignedp, &reversep, &volatilep);
You are talking about BIT_FIELD_INSERT above but are using get_inner_reference
which doesn't work on that. I wonder if you want to just look at the
outermost ref
instead.
> +
> + if (!bitsize.is_constant (&ibitsize)
> + || !bitpos.is_constant (&ibitpos))
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, bitsize or bitnum are non-constants.\n");
> + return NULL;
> + }
> +
> + if (volatilep)
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, volatile.\n\n");
> + return NULL;
> + }
> +
> + /* Make sure bitpos is not negative, it can wreak havoc later. */
> + if (ibitpos < 0)
> + {
> + gcc_assert (bytepos == NULL_TREE);
> + bytepos = size_int (ibitpos >> (BITS_PER_UNIT == 8
> + ? 3 : exact_log2 (BITS_PER_UNIT)));
> + ibitpos &= BITS_PER_UNIT - 1;
> + bitpos = ibitpos;
> + }
> +
> + if (!bytepos)
> + bytepos = size_int (0);
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "base: ");
> + print_generic_expr (dump_file, base);
> + fprintf (dump_file, " orig bitpos: ");
> + print_dec (bitpos, dump_file);
> + fprintf (dump_file, " bytepos: ");
> + print_generic_expr (dump_file, bytepos);
> + fprintf (dump_file, "\n");
> + }
> +
> + get_bit_range (&bitregion_start, &bitregion_end, expr, &bitpos, &bytepos);
> +
> + if (!bitpos.is_constant (&ibitpos))
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, bitpos after get_bit_range is non-constant.\n");
> + return NULL;
> + }
> +
> + int align = TYPE_ALIGN (TREE_TYPE (base));
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "after bit_range bitpos: ");
> + print_dec (bitpos, dump_file);
> + fprintf (dump_file, " bytepos: ");
> + print_generic_expr (dump_file, bytepos);
> + fprintf (dump_file, " bitregion_start: ");
> + print_dec (bitregion_start, dump_file);
> + fprintf (dump_file, " bitregion_end: ");
> + print_dec (bitregion_end, dump_file);
> + fprintf (dump_file, " align: %d word_size: %d.\n", align,
> + BITS_PER_WORD);
> + }
> +
> + scalar_int_mode typemode;
> + int modebitsize;
> +
> + /* If the base was a non-addressable/non-global decl and have a scalar integer
> + mode, then use base instead of finding the best mode. This allows for
> + structs that are done as register not needing to do a double insert. */
> + if (DECL_P (base) && !TREE_ADDRESSABLE (base)
> + && !is_global_var (base)
> + && TREE_CODE (bytepos) == INTEGER_CST
> + && compare_tree_int (bytepos, 0) == 0
> + && DECL_MODE (base) != BLKmode
> + && is_a <scalar_int_mode> (DECL_MODE (base), &typemode))
> + {
> + base_type = build_nonstandard_integer_type (GET_MODE_PRECISION (typemode),
> + UNSIGNED);
> + if (reversep)
> + ibitpos = TYPE_PRECISION (base_type) - ibitpos - ibitsize;
> +
> + *bitpos_p = bitsize_int (ibitpos);
> + *bitsize_p = bitsize_int (ibitsize);
> + *preversep = reversep;
> + ref = build1 (VIEW_CONVERT_EXPR, base_type, base);
> + return ref;
> + }
> +
> + if (!get_best_mode (ibitsize, ibitpos, bitregion_start, bitregion_end,
> + align,
> + BITS_PER_WORD, false, &typemode))
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, get_best_mode return false.\n");
> + return NULL;
> + }
> +
> + if (typemode == VOIDmode)
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, best mode was VOIDmode.\n");
> + return NULL;
> + }
> +
> + modebitsize = GET_MODE_BITSIZE (typemode);
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "best mode: %smode.\n", GET_MODE_NAME (typemode));
> +
> + addr = build_fold_addr_expr (unshare_expr (base));
Note we have entities we cannot take the address from. You also need to mark
things addressable after this. It would be preferable to keep the
original reference
here, you can for example replace a bitfield FIELD_DECL with its representative.
> + addr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (addr),
> + addr,
> + bytepos);
> +
> + base_type = build_nonstandard_integer_type (GET_MODE_PRECISION (typemode),
> + UNSIGNED);
> + bytepos = size_int ((ibitpos / modebitsize) * GET_MODE_SIZE (typemode));
> + ibitpos = ibitpos % modebitsize;
> +
> + /* Handle the case where the field spans two units. */
> + /* FIXME: Handle this case. */
> + if ((ibitpos / modebitsize)
> + != ((ibitpos + ibitsize - 1)) / modebitsize)
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "failed, Spans two units.\n\n");
> + return NULL;
> + }
> +
> + if (reversep)
> + ibitpos = TYPE_PRECISION (base_type) - ibitpos - ibitsize;
> +
> + /* Fetch full unit containing the bitfield. */
> + addr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (addr),
> + addr,
> + bytepos);
> + ref = build2 (MEM_REF, base_type, addr,
> + build_int_cst (reference_alias_ptr_type (expr), 0));
> +
> + /* Tell upstream where the bitfield is located. */
> + *bitpos_p = bitsize_int (ibitpos);
> + *bitsize_p = bitsize_int (ibitsize);
> + *preversep = reversep;
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "base memory unit : ");
> + print_generic_expr (dump_file, ref);
> + fprintf (dump_file, " %s endian ", reversep ? "opposite" : "native" );
> + fprintf (dump_file, " bit pos: ");
> + print_generic_expr (dump_file, *bitpos_p);
> + fprintf (dump_file, " bit size: ");
> + print_generic_expr (dump_file, *bitsize_p);
> + fprintf (dump_file, "\n");
> + }
> + return ref;
> +}
> +
> +/* Create a function call to byteswap with RHS as the argument if possible.
> + Returns NULL if it is not possible. */
What's the reason to lower non-native endian accesses together with the
bitfield lowering change? I think this should be applied separately. What's
the motivation for this?
> +tree
> +create_bswap (tree rhs)
> +{
> + tree fndecl;
> + switch (TYPE_PRECISION (TREE_TYPE (rhs)))
> + {
> + case 8:
> + return rhs;
> + case 16:
> + fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
> + break;
> + case 32:
> + fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
> + break;
> + case 64:
> + fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64);
> + break;
> + default:
> + return NULL;
> + }
> +
> + if (fndecl == NULL)
> + return NULL;
> +
> + tree inner_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
> + rhs = fold_convert (inner_type, rhs);
> + return build_call_expr (fndecl, 1, rhs);
> +}
> +
> +/* Expand the left hand side (LHS) of STMT, placing the new statements
> + at GSI. */
> +bool
> +maybe_expand_lhs (gimple_stmt_iterator *gsi, gimple *stmt, tree lhs)
> +{
> + if (TREE_CODE (lhs) != COMPONENT_REF
> + || !DECL_BIT_FIELD_TYPE (TREE_OPERAND (lhs, 1)))
> + return false;
> +
> + tree var, bitpos, bitsize;
> + tree rhs, newrhs, rhs1;
> + bool reversep = false;
> + gimple_seq seq = NULL;
> +
> + var = extract_bitfield (lhs, &bitpos, &bitsize, &reversep);
> + if (!var)
> + return false;
> +
> + rhs = gimple_assign_rhs1 (stmt);
> +
> + push_gimplify_context (true);
> +
> + if (reversep)
> + {
> + rhs1 = create_bswap (unshare_expr (var));
You are punning things to bigger modes, how can you be sure this is the
correct entitiy to byteswap?
> + /* We cannot create the bswap so don't do the lowering. */
> + if (!rhs1)
> + return false;
> + }
> + else
> + rhs1 = var;
> +
> + newrhs = make_ssa_name (TREE_TYPE (rhs1));
> + gimplify_assign (newrhs, unshare_expr (rhs1), &seq);
>
> + newrhs = fold_build3 (BIT_INSERT_EXPR, TREE_TYPE (newrhs), newrhs, rhs, bitpos);
The BIT_INSERT_EXPR should have the type of the whole object, not that of the
inserted value. 'var' probably?
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> + fprintf (dump_file, "Created bit_insert: ");
> + print_generic_expr (dump_file, newrhs);
> + fprintf (dump_file, "\n");
> + }
> +
> + if (reversep)
> + {
> + newrhs = create_bswap (newrhs);
> + gcc_assert (newrhs);
> + }
> +
> + gimplify_assign (unshare_expr (var), newrhs, &seq);
If you need to gimplify anything you are doing it wrong. Since this
is a lowering
pass we want to see exactly what we generate. It might be also nice to state
the goal of this lowering. Given you handle reverse storage order it could be
1) mimic RTL expansion to work towards the goal of simplifying it
or
2) expose code closer to RTL at the GIMPLE level to allow more GIMPLE
optimization
(what most PRs are about).
I think that 1) is possibly "easier" and the goal should be that all lowered ops
go through RTL expansion without running into {store,extract}_bit_field.
> +
> + pop_gimplify_context (NULL);
> + gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
> +
> + unlink_stmt_vdef (stmt);
> + gsi_remove (gsi, true);
> +
> + return true;
> +}
> +
> +/* Expand the right hand side (LHS) of STMT, placing the new statements
> + at GSI. */
> +bool
> +maybe_expand_rhs (gimple_stmt_iterator *gsi, gimple *stmt, tree rhs)
> +{
> + if (TREE_CODE (rhs) != COMPONENT_REF
> + || !DECL_BIT_FIELD_TYPE (TREE_OPERAND (rhs, 1)))
> + return false;
> +
> + tree var, bitpos, bitsize;
> + tree type = TREE_TYPE (rhs);
> + bool reversep = false;
> + gimple_seq seq = NULL;
> +
> + tree lhs = gimple_assign_lhs (stmt);
> +
> + var = extract_bitfield (rhs, &bitpos, &bitsize, &reversep);
> +
> + if (!var)
> + return false;
> +
> + if (reversep)
> + {
> + var = create_bswap (var);
> + /* We cannot create the bswap so don't do the lowering. */
> + if (!var)
> + return false;
> + }
> +
> + push_gimplify_context (true);
> +
> + tree newrhs = make_ssa_name (TREE_TYPE (var));
> + gimplify_assign (newrhs, var, &seq);
> + var = newrhs;
> +
> + var = fold_build3 (BIT_FIELD_REF, type, var, bitsize, bitpos);
> +
> + gimplify_assign (lhs, var, &seq);
> + pop_gimplify_context (NULL);
> + gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
> +
> + unlink_stmt_vdef (stmt);
> + gsi_remove (gsi, true);
> +
> + return true;
> +}
> +
> +/* Execute the lowering, lower all bit-field accesses that is possible.
> + FIXME: packed bit-fields are not handled.
> + FIXME: Reversed bit-fields that have more than 64bit underlying types are
> + not handled. */
> +unsigned int
> +lower_accesses::execute (void)
> +{
> + basic_block bb;
> + bool updated = false;
> +
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> + gsi_next (&gsi))
> + {
> + gimple *stmt = gsi_stmt (gsi);
> +
> + /* Don't lower accesses which can throw. */
> + if (stmt_could_throw_p (cfun, stmt))
> + continue;
> +
> + if (gimple_code (stmt) != GIMPLE_ASSIGN)
> + continue;
> +
> + tree lhs = gimple_assign_lhs (stmt);
> + if (gimple_store_p (stmt))
> + updated |= maybe_expand_lhs (&gsi, stmt, lhs);
> + else if (gimple_assign_load_p (stmt))
> + updated |= maybe_expand_rhs (&gsi, stmt, gimple_assign_rhs1 (stmt));
> + }
> + }
> +
> +
> + return updated ? TODO_update_ssa | TODO_rebuild_alias : 0;
> +}
> +
> +const pass_data pass_data_lower_accesses =
> +{
> + GIMPLE_PASS, /* type */
> + "lower_accesses", /* name */
> + OPTGROUP_NONE, /* optinfo_flags */
> + TV_NONE, /* tv_id */
> + ( PROP_cfg | PROP_ssa ), /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + 0, /* todo_flags_finish */
> +};
> +
> +class pass_lower_accesses : public gimple_opt_pass
> +{
> +public:
> + pass_lower_accesses (gcc::context *ctxt)
> + : gimple_opt_pass (pass_data_lower_accesses, ctxt)
> + {}
> +
> + /* opt_pass methods: */
> + opt_pass * clone () { return new pass_lower_accesses (m_ctxt); }
> + virtual bool gate (function *) { return true; }
> + virtual unsigned int execute (function *);
> +
> +}; // class pass_lower_accesses
> +
> +unsigned int
> +pass_lower_accesses::execute (function *fn)
> +{
> + return lower_accesses (fn).execute ();
> +}
> +
> +} // anonymous namepsace
> +
> +gimple_opt_pass *
> +make_pass_lower_accesses (gcc::context *ctxt)
> +{
> + return new pass_lower_accesses (ctxt);
> +}
> diff --git a/gcc/passes.def b/gcc/passes.def
> index faa5208b26b..fb1f17b3d9a 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -234,6 +234,10 @@ along with GCC; see the file COPYING3. If not see
> NEXT_PASS (pass_ch);
> NEXT_PASS (pass_lower_complex);
> NEXT_PASS (pass_sra);
> + /* Lower accesses including bit-field access. Note this should be after
> + the last SRA as SRA has better information with bit-field accesses
> + not lowered. */
> + NEXT_PASS (pass_lower_accesses);
In first incarnation I would do this even later, before
pass_store_merging. If the goal is to
eventually clean up RTL expansion then we'd also need to run this at -O0.
> /* The dom pass will also resolve all __builtin_constant_p calls
> that are still there to 0. This has to be done after some
> propagations have already run, but before some more dead code
> diff --git a/gcc/testsuite/gcc.dg/store_merging_10.c b/gcc/testsuite/gcc.dg/store_merging_10.c
> index 53a6a74c56f..350acf361c9 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_10.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_10.c
> @@ -53,4 +53,4 @@ main ()
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
> diff --git a/gcc/testsuite/gcc.dg/store_merging_14.c b/gcc/testsuite/gcc.dg/store_merging_14.c
> index 9310aaf3489..a930c3b7057 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_14.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_14.c
> @@ -214,4 +214,4 @@ main ()
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 9 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
> diff --git a/gcc/testsuite/gcc.dg/store_merging_16.c b/gcc/testsuite/gcc.dg/store_merging_16.c
> index cd83f1c0fe5..4c80c61893e 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_16.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_16.c
> @@ -152,6 +152,6 @@ main ()
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 7 "store-merging" } } */
> -/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 2 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 6 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 1 "store-merging" } } */
> /* { dg-final { scan-tree-dump-times "__builtin_bswap32" 4 "store-merging" } } */
> diff --git a/gcc/testsuite/gcc.dg/store_merging_20.c b/gcc/testsuite/gcc.dg/store_merging_20.c
> index b15582aa162..5beda6d975e 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_20.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_20.c
> @@ -65,4 +65,4 @@ int main (void)
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 3 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
So why do we fail to merge things here? If we now merge earlier we
should instead adjust
the dump scanning to look for the number of stores. Same applies to
other adjusted tests.
Btw, I have a hard time believing this is the only fallout - how did
you test the patch?
Thanks,
Richard.
> diff --git a/gcc/testsuite/gcc.dg/store_merging_21.c b/gcc/testsuite/gcc.dg/store_merging_21.c
> index ec0c8e240b7..302dc244056 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_21.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_21.c
> @@ -38,4 +38,4 @@ void bar2 (struct S2 *s, struct S2 *m, _Bool flag)
> s->size = m->size;
> }
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 4 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
> diff --git a/gcc/testsuite/gcc.dg/store_merging_24.c b/gcc/testsuite/gcc.dg/store_merging_24.c
> index 5291bb9e986..16b9804da57 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_24.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_24.c
> @@ -1,8 +1,8 @@
> /* PR tree-optimization/87859 */
> /* { dg-do run } */
> /* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-store-merging-details" } */
> -/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 19 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> -/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 6 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 7 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
>
> struct S {
> union F {
> diff --git a/gcc/testsuite/gcc.dg/store_merging_25.c b/gcc/testsuite/gcc.dg/store_merging_25.c
> index 96611b5e57b..196af0e2bcd 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_25.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_25.c
> @@ -1,8 +1,8 @@
> /* PR tree-optimization/87859 */
> /* { dg-do run } */
> /* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-store-merging-details" } */
> -/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 14 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> -/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 6 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 3 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
>
> struct S {
> union F {
> diff --git a/gcc/testsuite/gcc.dg/store_merging_6.c b/gcc/testsuite/gcc.dg/store_merging_6.c
> index 314829da6d3..2a4e03124bd 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_6.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_6.c
> @@ -50,4 +50,4 @@ main (void)
> }
>
>
> -/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c
> deleted file mode 100644
> index 9f0daac22dc..00000000000
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030714-1.c
> +++ /dev/null
> @@ -1,45 +0,0 @@
> -/* { dg-do compile } */
> -/* { dg-options "-O1 -fdump-tree-dom2" } */
> -
> -struct rtx_def;
> -typedef struct rtx_def *rtx;
> -enum rtx_code
> -{
> - REG,
> - LAST_AND_UNUSED_RTX_CODE = 256
> -};
> -typedef union rtunion_def rtunion;
> -struct rtx_def
> -{
> - enum rtx_code code:16;
> - unsigned frame_related:1;
> -};
> -
> -static rtx
> -find_base_value (src)
> - rtx src;
> -{
> - rtx temp;
> - rtx src_0, src_2;
> - rtx src_1, src_3;
> -
> - if ((src_0->code == REG) && (({src_2;})->frame_related))
> - return find_base_value (src_0);
> - if ((src_1->code == REG) && (({ src_3;})->frame_related))
> - return find_base_value (src_1);
> - if (src_0->code == REG)
> - find_base_value (src_0);
> - if (src_1->code == REG)
> - find_base_value (src_1);
> -}
> -
> -rtx
> -find_base_value_wrapper (src)
> - rtx src;
> -{
> - return find_base_value (src);
> -}
> -
> -/* There should be no casts to short unsigned int. */
> -/* { dg-final { scan-tree-dump-times "\\(short unsigned int\\)" 0 "dom2"} } */
> -
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c
> index 59d3a1baffc..0d9d4853124 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030729-1.c
> @@ -47,7 +47,6 @@ readonly_fields_p (type)
> always produce the same result. */
> /* { dg-final { scan-tree-dump-times "\\(unsigned int\\)" 0 "dom2"} } */
>
> -/* There should be one load of ->common.code. We currently fail this
> - because we load from ->common.code using different types. */
> -/* { dg-final { scan-tree-dump-times "common\.code" 1 "dom2"} } */
> +/* There should be no load of ->common.code. */
> +/* { dg-final { scan-tree-dump-times "common\.code" 0 "dom2"} } */
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c b/gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c
> index e4b8d43d299..cca6bbda4de 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030814-6.c
> @@ -38,6 +38,5 @@ foo (t, set)
> else
> return 0;
> }
> -/* There should be precisely one load of common.code. If there is
> - more than one, then the dominator optimizations failed. */
> -/* { dg-final { scan-tree-dump-times "common.code" 1 "dom2" } } */
> +/* There should be no load of common.code. */
> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "dom2" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-14.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-14.c
> index 64c761ca55a..b2e8745d61f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-14.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-14.c
> @@ -58,4 +58,4 @@ main (void)
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is interchanged" 1 "linterchange"} } */
> +/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is interchanged" 0 "linterchange"} } */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 6cdaed7d4b2..6972763d9f3 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -422,6 +422,7 @@ extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_complex (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_switch (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_lower_accesses (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt);
> --
> 2.30.2
>
@@ -1338,6 +1338,7 @@ OBJS = \
$(GIMPLE_MATCH_PD_SEQ_O) \
gimple-match-exports.o \
$(GENERIC_MATCH_PD_SEQ_O) \
+ gimple-lower-accesses.o \
insn-attrtab.o \
insn-automata.o \
insn-dfatab.o \
new file mode 100644
@@ -0,0 +1,463 @@
+/* GIMPLE lowering bit-field and opposite endian accesses pass.
+
+ Copyright (C) 2017-2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "tree-eh.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "tree-cfg.h"
+#include "tree-dfa.h"
+#include "tree-ssa.h"
+#include "tree-ssa-propagate.h"
+#include "tree-hasher.h"
+#include "cfgloop.h"
+#include "cfganal.h"
+#include "alias.h"
+#include "expr.h"
+#include "tree-pretty-print.h"
+
+namespace {
+
+class lower_accesses
+{
+ function *fn;
+public:
+ lower_accesses (function *f) : fn(f) {}
+ unsigned int execute (void);
+};
+
+
+/* Handle reference to a bitfield EXPR.
+ If BITPOS_P is non-null assume that reference is LHS and set *BITPOS_P
+ to bit position of the field.
+ If REF_P is non-null set it to the memory reference for the encompassing
+ allocation unit.
+ Note *BITPOS_P is suitable only for BIT_FIELD_REF/BIT_FIELD_INSERT. */
+
+tree
+extract_bitfield (tree expr, tree *bitpos_p, tree *bitsize_p,
+ bool *preversep)
+{
+ tree base, addr, ref;
+ tree base_type;
+ tree bytepos;
+ machine_mode mode1;
+ int unsignedp = false, volatilep = false, reversep = false;
+
+ poly_int64 bitsize, bitpos;
+ HOST_WIDE_INT ibitsize = 0, ibitpos = 0;
+
+ poly_uint64 bitregion_start = 0;
+ poly_uint64 bitregion_end = 0;
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "Trying to expand bitfield reference: \n");
+ print_generic_expr (dump_file, expr);
+ fprintf (dump_file, "\n");
+ }
+
+ base = get_inner_reference (expr, &bitsize, &bitpos, &bytepos, &mode1,
+ &unsignedp, &reversep, &volatilep);
+
+ if (!bitsize.is_constant (&ibitsize)
+ || !bitpos.is_constant (&ibitpos))
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, bitsize or bitnum are non-constants.\n");
+ return NULL;
+ }
+
+ if (volatilep)
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, volatile.\n\n");
+ return NULL;
+ }
+
+ /* Make sure bitpos is not negative, it can wreak havoc later. */
+ if (ibitpos < 0)
+ {
+ gcc_assert (bytepos == NULL_TREE);
+ bytepos = size_int (ibitpos >> (BITS_PER_UNIT == 8
+ ? 3 : exact_log2 (BITS_PER_UNIT)));
+ ibitpos &= BITS_PER_UNIT - 1;
+ bitpos = ibitpos;
+ }
+
+ if (!bytepos)
+ bytepos = size_int (0);
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "base: ");
+ print_generic_expr (dump_file, base);
+ fprintf (dump_file, " orig bitpos: ");
+ print_dec (bitpos, dump_file);
+ fprintf (dump_file, " bytepos: ");
+ print_generic_expr (dump_file, bytepos);
+ fprintf (dump_file, "\n");
+ }
+
+ get_bit_range (&bitregion_start, &bitregion_end, expr, &bitpos, &bytepos);
+
+ if (!bitpos.is_constant (&ibitpos))
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, bitpos after get_bit_range is non-constant.\n");
+ return NULL;
+ }
+
+ int align = TYPE_ALIGN (TREE_TYPE (base));
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "after bit_range bitpos: ");
+ print_dec (bitpos, dump_file);
+ fprintf (dump_file, " bytepos: ");
+ print_generic_expr (dump_file, bytepos);
+ fprintf (dump_file, " bitregion_start: ");
+ print_dec (bitregion_start, dump_file);
+ fprintf (dump_file, " bitregion_end: ");
+ print_dec (bitregion_end, dump_file);
+ fprintf (dump_file, " align: %d word_size: %d.\n", align,
+ BITS_PER_WORD);
+ }
+
+ scalar_int_mode typemode;
+ int modebitsize;
+
+ /* If the base was a non-addressable/non-global decl and have a scalar integer
+ mode, then use base instead of finding the best mode. This allows for
+ structs that are done as register not needing to do a double insert. */
+ if (DECL_P (base) && !TREE_ADDRESSABLE (base)
+ && !is_global_var (base)
+ && TREE_CODE (bytepos) == INTEGER_CST
+ && compare_tree_int (bytepos, 0) == 0
+ && DECL_MODE (base) != BLKmode
+ && is_a <scalar_int_mode> (DECL_MODE (base), &typemode))
+ {
+ base_type = build_nonstandard_integer_type (GET_MODE_PRECISION (typemode),
+ UNSIGNED);
+ if (reversep)
+ ibitpos = TYPE_PRECISION (base_type) - ibitpos - ibitsize;
+
+ *bitpos_p = bitsize_int (ibitpos);
+ *bitsize_p = bitsize_int (ibitsize);
+ *preversep = reversep;
+ ref = build1 (VIEW_CONVERT_EXPR, base_type, base);
+ return ref;
+ }
+
+ if (!get_best_mode (ibitsize, ibitpos, bitregion_start, bitregion_end,
+ align,
+ BITS_PER_WORD, false, &typemode))
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, get_best_mode return false.\n");
+ return NULL;
+ }
+
+ if (typemode == VOIDmode)
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, best mode was VOIDmode.\n");
+ return NULL;
+ }
+
+ modebitsize = GET_MODE_BITSIZE (typemode);
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "best mode: %smode.\n", GET_MODE_NAME (typemode));
+
+ addr = build_fold_addr_expr (unshare_expr (base));
+ addr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (addr),
+ addr,
+ bytepos);
+
+ base_type = build_nonstandard_integer_type (GET_MODE_PRECISION (typemode),
+ UNSIGNED);
+ bytepos = size_int ((ibitpos / modebitsize) * GET_MODE_SIZE (typemode));
+ ibitpos = ibitpos % modebitsize;
+
+ /* Handle the case where the field spans two units. */
+ /* FIXME: Handle this case. */
+ if ((ibitpos / modebitsize)
+ != ((ibitpos + ibitsize - 1)) / modebitsize)
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "failed, Spans two units.\n\n");
+ return NULL;
+ }
+
+ if (reversep)
+ ibitpos = TYPE_PRECISION (base_type) - ibitpos - ibitsize;
+
+ /* Fetch full unit containing the bitfield. */
+ addr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (addr),
+ addr,
+ bytepos);
+ ref = build2 (MEM_REF, base_type, addr,
+ build_int_cst (reference_alias_ptr_type (expr), 0));
+
+ /* Tell upstream where the bitfield is located. */
+ *bitpos_p = bitsize_int (ibitpos);
+ *bitsize_p = bitsize_int (ibitsize);
+ *preversep = reversep;
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "base memory unit : ");
+ print_generic_expr (dump_file, ref);
+ fprintf (dump_file, " %s endian ", reversep ? "opposite" : "native" );
+ fprintf (dump_file, " bit pos: ");
+ print_generic_expr (dump_file, *bitpos_p);
+ fprintf (dump_file, " bit size: ");
+ print_generic_expr (dump_file, *bitsize_p);
+ fprintf (dump_file, "\n");
+ }
+ return ref;
+}
+
+/* Create a function call to byteswap with RHS as the argument if possible.
+ Returns NULL if it is not possible. */
+tree
+create_bswap (tree rhs)
+{
+ tree fndecl;
+ switch (TYPE_PRECISION (TREE_TYPE (rhs)))
+ {
+ case 8:
+ return rhs;
+ case 16:
+ fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
+ break;
+ case 32:
+ fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
+ break;
+ case 64:
+ fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64);
+ break;
+ default:
+ return NULL;
+ }
+
+ if (fndecl == NULL)
+ return NULL;
+
+ tree inner_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
+ rhs = fold_convert (inner_type, rhs);
+ return build_call_expr (fndecl, 1, rhs);
+}
+
+/* Expand the left hand side (LHS) of STMT, placing the new statements
+ at GSI. */
+bool
+maybe_expand_lhs (gimple_stmt_iterator *gsi, gimple *stmt, tree lhs)
+{
+ if (TREE_CODE (lhs) != COMPONENT_REF
+ || !DECL_BIT_FIELD_TYPE (TREE_OPERAND (lhs, 1)))
+ return false;
+
+ tree var, bitpos, bitsize;
+ tree rhs, newrhs, rhs1;
+ bool reversep = false;
+ gimple_seq seq = NULL;
+
+ var = extract_bitfield (lhs, &bitpos, &bitsize, &reversep);
+ if (!var)
+ return false;
+
+ rhs = gimple_assign_rhs1 (stmt);
+
+ push_gimplify_context (true);
+
+ if (reversep)
+ {
+ rhs1 = create_bswap (unshare_expr (var));
+ /* We cannot create the bswap so don't do the lowering. */
+ if (!rhs1)
+ return false;
+ }
+ else
+ rhs1 = var;
+
+ newrhs = make_ssa_name (TREE_TYPE (rhs1));
+ gimplify_assign (newrhs, unshare_expr (rhs1), &seq);
+
+ newrhs = fold_build3 (BIT_INSERT_EXPR, TREE_TYPE (newrhs), newrhs, rhs, bitpos);
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "Created bit_insert: ");
+ print_generic_expr (dump_file, newrhs);
+ fprintf (dump_file, "\n");
+ }
+
+ if (reversep)
+ {
+ newrhs = create_bswap (newrhs);
+ gcc_assert (newrhs);
+ }
+
+ gimplify_assign (unshare_expr (var), newrhs, &seq);
+
+ pop_gimplify_context (NULL);
+ gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
+
+ unlink_stmt_vdef (stmt);
+ gsi_remove (gsi, true);
+
+ return true;
+}
+
+/* Expand the right hand side (LHS) of STMT, placing the new statements
+ at GSI. */
+bool
+maybe_expand_rhs (gimple_stmt_iterator *gsi, gimple *stmt, tree rhs)
+{
+ if (TREE_CODE (rhs) != COMPONENT_REF
+ || !DECL_BIT_FIELD_TYPE (TREE_OPERAND (rhs, 1)))
+ return false;
+
+ tree var, bitpos, bitsize;
+ tree type = TREE_TYPE (rhs);
+ bool reversep = false;
+ gimple_seq seq = NULL;
+
+ tree lhs = gimple_assign_lhs (stmt);
+
+ var = extract_bitfield (rhs, &bitpos, &bitsize, &reversep);
+
+ if (!var)
+ return false;
+
+ if (reversep)
+ {
+ var = create_bswap (var);
+ /* We cannot create the bswap so don't do the lowering. */
+ if (!var)
+ return false;
+ }
+
+ push_gimplify_context (true);
+
+ tree newrhs = make_ssa_name (TREE_TYPE (var));
+ gimplify_assign (newrhs, var, &seq);
+ var = newrhs;
+
+ var = fold_build3 (BIT_FIELD_REF, type, var, bitsize, bitpos);
+
+ gimplify_assign (lhs, var, &seq);
+ pop_gimplify_context (NULL);
+ gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
+
+ unlink_stmt_vdef (stmt);
+ gsi_remove (gsi, true);
+
+ return true;
+}
+
+/* Execute the lowering, lower all bit-field accesses that is possible.
+ FIXME: packed bit-fields are not handled.
+ FIXME: Reversed bit-fields that have more than 64bit underlying types are
+ not handled. */
+unsigned int
+lower_accesses::execute (void)
+{
+ basic_block bb;
+ bool updated = false;
+
+ FOR_EACH_BB_FN (bb, cfun)
+ {
+ for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gimple *stmt = gsi_stmt (gsi);
+
+ /* Don't lower accesses which can throw. */
+ if (stmt_could_throw_p (cfun, stmt))
+ continue;
+
+ if (gimple_code (stmt) != GIMPLE_ASSIGN)
+ continue;
+
+ tree lhs = gimple_assign_lhs (stmt);
+ if (gimple_store_p (stmt))
+ updated |= maybe_expand_lhs (&gsi, stmt, lhs);
+ else if (gimple_assign_load_p (stmt))
+ updated |= maybe_expand_rhs (&gsi, stmt, gimple_assign_rhs1 (stmt));
+ }
+ }
+
+
+ return updated ? TODO_update_ssa | TODO_rebuild_alias : 0;
+}
+
+const pass_data pass_data_lower_accesses =
+{
+ GIMPLE_PASS, /* type */
+ "lower_accesses", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_NONE, /* tv_id */
+ ( PROP_cfg | PROP_ssa ), /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0, /* todo_flags_finish */
+};
+
+class pass_lower_accesses : public gimple_opt_pass
+{
+public:
+ pass_lower_accesses (gcc::context *ctxt)
+ : gimple_opt_pass (pass_data_lower_accesses, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ opt_pass * clone () { return new pass_lower_accesses (m_ctxt); }
+ virtual bool gate (function *) { return true; }
+ virtual unsigned int execute (function *);
+
+}; // class pass_lower_accesses
+
+unsigned int
+pass_lower_accesses::execute (function *fn)
+{
+ return lower_accesses (fn).execute ();
+}
+
+} // anonymous namepsace
+
+gimple_opt_pass *
+make_pass_lower_accesses (gcc::context *ctxt)
+{
+ return new pass_lower_accesses (ctxt);
+}
@@ -234,6 +234,10 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_ch);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
+ /* Lower accesses including bit-field access. Note this should be after
+ the last SRA as SRA has better information with bit-field accesses
+ not lowered. */
+ NEXT_PASS (pass_lower_accesses);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
@@ -53,4 +53,4 @@ main ()
return 0;
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
@@ -214,4 +214,4 @@ main ()
return 0;
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 9 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
@@ -152,6 +152,6 @@ main ()
return 0;
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 7 "store-merging" } } */
-/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 2 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 6 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 1 "store-merging" } } */
/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 4 "store-merging" } } */
@@ -65,4 +65,4 @@ int main (void)
return 0;
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 3 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
@@ -38,4 +38,4 @@ void bar2 (struct S2 *s, struct S2 *m, _Bool flag)
s->size = m->size;
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 4 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
@@ -1,8 +1,8 @@
/* PR tree-optimization/87859 */
/* { dg-do run } */
/* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-store-merging-details" } */
-/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 19 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
-/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 6 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 7 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
struct S {
union F {
@@ -1,8 +1,8 @@
/* PR tree-optimization/87859 */
/* { dg-do run } */
/* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-store-merging-details" } */
-/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 14 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
-/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 6 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump "New sequence of \[23] stores to replace old one of 3 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target i?86-*-* x86_64-*-* } } } */
struct S {
union F {
@@ -50,4 +50,4 @@ main (void)
}
-/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
+/* { dg-final { scan-tree-dump-times "Merging successful" 0 "store-merging" } } */
deleted file mode 100644
@@ -1,45 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-dom2" } */
-
-struct rtx_def;
-typedef struct rtx_def *rtx;
-enum rtx_code
-{
- REG,
- LAST_AND_UNUSED_RTX_CODE = 256
-};
-typedef union rtunion_def rtunion;
-struct rtx_def
-{
- enum rtx_code code:16;
- unsigned frame_related:1;
-};
-
-static rtx
-find_base_value (src)
- rtx src;
-{
- rtx temp;
- rtx src_0, src_2;
- rtx src_1, src_3;
-
- if ((src_0->code == REG) && (({src_2;})->frame_related))
- return find_base_value (src_0);
- if ((src_1->code == REG) && (({ src_3;})->frame_related))
- return find_base_value (src_1);
- if (src_0->code == REG)
- find_base_value (src_0);
- if (src_1->code == REG)
- find_base_value (src_1);
-}
-
-rtx
-find_base_value_wrapper (src)
- rtx src;
-{
- return find_base_value (src);
-}
-
-/* There should be no casts to short unsigned int. */
-/* { dg-final { scan-tree-dump-times "\\(short unsigned int\\)" 0 "dom2"} } */
-
@@ -47,7 +47,6 @@ readonly_fields_p (type)
always produce the same result. */
/* { dg-final { scan-tree-dump-times "\\(unsigned int\\)" 0 "dom2"} } */
-/* There should be one load of ->common.code. We currently fail this
- because we load from ->common.code using different types. */
-/* { dg-final { scan-tree-dump-times "common\.code" 1 "dom2"} } */
+/* There should be no load of ->common.code. */
+/* { dg-final { scan-tree-dump-times "common\.code" 0 "dom2"} } */
@@ -38,6 +38,5 @@ foo (t, set)
else
return 0;
}
-/* There should be precisely one load of common.code. If there is
- more than one, then the dominator optimizations failed. */
-/* { dg-final { scan-tree-dump-times "common.code" 1 "dom2" } } */
+/* There should be no load of common.code. */
+/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "dom2" } } */
@@ -58,4 +58,4 @@ main (void)
return 0;
}
-/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is interchanged" 1 "linterchange"} } */
+/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is interchanged" 0 "linterchange"} } */
@@ -422,6 +422,7 @@ extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_complex (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_switch (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_switch_O0 (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_lower_accesses (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_vector (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_lower_vector_ssa (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_omp_oacc_kernels_decompose (gcc::context *ctxt);