[0/5] Add support for operand-specific alignment requirements

Message ID 20231112145229.2924713-1-richard.sandiford@arm.com
Headers
Series Add support for operand-specific alignment requirements |

Message

Richard Sandiford Nov. 12, 2023, 2:52 p.m. UTC
  SME has various instructions that require aligned register tuples.
However, the associated tuple modes are already widely used and do
not need to be aligned in other contexts.  It therefore isn't
appropriate to force alignment in TARGET_HARD_REGNO_MODE_OK.

There are also strided loads and stores that require:

- (regno & 0x8) == 0 for 2-register tuples
- (regno & 0xc) == 0 for 4-register tuples

Although the requirements for strided loads and stores could be
enforced by C++ conditions on the insn, it's convenient to handle
them in the same way as alignment.

This series of patches therefore adds a way for register constraints
to specify which start registers are valid and which aren't.  Most of
the details are in the covering note to the first patch.

This is clearly changing a performance-sensitive part of the compiler.
I've tried to ensure that the overhead is only small for targets that
use the new feature.  Almost all of the new code gets optimised away
on targets that don't use the feature.

Richard Sandiford (5):
  Add register filter operand to define_register_constraint
  recog: Handle register filters
  lra: Handle register filters
  ira: Handle register filters
  Add an aligned_register_operand predicate

 gcc/common.md          |  28 ++++++++
 gcc/doc/md.texi        |  41 +++++++++++-
 gcc/doc/tm.texi        |   3 +-
 gcc/doc/tm.texi.in     |   3 +-
 gcc/genconfig.cc       |   2 +
 gcc/genpreds.cc        | 146 ++++++++++++++++++++++++++++++++++++++++-
 gcc/gensupport.cc      |  48 +++++++++++++-
 gcc/gensupport.h       |   3 +
 gcc/ira-build.cc       |   8 +++
 gcc/ira-color.cc       |  10 +++
 gcc/ira-int.h          |  14 ++++
 gcc/ira-lives.cc       |  61 +++++++++++++++++
 gcc/lra-constraints.cc |  13 +++-
 gcc/recog.cc           |  14 +++-
 gcc/recog.h            |  24 ++++++-
 gcc/reginfo.cc         |   5 ++
 gcc/rtl.def            |   6 +-
 gcc/target-globals.cc  |   6 +-
 gcc/target-globals.h   |   3 +
 19 files changed, 421 insertions(+), 17 deletions(-)
  

Comments

Vladimir Makarov Nov. 14, 2023, 12:01 a.m. UTC | #1
On 11/12/23 09:52, Richard Sandiford wrote:
> SME has various instructions that require aligned register tuples.
> However, the associated tuple modes are already widely used and do
> not need to be aligned in other contexts.  It therefore isn't
> appropriate to force alignment in TARGET_HARD_REGNO_MODE_OK.
>
> There are also strided loads and stores that require:
>
> - (regno & 0x8) == 0 for 2-register tuples
> - (regno & 0xc) == 0 for 4-register tuples
>
> Although the requirements for strided loads and stores could be
> enforced by C++ conditions on the insn, it's convenient to handle
> them in the same way as alignment.
>
> This series of patches therefore adds a way for register constraints
> to specify which start registers are valid and which aren't.  Most of
> the details are in the covering note to the first patch.
>
> This is clearly changing a performance-sensitive part of the compiler.
> I've tried to ensure that the overhead is only small for targets that
> use the new feature.  Almost all of the new code gets optimised away
> on targets that don't use the feature.
>
> Richard Sandiford (5):
>    Add register filter operand to define_register_constraint
>    recog: Handle register filters
>    lra: Handle register filters
>    ira: Handle register filters
>    Add an aligned_register_operand predicate
>
>   gcc/common.md          |  28 ++++++++
>   gcc/doc/md.texi        |  41 +++++++++++-
>   gcc/doc/tm.texi        |   3 +-
>   gcc/doc/tm.texi.in     |   3 +-
>   gcc/genconfig.cc       |   2 +
>   gcc/genpreds.cc        | 146 ++++++++++++++++++++++++++++++++++++++++-
>   gcc/gensupport.cc      |  48 +++++++++++++-
>   gcc/gensupport.h       |   3 +
>   gcc/ira-build.cc       |   8 +++
>   gcc/ira-color.cc       |  10 +++
>   gcc/ira-int.h          |  14 ++++
>   gcc/ira-lives.cc       |  61 +++++++++++++++++
>   gcc/lra-constraints.cc |  13 +++-
>   gcc/recog.cc           |  14 +++-
>   gcc/recog.h            |  24 ++++++-
>   gcc/reginfo.cc         |   5 ++
>   gcc/rtl.def            |   6 +-
>   gcc/target-globals.cc  |   6 +-
>   gcc/target-globals.h   |   3 +
>   19 files changed, 421 insertions(+), 17 deletions(-)
>
Collecting all occurrence constraints for IRA probably might result in 
worse allocation (when pseudo is spilled because of this) in comparison 
with using wider hard reg set and generating reload insns for some 
pseudo occurrences requiring stricter constraints.  Regional RA 
mitigates this issue.  In any case IRA changes is an improvement in 
comparison with using only hard_regno_mode_ok.  Using smaller 
constraints in certain cases for pseudos spilled after using the biggest 
constraint is just an idea for further RA improvement for targets using 
the filters. The only question is it worth to implement.

All IRA/LRA/reginfo patches are OK for me.  IMHO other changes are 
pretty strait forward not to ask somebody to review them.

Thank you, Richard.