[pushed] aarch64: Vector move fixes for +nosimd

  This patch fixes various issues around the handling of vectors
and (particularly) vector structures with +nosimd.  Previously,
passing and returning structures would trigger an ICE, since:

* we didn't allow the structure modes to be stored in FPRs

* we didn't provide +nosimd move patterns

* splitting the moves into word-sized pieces (the default
  strategy without move patterns) doesn't work because the
  registers are doubleword sized.

The patch is a bit of a hodge-podge since a lot of the handling of
moves, register costs, and register legitimacy is so interconnected.
It didn't seem feasible to split things further.

Some notes:

* The patch recognises vector and tuple modes based on TARGET_FLOAT
  rather than TARGET_SIMD, and instead adds TARGET_SIMD to places
  that really do need the vector ISA.  This is necessary for the
  modes to be handled correctly in register arguments and returns.

* The 64-bit (DREG) STP peephole required TARGET_SIMD but the
  LDP peephole didn't.  I think the LDP one is right, since
  DREG moves could involve GPRs as well as FPRs.

* The patch keeps the existing choices of instructions for
  TARGET_SIMD, just in case they happen to be better than FMOV
  on some uarches.

* Before the patch, +nosimd Q<->Q moves of 128-bit scalars went via
  a GPR, thanks to a secondary reload pattern.  This approach might
  not be ideal, but there's no reason that 128-bit vectors should
  behave differently from 128-bit scalars.  The patch therefore
  extends the current scalar approach to vectors.

* Multi-vector LD1 and ST1 require TARGET_SIMD, so the TARGET_FLOAT
  structure moves need to use LDP/STP and LDR/STR combinations
  instead.  That's also what we do for big-endian even with
  TARGET_SIMD, so most of the code was already there.  The patterns
  for structures of 64-bit vectors are identical, but the patterns
  for structures of 128-bit vectors need to cope with the lack of
  128-bit Q<->Q moves.

  It isn't feasible to move multi-vector tuples via GPRs, so the
  patch moves them via memory instead.  This contaminates the port
  with its first secondary memory reload.

Tested on aarch64-linux-gnu & pushed.

Richard

gcc/

	* config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Use
	TARGET_FLOAT instead of TARGET_SIMD.
	(aarch64_vectorize_related_mode): Restrict ADVSIMD handling to
	TARGET_SIMD.
	(aarch64_hard_regno_mode_ok): Don't allow tuples of 2 64-bit vectors
	in GPRs.
	(aarch64_classify_address): Treat little-endian structure moves
	like big-endian for TARGET_FLOAT && !TARGET_SIMD.
	(aarch64_secondary_memory_needed): New function.
	(aarch64_secondary_reload): Handle 128-bit Advanced SIMD vectors
	in the same way as TF, TI and TD.
	(aarch64_rtx_mult_cost): Restrict ADVSIMD handling to TARGET_SIMD.
	(aarch64_rtx_costs): Likewise.
	(aarch64_register_move_cost): Treat a pair of 64-bit vectors
	separately from a single 128-bit vector.  Handle the cost implied
	by aarch64_secondary_memory_needed.
	(aarch64_simd_valid_immediate): Restrict ADVSIMD handling to
	TARGET_SIMD.
	(aarch64_expand_vec_perm_const_1): Likewise.
	(TARGET_SECONDARY_MEMORY_NEEDED): New macro.
	* config/aarch64/iterators.md (VTX): New iterator.
	* config/aarch64/aarch64.md (arches): Add fp_q as a synonym of simd.
	(arch_enabled): Adjust accordingly.
	(@aarch64_reload_mov<TX:mode>): Extend to...
	(@aarch64_reload_mov<VTX:mode>): ...this.
	* config/aarch64/aarch64-simd.md (mov<mode>): Require TARGET_FLOAT
	rather than TARGET_SIMD.
	(movmisalign<mode>): Likewise.
	(load_pair<DREG:mode><DREG2:mode>): Likewise.
	(vec_store_pair<DREG:mode><DREG2:mode>): Likewise.
	(load_pair<VQ:mode><VQ2:mode>): Likewise.
	(vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
	(@aarch64_split_simd_mov<mode>): Likewise.
	(aarch64_get_low<mode>): Likewise.
	(aarch64_get_high<mode>): Likewise.
	(aarch64_get_half<mode>): Likewise.  Canonicalize to a move for
	lowpart extracts.
	(*aarch64_simd_mov<VDMOV:mode>): Require TARGET_FLOAT rather than
	TARGET_SIMD.  Use different w<-w and r<-w instructions for
	!TARGET_SIMD.  Disable immediate moves for !TARGET_SIMD but
	add an alternative specifically for w<-Z.
	(*aarch64_simd_mov<VQMOV:mode>): Require TARGET_FLOAT rather than
	TARGET_SIMD.  Likewise for the associated define_splits.  Disable
	FPR moves and immediate moves for !TARGET_SIMD but add an alternative
	specifically for w<-Z.
	(aarch64_simd_mov_from_<mode>high): Require TARGET_FLOAT rather than
	TARGET_SIMD.  Restrict the existing alternatives to TARGET_SIMD
	but add a new r<-w one for !TARGET_SIMD.
	(*aarch64_get_high<mode>): New pattern.
	(load_pair_lanes<mode>): Require TARGET_FLOAT rather than TARGET_SIMD.
	(store_pair_lanes<mode>): Likewise.
	(*aarch64_combine_internal<mode>): Likewise.  Restrict existing
	w<-w, w<-r and w<-m alternatives to TARGET_SIMD but add a new w<-r
	alternative for !TARGET_SIMD.
	(*aarch64_combine_internal_be<mode>): Likewise.
	(aarch64_combinez<mode>): Require TARGET_FLOAT rather than TARGET_SIMD.
	Remove bogus arch attribute.
	(*aarch64_combinez_be<mode>): Likewise.
	(@aarch64_vec_concat<mode>): Require TARGET_FLOAT rather than
	TARGET_SIMD.
	(aarch64_combine<mode>): Likewise.
	(aarch64_rev_reglist<mode>): Likewise.
	(mov<mode>): Likewise.
	(*aarch64_be_mov<VSTRUCT_2D:mode>): Extend to TARGET_FLOAT &&
	!TARGET_SIMD, regardless of endianness.  Extend associated
	define_splits in the same way, both for this pattern and the
	ones below.
	(*aarch64_be_mov<VSTRUCT_2Qmode>): Likewise.  Restrict w<-w
	alternative to TARGET_SIMD.
	(*aarch64_be_movoi): Likewise.
	(*aarch64_be_movci): Likewise.
	(*aarch64_be_movxi): Likewise.
	(*aarch64_be_mov<VSTRUCT_4QD:mode>): Extend to TARGET_FLOAT
	&& !TARGET_SIMD, regardless of endianness.  Restrict w<-w alternative
	to TARGET_SIMD for tuples of 128-bit vectors.
	(*aarch64_be_mov<VSTRUCT_4QD:mode>): Likewise.
	* config/aarch64/aarch64-ldpstp.md: Remove TARGET_SIMD condition
	from DREG STP peephole.  Change TARGET_SIMD to TARGET_FLOAT in
	the VQ and VP_2E LDP and STP peepholes.

gcc/testsuite/
	* gcc.target/aarch64/ldp_stp_20.c: New test.
	* gcc.target/aarch64/ldp_stp_21.c: Likewise.
	* gcc.target/aarch64/ldp_stp_22.c: Likewise.
	* gcc.target/aarch64/ldp_stp_23.c: Likewise.
	* gcc.target/aarch64/ldp_stp_24.c: Likewise.
	* gcc.target/aarch64/movv16qi_1.c (gpr_to_gpr): New function.
	* gcc.target/aarch64/movv8qi_1.c (gpr_to_gpr): Likewise.
	* gcc.target/aarch64/movv16qi_2.c: New test.
	* gcc.target/aarch64/movv16qi_3.c: Likewise.
	* gcc.target/aarch64/movv2di_1.c: Likewise.
	* gcc.target/aarch64/movv2x16qi_1.c: Likewise.
	* gcc.target/aarch64/movv2x8qi_1.c: Likewise.
	* gcc.target/aarch64/movv3x16qi_1.c: Likewise.
	* gcc.target/aarch64/movv3x8qi_1.c: Likewise.
	* gcc.target/aarch64/movv4x16qi_1.c: Likewise.
	* gcc.target/aarch64/movv4x8qi_1.c: Likewise.
	* gcc.target/aarch64/movv8qi_2.c: Likewise.
	* gcc.target/aarch64/movv8qi_3.c: Likewise.
	* gcc.target/aarch64/vect_unary_2.c: Likewise.
---
 gcc/config/aarch64/aarch64-ldpstp.md          |  11 +-
 gcc/config/aarch64/aarch64-simd.md            | 199 +++++++++++-------
 gcc/config/aarch64/aarch64.cc                 |  94 ++++++---
 gcc/config/aarch64/aarch64.md                 |  11 +-
 gcc/config/aarch64/iterators.md               |   2 +
 gcc/testsuite/gcc.target/aarch64/ldp_stp_20.c |   7 +
 gcc/testsuite/gcc.target/aarch64/ldp_stp_21.c |   7 +
 gcc/testsuite/gcc.target/aarch64/ldp_stp_22.c |  13 ++
 gcc/testsuite/gcc.target/aarch64/ldp_stp_23.c |  16 ++
 gcc/testsuite/gcc.target/aarch64/ldp_stp_24.c |  16 ++
 gcc/testsuite/gcc.target/aarch64/movv16qi_1.c |  21 ++
 gcc/testsuite/gcc.target/aarch64/movv16qi_2.c |  27 +++
 gcc/testsuite/gcc.target/aarch64/movv16qi_3.c |  30 +++
 gcc/testsuite/gcc.target/aarch64/movv2di_1.c  | 103 +++++++++
 .../gcc.target/aarch64/movv2x16qi_1.c         |  40 ++++
 .../gcc.target/aarch64/movv2x8qi_1.c          |  38 ++++
 .../gcc.target/aarch64/movv3x16qi_1.c         |  44 ++++
 .../gcc.target/aarch64/movv3x8qi_1.c          |  41 ++++
 .../gcc.target/aarch64/movv4x16qi_1.c         |  44 ++++
 .../gcc.target/aarch64/movv4x8qi_1.c          |  42 ++++
 gcc/testsuite/gcc.target/aarch64/movv8qi_1.c  |  15 ++
 gcc/testsuite/gcc.target/aarch64/movv8qi_2.c  |  27 +++
 gcc/testsuite/gcc.target/aarch64/movv8qi_3.c  |  30 +++
 .../gcc.target/aarch64/vect_unary_2.c         |   5 +
 24 files changed, 774 insertions(+), 109 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_stp_20.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_stp_21.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_stp_22.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_stp_23.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_stp_24.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv2di_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv2x16qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv2x8qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv3x16qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv3x8qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv4x16qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv4x8qi_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv8qi_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/movv8qi_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect_unary_2.c

Message ID	mptbkrj65hr.fsf@arm.com
State	New, archived
Headers	Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp2247438wrt; Tue, 13 Sep 2022 01:31:35 -0700 (PDT) X-Google-Smtp-Source: AA6agR5QzKpcivvyF+u20f5rF2XBZ5nGHMZ1tPhQxHVyoPIpvRBcNWgolkevhxCL2V+JJOmLDzTN X-Received: by 2002:a17:907:a06a:b0:77b:9167:b226 with SMTP id ia10-20020a170907a06a00b0077b9167b226mr9562755ejc.421.1663057894864; Tue, 13 Sep 2022 01:31:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663057894; cv=none; d=google.com; s=arc-20160816; b=A/tJDvax4Mn2KdrTucq2TNzkuxrF7H/BLJTQxCMFFQ3sW3H8agayrU0JsgVQVize1z ScD1SZHcb/ZMgOO2vencaSpv+wP7uiiou1fRFlbYijSS/cus61K5EYcktnG9bL4Azp16 enWOmEANuA+qgYiWendNe9gygI8KcVJ2tFWhtVmVwz+bhO1kJu0O5z5HAw3dM0KwUg4F u8r+IJTTb9y6D9Y3VZ7c6FfsDcKI9bYMWT8peXhWnThdG7vQnVJ6ZIuBqth9FewWendO SGqiYuW3E+rmfl1Ct1/CE1ukeB/5lSO7WHMKvygDLG/2FOZ59ZGWIWwv52f98+XlUvAq xz4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:message-id:date:subject:mail-followup-to:to:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=X18QCwtGHkK+2yct7vpaqxx/BNNI8kRXV/0UNHlyZZ4=; b=qTE9KgxhcVIZ14H3IsTNSx7P0z4wCWlcnSQJRd99ztqYM8QMaj5FdV3Hc+7/trNDMY ztEw8Bl0oHVzwwhhdDoxa5TuV0Q02/l+5rAgdjtC9G7dH3+LwXEr4/mU5J21BrAC3eX0 wrPMhUvwS39KfR5hmB+10wTDMmCS8c+eraEapU2bFO5CbCAjhJxNqpskB/c3FNPxGxvW Ohfem9KXHGVn5LzfKNDPu4T0mKM6WPRop6/pkUzG3UD1d0t6vib4IAFoRcROWVY5RIhK zjNKTZmaYErrFlpYd3NdqPPVv6ep0gRnQWI3oqhxbARI2aXCxGNjBIETmGIx1swtWjcE 3ngg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=O90GdTFF; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id fi16-20020a056402551000b00447c736ddaasi7943563edb.611.2022.09.13.01.31.34 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Sep 2022 01:31:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=O90GdTFF; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 59E373851165 for <ouuuleilei@gmail.com>; Tue, 13 Sep 2022 08:31:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 59E373851165 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663057893; bh=X18QCwtGHkK+2yct7vpaqxx/BNNI8kRXV/0UNHlyZZ4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=O90GdTFFSHF/cOXWbFq3c+0H+A7HFmKaZdT4+0OmIkrkMXAKm3rQn7lF80XV8IxE7 seQDB+VI9oaqgc6xe0A+xe5WA0oxJ8tmLyb1ma4GDIzaFUij67h815ymGol9D+pe9f OWyxp09wjCZ+RuYT07zWhdNVqygIxWgfbGjySj7s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 4C52C385AE58 for <gcc-patches@gcc.gnu.org>; Tue, 13 Sep 2022 08:30:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4C52C385AE58 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A271D6E for <gcc-patches@gcc.gnu.org>; Tue, 13 Sep 2022 01:30:48 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5D9EC3F73B for <gcc-patches@gcc.gnu.org>; Tue, 13 Sep 2022 01:30:41 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [pushed] aarch64: Vector move fixes for +nosimd Date: Tue, 13 Sep 2022 09:30:40 +0100 Message-ID: <mptbkrj65hr.fsf@arm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-48.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Richard Sandiford <richard.sandiford@arm.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1743842595080811381?= X-GMAIL-MSGID: =?utf-8?q?1743842595080811381?=
Series	[pushed] aarch64: Vector move fixes for +nosimd \| [pushed] aarch64: Vector move fixes for +nosimd

[pushed] aarch64: Vector move fixes for +nosimd

Commit Message

Patch