From patchwork Fri Sep 22 10:56:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143351 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4566586rwb; Fri, 22 Sep 2023 04:00:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFwO9I1jBABEM/P3i4BLsAoVNYWwKSIo3Fzr4HENanYXo8/Otjy9FNxFBnqJAYFqKK4GGav X-Received: by 2002:a05:6402:34d6:b0:531:14c4:ae30 with SMTP id w22-20020a05640234d600b0053114c4ae30mr3512369edc.0.1695380416787; Fri, 22 Sep 2023 04:00:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380416; cv=none; d=google.com; s=arc-20160816; b=wO5OEJrJYiQ+zrnL0cKJ4vCsQPKUWZ4IT4APgOz91cuXlwA9p0kZACKpJZ1o3Ofwj7 ZZ1L7BnABRDgiUiu+iVcpV71iQHSHkWfl9VCh6cxxNJfCNp9mJNU+vuZYoQa/YE1hvUH aQUzChmaO7IzyjTh9qTMqtS/KsQvgvn292Ca5vhnfoZ5WHx2/12XKHzyTiVc/bSmf9Kt eyVWHClkY4Flo0DX8dXSFDygOj6fKzD6B2vsbnt5I/qgEZNyQj+jFBybX94bPMz39R4u uEpvSEUt3rnX/2fiwlRZgNz2ePQXyl55eP6ZTvg+YiCPaUEKI2Had9LWEAc6A0vJXjIJ wawQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=/VRpeNBTLEE83D0QqhZHhA0mMY1zc1Lt6zivlv+YiOE=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=VNuzXE/Airqggamliy7eWVVKhKt8mzj2AjdSacuX6DZHeWW5WYMXRqc3mhDdqUIy1R f5D944sUkvwhIz0T62R1tWqRHwBQo4TSmAxbuS4xQItPfYDm4h7YX+AQu3+UeQRnOQQ5 D0Fmo9pDHzaz9Z4tDnR5I4ldZQz5PYrqyFU595Jr7CZMfKGahuF6/XDptWNdjISH3v8C ncQQm4urZwoYI94DWDwQzf6cJb/VtAlEvcuaxV4WmXQN4jL+znBUu1PZVmUtTDPeL8da VD7O01vpaFd85aMAOkBZQ0Wl5ZzOQ2ZRPycbPwHyl2tUzGTZYsdwjCQpzsgWagHjw2H+ 05LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kEEf86XP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bf1-20020a0564021a4100b005315b991087si2959508edb.332.2023.09.22.04.00.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:00:16 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kEEf86XP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7A53A385CC89 for ; Fri, 22 Sep 2023 10:57:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 967823858C41 for ; Fri, 22 Sep 2023 10:56:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 967823858C41 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380199; x=1726916199; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vg6BPDT4IrW1QPMocm3h6Bjj2OVG1DYuAVdaYcnrVls=; b=kEEf86XPwftHoLtUcs71FmRLvcocrmerUZCXDUzNk8cuvt44iPn+x9tv WWlm6z2BkI/cT7vlO8HZUWUA5xsCdrlIyTT8lMdsycpNlKvEtDj0Ifg5x JbCLEtayIfEnBvOUbLsEz2LT1HORrw7vAv9+iE710g2xEswWsF1d6eBez lBtdszXgLd7suloQ/lqbcvr+/qVWww2y7Y9jxa/qLAOgKyjaEz+625OQb 9/jB7U160F6eY+ab80hOXUAT/zkISr7r52MwND7fkH6rPtGgOH8Y2K7xa CTxMpORAQ2fIieh+ZjejmQiIQOC8hpyaq4vLuX80MFxVWTSYN1rC2FVhS A==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680789" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680789" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615887" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615887" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A07EF1005136; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class Date: Fri, 22 Sep 2023 18:56:19 +0800 Message-Id: <20230922105631.2298849-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735215984034361 X-GMAIL-MSGID: 1777735215984034361 From: Kong Lingling Current reload infrastructure does not support selective base_reg_class for backend insn. Add new macros with insn parameters to base_reg_class for lra/reload usage. gcc/ChangeLog: * addresses.h (base_reg_class): Add insn argument and new macro INSN_BASE_REG_CLASS. (regno_ok_for_base_p_1): Add insn argument and new macro REGNO_OK_FOR_INSN_BASE_P. (regno_ok_for_base_p): Add insn argument and parse to ok_for_base_p_1. * doc/tm.texi: Document INSN_BASE_REG_CLASS and REGNO_OK_FOR_INSN_BASE_P. * doc/tm.texi.in: Ditto. * lra-constraints.cc (process_address_1): Pass insn to base_reg_class. (curr_insn_transform): Ditto. * reload.cc (find_reloads): Ditto. (find_reloads_address): Ditto. (find_reloads_address_1): Ditto. (find_reloads_subreg_address): Ditto. * reload1.cc (maybe_fix_stack_asms): Ditto. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/addresses.h | 19 +++++++++++++++---- gcc/doc/tm.texi | 14 ++++++++++++++ gcc/doc/tm.texi.in | 14 ++++++++++++++ gcc/lra-constraints.cc | 15 +++++++++------ gcc/reload.cc | 30 ++++++++++++++++++------------ gcc/reload1.cc | 2 +- 6 files changed, 71 insertions(+), 23 deletions(-) diff --git a/gcc/addresses.h b/gcc/addresses.h index 3519c241c6d..2c92927bd51 100644 --- a/gcc/addresses.h +++ b/gcc/addresses.h @@ -28,8 +28,12 @@ inline enum reg_class base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, addr_space_t as ATTRIBUTE_UNUSED, enum rtx_code outer_code ATTRIBUTE_UNUSED, - enum rtx_code index_code ATTRIBUTE_UNUSED) + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn *insn ATTRIBUTE_UNUSED = NULL) { +#ifdef INSN_BASE_REG_CLASS + return INSN_BASE_REG_CLASS (insn); +#else #ifdef MODE_CODE_BASE_REG_CLASS return MODE_CODE_BASE_REG_CLASS (MACRO_MODE (mode), as, outer_code, index_code); @@ -44,6 +48,7 @@ base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, return BASE_REG_CLASS; #endif #endif +#endif } /* Wrapper function to unify target macros REGNO_MODE_CODE_OK_FOR_BASE_P, @@ -56,8 +61,12 @@ ok_for_base_p_1 (unsigned regno ATTRIBUTE_UNUSED, machine_mode mode ATTRIBUTE_UNUSED, addr_space_t as ATTRIBUTE_UNUSED, enum rtx_code outer_code ATTRIBUTE_UNUSED, - enum rtx_code index_code ATTRIBUTE_UNUSED) + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn* insn ATTRIBUTE_UNUSED = NULL) { +#ifdef REGNO_OK_FOR_INSN_BASE_P + return REGNO_OK_FOR_INSN_BASE_P (regno, insn); +#else #ifdef REGNO_MODE_CODE_OK_FOR_BASE_P return REGNO_MODE_CODE_OK_FOR_BASE_P (regno, MACRO_MODE (mode), as, outer_code, index_code); @@ -72,6 +81,7 @@ ok_for_base_p_1 (unsigned regno ATTRIBUTE_UNUSED, return REGNO_OK_FOR_BASE_P (regno); #endif #endif +#endif } /* Wrapper around ok_for_base_p_1, for use after register allocation is @@ -79,12 +89,13 @@ ok_for_base_p_1 (unsigned regno ATTRIBUTE_UNUSED, inline bool regno_ok_for_base_p (unsigned regno, machine_mode mode, addr_space_t as, - enum rtx_code outer_code, enum rtx_code index_code) + enum rtx_code outer_code, enum rtx_code index_code, + rtx_insn *insn = NULL) { if (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0) regno = reg_renumber[regno]; - return ok_for_base_p_1 (regno, mode, as, outer_code, index_code); + return ok_for_base_p_1 (regno, mode, as, outer_code, index_code, insn); } #endif /* GCC_ADDRESSES_H */ diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b0779724d30..5b1e2a11f89 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -2568,6 +2568,13 @@ of an address, @code{ADDRESS} for something that occurs in an index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. @end defmac +@defmac INSN_BASE_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +base register for a specified @var{insn} must belong. This macro is +used when some backend insns may have limited usage of base register +compared with other insns. +@end defmac + @defmac INDEX_REG_CLASS A macro whose definition is the name of the class to which a valid index register must belong. An index register is one used in an @@ -2618,6 +2625,13 @@ corresponding index expression if @var{outer_code} is @code{PLUS}; that appear outside a @code{MEM}, i.e., as an @code{address_operand}. @end defmac +@defmac REGNO_OK_FOR_INSN_BASE_P (@var{num}, @var{insn}) +A C expression which is nonzero if register number @var{num} is +suitable for use as a base register in operand addresses for a specified +@var{insn}. This macro is used when some backend insn may have limited +usage of base register compared with other insns. +@end defmac + @defmac REGNO_OK_FOR_INDEX_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as an index register in operand addresses. It may be diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index d3e18955628..f6e63ad8871 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2150,6 +2150,13 @@ of an address, @code{ADDRESS} for something that occurs in an index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. @end defmac +@defmac INSN_BASE_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +base register for a specified @var{insn} must belong. This macro is +used when some backend insns may have limited usage of base register +compared with other insns. +@end defmac + @defmac INDEX_REG_CLASS A macro whose definition is the name of the class to which a valid index register must belong. An index register is one used in an @@ -2200,6 +2207,13 @@ corresponding index expression if @var{outer_code} is @code{PLUS}; that appear outside a @code{MEM}, i.e., as an @code{address_operand}. @end defmac +@defmac REGNO_OK_FOR_INSN_BASE_P (@var{num}, @var{insn}) +A C expression which is nonzero if register number @var{num} is +suitable for use as a base register in operand addresses for a specified +@var{insn}. This macro is used when some backend insn may have limited +usage of base register compared with other insns. +@end defmac + @defmac REGNO_OK_FOR_INDEX_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as an index register in operand addresses. It may be diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index 3aaa4906999..6dc77af86cd 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -3681,7 +3681,7 @@ process_address_1 (int nop, bool check_only_p, REGNO (*ad.base_term)) != NULL_RTX) ? after : NULL), base_reg_class (ad.mode, ad.as, ad.base_outer_code, - get_index_code (&ad))))) + get_index_code (&ad), curr_insn)))) { change_p = true; if (ad.base_term2 != NULL) @@ -3731,7 +3731,8 @@ process_address_1 (int nop, bool check_only_p, rtx_insn *last = get_last_insn (); int code = -1; enum reg_class cl = base_reg_class (ad.mode, ad.as, - SCRATCH, SCRATCH); + SCRATCH, SCRATCH, + curr_insn); rtx addr = *ad.inner; new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "addr"); @@ -3794,7 +3795,8 @@ process_address_1 (int nop, bool check_only_p, /* index * scale + disp => new base + index * scale, case (1) above. */ enum reg_class cl = base_reg_class (ad.mode, ad.as, PLUS, - GET_CODE (*ad.index)); + GET_CODE (*ad.index), + curr_insn); lra_assert (INDEX_REG_CLASS != NO_REGS); new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "disp"); @@ -3855,7 +3857,7 @@ process_address_1 (int nop, bool check_only_p, *ad.base_term = XEXP (SET_SRC (set), 0); *ad.disp_term = XEXP (SET_SRC (set), 1); cl = base_reg_class (ad.mode, ad.as, ad.base_outer_code, - get_index_code (&ad)); + get_index_code (&ad), curr_insn); regno = REGNO (*ad.base_term); if (regno >= FIRST_PSEUDO_REGISTER && cl != lra_get_allocno_class (regno)) @@ -3899,7 +3901,8 @@ process_address_1 (int nop, bool check_only_p, else { enum reg_class cl = base_reg_class (ad.mode, ad.as, - SCRATCH, SCRATCH); + SCRATCH, SCRATCH, + curr_insn); rtx addr = *ad.inner; new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "addr"); @@ -4649,7 +4652,7 @@ curr_insn_transform (bool check_only_p) push_to_sequence (before); rclass = base_reg_class (GET_MODE (op), MEM_ADDR_SPACE (op), - MEM, SCRATCH); + MEM, SCRATCH, curr_insn); if (GET_RTX_CLASS (code) == RTX_AUTOINC) new_reg = emit_inc (rclass, *loc, *loc, /* This value does not matter for MODIFY. */ diff --git a/gcc/reload.cc b/gcc/reload.cc index 2126bdd117c..72f7e27af15 100644 --- a/gcc/reload.cc +++ b/gcc/reload.cc @@ -3321,7 +3321,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, were handled in find_reloads_address. */ this_alternative[i] = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH); + ADDRESS, SCRATCH, insn); win = 1; badop = 0; break; @@ -3508,7 +3508,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, the address into a base register. */ this_alternative[i] = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH); + ADDRESS, SCRATCH, insn); badop = 0; break; @@ -4018,7 +4018,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, operand_reloadnum[i] = push_reload (XEXP (recog_data.operand[i], 0), NULL_RTX, &XEXP (recog_data.operand[i], 0), (rtx*) 0, - base_reg_class (VOIDmode, as, MEM, SCRATCH), + base_reg_class (VOIDmode, as, MEM, SCRATCH, insn), address_mode, VOIDmode, 0, 0, i, RELOAD_OTHER); rld[operand_reloadnum[i]].inc @@ -4897,7 +4897,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, if (reg_equiv_constant (regno) != 0) { find_reloads_address_part (reg_equiv_constant (regno), loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), GET_MODE (ad), opnum, type, ind_levels); return 1; } @@ -4966,7 +4967,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, /* If we do not have one of the cases above, we must do the reload. */ push_reload (ad, NULL_RTX, loc, (rtx*) 0, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, SCRATCH, insn), GET_MODE (ad), VOIDmode, 0, 0, opnum, type); return 1; } @@ -5123,7 +5124,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, reload the sum into a base reg. That will at least work. */ find_reloads_address_part (ad, loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), GET_MODE (ad), opnum, type, ind_levels); } return ! removed_and; @@ -5203,7 +5205,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, op_index == 0 ? addend : offset_reg); *loc = ad; - cls = base_reg_class (mode, as, MEM, GET_CODE (addend)); + cls = base_reg_class (mode, as, MEM, GET_CODE (addend), insn); find_reloads_address_part (XEXP (ad, op_index), &XEXP (ad, op_index), cls, GET_MODE (ad), opnum, type, ind_levels); @@ -5261,7 +5263,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, } find_reloads_address_part (ad, loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), address_mode, opnum, type, ind_levels); return ! removed_and; } @@ -5513,7 +5516,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, if (context == 1) context_reg_class = INDEX_REG_CLASS; else - context_reg_class = base_reg_class (mode, as, outer_code, index_code); + context_reg_class = base_reg_class (mode, as, outer_code, index_code, + insn); switch (code) { @@ -5738,7 +5742,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, reloadnum = push_reload (tem, tem, &XEXP (x, 0), &XEXP (op1, 0), base_reg_class (mode, as, - code, index_code), + code, index_code, + insn), GET_MODE (x), GET_MODE (x), 0, 0, opnum, RELOAD_OTHER); @@ -5756,7 +5761,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, reloadnum = push_reload (XEXP (op1, 0), XEXP (x, 0), &XEXP (op1, 0), &XEXP (x, 0), base_reg_class (mode, as, - code, index_code), + code, index_code, + insn), GET_MODE (x), GET_MODE (x), 0, 0, opnum, RELOAD_OTHER); @@ -6216,7 +6222,7 @@ find_reloads_subreg_address (rtx x, int opnum, enum reload_type type, { push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0, base_reg_class (GET_MODE (tem), MEM_ADDR_SPACE (tem), - MEM, SCRATCH), + MEM, SCRATCH, insn), GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0, opnum, type); reloaded = 1; } diff --git a/gcc/reload1.cc b/gcc/reload1.cc index 9ba822d1ff7..f41f4a4de22 100644 --- a/gcc/reload1.cc +++ b/gcc/reload1.cc @@ -1382,7 +1382,7 @@ maybe_fix_stack_asms (void) if (insn_extra_address_constraint (cn)) cls = (int) reg_class_subunion[cls] [(int) base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH)]; + ADDRESS, SCRATCH, chain->insn)]; else cls = (int) reg_class_subunion[cls] [reg_class_for_constraint (cn)]; From patchwork Fri Sep 22 10:56:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143347 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4565566rwb; Fri, 22 Sep 2023 03:57:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE+LqvoWbTVz4r/ePFPjUM97h6L0b9w+hj+e+7B/CLcY6szG3xVhaKf514wPdAiD+G58IzZ X-Received: by 2002:aa7:d487:0:b0:522:b112:6254 with SMTP id b7-20020aa7d487000000b00522b1126254mr7320560edr.4.1695380265437; Fri, 22 Sep 2023 03:57:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380265; cv=none; d=google.com; s=arc-20160816; b=IMvQ6pH5HHcF0K+Y9h/e5OwqQd+u/Eu7kxk2S8xDHYSG80MUr6HWgDjpd5boAx+6il kM1dmolvNtYbInOqg1u2JJAPYSoRovjfdtM5J+YoY/1/KRdhb8EkYfhV/4Ry2Wka0xxG yQ2TlP0WmljivxMV3jEXrBIWDZO2FATBq0NS5VhhxB0aEjSV3hSe1DXe7wF492Bw5oB9 2FVCFWS5dw68bl1W908rKtq6OBX3GZckkmDYTKOAQGBQzgJRwb1c5sdp1Z+l8IKYmPws 05JFqj+86dcl8ZzyvDIZhKt2rcdlvtPMFMZyrbqEnpOr/dt+lSfRVTceQVNyH97AhlRy NB0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=dEWfpUIue0vJ/aESz85wp7iWZ6vUgByNuGRFx3x5EtY=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=K6BMl+xvILD8j7jHGGVDrvgSDEUSnQZrcHRbAGS7csQNxMeu33ZO7HqehzUX4WEEJD LmQ5KhUHpaNQp5M89Y/FwwU4msgdwH5UY5Zqik/RY5x4eEpB5h95K2kvix3vo392BOwE 21aeMChTBXq62ZVr8V9hw4gvajfGASjFs7aS1dLMu1S0xoSb9zHJt2gkE13fcudZlZph oHvn1NjrgPZU1289SiIYEyBFNICcKxEvk/CXaDY99tY55w6sCd5AcMBrSQwO/A/cZ/+P kTRBCXZ/6SiDXpyyPsuhP4t/0xclw0XAssHq5NWSBrELP7ufVVU9FHiRACIKxqGi18LZ bMTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C5EEaRYZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e9-20020aa7d7c9000000b0053303a2deabsi3124243eds.200.2023.09.22.03.57.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 03:57:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C5EEaRYZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 989C6386183C for ; Fri, 22 Sep 2023 10:57:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 029123858D33 for ; Fri, 22 Sep 2023 10:56:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 029123858D33 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380198; x=1726916198; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+hdqcnhfWSR7sZ5k9232S16W96z+kJFSzyIu53WpNfo=; b=C5EEaRYZcNBIDenGyrXHHyg7VSa1Un0depokruiNWhHpW9seIPedW5eB xmOsGct1z95Knn1zWlQCRzPu0oxAZVaGLoTGQARobKHLOcme/D/lw2x9i Kx6W/IhPOk4wYmc5HvdrYit/q6dqBxnnnh5uq63+tnNpBnlEpOe/d+prT i9GvL8bDGsD134gvq/ePldEWtrKIvyKHeL6opze5vk7eNlLNfwob+s708 ia7ra0NlI0BsSpTHehd7CH4MTlaYhMwcY45mL+ZPgKs+wHC5N8A2Kj8cK KRIEiOnyGibMQPW8uOsiYn6i7Q3kZ4A71cmWUO4y9QO3M+mjlsA/CydEE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680783" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680783" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615872" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615872" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A39291005137; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 02/13] [APX EGPR] middle-end: Add index_reg_class with insn argument. Date: Fri, 22 Sep 2023 18:56:20 +0800 Message-Id: <20230922105631.2298849-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735057026456886 X-GMAIL-MSGID: 1777735057026456886 Like base_reg_class, INDEX_REG_CLASS also does not support backend insn. Add index_reg_class with insn argument for lra/reload usage. gcc/ChangeLog: * addresses.h (index_reg_class): New wrapper function like base_reg_class. * doc/tm.texi: Document INSN_INDEX_REG_CLASS. * doc/tm.texi.in: Ditto. * lra-constraints.cc (index_part_to_reg): Pass index_class. (process_address_1): Calls index_reg_class with curr_insn and replace INDEX_REG_CLASS with its return value index_cl. * reload.cc (find_reloads_address): Likewise. (find_reloads_address_1): Likewise. Co-authored-by: Kong Lingling Co-authored-by: Hongtao Liu --- gcc/addresses.h | 10 ++++++++++ gcc/doc/tm.texi | 7 +++++++ gcc/doc/tm.texi.in | 7 +++++++ gcc/lra-constraints.cc | 17 +++++++++-------- gcc/reload.cc | 4 ++-- 5 files changed, 35 insertions(+), 10 deletions(-) diff --git a/gcc/addresses.h b/gcc/addresses.h index 2c92927bd51..08bf39cd56c 100644 --- a/gcc/addresses.h +++ b/gcc/addresses.h @@ -51,6 +51,16 @@ base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, #endif } +inline enum reg_class +index_reg_class (rtx_insn *insn ATTRIBUTE_UNUSED = NULL) +{ +#ifdef INSN_INDEX_REG_CLASS + return INSN_INDEX_REG_CLASS (insn); +#else + return INDEX_REG_CLASS; +#endif +} + /* Wrapper function to unify target macros REGNO_MODE_CODE_OK_FOR_BASE_P, REGNO_MODE_OK_FOR_REG_BASE_P, REGNO_MODE_OK_FOR_BASE_P and REGNO_OK_FOR_BASE_P. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 5b1e2a11f89..c566f7a1105 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -2582,6 +2582,13 @@ address where its value is either multiplied by a scale factor or added to another register (as well as added to a displacement). @end defmac +@defmac INSN_INDEX_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +index register for a specified @var{insn} must belong. This macro is +used when some backend insns may have limited usage of index register +compared with other insns. +@end defmac + @defmac REGNO_OK_FOR_BASE_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index f6e63ad8871..3182d0d7c75 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2164,6 +2164,13 @@ address where its value is either multiplied by a scale factor or added to another register (as well as added to a displacement). @end defmac +@defmac INSN_INDEX_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +index register for a specified @var{insn} must belong. This macro is +used when some backend insns may have limited usage of index register +compared with other insns. +@end defmac + @defmac REGNO_OK_FOR_BASE_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses. diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index 6dc77af86cd..0c8e28e0194 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -3399,12 +3399,12 @@ base_plus_disp_to_reg (struct address_info *ad, rtx disp) /* Make reload of index part of address AD. Return the new pseudo. */ static rtx -index_part_to_reg (struct address_info *ad) +index_part_to_reg (struct address_info *ad, enum reg_class index_class) { rtx new_reg; new_reg = lra_create_new_reg (GET_MODE (*ad->index), NULL_RTX, - INDEX_REG_CLASS, NULL, "index term"); + index_class, NULL, "index term"); expand_mult (GET_MODE (*ad->index), *ad->index_term, GEN_INT (get_index_scale (ad)), new_reg, 1); return new_reg; @@ -3659,13 +3659,14 @@ process_address_1 (int nop, bool check_only_p, /* If INDEX_REG_CLASS is assigned to base_term already and isn't to index_term, swap them so to avoid assigning INDEX_REG_CLASS to both when INDEX_REG_CLASS is a single register class. */ + enum reg_class index_cl = index_reg_class (curr_insn); if (ad.base_term != NULL && ad.index_term != NULL - && ira_class_hard_regs_num[INDEX_REG_CLASS] == 1 + && ira_class_hard_regs_num[index_cl] == 1 && REG_P (*ad.base_term) && REG_P (*ad.index_term) - && in_class_p (*ad.base_term, INDEX_REG_CLASS, NULL) - && ! in_class_p (*ad.index_term, INDEX_REG_CLASS, NULL)) + && in_class_p (*ad.base_term, index_cl, NULL) + && ! in_class_p (*ad.index_term, index_cl, NULL)) { std::swap (ad.base, ad.index); std::swap (ad.base_term, ad.index_term); @@ -3689,7 +3690,7 @@ process_address_1 (int nop, bool check_only_p, } if (ad.index_term != NULL && process_addr_reg (ad.index_term, check_only_p, - before, NULL, INDEX_REG_CLASS)) + before, NULL, index_cl)) change_p = true; /* Target hooks sometimes don't treat extra-constraint addresses as @@ -3798,7 +3799,7 @@ process_address_1 (int nop, bool check_only_p, GET_CODE (*ad.index), curr_insn); - lra_assert (INDEX_REG_CLASS != NO_REGS); + lra_assert (index_cl != NO_REGS); new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "disp"); lra_emit_move (new_reg, *ad.disp); *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), @@ -3894,7 +3895,7 @@ process_address_1 (int nop, bool check_only_p, changed pseudo on the equivalent memory and a subreg of the pseudo onto the memory of different mode for which the scale is prohibitted. */ - new_reg = index_part_to_reg (&ad); + new_reg = index_part_to_reg (&ad, index_cl); *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), *ad.base_term, new_reg); } diff --git a/gcc/reload.cc b/gcc/reload.cc index 72f7e27af15..66b484b12fa 100644 --- a/gcc/reload.cc +++ b/gcc/reload.cc @@ -5114,7 +5114,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, /* Reload the displacement into an index reg. We assume the frame pointer or arg pointer is a base reg. */ find_reloads_address_part (XEXP (ad, 1), &XEXP (ad, 1), - INDEX_REG_CLASS, GET_MODE (ad), opnum, + index_reg_class (insn), GET_MODE (ad), opnum, type, ind_levels); return 0; } @@ -5514,7 +5514,7 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, bool reloaded_inner_of_autoinc = false; if (context == 1) - context_reg_class = INDEX_REG_CLASS; + context_reg_class = index_reg_class (insn); else context_reg_class = base_reg_class (mode, as, outer_code, index_code, insn); From patchwork Fri Sep 22 10:56:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143350 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4566452rwb; Fri, 22 Sep 2023 04:00:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEkre5RAhx28UgYdQ/A6uHYIum5F9jJpUiFzQG141nAX9LDV/kfh0yEZAC34n1CUBBfOcrM X-Received: by 2002:a17:907:2cd6:b0:9a1:ca55:d0cb with SMTP id hg22-20020a1709072cd600b009a1ca55d0cbmr3344074ejc.23.1695380404730; Fri, 22 Sep 2023 04:00:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380404; cv=none; d=google.com; s=arc-20160816; b=WqRfrThsldXW4tFritvWNPe+haw+L1pfqlKShxoVm1Gd9bOI/rrnv5yXAaXwfzEsjM bh1ESf1YZB7R4HusXQVdeMPDM/dQA160iwYHllOHW3OX7Ekt79J0WtpM/gBq2/Lm53MC HIMRg/goZBCijRdjy8xaE0FuEL2DIUke8mNXn2BPxLn3k6ZaPFEbBLLyjGtzLRuix4OA LQhFUJUyt1eSScMbPrHeyffcKD6K680FlrwxvkcX4yqygTImGSQm01xI9FiPLHSQ8CtH 3NKqvNarOs9TL9VIcYxX/nNBi+D6+4bsYPRNIPAbN2rrvE/qKF+0QFl/vneN0X7YtNrY eY1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=BNxP/hCPKP3sr1i/Xw4nfOTgZVChXfp8os83rsC+c78=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=iysGWKpTUACGG2hG0cmzegUpOZrv+As1tY2BtEj8LamHzUwSZRgik/tJ8jEZtm2sTv J34KbMV2tzD5mHsAsiaCQGid/EVIduOEVjDnieANpsYSEtbyh9Q1y09VjKcIaXpAAKxN klzueBpVtuAnfbJ6xC2G/x7mM50yRu7F7WRYulIf/B3NSpjbgqcLIZsqSym0CdmxYIJy jgYdR6Z2T4d0YDrHBQ5icf963/JM0pSrkanjKPPqDslA+1kN2hBBSO+WwHpr1HwzjF86 Xs27HT8OUzFW6mUXt9C5hV1pa/zxxChn1hHn5+HoWHBtF6cvWUWrDCaBLW+W1vl0U1jY FNCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SI8gU9z2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id lj4-20020a170906f9c400b0099329b35b84si3067571ejb.425.2023.09.22.04.00.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:00:04 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SI8gU9z2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 40CA63882674 for ; Fri, 22 Sep 2023 10:57:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 1E60B3858CD1 for ; Fri, 22 Sep 2023 10:56:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E60B3858CD1 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380199; x=1726916199; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YpxYOcx5a0AdJs6n9ES2iujMKK2dF5klzguhrQFWTt0=; b=SI8gU9z26b15CLMWNcKyMmE0uLip7arhlcH0PQ5+rXqX0ufxEw5gHSnY 06L5j/VVs4JljewML2+BTyfc7p5h9eUjgP0lOHS1Codv4y1pB1iCgvtxX utnG7IDX+srDqQJoSrNWWs1IiSjQLWza+x3X2Go5/7F/1Feeh1aaiZjjX D64xrPQ4ox/2pNkEG3ypbG/Tfw9c86VsfNVXDerp5a+SlN//OwQHv+lL1 tC8fQ+J/pBXSmGItgXuC52pXuuPwcfq0psllE1ozM/N5UKAvOK0MzMlXG QzU/ku8sQDZFev6KJH31PCLR1oEFtMCKda5mfnWBajGOFn7w38PUcsfqJ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680786" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680786" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615885" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615885" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A638C1005138; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 03/13] [APX_EGPR] Initial support for APX_F Date: Fri, 22 Sep 2023 18:56:21 +0800 Message-Id: <20230922105631.2298849-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735202976905291 X-GMAIL-MSGID: 1777735202976905291 From: Kong Lingling Add -mapx-features= enumeration to separate subfeatures of APX_F. -mapxf is treated same as previous ISA flag, while it sets -mapx-features=apx_all that enables all subfeatures. gcc/ChangeLog: * common/config/i386/cpuinfo.h (XSTATE_APX_F): New macro. (XCR_APX_F_ENABLED_MASK): Likewise. (get_available_features): Detect APX_F under * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_APX_F_SET): New. (OPTION_MASK_ISA2_APX_F_UNSET): Likewise. (ix86_handle_option): Handle -mapxf. * common/config/i386/i386-cpuinfo.h (FEATURE_APX_F): New. * common/config/i386/i386-isas.h: Add entry for APX_F. * config/i386/cpuid.h (bit_APX_F): New. * config/i386/i386.h (bit_APX_F): (TARGET_APX_EGPR, TARGET_APX_PUSH2POP2, TARGET_APX_NDD): New define. * config/i386/i386-opts.h (enum apx_features): New enum. * config/i386/i386-isa.def (APX_F): New DEF_PTA. * config/i386/i386-options.cc (ix86_function_specific_save): Save ix86_apx_features. (ix86_function_specific_restore): Restore it. (ix86_valid_target_attribute_inner_p): Add mapxf. (ix86_option_override_internal): Set ix86_apx_features for PTA and TARGET_APX_F. Also reports error when APX_F is set but not having TARGET_64BIT. * config/i386/i386.opt: (-mapxf): New ISA flag option. (-mapx=): New enumeration option. (apx_features): New enum type. (apx_none): New enum value. (apx_egpr): Likewise. (apx_push2pop2): Likewise. (apx_ndd): Likewise. (apx_all): Likewise. * doc/invoke.texi: Document mapxf. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-1.c: New test. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/common/config/i386/cpuinfo.h | 12 +++++++++++- gcc/common/config/i386/i386-common.cc | 17 +++++++++++++++++ gcc/common/config/i386/i386-cpuinfo.h | 1 + gcc/common/config/i386/i386-isas.h | 1 + gcc/config/i386/cpuid.h | 1 + gcc/config/i386/i386-isa.def | 1 + gcc/config/i386/i386-options.cc | 18 ++++++++++++++++++ gcc/config/i386/i386-opts.h | 8 ++++++++ gcc/config/i386/i386.h | 4 ++++ gcc/config/i386/i386.opt | 25 +++++++++++++++++++++++++ gcc/doc/invoke.texi | 11 +++++++---- gcc/testsuite/gcc.target/i386/apx-1.c | 8 ++++++++ 12 files changed, 102 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-1.c diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h index 24ae0dbf0ac..141d3743316 100644 --- a/gcc/common/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo.h @@ -678,6 +678,7 @@ get_available_features (struct __processor_model *cpu_model, #define XSTATE_HI_ZMM 0x80 #define XSTATE_TILECFG 0x20000 #define XSTATE_TILEDATA 0x40000 +#define XSTATE_APX_F 0x80000 #define XCR_AVX_ENABLED_MASK \ (XSTATE_SSE | XSTATE_YMM) @@ -685,11 +686,13 @@ get_available_features (struct __processor_model *cpu_model, (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) #define XCR_AMX_ENABLED_MASK \ (XSTATE_TILECFG | XSTATE_TILEDATA) +#define XCR_APX_F_ENABLED_MASK XSTATE_APX_F - /* Check if AVX and AVX512 are usable. */ + /* Check if AVX, AVX512 and APX are usable. */ int avx_usable = 0; int avx512_usable = 0; int amx_usable = 0; + int apx_usable = 0; /* Check if KL is usable. */ int has_kl = 0; if ((ecx & bit_OSXSAVE)) @@ -709,6 +712,8 @@ get_available_features (struct __processor_model *cpu_model, } amx_usable = ((xcrlow & XCR_AMX_ENABLED_MASK) == XCR_AMX_ENABLED_MASK); + apx_usable = ((xcrlow & XCR_APX_F_ENABLED_MASK) + == XCR_APX_F_ENABLED_MASK); } #define set_feature(f) \ @@ -922,6 +927,11 @@ get_available_features (struct __processor_model *cpu_model, if (edx & bit_AMX_COMPLEX) set_feature (FEATURE_AMX_COMPLEX); } + if (apx_usable) + { + if (edx & bit_APX_F) + set_feature (FEATURE_APX_F); + } } } diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index 95468b7c405..86596e96ad1 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -123,6 +123,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_SET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_SET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_SET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_APX_F_SET OPTION_MASK_ISA2_APX_F /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same as -msse4.2. */ @@ -309,6 +310,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_UNSET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_UNSET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_UNSET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_APX_F_UNSET OPTION_MASK_ISA2_APX_F /* SSE4 includes both SSE4.1 and SSE4.2. -mno-sse4 should the same as -mno-sse4.1. */ @@ -1341,6 +1343,21 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mapxf: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_APX_F_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_APX_F_SET; + opts->x_ix86_apx_features = apx_all; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_APX_F_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_APX_F_UNSET; + opts->x_ix86_apx_features = apx_none; + } + return true; + case OPT_mfma: if (value) { diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h index 9153b4d0a54..8bf592191ab 100644 --- a/gcc/common/config/i386/i386-cpuinfo.h +++ b/gcc/common/config/i386/i386-cpuinfo.h @@ -261,6 +261,7 @@ enum processor_features FEATURE_SM3, FEATURE_SHA512, FEATURE_SM4, + FEATURE_APX_F, CPU_FEATURE_MAX }; diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h index 2297903a45e..47e0cbd6f5b 100644 --- a/gcc/common/config/i386/i386-isas.h +++ b/gcc/common/config/i386/i386-isas.h @@ -191,4 +191,5 @@ ISA_NAMES_TABLE_START ISA_NAMES_TABLE_ENTRY("sm3", FEATURE_SM3, P_NONE, "-msm3") ISA_NAMES_TABLE_ENTRY("sha512", FEATURE_SHA512, P_NONE, "-msha512") ISA_NAMES_TABLE_ENTRY("sm4", FEATURE_SM4, P_NONE, "-msm4") + ISA_NAMES_TABLE_ENTRY("apxf", FEATURE_APX_F, P_NONE, "-mapxf") ISA_NAMES_TABLE_END diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h index 73c15480350..f3d3a2a1c22 100644 --- a/gcc/config/i386/cpuid.h +++ b/gcc/config/i386/cpuid.h @@ -149,6 +149,7 @@ #define bit_AVXNECONVERT (1 << 5) #define bit_AVXVNNIINT16 (1 << 10) #define bit_PREFETCHI (1 << 14) +#define bit_APX_F (1 << 21) /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */ #define bit_XSAVEOPT (1 << 0) diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def index aeafcf870ac..c581f343339 100644 --- a/gcc/config/i386/i386-isa.def +++ b/gcc/config/i386/i386-isa.def @@ -121,3 +121,4 @@ DEF_PTA(AVXVNNIINT16) DEF_PTA(SM3) DEF_PTA(SHA512) DEF_PTA(SM4) +DEF_PTA(APX_F) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index e47f9ed5d5f..b9727d5f1f3 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -694,6 +694,7 @@ ix86_function_specific_save (struct cl_target_option *ptr, ptr->branch_cost = ix86_branch_cost; ptr->tune_defaulted = ix86_tune_defaulted; ptr->arch_specified = ix86_arch_specified; + ptr->x_ix86_apx_features = opts->x_ix86_apx_features; ptr->x_ix86_isa_flags_explicit = opts->x_ix86_isa_flags_explicit; ptr->x_ix86_isa_flags2_explicit = opts->x_ix86_isa_flags2_explicit; ptr->x_recip_mask_explicit = opts->x_recip_mask_explicit; @@ -832,6 +833,7 @@ ix86_function_specific_restore (struct gcc_options *opts, ix86_prefetch_sse = ptr->prefetch_sse; ix86_tune_defaulted = ptr->tune_defaulted; ix86_arch_specified = ptr->arch_specified; + opts->x_ix86_apx_features = ptr->x_ix86_apx_features; opts->x_ix86_isa_flags_explicit = ptr->x_ix86_isa_flags_explicit; opts->x_ix86_isa_flags2_explicit = ptr->x_ix86_isa_flags2_explicit; opts->x_recip_mask_explicit = ptr->x_recip_mask_explicit; @@ -1109,6 +1111,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], IX86_ATTR_ISA ("sm3", OPT_msm3), IX86_ATTR_ISA ("sha512", OPT_msha512), IX86_ATTR_ISA ("sm4", OPT_msm4), + IX86_ATTR_ISA ("apxf", OPT_mapxf), /* enum options */ IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_), @@ -2080,6 +2083,9 @@ ix86_option_override_internal (bool main_args_p, opts->x_ix86_stringop_alg = no_stringop; } + if (TARGET_APX_F && !TARGET_64BIT) + error ("%<-mapxf%> is not supported for 32-bit code"); + if (TARGET_UINTR && !TARGET_64BIT) error ("%<-muintr%> not supported for 32-bit code"); @@ -2293,6 +2299,14 @@ ix86_option_override_internal (bool main_args_p, SET_TARGET_POPCNT (opts); } + /* Enable apx if apxf or apx_features are not + explicitly set for -march. */ + if (TARGET_64BIT_P (opts->x_ix86_isa_flags) + && ((processor_alias_table[i].flags & PTA_APX_F) != 0) + && !TARGET_EXPLICIT_APX_F_P (opts) + && !OPTION_SET_P (ix86_apx_features)) + opts->x_ix86_apx_features = apx_all; + if ((processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE)) != 0) ix86_prefetch_sse = true; @@ -2444,6 +2458,10 @@ ix86_option_override_internal (bool main_args_p, /* Arrange to set up i386_stack_locals for all functions. */ init_machine_status = ix86_init_machine_status; + /* Override APX flag here if ISA bit is set. */ + if (TARGET_APX_F && !OPTION_SET_P (ix86_apx_features)) + opts->x_ix86_apx_features = apx_all; + /* Validate -mregparm= value. */ if (opts_set->x_ix86_regparm) { diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h index be359f3e3d5..2ec76a16bce 100644 --- a/gcc/config/i386/i386-opts.h +++ b/gcc/config/i386/i386-opts.h @@ -134,4 +134,12 @@ enum lam_type { lam_u57 }; +enum apx_features { + apx_none = 0, + apx_egpr = 1 << 0, + apx_push2pop2 = 1 << 1, + apx_ndd = 1 << 2, + apx_all = apx_egpr | apx_push2pop2 | apx_ndd, +}; + #endif diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 3e8488f2ae8..8c7ed541a8f 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -51,6 +51,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) +#define TARGET_APX_EGPR (ix86_apx_features & apx_egpr) +#define TARGET_APX_PUSH2POP2 (ix86_apx_features & apx_push2pop2) +#define TARGET_APX_NDD (ix86_apx_features & apx_ndd) + #include "config/vxworks-dummy.h" #include "config/i386/i386-opts.h" diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 78b499304a4..d89b5bbc5e8 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1310,3 +1310,28 @@ Enable vectorization for gather instruction. mscatter Target Alias(mtune-ctrl=, use_scatter, ^use_scatter) Enable vectorization for scatter instruction. + +mapxf +Target Mask(ISA2_APX_F) Var(ix86_isa_flags2) Save +Support APX code generation. + +mapx-features= +Target Undocumented Joined Enum(apx_features) EnumSet Var(ix86_apx_features) Init(apx_none) Save + +Enum +Name(apx_features) Type(int) + +EnumValue +Enum(apx_features) String(none) Value(apx_none) Set(1) + +EnumValue +Enum(apx_features) String(egpr) Value(apx_egpr) Set(2) + +EnumValue +Enum(apx_features) String(push2pop2) Value(apx_push2pop2) Set(3) + +EnumValue +Enum(apx_features) String(ndd) Value(apx_ndd) Set(4) + +EnumValue +Enum(apx_features) String(all) Value(apx_all) Set(1) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ba7984bcb7e..afe6b321b14 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1443,7 +1443,7 @@ See RS/6000 and PowerPC Options. -mrdseed -msgx -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 --mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 +-mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 -mapxf -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops -minline-stringops-dynamically -mstringop-strategy=@var{alg} -mkl -mwidekl @@ -33802,6 +33802,9 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @need 200 @opindex msm4 @itemx -msm4 +@need 200 +@opindex mapxf +@itemx -mapxf These switches enable the use of instructions in the MMX, SSE, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG, @@ -33812,9 +33815,9 @@ GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16, ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE, UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512-FP16, AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI, RAOINT, -AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4 or CLDEMOTE extended instruction -sets. Each has a corresponding @option{-mno-} option to disable use of these -instructions. +AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4, APX_F or CLDEMOTE extended +instruction sets. Each has a corresponding @option{-mno-} option to disable +use of these instructions. These extensions are also available as built-in functions: see @ref{x86 Built-in Functions}, for details of the functions enabled and diff --git a/gcc/testsuite/gcc.target/i386/apx-1.c b/gcc/testsuite/gcc.target/i386/apx-1.c new file mode 100644 index 00000000000..4e580ecdf37 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mapxf" } */ +/* { dg-error "'-mapxf' is not supported for 32-bit code" "" { target ia32 } 0 } */ + +void +apx_hanlder () +{ +} From patchwork Fri Sep 22 10:56:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143355 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4568364rwb; Fri, 22 Sep 2023 04:02:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGxVNT4ugx0rCIU/v/ufDT+dnsxFk37szD/4qg3MP1hThv0qKC7u8XmUHqZjvaILB4qYqOh X-Received: by 2002:a17:906:9b8c:b0:9ad:8a96:ad55 with SMTP id dd12-20020a1709069b8c00b009ad8a96ad55mr3499385ejc.14.1695380544905; Fri, 22 Sep 2023 04:02:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380544; cv=none; d=google.com; s=arc-20160816; b=YZ+FyoJl+VaD+1TIFraFCY4/UyaciTzxuqUf9zaFoZXu44fiiqYwx7suB2hZ8kligG AxF700Lf+nc7qKjeVQ1aMXFzHWQqdII9m1U4iVjEDiG90F8xewC89hfxt2ACWxk1R8r8 5DXIlnZZv8wIZtcC4x837XeDHR/YgzK/lByjusqNNMXL8pJEsDQp9H+pzLucRRfylbGT lsXgpyFZyCVM77ReNtnwcDy3Fh+SCMlV/+r0SLZN3lrT9ERqonmYUWkIOx66cw8kgufc l24SFyQji079KnPq0r5a/Nidfm9BKJ/Mvjbfb6bsNzEtD6/EdUs7164ezZQQzitmizcL 9kGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=yrbPeIA+2Q112ExODuTQtEszUyJZa3OwHvKD1dHSHVc=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=G3dFqurw4sTVk16cK5Jnv0DcNGmlVT8YrBtfK2VQcJxIdwvydPnQHzsB2CxlcFe04e z0SNLL42OoSMZcj8lSB0j5T6ehaaNlaus1RcCSpEg27w1cb+RS+GtfCtljneNpgnmJ4B ttAB3zUtqH38edwDOZzAA3zT1skMXKy9n5hL0PnptlETUl1qXb9U8oo0Zjlby1S0NLWE jom5jPOhqS85MAICZB9P3nT2EgeAqyuggI2sH345IoG8m7BVCuq+xjB0RYOYFjRRoMrP TfN7WtQodVr5Roj3vOM3hLA9iqwjK2PQPZvhG/pW79I6T6FgTwIG7vraG3r9t9uHwUl4 V2GQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UEGQBC6p; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id au28-20020a170907093c00b009a633e2fae9si2920014ejc.127.2023.09.22.04.02.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:02:24 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UEGQBC6p; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3B9F0388267A for ; Fri, 22 Sep 2023 10:58:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 12BF63858C53 for ; Fri, 22 Sep 2023 10:56:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12BF63858C53 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380200; x=1726916200; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5LrD9YlXx+5EWb3+Em7MWpKW3QyFGp1/msbrhlfHvYk=; b=UEGQBC6phqpwoQfbemFCGGQD6sESS4u5bAQVyW6AvGfsolx2BPOHTPh+ 6XEi1Kwh1eSKoSxuc23lr4CM6Rm8OXhkvg6IRC0FQX5q21qsj8Hzm9Bxg BOy/dASdt6lprgtI60sfkG6cfX0CjHm6pN+u9gvmnN+gvavko1Z1KizXN 08KvqjJABOR/m730C6QDj59weTtKH9pFAVvlIczSnXFdm0QIc+a5Y9q/g wYMr/qaZMF7hw3memtSp0sMqGMrFo6noamcSf/zEGinO8piVOW/SSKmnC fttCgnPJ3TBggIUmeS9Kk4iE3Emr9od5NM4u79+n4BsA6v/uPySb6YGTw g==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680792" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680792" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615896" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615896" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A9D1C1005139; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 04/13] [APX EGPR] Add 16 new integer general purpose registers Date: Fri, 22 Sep 2023 18:56:22 +0800 Message-Id: <20230922105631.2298849-5-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735350194898198 X-GMAIL-MSGID: 1777735350194898198 From: Kong Lingling Extend GENERAL_REGS with extra r16-r31 registers like REX registers, named as REX2 registers. They will only be enabled under TARGET_APX_EGPR. gcc/ChangeLog: * config/i386/i386-protos.h (x86_extended_rex2reg_mentioned_p): New function prototype. * config/i386/i386.cc (regclass_map): Add mapping for 16 new general registers. (debugger64_register_map): Likewise. (ix86_conditional_register_usage): Clear REX2 register when APX disabled. (ix86_code_end): Add handling for REX2 reg. (print_reg): Likewise. (ix86_output_jmp_thunk_or_indirect): Likewise. (ix86_output_indirect_branch_via_reg): Likewise. (ix86_attr_length_vex_default): Likewise. (ix86_emit_save_regs): Adjust to allow saving r31. (ix86_register_priority): Set REX2 reg priority same as REX. (x86_extended_reg_mentioned_p): Add check for REX2 regs. (x86_extended_rex2reg_mentioned_p): New function. * config/i386/i386.h (CALL_USED_REGISTERS): Add new extended registers. (REG_ALLOC_ORDER): Likewise. (FIRST_REX2_INT_REG): Define. (LAST_REX2_INT_REG): Ditto. (GENERAL_REGS): Add 16 new registers. (INT_SSE_REGS): Likewise. (FLOAT_INT_REGS): Likewise. (FLOAT_INT_SSE_REGS): Likewise. (INT_MASK_REGS): Likewise. (ALL_REGS):Likewise. (REX2_INT_REG_P): Define. (REX2_INT_REGNO_P): Ditto. (GENERAL_REGNO_P): Add REX2_INT_REGNO_P. (REGNO_OK_FOR_INDEX_P): Ditto. (REG_OK_FOR_INDEX_NONSTRICT_P): Add new extended registers. * config/i386/i386.md: Add 16 new integer general registers. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-egprs-names.c: New test. * gcc.target/i386/apx-spill_to_egprs-1.c: Likewise. * gcc.target/i386/apx-interrupt-1.c: Likewise. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 67 ++++++++++-- gcc/config/i386/i386.h | 46 +++++--- gcc/config/i386/i386.md | 18 +++- .../gcc.target/i386/apx-egprs-names.c | 17 +++ .../gcc.target/i386/apx-interrupt-1.c | 102 ++++++++++++++++++ .../gcc.target/i386/apx-spill_to_egprs-1.c | 25 +++++ 7 files changed, 252 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-egprs-names.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-interrupt-1.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 9ffb125fc2b..bd4782800c4 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -64,6 +64,7 @@ extern bool symbolic_reference_mentioned_p (rtx); extern bool extended_reg_mentioned_p (rtx); extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); +extern bool x86_extended_rex2reg_mentioned_p (rtx); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 477e6cecc38..fb1672f0b3d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -169,7 +169,12 @@ enum reg_class const regclass_map[FIRST_PSEUDO_REGISTER] = ALL_SSE_REGS, ALL_SSE_REGS, ALL_SSE_REGS, ALL_SSE_REGS, /* Mask registers. */ ALL_MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS, - MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS + MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS, + /* REX2 registers */ + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, }; /* The "default" register map used in 32bit mode. */ @@ -227,7 +232,10 @@ int const debugger64_register_map[FIRST_PSEUDO_REGISTER] = /* AVX-512 registers 24-31 */ 75, 76, 77, 78, 79, 80, 81, 82, /* Mask registers */ - 118, 119, 120, 121, 122, 123, 124, 125 + 118, 119, 120, 121, 122, 123, 124, 125, + /* rex2 extend interger registers */ + 130, 131, 132, 133, 134, 135, 136, 137, + 138, 139, 140, 141, 142, 143, 144, 145 }; /* Define the register numbers to be used in Dwarf debugging information. @@ -521,6 +529,13 @@ ix86_conditional_register_usage (void) accessible_reg_set &= ~reg_class_contents[ALL_MASK_REGS]; } + + /* If APX is disabled, disable the registers. */ + if (! (TARGET_APX_EGPR && TARGET_64BIT)) + { + for (i = FIRST_REX2_INT_REG; i <= LAST_REX2_INT_REG; i++) + CLEAR_HARD_REG_BIT (accessible_reg_set, i); + } } /* Canonicalize a comparison from one we don't have to one we do have. */ @@ -6188,6 +6203,13 @@ ix86_code_end (void) regno, false); } + for (regno = FIRST_REX2_INT_REG; regno <= LAST_REX2_INT_REG; regno++) + { + if (TEST_HARD_REG_BIT (indirect_thunks_used, regno)) + output_indirect_thunk_function (indirect_thunk_prefix_none, + regno, false); + } + for (regno = FIRST_INT_REG; regno <= LAST_INT_REG; regno++) { char name[32]; @@ -7199,10 +7221,10 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align, static void ix86_emit_save_regs (void) { - unsigned int regno; + int regno; rtx_insn *insn; - for (regno = FIRST_PSEUDO_REGISTER - 1; regno-- > 0; ) + for (regno = FIRST_PSEUDO_REGISTER - 1; regno >= 0; regno--) if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true)) { insn = emit_insn (gen_push (gen_rtx_REG (word_mode, regno))); @@ -13046,7 +13068,7 @@ print_reg (rtx x, int code, FILE *file) /* Irritatingly, AMD extended registers use different naming convention: "r%d[bwd]" */ - if (REX_INT_REGNO_P (regno)) + if (REX_INT_REGNO_P (regno) || REX2_INT_REGNO_P (regno)) { gcc_assert (TARGET_64BIT); switch (msize) @@ -16260,7 +16282,7 @@ ix86_output_jmp_thunk_or_indirect (const char *thunk_name, const int regno) { if (thunk_name != NULL) { - if (REX_INT_REGNO_P (regno) + if ((REX_INT_REGNO_P (regno) || REX2_INT_REGNO_P (regno)) && ix86_indirect_branch_cs_prefix) fprintf (asm_out_file, "\tcs\n"); fprintf (asm_out_file, "\tjmp\t"); @@ -16312,7 +16334,7 @@ ix86_output_indirect_branch_via_reg (rtx call_op, bool sibcall_p) { if (thunk_name != NULL) { - if (REX_INT_REGNO_P (regno) + if ((REX_INT_REGNO_P (regno) || REX_INT_REGNO_P (regno)) && ix86_indirect_branch_cs_prefix) fprintf (asm_out_file, "\tcs\n"); fprintf (asm_out_file, "\tcall\t"); @@ -17069,19 +17091,26 @@ ix86_attr_length_vex_default (rtx_insn *insn, bool has_0f_opcode, for (i = recog_data.n_operands - 1; i >= 0; --i) if (REG_P (recog_data.operand[i])) { - /* REX.W bit uses 3 byte VEX prefix. */ + /* REX.W bit uses 3 byte VEX prefix. + REX2 with vex use extended EVEX prefix length is 4-byte. */ if (GET_MODE (recog_data.operand[i]) == DImode && GENERAL_REG_P (recog_data.operand[i])) return 3 + 1; /* REX.B bit requires 3-byte VEX. Right here we don't know which - operand will be encoded using VEX.B, so be conservative. */ + operand will be encoded using VEX.B, so be conservative. + REX2 with vex use extended EVEX prefix length is 4-byte. */ if (REX_INT_REGNO_P (recog_data.operand[i]) + || REX2_INT_REGNO_P (recog_data.operand[i]) || REX_SSE_REGNO_P (recog_data.operand[i])) reg_only = 3 + 1; } else if (MEM_P (recog_data.operand[i])) { + /* REX2.X or REX2.B bits use 3 byte VEX prefix. */ + if (x86_extended_rex2reg_mentioned_p (recog_data.operand[i])) + return 4; + /* REX.X or REX.B bits use 3 byte VEX prefix. */ if (x86_extended_reg_mentioned_p (recog_data.operand[i])) return 3 + 1; @@ -19518,6 +19547,8 @@ ix86_register_priority (int hard_regno) /* New x86-64 int registers result in bigger code size. Discourage them. */ if (REX_INT_REGNO_P (hard_regno)) return 2; + if (REX2_INT_REGNO_P (hard_regno)) + return 2; /* New x86-64 SSE registers result in bigger code size. Discourage them. */ if (REX_SSE_REGNO_P (hard_regno)) return 2; @@ -22764,7 +22795,23 @@ x86_extended_reg_mentioned_p (rtx insn) { const_rtx x = *iter; if (REG_P (x) - && (REX_INT_REGNO_P (REGNO (x)) || REX_SSE_REGNO_P (REGNO (x)))) + && (REX_INT_REGNO_P (REGNO (x)) || REX_SSE_REGNO_P (REGNO (x)) + || REX2_INT_REGNO_P (REGNO (x)))) + return true; + } + return false; +} + +/* Return true when INSN mentions register that must be encoded using REX2 + prefix. */ +bool +x86_extended_rex2reg_mentioned_p (rtx insn) +{ + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, INSN_P (insn) ? PATTERN (insn) : insn, NONCONST) + { + const_rtx x = *iter; + if (REG_P (x) && REX2_INT_REGNO_P (REGNO (x))) return true; } return false; diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 8c7ed541a8f..215f6b8db55 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -948,7 +948,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/ \ 0, 0, 0, 0, 0, 0, 0, 0, \ /* k0, k1, k2, k3, k4, k5, k6, k7*/ \ - 0, 0, 0, 0, 0, 0, 0, 0 } + 0, 0, 0, 0, 0, 0, 0, 0, \ +/* r16, r17, r18, r19, r20, r21, r22, r23*/ \ + 0, 0, 0, 0, 0, 0, 0, 0, \ +/* r24, r25, r26, r27, r28, r29, r30, r31*/ \ + 0, 0, 0, 0, 0, 0, 0, 0} \ /* 1 for registers not available across function calls. These must include the FIXED_REGISTERS and also any @@ -985,7 +989,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/ \ 1, 1, 1, 1, 1, 1, 1, 1, \ /* k0, k1, k2, k3, k4, k5, k6, k7*/ \ - 1, 1, 1, 1, 1, 1, 1, 1 } + 1, 1, 1, 1, 1, 1, 1, 1, \ +/* r16, r17, r18, r19, r20, r21, r22, r23*/ \ + 1, 1, 1, 1, 1, 1, 1, 1, \ +/* r24, r25, r26, r27, r28, r29, r30, r31*/ \ + 1, 1, 1, 1, 1, 1, 1, 1} \ /* Order in which to allocate registers. Each register must be listed once, even those in FIXED_REGISTERS. List frame pointer @@ -1001,7 +1009,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, \ 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, \ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, \ - 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 } + 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, \ + 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91} /* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order to be rearranged based on a particular function. When using sse math, @@ -1203,6 +1212,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define FIRST_MASK_REG MASK0_REG #define LAST_MASK_REG MASK7_REG +#define FIRST_REX2_INT_REG R16_REG +#define LAST_REX2_INT_REG R31_REG + /* Override this in other tm.h files to cope with various OS lossage requiring a frame pointer. */ #ifndef SUBTARGET_FRAME_POINTER_REQUIRED @@ -1280,7 +1292,9 @@ enum reg_class INDEX_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp */ LEGACY_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp */ GENERAL_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp - %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 + %r16 %r17 %r18 %r19 %r20 %r21 %r22 %r23 + %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1380,7 +1394,7 @@ enum reg_class { 0x7e, 0xff0, 0x0 }, /* TLS_GOTBASE_REGS */ \ { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ - { 0x900ff, 0xff0, 0x0 }, /* GENERAL_REGS */ \ + { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ @@ -1390,13 +1404,13 @@ enum reg_class { 0xff00000, 0xfffff000, 0xf }, /* ALL_SSE_REGS */ \ { 0xf0000000, 0xf, 0x0 }, /* MMX_REGS */ \ { 0xff0ff00, 0xfffff000, 0xf }, /* FLOAT_SSE_REGS */ \ - { 0x9ffff, 0xff0, 0x0 }, /* FLOAT_INT_REGS */ \ - { 0xff900ff, 0xfffffff0, 0xf }, /* INT_SSE_REGS */ \ - { 0xff9ffff, 0xfffffff0, 0xf }, /* FLOAT_INT_SSE_REGS */ \ + { 0x9ffff, 0xff0, 0xffff000 }, /* FLOAT_INT_REGS */ \ + { 0xff900ff, 0xfffffff0, 0xffff00f }, /* INT_SSE_REGS */ \ + { 0xff9ffff, 0xfffffff0, 0xffff00f }, /* FLOAT_INT_SSE_REGS */ \ { 0x0, 0x0, 0xfe0 }, /* MASK_REGS */ \ { 0x0, 0x0, 0xff0 }, /* ALL_MASK_REGS */ \ - { 0x900ff, 0xff0, 0xff0 }, /* INT_MASK_REGS */ \ -{ 0xffffffff, 0xffffffff, 0xfff } /* ALL_REGS */ \ + { 0x900ff, 0xff0, 0xffffff0 }, /* INT_MASK_REGS */ \ +{ 0xffffffff, 0xffffffff, 0xfffffff } /* ALL_REGS */ \ } /* The same information, inverted: @@ -1426,13 +1440,17 @@ enum reg_class #define REX_INT_REGNO_P(N) \ IN_RANGE ((N), FIRST_REX_INT_REG, LAST_REX_INT_REG) +#define REX2_INT_REG_P(X) (REG_P (X) && REX2_INT_REGNO_P (REGNO (X))) +#define REX2_INT_REGNO_P(N) \ + IN_RANGE ((N), FIRST_REX2_INT_REG, LAST_REX2_INT_REG) + #define GENERAL_REG_P(X) (REG_P (X) && GENERAL_REGNO_P (REGNO (X))) #define GENERAL_REGNO_P(N) \ - (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N)) + (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) #define INDEX_REG_P(X) (REG_P (X) && INDEX_REGNO_P (REGNO (X))) #define INDEX_REGNO_P(N) \ - (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N)) + (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) #define ANY_QI_REG_P(X) (REG_P (X) && ANY_QI_REGNO_P (REGNO (X))) #define ANY_QI_REGNO_P(N) \ @@ -1990,7 +2008,9 @@ do { \ "xmm20", "xmm21", "xmm22", "xmm23", \ "xmm24", "xmm25", "xmm26", "xmm27", \ "xmm28", "xmm29", "xmm30", "xmm31", \ - "k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7" } + "k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7", \ + "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", \ + "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31" } #define REGISTER_NAMES HI_REGISTER_NAMES diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index eef8a0e01eb..e3270658cb7 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -464,7 +464,23 @@ (define_constants (MASK5_REG 73) (MASK6_REG 74) (MASK7_REG 75) - (FIRST_PSEUDO_REG 76) + (R16_REG 76) + (R17_REG 77) + (R18_REG 78) + (R19_REG 79) + (R20_REG 80) + (R21_REG 81) + (R22_REG 82) + (R23_REG 83) + (R24_REG 84) + (R25_REG 85) + (R26_REG 86) + (R27_REG 87) + (R28_REG 88) + (R29_REG 89) + (R30_REG 90) + (R31_REG 91) + (FIRST_PSEUDO_REG 92) ]) ;; Insn callee abi index. diff --git a/gcc/testsuite/gcc.target/i386/apx-egprs-names.c b/gcc/testsuite/gcc.target/i386/apx-egprs-names.c new file mode 100644 index 00000000000..445bcf2c250 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-egprs-names.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mapxf -m64" } */ +/* { dg-final { scan-assembler "r31" } } */ +/* { dg-final { scan-assembler "r30" } } */ +/* { dg-final { scan-assembler "r29" } } */ +/* { dg-final { scan-assembler "r28" } } */ +void foo () +{ + register long a __asm ("r31"); + register int b __asm ("r30"); + register short c __asm ("r29"); + register char d __asm ("r28"); + __asm__ __volatile__ ("mov %0, %%rax" : : "r" (a) : "rax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (b) : "eax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (c) : "eax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (d) : "eax"); +} diff --git a/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c new file mode 100644 index 00000000000..441dbf04bf2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c @@ -0,0 +1,102 @@ +/* { dg-do compile } */ +/* { dg-options "-mapxf -m64 -O2 -mgeneral-regs-only -mno-cld -mno-push-args -maccumulate-outgoing-args" } */ + +extern void foo (void *) __attribute__ ((interrupt)); +extern int bar (int); + +void foo (void *frame) +{ + int a,b,c,d,e,f,i; + a = bar (5); + b = bar (a); + c = bar (b); + d = bar (c); + e = bar (d); + f = bar (e); + for (i = 1; i < 10; i++) + { + a += bar (a + i) + bar (b + i) + + bar (c + i) + bar (d + i) + + bar (e + i) + bar (f + i); + } +} +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)ax" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)bx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)cx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)dx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)si" 1 } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%rdi" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r8" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r9" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r10" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r11" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r12" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r13" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r14" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r15" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r16" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r17" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r18" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r19" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r20" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r21" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r22" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r23" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r24" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r25" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r26" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r27" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r28" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r29" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r30" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r31" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 145, -16} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 144, -24} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 143, -32} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 142, -40} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 141, -48} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 140, -56} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 139, -64} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 138, -72} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 137, -80} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 136, -88} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 135, -96} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 134, -104} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 133, -112} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 132, -120} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 131, -128} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 130, -136} 1 } } */ +/* { dg-final { scan-assembler-times ".cfi_restore" 15} } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)ax" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)bx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)cx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)dx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)si" 1 } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%rdi" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r8" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r9" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r10" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r11" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r12" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r13" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r14" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r15" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r16" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r17" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r18" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r19" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r20" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r21" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r22" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r23" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r24" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r25" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r26" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r27" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r28" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r29" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r30" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r31" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "iret" 1 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "iretq" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "\tcld" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c new file mode 100644 index 00000000000..290863d63a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c @@ -0,0 +1,25 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=skylake-avx512 -mapxf -DDTYPE32" } */ + +#include "spill_to_mask-1.c" + +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r16d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r17d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r18d" } } */ +/* { dg-final { scan-assembler "movq\[ \t]+\[^\\n\\r\]*, %r19" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r20d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r21d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r22d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r23d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r24d" } } */ +/* { dg-final { scan-assembler "addl\[ \t]+\[^\\n\\r\]*, %r25d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r26d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r27d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r28d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r29d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r30d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r31d" } } */ +/* { dg-final { scan-assembler-not "knot" } } */ +/* { dg-final { scan-assembler-not "kxor" } } */ +/* { dg-final { scan-assembler-not "kor" } } */ +/* { dg-final { scan-assembler-not "kandn" } } */ From patchwork Fri Sep 22 10:56:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143348 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4565928rwb; Fri, 22 Sep 2023 03:58:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHGtD56pbwuHST1XAqWKMS/B6IC5Iz920If2vk4HBRAnSu3iIBMBqJ0khqtXGOzP/C1Qde4 X-Received: by 2002:a17:906:10cf:b0:9ae:767f:8675 with SMTP id v15-20020a17090610cf00b009ae767f8675mr1194878ejv.40.1695380327074; Fri, 22 Sep 2023 03:58:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380327; cv=none; d=google.com; s=arc-20160816; b=xKalIYfY1m0NZhZpVdw1ri3caOZI58rb+fiKIoxYFge5khudL7RiRBEXoRrhYq9j5m SUaW8zD3n18FfE3CeBWgR8GQ2YYzkqTOf05U34r2lRIXn2CTfoaGVoy4SzgjNvvWltbh CmnZFb2i+27tFQJr7nSkn2XCzb0+vq1AtS190NlS0uL+neUtmGLmyU+L4IoNe+wCon7a cLOvTO8s1etN3YzuwA1PbU0kukXXZc7tUejrMY56KMQlokKTU8Q4NPOeUyLdAJ4+OvOC cCsM2DXivavp1H8v46XcJZoKcFtEkVZPRhfl9atPof2wmFmiJtHr505JQMK4Vd/Xvj54 dt6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=5Bv0xDGhTisXYrKGxRbfJEYDbkH7R4s6WPhCh6UJeco=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=w5VvsaoCVMRQ/dC2FbJPEPw+Icq2DSR+LQz9DCo0JhDdNGnkKtqEdQBn3yeDvCT/yF 26NzXXW4GcIp5zNs/tY6dbrR2mHbwDhNZU19qTT+vHaQdsR3e8wz9xrqnVNAhKRiXSvJ uO91yhPI3Nrmkd6zyG+wMDP2sZG5hIsHDmjaUh68cNlTj8kZ/muBTuEUuu6NPg7NLH/S YpW+QFwmDAVv8Za3CDUMr+Hvbi5Xh/8MiwsdsOUeZVfpjViZYNRSjwJ0YKIOlC2oKOj4 mgSKkG0+R1jtDmIqBSeW/8WUouSShyNDtXF5UT/PzomEnr5orMItROffkI9B/Fe9JU0K Zpgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="hLH0Z/+C"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id kb10-20020a1709070f8a00b0099ced4e20b2si3397431ejc.457.2023.09.22.03.58.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 03:58:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="hLH0Z/+C"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9C1F9385CCBF for ; Fri, 22 Sep 2023 10:57:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 5F5A83858281 for ; Fri, 22 Sep 2023 10:56:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5F5A83858281 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380202; x=1726916202; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OaAPJbqPqCoT5U8cWrAAKHXsej5m+V+tThp5anh1l6A=; b=hLH0Z/+CSn4wo1byZiRe1TDSmjf8xLbXm2xSWo5cLkyAaRDmRHrX7hua axoS1Q4ojasdzQ42bohzZdvgT64M7tsTQHKwGtGH/SKgy5bA1rmEFL51n LTf9ge7+uSsH+ZcLbHHkMrOm0mt/i8lE5ufUjrYh5oPUYPZAsNLnPQGpy p/KCI5A4eu9uSRtoYp5qRR+6QlODoqQf2DJYy5HCzGLcsy6e0QoE332pa SUSti/eIA5y/KXqa+g7Tyn+zYmP2rf69LPHczdIfOXzy3Q39g0lRFplgk 1dtjTQTh5YJ8+BheH294yVqonK66GsFXrlKb143mHqW5/jzkNzYukB54o w==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680797" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680797" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615904" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615904" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:35 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AD2B3100513A; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 05/13] [APX EGPR] Add register and memory constraints that disallow EGPR Date: Fri, 22 Sep 2023 18:56:23 +0800 Message-Id: <20230922105631.2298849-6-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735121830623690 X-GMAIL-MSGID: 1777735121830623690 From: Kong Lingling For APX, as we extended the GENERAL_REG_CLASS, new constraints are needed to restrict insns that cannot adopt EGPR either in its reg or memory operands. We added a series of constraints for general/backend ones that related to GPR usage. All of them are prefixed with "j" to indicate the constraints does not allow EGPR. gcc/ChangeLog: * config/i386/constraints.md (jr): New register constraint that prohibits EGPR. (jR): Constraint that force usage of EGPR. (jm): New memory constraint that prohibits EGPR. (ja): Likewise for Bm constraint. (jb): Likewise for Tv constraint. (j<): New auto-dec memory constraint that prohibits EGPR. (j>): Likewise for ">" constraint. (jo): Likewise for "o" constraint. (jv): Likewise for "V" constraint. (jp): Likewise for "p" constraint. * config/i386/i386.h (enum reg_class): Add new reg class GENERAL_GPR16. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/constraints.md | 59 +++++++++++++++++++++++++++++++++- gcc/config/i386/i386.h | 4 +++ 2 files changed, 62 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index fd490f39110..36c268d7f9b 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -19,7 +19,7 @@ ;;; Unused letters: ;;; H -;;; h j z +;;; j z ;; Integer register constraints. ;; It is not necessary to define 'r' here. @@ -371,3 +371,60 @@ (define_address_constraint "Tv" (define_address_constraint "Ts" "Address operand without segment register" (match_operand 0 "address_no_seg_operand")) + +;; Constraint that force to use EGPR, can only adopt to register class. +(define_register_constraint "jR" "GENERAL_REGS") + +(define_register_constraint "jr" + "TARGET_APX_EGPR ? GENERAL_GPR16 : GENERAL_REGS") + +(define_memory_constraint "jm" + "@internal memory operand without GPR32." + (and (match_operand 0 "memory_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_constraint "j<" + "@internal auto-dec memory operand without GPR32." + (and (and (match_code "mem") + (ior (match_test "GET_CODE (XEXP (op, 0)) == PRE_DEC") + (match_test "GET_CODE (XEXP (op, 0)) == POST_DEC"))) + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_constraint "j>" + "@internal auto-dec memory operand without GPR32." + (and (and (match_code "mem") + (ior (match_test "GET_CODE (XEXP (op, 0)) == PRE_INC") + (match_test "GET_CODE (XEXP (op, 0)) == POST_INC"))) + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_memory_constraint "jo" + "@internal offsetable memory operand without GPR32." + (and (and (match_code "mem") + (match_test "offsettable_nonstrict_memref_p (op)")) + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_constraint "jV" + "@internal non-offsetable memory operand without GPR32." + (and (and (match_code "mem") + (match_test "memory_address_addr_space_p (GET_MODE (op), + XEXP (op, 0), + MEM_ADDR_SPACE (op))") + (not (match_test "offsettable_nonstrict_memref_p (op)"))) + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_address_constraint "jp" + "@internal general address operand without GPR32" + (and (match_test "address_operand (op, VOIDmode)") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_special_memory_constraint "ja" + "@internal vector memory operand without GPR32." + (and (match_operand 0 "vector_memory_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 215f6b8db55..66b8764e82b 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1295,6 +1295,8 @@ enum reg_class %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 %r16 %r17 %r18 %r19 %r20 %r21 %r22 %r23 %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ + GENERAL_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1357,6 +1359,7 @@ enum reg_class "INDEX_REGS", \ "LEGACY_REGS", \ "GENERAL_REGS", \ + "GENERAL_GPR16", \ "FP_TOP_REG", "FP_SECOND_REG", \ "FLOAT_REGS", \ "SSE_FIRST_REG", \ @@ -1395,6 +1398,7 @@ enum reg_class { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ + { 0x900ff, 0xff0, 0x0 }, /* GENERAL_GPR16 */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ From patchwork Fri Sep 22 10:56:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143356 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4568802rwb; Fri, 22 Sep 2023 04:02:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE1jXUEEOeg3x3sdu4bYfKi7Ms+wDrblpnQiGGVqD+8RariHJiJltrHq7D+Ac6IHPNtHBY+ X-Received: by 2002:ac2:4642:0:b0:4fb:780d:2a49 with SMTP id s2-20020ac24642000000b004fb780d2a49mr6863287lfo.5.1695380579511; Fri, 22 Sep 2023 04:02:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380579; cv=none; d=google.com; s=arc-20160816; b=QsFpSeZml6zFprEZS4V2/hXLYIl5ovI+6WZFOzbfIVnQgDBoLYjpuB+h8q3qn95kxX 94VSibPo0jkg+C7vuZo47kyYgq6BFNXfJMk7xQ/lbo1v6z3VI/staAhOq9kv0vc4R3Pe WcSUbbAACXXft0RZxSvinDp8nlEm//0eEG8jiLbfIHcDjdEbgAqHHyLYS0F84oCtPTUt 0FoCPDFiaKoNUNZPxygm1LhZySmpLn6RhMnMKe21us7+l/FlaX9JxNhyn61swR3XuuD1 lHUK9W50ue+gMEOD1OE3ktJHtKY5Xy9+Sb6QKRmzAcX7MbpgWhcguOg0rDSU8FJUeNMy eqyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=IghLCoqHKrp6L1dktmTLLVpR4uRo6CFHG7PsQrEfgXE=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=H/9tBi0euLzGmunGCZPGlSdR4FvFLbdVbQ48DZq3Ol16euCyGT7+KWmmAIb01kuEU4 HgCWNLX1oC2LJnkWIX9auohGJHlLMIc7YH4SmmoWwEz/Muj/SZb7NGZvenqroZGItRcJ Rc8uS3SKOzOfawjbtSLxtK89vFVbUt+BYcpZg316nBAIcqDgO1d1xe2Yl8n/23ezpKI6 0SfQ2eDEI/A095jH94hYUqkVmUe+HJYywnozKZkF1CWl4n6sS8hj1NVjQIJS+VBqmXoL lCLH1ONetGyEP30MxlrVBdelSV6qF7HfH8w6jOuPiK2UM/Fo4s3z4XrjSXurvHmTdxD2 Soog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jGcGGByx; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t1-20020aa7d701000000b0052ff1de03efsi2966392edq.119.2023.09.22.04.02.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:02:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jGcGGByx; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8184D3889800 for ; Fri, 22 Sep 2023 10:58:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 25DBC3857359 for ; Fri, 22 Sep 2023 10:56:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 25DBC3857359 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380202; x=1726916202; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U5ZLvJ4SwCIjPl+J/Skh9hWeQmCYGasr82BA37Hz7Yk=; b=jGcGGByxLcWwpR5WXWPlns81iZEfKVc/HIogf55ekEUdb11MExpi0BTx De8m1kbV2+pZPs2CiVuSghKqS6OROOsC2pWdTE8XzURPqKUnZ9vCM2lR0 8v3fqTEFBPqWhqbeMwimkdqRoxIVKS03+SFvil2BqiAYnmbbl1T4usQIH l7hD71PL6t4tjRhJozge5EC0fhNKAnTLzCXdlwksShdGJdtnEKsvrc4nf /cNKpRkRWLknxnXOf+WWRGqkkCRR9Xoq6rns0plzlCkhDMnUHiCAUcgY1 VqKkgEbln2FZJYQCDgKVw2eQfe5B3xDQ/f3N8YnhASA9eBbv1YXaL2zdG Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680795" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680795" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615901" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615901" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:35 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AF738100513B; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 06/13] [APX EGPR] Add backend hook for base_reg_class/index_reg_class. Date: Fri, 22 Sep 2023 18:56:24 +0800 Message-Id: <20230922105631.2298849-7-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735386604880929 X-GMAIL-MSGID: 1777735386604880929 From: Kong Lingling Add backend helper functions to verify if a rtx_insn can adopt EGPR to its base/index reg of memory operand. The verification rule goes like 1. For asm insn, enable/disable EGPR by ix86_apx_inline_asm_use_gpr32. 2. Disable EGPR for unrecognized insn. 3. If which_alternative is not decided, loop through enabled alternatives and check its attr_gpr32. Only enable EGPR when all enabled alternatives has attr_gpr32 = 1. 4. If which_alternative is decided, enable/disable EGPR by its corresponding attr_gpr32. gcc/ChangeLog: * config/i386/i386-protos.h (ix86_insn_base_reg_class): New prototype. (ix86_regno_ok_for_insn_base_p): Likewise. (ix86_insn_index_reg_class): Likewise. * config/i386/i386.cc (ix86_memory_address_use_extended_reg_class_p): New helper function to scan the insn. (ix86_insn_base_reg_class): New function to choose BASE_REG_CLASS. (ix86_regno_ok_for_insn_base_p): Likewise for base regno. (ix86_insn_index_reg_class): Likewise for INDEX_REG_CLASS. * config/i386/i386.h (INSN_BASE_REG_CLASS): Define. (REGNO_OK_FOR_INSN_BASE_P): Likewise. (INSN_INDEX_REG_CLASS): Likewise. (enum reg_class): Add INDEX_GPR16. (GENERAL_GPR16_REGNO_P): Define. * config/i386/i386.md (gpr32): New attribute. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386-protos.h | 3 ++ gcc/config/i386/i386.cc | 89 +++++++++++++++++++++++++++++++++++ gcc/config/i386/i386.h | 17 ++++++- gcc/config/i386/i386.md | 3 ++ 4 files changed, 111 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index bd4782800c4..a54e3f6b1dc 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -79,6 +79,9 @@ extern bool ix86_expand_set_or_cpymem (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, bool); extern bool ix86_expand_cmpstrn_or_cmpmem (rtx, rtx, rtx, rtx, rtx, bool); +extern enum reg_class ix86_insn_base_reg_class (rtx_insn *); +extern bool ix86_regno_ok_for_insn_base_p (int, rtx_insn *); +extern enum reg_class ix86_insn_index_reg_class (rtx_insn *); extern bool constant_address_p (rtx); extern bool legitimate_pic_operand_p (rtx); extern bool legitimate_pic_address_disp_p (rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index fb1672f0b3d..5af0de4dae7 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -11062,6 +11062,95 @@ ix86_validate_address_register (rtx op) return NULL_RTX; } +/* Return true if insn memory address can use any available reg + in BASE_REG_CLASS or INDEX_REG_CLASS, otherwise false. + For APX, some instruction can't be encoded with gpr32 + which is BASE_REG_CLASS or INDEX_REG_CLASS, for that case + returns false. */ +static bool +ix86_memory_address_use_extended_reg_class_p (rtx_insn* insn) +{ + /* LRA will do some initialization with insn == NULL, + return the maximum reg class for that. + For other cases, real insn will be passed and checked. */ + bool ret = true; + if (TARGET_APX_EGPR && insn) + { + if (asm_noperands (PATTERN (insn)) >= 0 + || GET_CODE (PATTERN (insn)) == ASM_INPUT) + return ix86_apx_inline_asm_use_gpr32; + + if (INSN_CODE (insn) < 0) + return false; + + /* Try recog the insn before calling get_attr_gpr32. Save + the current recog_data first. */ + /* Also save which_alternative for current recog. */ + + struct recog_data_d recog_data_save = recog_data; + int which_alternative_saved = which_alternative; + + /* Update the recog_data for alternative check. */ + if (recog_data.insn != insn) + extract_insn_cached (insn); + + /* If alternative is not set, loop throught each alternative + of insn and get gpr32 attr for all enabled alternatives. + If any enabled alternatives has 0 value for gpr32, disallow + gpr32 for addressing. */ + if (which_alternative_saved == -1) + { + alternative_mask enabled = get_enabled_alternatives (insn); + bool curr_insn_gpr32 = false; + for (int i = 0; i < recog_data.n_alternatives; i++) + { + if (!TEST_BIT (enabled, i)) + continue; + which_alternative = i; + curr_insn_gpr32 = get_attr_gpr32 (insn); + if (!curr_insn_gpr32) + ret = false; + } + } + else + { + which_alternative = which_alternative_saved; + ret = get_attr_gpr32 (insn); + } + + recog_data = recog_data_save; + which_alternative = which_alternative_saved; + } + + return ret; +} + +/* For APX, some instructions can't be encoded with gpr32. */ +enum reg_class +ix86_insn_base_reg_class (rtx_insn* insn) +{ + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return BASE_REG_CLASS; + return GENERAL_GPR16; +} + +bool +ix86_regno_ok_for_insn_base_p (int regno, rtx_insn* insn) +{ + + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return GENERAL_REGNO_P (regno); + return GENERAL_GPR16_REGNO_P (regno); +} + +enum reg_class +ix86_insn_index_reg_class (rtx_insn* insn) +{ + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return INDEX_REG_CLASS; + return INDEX_GPR16; +} + /* Recognizes RTL expressions that are valid memory addresses for an instruction. The MODE argument is the machine mode for the MEM expression that wants to use this address. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 66b8764e82b..7fa7585e058 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1018,6 +1018,14 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define ADJUST_REG_ALLOC_ORDER x86_order_regs_for_local_alloc () +#define INSN_BASE_REG_CLASS(INSN) \ + ix86_insn_base_reg_class (INSN) + +#define REGNO_OK_FOR_INSN_BASE_P(NUM, INSN) \ + ix86_regno_ok_for_insn_base_p (NUM, INSN) + +#define INSN_INDEX_REG_CLASS(INSN) \ + ix86_insn_index_reg_class (INSN) #define OVERRIDE_ABI_FORMAT(FNDECL) ix86_call_abi_override (FNDECL) @@ -1297,6 +1305,8 @@ enum reg_class %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ GENERAL_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ + INDEX_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1360,6 +1370,7 @@ enum reg_class "LEGACY_REGS", \ "GENERAL_REGS", \ "GENERAL_GPR16", \ + "INDEX_GPR16", \ "FP_TOP_REG", "FP_SECOND_REG", \ "FLOAT_REGS", \ "SSE_FIRST_REG", \ @@ -1395,10 +1406,11 @@ enum reg_class { 0x0f, 0x0, 0x0 }, /* Q_REGS */ \ { 0x900f0, 0x0, 0x0 }, /* NON_Q_REGS */ \ { 0x7e, 0xff0, 0x0 }, /* TLS_GOTBASE_REGS */ \ - { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ + { 0x7f, 0xff0, 0xffff000 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ { 0x900ff, 0xff0, 0x0 }, /* GENERAL_GPR16 */ \ + { 0x0007f, 0xff0, 0x0 }, /* INDEX_GPR16 */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ @@ -1456,6 +1468,9 @@ enum reg_class #define INDEX_REGNO_P(N) \ (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) +#define GENERAL_GPR16_REGNO_P(N) \ + (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N)) + #define ANY_QI_REG_P(X) (REG_P (X) && ANY_QI_REGNO_P (REGNO (X))) #define ANY_QI_REGNO_P(N) \ (TARGET_64BIT ? GENERAL_REGNO_P (N) : QI_REGNO_P (N)) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e3270658cb7..b9eaea78f00 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -873,6 +873,9 @@ (define_attr "use_carry" "0,1" (const_string "0")) ;; Define attribute to indicate unaligned ssemov insns (define_attr "movu" "0,1" (const_string "0")) +;; Define attribute to indicate gpr32 insns. +(define_attr "gpr32" "0, 1" (const_string "1")) + ;; Define instruction set of MMX instructions (define_attr "mmx_isa" "base,native,sse,sse_noavx,avx" (const_string "base")) From patchwork Fri Sep 22 10:56:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143349 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4566186rwb; Fri, 22 Sep 2023 03:59:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFffKHt+JGsIG+xqZiHCKNfwcrIYpcz5JGhrkAA8go6eGbgS6ygBc5vdkq5yA0Ccmz0bR5f X-Received: by 2002:a17:906:114:b0:9ad:b046:bc50 with SMTP id 20-20020a170906011400b009adb046bc50mr3499643eje.10.1695380368762; Fri, 22 Sep 2023 03:59:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380368; cv=none; d=google.com; s=arc-20160816; b=F1aytqn3phKXQ4lkJVEofnC/RQO1MdVdmRDZXzg0AYX99cDy2ods+3gfjvNhA7ePhb 1vDOrFakG2hhb/hxmGiu8bfu9nI/vDM3FD1BN8tviE8pNtb4V3V3rPWkmOqXK85fcl4u 5+l1qnPiMgMPZF25RgmBc0DDpRWUmW9IPwWIUyHTBj/WHVimr8KRzyqp69I13jEQaX0M Dlya3nobCZBB003X5GvNLgd7xsSD/vrPAvVieTlfrQ5DPIdFr9nlHoZQX9sDxQll4uIb sWtcQlUr/pzrzcdptH60XCjdLiPHZ8BC946v/Jh7q89UZZoeaHhW/I2ErbSbJ1s7D60e WplQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=m3Ux2YOFJ1ROoiafhwLCvMzt8lsXlYr9Oi7ideIzD9U=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=Rc2I1XeXg3CQl1zu8rwj2rZPIW4QAUOZ3qI0ZTd5WLMusYFqg47gdthpOgW0FA1c+z bY012Z2Ko9AD/HmgB68HtJ7lYC+0H7+wYc8/eZf6Bk0oC/FonZXZ5vtu+U40M9BfBFxW sjHlYF4QHMKR6kXv7OtAvjsuoQOY2B3liUgUuI1INPj5+9d46WhauoN8HWrYEt8CzzuJ ITSvi1Kp9/OkItXeYuexMnacKOfQJ4z2aNxP6Qb/yHdW4bR+3B1rz15GUIDZS1NzzVna ErxfOa5xpfgCay2glLeFZjyNYspLk+3L8tXncfNvWmKyEmgrAZxwy0zKY44Gx8LRvFRJ 17hA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f1HNxjvk; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id kq1-20020a170906abc100b009ae3f7f47c2si3278538ejb.890.2023.09.22.03.59.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 03:59:28 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f1HNxjvk; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6FA0B38323E7 for ; Fri, 22 Sep 2023 10:57:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 4055C385735A for ; Fri, 22 Sep 2023 10:56:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4055C385735A Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380204; x=1726916204; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4fpWAWZHdkSkbHAD051o8kRpkV0uT8k0w/m2arwtn/k=; b=f1HNxjvkeVIIOGpuvA5Jx6ly8HHlcO6PN9f17rLlCG11lUxTHfdrKRSW R9P/OtnHm53D7YRbrWI9JJlMtCpXqmXZzl6LHKrdCEAzf4ODFGYoDc3jT hKLTbKyprQa5x7VgqWNZP/TVdTKhYr5gQq+/bQ8DaJX8FOnvk9jB6WfSR z94Jw1oyhufsyRUOfRygKuEIj6L1E8ayRH3lolw5LQ3vO0wZ58wpSCGna 9du9Kv4RCDMDydZQ3YQg31/C/iUXk1lSdyCKre4KZhYZAZtmwuKk/hYNd Qhxt8DBY3zBmnr47x6c4xCN97Xc2wqwyp+OcDRITemQQVyJwAlnxrE7UD w==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680804" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680804" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615908" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615908" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:36 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B25B1100513C; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 07/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint. Date: Fri, 22 Sep 2023 18:56:25 +0800 Message-Id: <20230922105631.2298849-8-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735165371287251 X-GMAIL-MSGID: 1777735165371287251 From: Kong Lingling In inline asm, we do not know if the insn can use EGPR, so disable EGPR usage by default via mapping the common reg/mem constraint to non-EGPR constraints. The full list of mapping goes like "g" -> "jrjmi" "r" -> "jr" "m" -> "jm" "<" -> "j<" ">" -> "j>" "o" -> "jo" "V" -> "jV" "p" -> "jp" "Bm" -> "ja For memory constraints, we add an option -mapx-inline-asm-use-gpr32 to allow/disallow gpr32 usage in any memory related constraints, as base_reg_class/index_reg_class cannot aware whether the asm insn support gpr32 or not. gcc/ChangeLog: * config/i386/i386.cc (map_egpr_constraints): New funciton to map common constraints to EGPR prohibited constraints. (ix86_md_asm_adjust): Calls map_egpr_constraints. * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-inline-gpr-norex2.c: New test. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386.cc | 92 +++++++++++++++++++ gcc/config/i386/i386.opt | 5 + .../gcc.target/i386/apx-inline-gpr-norex2.c | 25 +++++ 3 files changed, 122 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 5af0de4dae7..ea94663eb68 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ +#define INCLUDE_STRING #define IN_TARGET_CODE 1 #include "config.h" @@ -23161,6 +23162,93 @@ ix86_c_mode_for_suffix (char suffix) return VOIDmode; } +/* Helper function to map common constraints to non-EGPR ones. + All related constraints have h prefix, and h plus Upper letter + means the constraint is strictly EGPR enabled, while h plus + lower letter indicates the constraint is strictly gpr16 only. + + Specially for "g" constraint, split it to rmi as there is + no corresponding general constraint define for backend. + + Here is the full list to map constraints that may involve + gpr to h prefixed. + + "g" -> "jrjmi" + "r" -> "jr" + "m" -> "jm" + "<" -> "j<" + ">" -> "j>" + "o" -> "jo" + "V" -> "jV" + "p" -> "jp" + "Bm" -> "ja" +*/ + +static void map_egpr_constraints (vec &constraints) +{ + for (size_t i = 0; i < constraints.length(); i++) + { + const char *cur = constraints[i]; + + if (startswith (con, "=@cc")) + continue; + + int len = strlen (cur); + auto_vec buf; + + for (int j = 0; j < len; j++) + { + switch (cur[j]) + { + case 'g': + buf.safe_push ('j'); + buf.safe_push ('r'); + buf.safe_push ('j'); + buf.safe_push ('m'); + buf.safe_push ('i'); + break; + case 'r': + case 'm': + case '<': + case '>': + case 'o': + case 'V': + case 'p': + buf.safe_push ('j'); + buf.safe_push (cur[j]); + break; + case 'B': + if (cur[j + 1] == 'm') + { + buf.safe_push ('j'); + buf.safe_push ('a'); + j++; + } + else + { + buf.safe_push (cur[j]); + buf.safe_push (cur[j + 1]); + j++; + } + break; + case 'T': + case 'Y': + case 'W': + case 'j': + buf.safe_push (cur[j]); + buf.safe_push (cur[j + 1]); + j++; + break; + default: + buf.safe_push (cur[j]); + break; + } + } + buf.safe_push ('\0'); + constraints[i] = xstrdup (buf.address ()); + } +} + /* Worker function for TARGET_MD_ASM_ADJUST. We implement asm flag outputs, and maintain source compatibility @@ -23175,6 +23263,10 @@ ix86_md_asm_adjust (vec &outputs, vec & /*inputs*/, bool saw_asm_flag = false; start_sequence (); + + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) + map_egpr_constraints (constraints); + for (unsigned i = 0, n = outputs.length (); i < n; ++i) { const char *con = constraints[i]; diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index d89b5bbc5e8..d4a7b7ec839 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1335,3 +1335,8 @@ Enum(apx_features) String(ndd) Value(apx_ndd) Set(4) EnumValue Enum(apx_features) String(all) Value(apx_all) Set(1) + +mapx-inline-asm-use-gpr32 +Target Var(ix86_apx_inline_asm_use_gpr32) Init(0) +Enable GPR32 in inline asm when APX_EGPR enabled, do not +hook reg or mem constraint in inline asm to GPR16. diff --git a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c new file mode 100644 index 00000000000..ffd8f954500 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mapxf -m64" } */ + +typedef unsigned int u32; +typedef unsigned long long u64; + +void constraint_test () +{ + register u64 *r16 __asm__("%r16"); + register u64 r17 __asm__("%r17"); + u64 *addr = r16; + + __asm__ __volatile__ ("test_mapping_g_m %0, %%rax" : : "g" (r16) : "rax"); + __asm__ __volatile__ ("test_mapping_g_r %0, %%rax" : : "g" (r17) : "rax"); + __asm__ __volatile__ ("test_mapping_m %0, %%rax" : : "m" (addr) : "rax"); + __asm__ __volatile__ ("test_mapping_r %0, %%rax" : : "r" (r17) : "rax"); + __asm__ __volatile__ ("test_mapping_rm %0, %%rax" : "=r,m" (r16) : : "rax"); +} + +/* { dg-final { scan-assembler-not "test_mapping_g_m %r16, %rax" } } */ +/* { dg-final { scan-assembler-not "test_mapping_g_r %r17, %rax" } } */ +/* { dg-final { scan-assembler-not "test_mapping_m %r16, %rax" } } */ +/* { dg-final { scan-assembler-not "test_mapping_r %r17, %rax" } } */ +/* { dg-final { scan-assembler-not "test_mapping_rm %r16, %rax" } } */ + From patchwork Fri Sep 22 10:56:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143361 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4569851rwb; Fri, 22 Sep 2023 04:04:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEoA/mqIQ/99ZAZabq0eEw1lleC7VIlZokGUr2AZRgD8peU77gPkc+XvJ5jJIhvLTxnk0je X-Received: by 2002:aa7:d414:0:b0:525:680a:6b89 with SMTP id z20-20020aa7d414000000b00525680a6b89mr7609383edq.12.1695380669531; Fri, 22 Sep 2023 04:04:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380669; cv=none; d=google.com; s=arc-20160816; b=0/qXPDqW0xP6PNfgtXDn3QaZahVQ/T7SIh+XakZ1ZWkUDs6AjLBCPV1ej/Wz3NHVdv QALIhsHFJLHDknofU5Lf4mM3HajKIS2pIS/HIJLaIuocT8G+ZGijbLl4NA2cnP56UftJ ZC6x/ZChv8kYaVPFdC/tvSbVpAAsBD6fk1IVFE6wFnw1inX8QcOo0SNjaTvv/FZEupwD yiBjIH8jPBCseL1ek/jWSZYe3qhNVd92g3/Rjp4snpUmGAqKLNmAwWZjK5P95xqcVw3T oyYNJ9xaP1hSE4RFf1dBc5aQAYypqSyZFajAvNaNHLO2awR1dzFe8+2HhA9t5A4wHo26 Wy7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=hPSzoc0Z2eyhigRhEQgY4+sBUMFsN/0YNSJOzN52am8=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=v0FjmbpjD3w3UZQ3gR8cQsFrXYhK5/Hr888zzVaCGEOTsfKEIoHm0CfV/hfl2SqG7F JGegnM+opBfSiMgLgGAh5JY2WCSif7m9NUiXIRPpkosoIlIxFSiDHcxhzxVCtZapn7G9 l6PKz8ko7Ci6lRZ+mzJyy5wenG6gQkNMMYZOw69xp7JFZIprhsGxrD00Sd5WvLlJSfO+ gKpxMqFmXPwxoCpm/+Oy31EkRuCyhFiXdzKAXxrycqk4YErnuJoHhRhoSFnlppI9FA80 oLW2OwpoTUdBjUiOM9dAyeTCdkugaa/P9oz9xAHHaP3QxTI5XEr//047S8s2/60MOmpo aR+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=egREeiOF; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e16-20020a056402089000b00522394947c9si3083278edy.632.2023.09.22.04.04.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:04:29 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=egREeiOF; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 011C4385DC02 for ; Fri, 22 Sep 2023 10:59:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id EFBC1385772B for ; Fri, 22 Sep 2023 10:56:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFBC1385772B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380204; x=1726916204; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4kWzVLgFitkxqMl1GuJOV5I4j6isBnpKDiYoVer948U=; b=egREeiOF1m9rGLF3230j5qcapc55Cp2fvmq/I2Sev5Z2zcMyyFLsF7Mh NQLw4nw+52nIHkThgRtpVeWffH1kVvRwxms2M7Qi/t8gbLdVzJ3x1fu2h hBwrIyycQ5LFJBeDmVcWsFf+eSVYtrTuMltQyWgaCcxiU1Up9mb8g6x9m tiaruSW8ElVVAMbFdPMEim+kHrnPMBtc6agOSgvkVO14i2d4NIvHXaky7 h8+m44Gr/EOnMiROwmDICe2Ca+QuQjUXnuSS0p0gC+9a/P6MkdlnNqaLN 2yaPLRkL03s787N6VC1fWzcuM/TWow5f7C7kv+Ug987YgPJsy/wVWE7ni g==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680800" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680800" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615906" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615906" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:36 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B4BF4100513D; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 08/13] [APX EGPR] Handle GPR16 only vector move insns Date: Fri, 22 Sep 2023 18:56:26 +0800 Message-Id: <20230922105631.2298849-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735480869765009 X-GMAIL-MSGID: 1777735480869765009 For vector move insns like vmovdqa/vmovdqu, their evex counterparts requrire explicit suffix 64/32/16/8. The usage of these instruction are prohibited under AVX10_1 or AVX512F, so for we select vmovaps/vmovups for vector load/store insns that contains EGPR if ther is no AVX512VL, and keep the original move insn selection otherwise. gcc/ChangeLog: * config/i386/i386.cc (ix86_get_ssemov): Check if egpr is used, adjust mnemonic for vmovduq/vmovdqa. * config/i386/sse.md (*_vinsert_0): Check if egpr is used, adjust mnemonic for vmovdqu/vmovdqa. (avx_vec_concat): Likewise, and separate alternative 0 to avx_noavx512f. Co-authored-by: Kong Lingling Co-authored-by: Hongtao Liu --- gcc/config/i386/i386.cc | 42 +++++++++++++++++++++++++++++++++++------ gcc/config/i386/sse.md | 34 +++++++++++++++++++++++---------- 2 files changed, 60 insertions(+), 16 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index ea94663eb68..5d47c2af25e 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -5478,6 +5478,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, bool evex_reg_p = (size == 64 || EXT_REX_SSE_REG_P (operands[0]) || EXT_REX_SSE_REG_P (operands[1])); + + bool egpr_p = (TARGET_APX_EGPR + && (x86_extended_rex2reg_mentioned_p (operands[0]) + || x86_extended_rex2reg_mentioned_p (operands[1]))); + bool egpr_vl = egpr_p && TARGET_AVX512VL; + machine_mode scalar_mode; const char *opcode = NULL; @@ -5550,12 +5556,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, { case E_HFmode: case E_BFmode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = (misaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5570,8 +5582,10 @@ ix86_get_ssemov (rtx *operands, unsigned size, opcode = misaligned_p ? "%vmovupd" : "%vmovapd"; break; case E_TFmode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; @@ -5584,12 +5598,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, switch (scalar_mode) { case E_QImode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = (misaligned_p ? (TARGET_AVX512BW ? "vmovdqu8" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5598,12 +5618,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, : "%vmovdqa"); break; case E_HImode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = (misaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5612,16 +5638,20 @@ ix86_get_ssemov (rtx *operands, unsigned size, : "%vmovdqa"); break; case E_SImode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; case E_DImode: case E_TImode: case E_OImode: - if (evex_reg_p) + if (evex_reg_p || egpr_vl) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 80b43fd7db7..256b0eedbbb 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -18912,6 +18912,12 @@ (define_insn "*_vinsert_0" { if (which_alternative == 0) return "vinsert\t{$0, %2, %1, %0|%0, %1, %2, 0}"; + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[2])); + const char *align_templ = egpr_used ? "vmovaps\t{%2, %x0|%x0, %2}" + : "vmovdqa\t{%2, %x0|%x0, %2}"; + const char *unalign_templ = egpr_used ? "vmovups\t{%2, %x0|%x0, %2}" + : "vmovdqu\t{%2, %x0|%x0, %2}"; switch (mode) { case E_V8DFmode: @@ -18927,17 +18933,17 @@ (define_insn "*_vinsert_0" case E_V8DImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu64\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa64\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; case E_V16SImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu32\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa32\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; default: gcc_unreachable (); } @@ -27661,11 +27667,13 @@ (define_insn "avx_vec_concat" [(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv") (vec_concat:V_256_512 (match_operand: 1 "nonimmediate_operand" "x,v,xm,vm") - (match_operand: 2 "nonimm_or_0_operand" "xm,vm,C,C")))] + (match_operand: 2 "nonimm_or_0_operand" "xBt,vm,C,C")))] "TARGET_AVX && (operands[2] == CONST0_RTX (mode) || !MEM_P (operands[1]))" { + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[1])); switch (which_alternative) { case 0: @@ -27713,7 +27721,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovups\t{%1, %t0|%t0, %1}" + : "vmovdqu\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %t0|%t0, %1}"; else @@ -27722,7 +27731,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovaps\t{%1, %t0|%t0, %1}" + : "vmovdqa\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %t0|%t0, %1}"; else @@ -27732,7 +27742,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovups\t{%1, %x0|%x0, %1}" + : "vmovdqu\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %x0|%x0, %1}"; else @@ -27741,7 +27752,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovaps\t{%1, %x0|%x0, %1}" + : "vmovdqa\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %x0|%x0, %1}"; else @@ -27754,7 +27766,9 @@ (define_insn "avx_vec_concat" gcc_unreachable (); } } - [(set_attr "type" "sselog,sselog,ssemov,ssemov") + [(set_attr "isa" "noavx512f,avx512f,*,*") + (set_attr "gpr32" "0,1,1,1") + (set_attr "type" "sselog,sselog,ssemov,ssemov") (set_attr "prefix_extra" "1,1,*,*") (set_attr "length_immediate" "1,1,*,*") (set_attr "prefix" "maybe_evex") From patchwork Fri Sep 22 10:56:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143363 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4570864rwb; Fri, 22 Sep 2023 04:05:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGBU74GLKg0zKH2xeZI7KDn+5bNUIgficj3wcpx4X7M1VMH1NekQ9613c/sjMOpzWtrII/v X-Received: by 2002:a05:6512:10d1:b0:4ff:9a75:211e with SMTP id k17-20020a05651210d100b004ff9a75211emr9148601lfg.42.1695380759327; Fri, 22 Sep 2023 04:05:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380759; cv=none; d=google.com; s=arc-20160816; b=C27d1WuS+dkW5L0Bjm7FyReJDGL6v/5gILUaSGrODbOXbMyeQySqGKtYUZxx1RFYGF mJ7/t3xfzzUTzkIbI+jdQB62/acZS8QzOByO5v+ca4AiRexx5dJZRZ5Ymhx4oBBNGeb7 fsGCyXW5jOIdXapKJtg5R08SUih/aEt5zxLVEjtAD4D1jWT6C61epWqBFu8fNGaVox6M 9/HxHwY8o0gjqo9A9XSHZvDghxeguzOSxfq6NX1iL6JlxzmS47+bBiaDAq4oryyjO1ar +qHzv9qKAPx+CGGzJHi3eJhF28xpzEXFDNUtx+0g3C5j6igtqvUUwbbMbihv8MfYW/H/ Dp+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=I8N0OA57I5pVHKubJP2kTIlXsynbnHRgTeepVGBeAnE=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=pe4P01E5UZMRmEyYrer6XCa1/dq6+XlOX6PDzrrhX53cP+EN0PHAZ8DjnhVXa7RTHu W0ScJAutThRAa5LrFk80dT0MEdlAKUSpCXWq0F2u/Mv62ljSJPgF0P+3D1SkmhtS7DFr OTW8LrhyLXb8UFVkfMXHHUBswIg8mkjzxxa9kYSAF678tGamfps8C5lNMKGcOZZHAOyf lU0Jymk5q9lMRJA3zoO1dQwTsEetYsx7wEMzC6tDph+igi9bQFzqv8SM0ckJohmymL6j qxc/4R131GNe6rtr8Rg0C5T8T/UAnJR2A5DizRd/h4I+Yz5ZiPjNOOkdF6X/C+ZYtjMa WwFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YU9tiBHi; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y4-20020a056402134400b005233fb096d2si2794827edw.460.2023.09.22.04.05.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:05:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YU9tiBHi; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2EA9A385E021 for ; Fri, 22 Sep 2023 10:59:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 5A5F63857732 for ; Fri, 22 Sep 2023 10:56:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5A5F63857732 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380204; x=1726916204; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J33X/xY4q59gSwMLS1dMMnLp+ICxHXcaGBbvmryV2SE=; b=YU9tiBHivTmApRBEYac4a3Z1tAfdbT4y3+gI93yve9sSgBu4AVqFeqKl kWgInWypPVndutqEUkli6gvvjmKh9PafACzF76gppdOC03CnMCTPbYcM6 bwQO7ZW4Xt5kqeamK+qP+c97YNi9TWRzER+zQ/FoNYwceFvRKm4ZNc3pL wpAA6ByzVOfPuxIQPx11ALVO3qTuEtfEcnNGG20Qh5qcz9qeoWCGX7B+g LqCtAL3NFj8msSj7LX3kdMByP58nK/LzoHPfVawZgLurcyXMCdFQSKV3R LLkdg6TzeJEY85cbw7F2sDEjExx339SsRybF+f49STjcqBuxSryAljuia w==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680813" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680813" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615917" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615917" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:37 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B73AF100513E; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 09/13] [APX EGPR] Handle legacy insn that only support GPR16 (1/5) Date: Fri, 22 Sep 2023 18:56:27 +0800 Message-Id: <20230922105631.2298849-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735574844530700 X-GMAIL-MSGID: 1777735574844530700 From: Kong Lingling These legacy insn in opcode map0/1 only support GPR16, and do not have vex/evex counterpart, directly adjust constraints and add gpr32 attr to patterns. insn list: 1. xsave/xsave64, xrstor/xrstor64 2. xsaves/xsaves64, xrstors/xrstors64 3. xsavec/xsavec64 4. xsaveopt/xsaveopt64 5. fxsave64/fxrstor64 gcc/ChangeLog: * config/i386/i386.md (): Set attr gpr32 0 and constraint jm. (_rex64): Likewise. (_rex64): Likewise. (64): Likewise. (fxsave64): Likewise. (fxstore64): Likewise. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add apxf check. * gcc.target/i386/apx-legacy-insn-check-norex2.c: New test. * gcc.target/i386/apx-legacy-insn-check-norex2-asm.c: New assembler test. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386.md | 18 +++++++---- .../i386/apx-legacy-insn-check-norex2-asm.c | 5 ++++ .../i386/apx-legacy-insn-check-norex2.c | 30 +++++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 10 +++++++ 4 files changed, 57 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index b9eaea78f00..6cf86b798a8 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -25626,11 +25626,12 @@ (define_insn "fxsave" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "fxsave64" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=jm") (unspec_volatile:BLK [(const_int 0)] UNSPECV_FXSAVE64))] "TARGET_64BIT && TARGET_FXSR" "fxsave64\t%0" [(set_attr "type" "other") + (set_attr "gpr32" "0") (set_attr "memory" "store") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25646,11 +25647,12 @@ (define_insn "fxrstor" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "fxrstor64" - [(unspec_volatile [(match_operand:BLK 0 "memory_operand" "m")] + [(unspec_volatile [(match_operand:BLK 0 "memory_operand" "jm")] UNSPECV_FXRSTOR64)] "TARGET_64BIT && TARGET_FXSR" "fxrstor64\t%0" [(set_attr "type" "other") + (set_attr "gpr32" "0") (set_attr "memory" "load") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25704,7 +25706,7 @@ (define_insn "" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "_rex64" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=jm") (unspec_volatile:BLK [(match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] @@ -25713,11 +25715,12 @@ (define_insn "_rex64" "\t%0" [(set_attr "type" "other") (set_attr "memory" "store") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=jm") (unspec_volatile:BLK [(match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] @@ -25726,6 +25729,7 @@ (define_insn "" "\t%0" [(set_attr "type" "other") (set_attr "memory" "store") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25743,7 +25747,7 @@ (define_insn "" (define_insn "_rex64" [(unspec_volatile:BLK - [(match_operand:BLK 0 "memory_operand" "m") + [(match_operand:BLK 0 "memory_operand" "jm") (match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] ANY_XRSTOR)] @@ -25751,12 +25755,13 @@ (define_insn "_rex64" "\t%0" [(set_attr "type" "other") (set_attr "memory" "load") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "64" [(unspec_volatile:BLK - [(match_operand:BLK 0 "memory_operand" "m") + [(match_operand:BLK 0 "memory_operand" "jm") (match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] ANY_XRSTOR64)] @@ -25764,6 +25769,7 @@ (define_insn "64" "64\t%0" [(set_attr "type" "other") (set_attr "memory" "load") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c new file mode 100644 index 00000000000..7ecc861435f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c @@ -0,0 +1,5 @@ +/* { dg-do assemble { target apxf } } */ +/* { dg-options "-O1 -mapxf -m64 -DDTYPE32" } */ + +#include "apx-legacy-insn-check-norex2.c" + diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c new file mode 100644 index 00000000000..1e5450dfb73 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mapxf -m64 -DDTYPE32" } */ + +#include + +typedef unsigned int u32; +typedef unsigned long long u64; + +#ifndef DTYPE32 +#define DTYPE32 +#endif + +#ifdef DTYPE32 +typedef u32 DTYPE; +#endif + +__attribute__((target("xsave,fxsr"))) +void legacy_test () +{ + register DTYPE* val __asm__("r16"); + _xsave64 (val, 1); + _xrstor64 (val, 1); + _fxsave64 (val); + _fxrstor64 (val); +} + +/* { dg-final { scan-assembler-not "xsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 2de41cef2f6..2907de8bd7c 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -10001,6 +10001,16 @@ proc check_effective_target_sm4 { } { } "-msm4" ] } +proc check_effective_target_apxf { } { + return [check_no_compiler_messages apxf object { + void + foo () + { + __asm__ volatile ("add\t%%r16, %%r31" ::); + } + } "-mapxf" ] +} + # Return 1 if sse instructions can be compiled. proc check_effective_target_sse { } { return [check_no_compiler_messages sse object { From patchwork Fri Sep 22 10:56:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143360 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4569603rwb; Fri, 22 Sep 2023 04:04:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE9gofuBX6mpsBWfTj6dNjXLumIuMFyTSw4qGSrBws+493KfO51R0N5AV5UHArK1phxdWKW X-Received: by 2002:a05:6512:358a:b0:500:daf6:3898 with SMTP id m10-20020a056512358a00b00500daf63898mr6812473lfr.26.1695380648925; Fri, 22 Sep 2023 04:04:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380648; cv=none; d=google.com; s=arc-20160816; b=AOnJJvMhnlDkcehtsQIMDgEMyahD7smnSyCqE/PveT/xOnIy2Uf8OsOCAsYpG83/eG x5uysH7gQ1ZXdzsyqTo/SyB838AKm78XhxQE0BrVCo8uXM3ZSOrIiUp+iiA3M+EQwV6u PGr0IzsvdPpk162kTL6VSYSXUIqiM3L1Zbs6CabYPAjpBYifHyJpYvhn5IkSevnbwnro iAtQHIHQrOytuOSEhKwkTzSfVA9sQ3XIzxKbqKVzutLCUcCSF+39HezlXHrqWt/0gr3i w6FQTjvuMYtD7zQDzjA+If4wASb4XLeXVeo5JZlbsw0hMhY3caNCyYkBzIviceSLIYun 3cXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=pvMG4IUKPZtDk8+i/sQU//iBfA6jb+nVjDZCTtyAcH4=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=AevEJE07a/yuJZsfHcZSXd9312U9jrzq+pCHso4i86HzMWaYbH9hJ9TsQoLqpyVQZd spdhP6id2k5S8RJT1rjFWXY/6uRFWlE6qyFLoJa5NXcM5A80GYJHQlC4QgPxXBxv0kxj aJXQvIBRAxUszF+6q9rmxFkC3522zSLTCIKm3VO7VW0ONgFMlSNsam0U2tSB+TQV1j78 9NoqZcRVAkscSkKrprH3y8jvd+ETtdEYVZjpsfU6/Ie1jbj9sSta3zO8bjGm12aA9mk8 koJ1QqxrDHPMBX8sJXqXfRLn78ox+uolq1KL3ddw8fDyfP66lyYluAgvUTHr0aevEsP5 mSVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I2xRb9sb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r6-20020aa7d586000000b0052e925193a1si2911370edq.546.2023.09.22.04.04.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:04:08 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I2xRb9sb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D98F4385CCAE for ; Fri, 22 Sep 2023 10:58:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 47E83385696A for ; Fri, 22 Sep 2023 10:56:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47E83385696A Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380206; x=1726916206; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sDRjRsqu62WH7U1IZlgVzHw/+X9gKuH/cQjYESr1h9w=; b=I2xRb9sb0tAJgZjSgeot5m+Uh9uoUtOa90xWnl270nqw0yfEk4rcARoa 4cpM9CYPtjCEna2b7dhJFd64xQFOJqoiunCWWrUns1hocHnVhkvrM3uE4 zwqD0hJBgUWFyYiBGu4jNPngD/OCUNeuhV13g8KVvOqnPqCrBevAGSp4D T5tFeBSPYpQGKntAEwMwyxb36y2VttQf9r5cF3SUrDeZJ2vMXZBwQlpXR brO8Rlawq3VVyHZ8iHppNmi6uQ+NPiyB1BhL6Y7uBmq4XrImiNKQPHb70 jV9wOV+GIXRAXWkrYpHSD5f7HHDbGlnvevDXQrZ42KzExWcP8rJa9GcnA Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680831" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680831" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615942" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615942" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:37 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BA27B1005142; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5) Date: Fri, 22 Sep 2023 18:56:28 +0800 Message-Id: <20230922105631.2298849-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735459094531577 X-GMAIL-MSGID: 1777735459094531577 From: Kong Lingling These legacy insns in opcode map2/3 have vex but no evex counterpart, disable EGPR for them by adjusting alternatives and attr_gpr32. insn list: 1. phaddw/vphaddw, phaddd/vphaddd, phaddsw/vphaddsw 2. phsubw/vphsubw, phsubd/vphsubd, phsubsw/vphsubsw 3. psignb/vpsginb, psignw/vpsignw, psignd/vpsignd 4. blendps/vblendps, blendpd/vblendpd 5. blendvps/vblendvps, blendvpd/vblendvpd 6. pblendvb/vpblendvb, pblendw/vpblendw 7. mpsadbw/vmpsadbw 8. dpps/vddps, dppd/vdppd 9. pcmpeqq/vpcmpeqq, pcmpgtq/vpcmpgtq gcc/ChangeLog: * config/i386/sse.md (avx2_phwv16hi3): Set attr gpr32 0 and constraint jm/ja to all mem alternatives. (ssse3_phwv8hi3): Likewise. (ssse3_phwv4hi3): Likewise. (avx2_phdv8si3): Likewise. (ssse3_phdv4si3): Likewise. (ssse3_phdv2si3): Likewise. (_psign3): Likewise. (ssse3_psign3): Likewise. (_blend_blendv_blendv_lt): Likewise. (*_blendv_not_ltint: Likewise. (_dp): Likewise. (_mpsadbw): Likewise. (_pblendvb): Likewise. (*_pblendvb_lt): Likewise. (sse4_1_pblend): Likewise. (*avx2_pblend): Likewise. (avx2_permv2ti): Likewise. (*avx_vperm2f128_nozero): Likewise. (*avx2_eq3): Likewise. (*sse4_1_eqv2di3): Likewise. (sse4_2_gtv2di3): Likewise. (avx2_gt3): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add sse/vex intrinsic tests. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/sse.md | 73 ++++++++---- .../i386/apx-legacy-insn-check-norex2.c | 106 ++++++++++++++++++ 2 files changed, 155 insertions(+), 24 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 256b0eedbbb..a7858a7f8cf 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16831,7 +16831,7 @@ (define_insn "*avx2_eq3" [(set (match_operand:VI_256 0 "register_operand" "=x") (eq:VI_256 (match_operand:VI_256 1 "nonimmediate_operand" "%x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "jm")))] "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "vpcmpeq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -16839,6 +16839,7 @@ (define_insn "*avx2_eq3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17021,7 +17022,7 @@ (define_insn "*sse4_1_eqv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (eq:V2DI (match_operand:V2DI 1 "vector_operand" "%0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "Yrja,*xja,xjm")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pcmpeqq\t{%2, %0|%0, %2} @@ -17029,6 +17030,7 @@ (define_insn "*sse4_1_eqv2di3" vpcmpeqq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17037,13 +17039,14 @@ (define_insn "*sse2_eq3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (eq:VI124_128 (match_operand:VI124_128 1 "vector_operand" "%0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xjm")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pcmpeq\t{%2, %0|%0, %2} vpcmpeq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -17052,7 +17055,7 @@ (define_insn "sse4_2_gtv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (gt:V2DI (match_operand:V2DI 1 "register_operand" "0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "Yrja,*xja,xjm")))] "TARGET_SSE4_2" "@ pcmpgtq\t{%2, %0|%0, %2} @@ -17060,6 +17063,7 @@ (define_insn "sse4_2_gtv2di3" vpcmpgtq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17068,7 +17072,7 @@ (define_insn "avx2_gt3" [(set (match_operand:VI_256 0 "register_operand" "=x") (gt:VI_256 (match_operand:VI_256 1 "register_operand" "x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "xjm")))] "TARGET_AVX2" "vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -17076,6 +17080,7 @@ (define_insn "avx2_gt3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17099,7 +17104,7 @@ (define_insn "*sse2_gt3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (gt:VI124_128 (match_operand:VI124_128 1 "register_operand" "0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xjm")))] "TARGET_SSE2" "@ pcmpgt\t{%2, %0|%0, %2} @@ -21222,7 +21227,7 @@ (define_insn "avx2_phwv16hi3" (vec_select:V16HI (vec_concat:V32HI (match_operand:V16HI 1 "register_operand" "x") - (match_operand:V16HI 2 "nonimmediate_operand" "xm")) + (match_operand:V16HI 2 "nonimmediate_operand" "xjm")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 16) (const_int 18) (const_int 20) (const_int 22) @@ -21238,6 +21243,7 @@ (define_insn "avx2_phwv16hi3" "TARGET_AVX2" "vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21248,7 +21254,7 @@ (define_insn "ssse3_phwv8hi3" (vec_select:V8HI (vec_concat:V16HI (match_operand:V8HI 1 "register_operand" "0,x") - (match_operand:V8HI 2 "vector_operand" "xBm,xm")) + (match_operand:V8HI 2 "vector_operand" "xja,xjm")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) (const_int 12) (const_int 14)])) @@ -21263,6 +21269,7 @@ (define_insn "ssse3_phwv8hi3" vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") @@ -21314,7 +21321,7 @@ (define_insn "avx2_phdv8si3" (vec_select:V8SI (vec_concat:V16SI (match_operand:V8SI 1 "register_operand" "x") - (match_operand:V8SI 2 "nonimmediate_operand" "xm")) + (match_operand:V8SI 2 "nonimmediate_operand" "xjm")) (parallel [(const_int 0) (const_int 2) (const_int 8) (const_int 10) (const_int 4) (const_int 6) (const_int 12) (const_int 14)])) @@ -21326,6 +21333,7 @@ (define_insn "avx2_phdv8si3" "TARGET_AVX2" "vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21336,7 +21344,7 @@ (define_insn "ssse3_phdv4si3" (vec_select:V4SI (vec_concat:V8SI (match_operand:V4SI 1 "register_operand" "0,x") - (match_operand:V4SI 2 "vector_operand" "xBm,xm")) + (match_operand:V4SI 2 "vector_operand" "xja,xjm")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) (vec_select:V4SI @@ -21349,6 +21357,7 @@ (define_insn "ssse3_phdv4si3" vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_data16" "1,*") (set_attr "prefix_extra" "1") @@ -21388,6 +21397,7 @@ (define_insn_and_split "ssse3_phdv2si3" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) @@ -21842,7 +21852,7 @@ (define_insn "_psign3" [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x") (unspec:VI124_AVX2 [(match_operand:VI124_AVX2 1 "register_operand" "0,x") - (match_operand:VI124_AVX2 2 "vector_operand" "xBm,xm")] + (match_operand:VI124_AVX2 2 "vector_operand" "xja,xjm")] UNSPEC_PSIGN))] "TARGET_SSSE3" "@ @@ -21850,6 +21860,7 @@ (define_insn "_psign3" vpsign\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) @@ -22147,7 +22158,7 @@ (define_mode_attr blendbits (define_insn "_blend" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (vec_merge:VF_128_256 - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:VF_128_256 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to__operand")))] "TARGET_SSE4_1" @@ -22157,6 +22168,7 @@ (define_insn "_blend" vblend\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22167,7 +22179,7 @@ (define_insn "_blendv" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:VF_128_256 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22177,6 +22189,7 @@ (define_insn "_blendv" vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22228,7 +22241,7 @@ (define_insn_and_split "*_blendv_lt" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "Yrja,*xja,xjm") (lt:VF_128_256 (match_operand: 3 "register_operand" "Yz,Yz,x") (match_operand: 4 "const0_operand"))] @@ -22242,6 +22255,7 @@ (define_insn_and_split "*_blendv_lt" "operands[3] = gen_lowpart (mode, operands[3]);" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22260,7 +22274,7 @@ (define_insn_and_split "*_blendv_ltint" [(set (match_operand: 0 "register_operand" "=Yr,*x,x") (unspec: [(match_operand: 1 "register_operand" "0,0,x") - (match_operand: 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand: 2 "vector_operand" "Yrja,*xja,xjm") (subreg: (lt:VI48_AVX (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") @@ -22280,6 +22294,7 @@ (define_insn_and_split "*_blendv_ltint" } [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22318,7 +22333,7 @@ (define_insn "_dp" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "vector_operand" "%0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_DP))] "TARGET_SSE4_1" @@ -22328,6 +22343,7 @@ (define_insn "_dp" vdp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemul") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22356,7 +22372,7 @@ (define_insn "_mpsadbw" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_MPSADBW))] "TARGET_SSE4_1" @@ -22366,6 +22382,7 @@ (define_insn "_mpsadbw" vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -22394,7 +22411,7 @@ (define_insn "_pblendvb" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22404,6 +22421,7 @@ (define_insn "_pblendvb" vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22443,7 +22461,7 @@ (define_insn_and_split "*_pblendvb_lt" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "Yrja,*xja,xjm") (lt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x") (match_operand:VI1_AVX2 4 "const0_operand"))] UNSPEC_BLENDV))] @@ -22456,6 +22474,7 @@ (define_insn_and_split "*_pblendvb_lt" "" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22487,7 +22506,7 @@ (define_insn_and_split "*_pblendvb_lt_subreg_not" (define_insn "sse4_1_pblend" [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x") (vec_merge:V8_128 - (match_operand:V8_128 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:V8_128 2 "vector_operand" "Yrja,*xja,xjm") (match_operand:V8_128 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_SSE4_1" @@ -22497,6 +22516,7 @@ (define_insn "sse4_1_pblend" vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,orig,vex") @@ -22559,7 +22579,7 @@ (define_expand "avx2_pblend_1" (define_insn "*avx2_pblend" [(set (match_operand:V16_256 0 "register_operand" "=x") (vec_merge:V16_256 - (match_operand:V16_256 2 "nonimmediate_operand" "xm") + (match_operand:V16_256 2 "nonimmediate_operand" "xjm") (match_operand:V16_256 1 "register_operand" "x") (match_operand:SI 3 "avx2_pblendw_operand")))] "TARGET_AVX2" @@ -22568,6 +22588,7 @@ (define_insn "*avx2_pblend" return "vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -22576,7 +22597,7 @@ (define_insn "*avx2_pblend" (define_insn "avx2_pblendd" [(set (match_operand:VI4_AVX2 0 "register_operand" "=x") (vec_merge:VI4_AVX2 - (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xm") + (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xjm") (match_operand:VI4_AVX2 1 "register_operand" "x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_AVX2" @@ -26437,11 +26458,13 @@ (define_insn "avx512f_perm_1" (set_attr "prefix" "") (set_attr "mode" "")]) +;; TODO (APX): vmovaps supports EGPR but not others, could split +;; pattern to enable gpr32 for this one. (define_insn "avx2_permv2ti" [(set (match_operand:V4DI 0 "register_operand" "=x") (unspec:V4DI [(match_operand:V4DI 1 "register_operand" "x") - (match_operand:V4DI 2 "nonimmediate_operand" "xm") + (match_operand:V4DI 2 "nonimmediate_operand" "xjm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMTI))] "TARGET_AVX2" @@ -26468,6 +26491,7 @@ (define_insn "avx2_permv2ti" return "vperm2i128\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -27098,7 +27122,7 @@ (define_insn "*avx_vperm2f128_nozero" (vec_select:AVX256MODE2P (vec_concat: (match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm")) + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xjm")) (match_parallel 3 "" [(match_operand 4 "const_int_operand")])))] "TARGET_AVX @@ -27115,6 +27139,7 @@ (define_insn "*avx_vperm2f128_nozero" return "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c index 1e5450dfb73..510213a6ca7 100644 --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -28,3 +28,109 @@ void legacy_test () /* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ + +#ifdef DTYPE +#undef DTYPE +#define DTYPE u64 +#endif + +typedef union +{ + __m128i xi[8]; + __m128 xf[8]; + __m128d xd[8]; + __m256i yi[4]; + __m256 yf[4]; + __m256d yd[4]; + DTYPE a[16]; +} tmp_u; + +__attribute__((target("sse4.2"))) +void sse_test () +{ + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]); + src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]); + tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]); + tdst->xi[3] = _mm_hsub_epi16 (src1->xi[6], src2->xi[7]); + tdst->xi[4] = _mm_hsub_epi32 (src1->xi[0], src2->xi[1]); + tdst->xi[5] = _mm_hsubs_epi16 (src1->xi[2], src2->xi[3]); + + src1->xi[6] = _mm_cmpeq_epi64 (tdst->xi[4], src2->xi[5]); + src1->xi[7] = _mm_cmpgt_epi64 (tdst->xi[6], src2->xi[7]); + + tdst->xf[0] = _mm_dp_ps (src1->xf[0], src2->xf[1], 0xbf); + tdst->xd[1] = _mm_dp_pd (src1->xd[2], src2->xd[3], 0xae); + + tdst->xi[2] = _mm_mpsadbw_epu8 (src1->xi[4], src2->xi[5], 0xc1); + + tdst->xi[3] = _mm_blend_epi16 (src1->xi[6], src2->xi[7], 0xc); + tdst->xi[4] = _mm_blendv_epi8 (src1->xi[0], src2->xi[1], tdst->xi[2]); + tdst->xf[5] = _mm_blend_ps (src1->xf[3], src2->xf[4], 0x4); + tdst->xf[6] = _mm_blendv_ps (src1->xf[5], src2->xf[6], tdst->xf[7]); + tdst->xd[7] = _mm_blend_pd (tdst->xd[0], src1->xd[1], 0x1); + tdst->xd[0] = _mm_blendv_pd (src1->xd[2], src2->xd[3], tdst->xd[4]); + + tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]); + tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]); + tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]); +} + +__attribute__((target("avx2"))) +void vex_test () +{ + + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]); + src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]); + tdst->yi[0] = _mm256_hsub_epi16 (src1->yi[3], src2->yi[0]); + tdst->yi[1] = _mm256_hsub_epi32 (src1->yi[0], src2->yi[1]); + tdst->yi[2] = _mm256_hsubs_epi16 (src1->yi[2], src2->yi[3]); + + src1->yi[2] = _mm256_cmpeq_epi64 (tdst->yi[1], src2->yi[2]); + src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]); + + tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf); + tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf); + + tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1); + + tdst->yi[0] = _mm256_blend_epi16 (src1->yi[1], src2->yi[2], 0xc); + tdst->yi[1] = _mm256_blendv_epi8 (src1->yi[1], src2->yi[2], tdst->yi[0]); + tdst->yf[2] = _mm256_blend_ps (src1->yf[0], src2->yf[1], 0x4); + tdst->yf[3] = _mm256_blendv_ps (src1->yf[2], src2->yf[3], tdst->yf[1]); + tdst->yd[3] = _mm256_blend_pd (tdst->yd[1], src1->yd[0], 0x1); + tdst->yd[1] = _mm256_blendv_pd (src1->yd[2], src2->yd[3], tdst->yd[2]); + + tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]); + tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]); +} + +/* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpgtq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dpps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dppd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psadbw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendvb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ From patchwork Fri Sep 22 10:56:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143354 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4567333rwb; Fri, 22 Sep 2023 04:01:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEqhC6u3s+LGIRAmdUT4Nh41ZpNrd6vZOYqOVT2tkC13iUHAuQVAlLvFXY1iX7MRhhmU2xB X-Received: by 2002:a17:906:846a:b0:9ae:7d2d:f2b0 with SMTP id hx10-20020a170906846a00b009ae7d2df2b0mr792872ejc.63.1695380474205; Fri, 22 Sep 2023 04:01:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380474; cv=none; d=google.com; s=arc-20160816; b=AgUMXy3gRt1RDNt2UWIO5EffhA22zRx/Ln4Pza5J/8cAOik71Nd7H8Vl593QV5ZpGI 7SRH8Z/UnSg9v83xN2YerAF2RnGaDsZV1Ynzp+YUdUikI4yI5g6NhSepnQdyaDuDa1dA oTGJy4uz+zppXmybZYpwTV9JSeMiAXbY6SNnrvXGuBtjj8E37tPDlOaM7iw0sTYiQYAy gNu2FWVJTUJYmfe/yrALzcb++tgAVNJWKdSvEDjGyRLpsSGMEzb/JOnRN+oMolNvmSn4 ZdwOKpQA8Ksmw+xuv5p6bdH/MBj051WjbwdeDXJ91aNy0Zq2zVbX34CeXjtyTJOtR4Z3 59Fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=jWIAkdfGuBTVAIlkALhDbrcUZe+yVjU9DHePycRDhb0=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=uMqYOIMkqSyAUiBlgI5WOuVtFloeVT9UlHu8a2/yM8RjPF5haX/uDk+VmQ4v4zWQKi 6lTxjVXh4MKG7BtOyAG6DTJ1dXgpYGGQuAu+g8VY0RSegYlW+RJNAuOy8ozDFr4GKCuT 9/ROOMg7qXD++XrGzruSlKXJZ3PcnOpZPHMwbjbm1TUzCmUiZZgWo8l2NsyTmGPp2cBT xRJiZmX1rFQ2WOgXD7saqb+XCwwuSqxdzw0cxvrmVQdwD7EIYDEgVm1ZM3KFdEuux7uo WRedYiCjOYVZjt0VObiXY4SFdNcSztTh6bygHPE4x4VHNsVZCyurE4h6Rf5uwJlKg1X9 RgzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dpJn7fST; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e18-20020a17090681d200b0099399ac6fdesi2978934ejx.79.2023.09.22.04.01.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:01:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dpJn7fST; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BF1383830B74 for ; Fri, 22 Sep 2023 10:57:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 0F1323856DDF for ; Fri, 22 Sep 2023 10:56:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F1323856DDF Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380206; x=1726916206; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7UgSBVwXS49neWyPLZELzNsDRgj1uLco4DcTzsVJkFc=; b=dpJn7fSTsVB9wa6zG+FQaKqbaffQKDul55SvTHsoEiJ/jll85QOUkf9B D67e7YKk7LSeDhqToCX5p4LmBRQeqQ0of5UnoyXNejCQx6ooxpw/14ZmC FIfyxcydIPctIeicvyd20coFKYyzf30goW4a3rRm07QRBKbPTwKkjFKox H8tjaIT/JOTC0K2F+WFAUg5lB52xtV+JlfGP5OLUY+ZiMSSRWFygd3nuq WIiKfU4xDnD+hC7N1z6i2uPVYlg7F8cZu+k7kfHw8deBuTVdTlBQcRAik 6iNjAHUleqfO5QvVFvGIhO7nB53tbeYEKM7Iw9AZc0mQwx/c2Lt++J2TQ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680815" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680815" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615920" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615920" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:37 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BE022100514D; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5) Date: Fri, 22 Sep 2023 18:56:29 +0800 Message-Id: <20230922105631.2298849-12-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735276105345735 X-GMAIL-MSGID: 1777735276105345735 From: Kong Lingling Disable EGPR usage for below legacy insns in opcode map2/3 that have vex but no evex counterpart. insn list: 1. phminposuw/vphminposuw 2. ptest/vptest 3. roundps/vroundps, roundpd/vroundpd, roundss/vroundss, roundsd/vroundsd 4. pcmpestri/vpcmpestri, pcmpestrm/vpcmpestrm 5. pcmpistri/vpcmpistri, pcmpistrm/vpcmpistrm 6. aesimc/vaesimc, aeskeygenassist/vaeskeygenassist gcc/ChangeLog: * config/i386/i386-protos.h (x86_evex_reg_mentioned_p): New prototype. * config/i386/i386.cc (x86_evex_reg_mentioned_p): New function. * config/i386/i386.md (sse4_1_round2): Set attr gpr32 0 and constraint jm to all non-evex alternatives, adjust alternative outputs if evex reg is mentioned. * config/i386/sse.md (_ptest): Set attr gpr32 0 and constraint jm/ja to all non-evex alternatives. (ptesttf2): Likewise. (_round): Likewise. (sse4_2_pcmpestri): Likewise. (sse4_2_pcmpestrm): Likewise. (sse4_2_pcmpestr_cconly): Likewise. (sse4_2_pcmpistr): Likewise. (sse4_2_pcmpistri): Likewise. (sse4_2_pcmpistrm): Likewise. (sse4_2_pcmpistr_cconly): Likewise. (aesimc): Likewise. (aeskeygenassist): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add intrinsic tests. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 13 +++ gcc/config/i386/i386.md | 3 +- gcc/config/i386/sse.md | 93 +++++++++++++------ .../i386/apx-legacy-insn-check-norex2.c | 55 ++++++++++- 5 files changed, 132 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index a54e3f6b1dc..28d0eab11d5 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -65,6 +65,7 @@ extern bool extended_reg_mentioned_p (rtx); extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); extern bool x86_extended_rex2reg_mentioned_p (rtx); +extern bool x86_evex_reg_mentioned_p (rtx [], int); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 5d47c2af25e..58fa054635a 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22937,6 +22937,19 @@ x86_extended_rex2reg_mentioned_p (rtx insn) return false; } +/* Return true when rtx operands mentions register that must be encoded using + evex prefix. */ +bool +x86_evex_reg_mentioned_p (rtx operands[], int nops) +{ + int i; + for (i = 0; i < nops; i++) + if (EXT_REX_SSE_REG_P (operands[i]) + || x86_extended_rex2reg_mentioned_p (operands[i])) + return true; + return false; +} + /* If profitable, negate (without causing overflow) integer constant of mode MODE at location LOC. Return true in this case. */ bool diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6cf86b798a8..271d417146c 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -21603,7 +21603,7 @@ (define_expand "significand2" (define_insn "sse4_1_round2" [(set (match_operand:MODEFH 0 "register_operand" "=x,x,x,v,v") (unspec:MODEFH - [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,m,v,m") + [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,jm,v,m") (match_operand:SI 2 "const_0_to_15_operand")] UNSPEC_ROUND))] "TARGET_SSE4_1" @@ -21616,6 +21616,7 @@ (define_insn "sse4_1_round2" [(set_attr "type" "ssecvt") (set_attr "prefix_extra" "1,1,1,*,*") (set_attr "length_immediate" "1") + (set_attr "gpr32" "1,1,0,1,1") (set_attr "prefix" "maybe_vex,maybe_vex,maybe_vex,evex,evex") (set_attr "isa" "noavx512f,noavx512f,noavx512f,avx512f,avx512f") (set_attr "avx_partial_xmm_update" "false,false,true,false,true") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a7858a7f8cf..4db3940e422 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -22610,11 +22610,12 @@ (define_insn "avx2_pblendd" (define_insn "sse4_1_phminposuw" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,x") - (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,xm")] + (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "Yrja,*xja,xjm")] UNSPEC_PHMINPOSUW))] "TARGET_SSE4_1" "%vphminposuw\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -23803,12 +23804,13 @@ (define_insn "avx_vtest" (define_insn "*_ptest" [(set (reg FLAGS_REG) (unspec [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x") - (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")] + (match_operand:V_AVX 1 "vector_operand" "Yrja, *xja, xjm")] UNSPEC_PTEST))] "TARGET_SSE4_1 && ix86_match_ptest_ccmode (insn)" "%vptest\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set (attr "btver2_decode") @@ -23845,12 +23847,13 @@ (define_expand "_ptest" (define_insn "ptesttf2" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:TF 0 "register_operand" "Yr, *x, x") - (match_operand:TF 1 "vector_operand" "YrBm, *xBm, xm")] + (match_operand:TF 1 "vector_operand" "Yrja, *xja, xjm")] UNSPEC_PTEST))] "TARGET_SSE4_1" "%vptest\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -23961,13 +23964,14 @@ (define_expand "lrint2" (define_insn "_round" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 - [(match_operand:VF_128_256 1 "vector_operand" "YrBm,*xBm,xm") + [(match_operand:VF_128_256 1 "vector_operand" "Yrja,*xja,xjm") (match_operand:SI 2 "const_0_to_15_operand")] UNSPEC_ROUND))] "TARGET_SSE4_1" "%vround\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -24054,19 +24058,32 @@ (define_insn "sse4_1_round" [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v") (vec_merge:VF_128 (unspec:VF_128 - [(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm") + [(match_operand:VF_128 2 "nonimmediate_operand" "Yrjm,*xjm,xjm,vm") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_ROUND) (match_operand:VF_128 1 "register_operand" "0,0,x,v") (const_int 1)))] "TARGET_SSE4_1" - "@ - round\t{%3, %2, %0|%0, %2, %3} - round\t{%3, %2, %0|%0, %2, %3} - vround\t{%3, %2, %1, %0|%0, %1, %2, %3} - vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}" - [(set_attr "isa" "noavx,noavx,avx,avx512f") +{ + switch (which_alternative) + { + case 0: + case 1: + return "round\t{%3, %2, %0|%0, %2, %3}"; + case 2: + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + case 3: + if (x86_evex_reg_mentioned_p (operands, 3)) + return "vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + else + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,noavx,noavx512f,avx512f") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0,0,0,1") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*,*") (set_attr "prefix_extra" "1") @@ -24078,19 +24095,32 @@ (define_insn "*sse4_1_round" (vec_merge:VFH_128 (vec_duplicate:VFH_128 (unspec: - [(match_operand: 2 "nonimmediate_operand" "Yrm,*xm,xm,vm") + [(match_operand: 2 "nonimmediate_operand" "Yrjm,*xjm,xjm,vm") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_ROUND)) (match_operand:VFH_128 1 "register_operand" "0,0,x,v") (const_int 1)))] "TARGET_SSE4_1" - "@ - round\t{%3, %2, %0|%0, %2, %3} - round\t{%3, %2, %0|%0, %2, %3} - vround\t{%3, %2, %1, %0|%0, %1, %2, %3} - vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}" - [(set_attr "isa" "noavx,noavx,avx,avx512f") +{ + switch (which_alternative) + { + case 0: + case 1: + return "round\t{%3, %2, %0|%0, %2, %3}"; + case 2: + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + case 3: + if (x86_evex_reg_mentioned_p (operands, 3) || mode == V8HFmode) + return "vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + else + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,noavx,noavx512f,avx512f") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0,0,0,1") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*,*") (set_attr "prefix_extra" "1") @@ -24311,7 +24341,7 @@ (define_insn "sse4_2_pcmpestri" (unspec:SI [(match_operand:V16QI 1 "register_operand" "x,x") (match_operand:SI 2 "register_operand" "a,a") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,jm") (match_operand:SI 4 "register_operand" "d,d") (match_operand:SI 5 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24326,6 +24356,7 @@ (define_insn "sse4_2_pcmpestri" "TARGET_SSE4_2" "%vpcmpestri\t{%5, %3, %1|%1, %3, %5}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "length_immediate" "1") @@ -24338,7 +24369,7 @@ (define_insn "sse4_2_pcmpestrm" (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x,x") (match_operand:SI 2 "register_operand" "a,a") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,jm") (match_operand:SI 4 "register_operand" "d,d") (match_operand:SI 5 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24353,6 +24384,7 @@ (define_insn "sse4_2_pcmpestrm" "TARGET_SSE4_2" "%vpcmpestrm\t{%5, %3, %1|%1, %3, %5}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24365,7 +24397,7 @@ (define_insn "sse4_2_pcmpestr_cconly" (unspec:CC [(match_operand:V16QI 2 "register_operand" "x,x,x,x") (match_operand:SI 3 "register_operand" "a,a,a,a") - (match_operand:V16QI 4 "nonimmediate_operand" "x,m,x,m") + (match_operand:V16QI 4 "nonimmediate_operand" "x,jm,x,jm") (match_operand:SI 5 "register_operand" "d,d,d,d") (match_operand:SI 6 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24378,6 +24410,7 @@ (define_insn "sse4_2_pcmpestr_cconly" %vpcmpestri\t{%6, %4, %2|%2, %4, %6} %vpcmpestri\t{%6, %4, %2|%2, %4, %6}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load,none,load") @@ -24389,7 +24422,7 @@ (define_insn_and_split "sse4_2_pcmpistr" [(set (match_operand:SI 0 "register_operand" "=c,c") (unspec:SI [(match_operand:V16QI 2 "register_operand" "x,x") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,jm") (match_operand:SI 4 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (match_operand:V16QI 1 "register_operand" "=Yz,Yz") @@ -24432,6 +24465,7 @@ (define_insn_and_split "sse4_2_pcmpistr" DONE; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load") @@ -24441,7 +24475,7 @@ (define_insn "sse4_2_pcmpistri" [(set (match_operand:SI 0 "register_operand" "=c,c") (unspec:SI [(match_operand:V16QI 1 "register_operand" "x,x") - (match_operand:V16QI 2 "nonimmediate_operand" "x,m") + (match_operand:V16QI 2 "nonimmediate_operand" "x,jm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (reg:CC FLAGS_REG) @@ -24453,6 +24487,7 @@ (define_insn "sse4_2_pcmpistri" "TARGET_SSE4_2" "%vpcmpistri\t{%3, %2, %1|%1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24464,7 +24499,7 @@ (define_insn "sse4_2_pcmpistrm" [(set (match_operand:V16QI 0 "register_operand" "=Yz,Yz") (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x,x") - (match_operand:V16QI 2 "nonimmediate_operand" "x,m") + (match_operand:V16QI 2 "nonimmediate_operand" "x,jm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (reg:CC FLAGS_REG) @@ -24476,6 +24511,7 @@ (define_insn "sse4_2_pcmpistrm" "TARGET_SSE4_2" "%vpcmpistrm\t{%3, %2, %1|%1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24487,7 +24523,7 @@ (define_insn "sse4_2_pcmpistr_cconly" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:V16QI 2 "register_operand" "x,x,x,x") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m,x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,jm,x,jm") (match_operand:SI 4 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (clobber (match_scratch:V16QI 0 "=Yz,Yz,X,X")) @@ -24499,6 +24535,7 @@ (define_insn "sse4_2_pcmpistr_cconly" %vpcmpistri\t{%4, %3, %2|%2, %3, %4} %vpcmpistri\t{%4, %3, %2|%2, %3, %4}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load,none,load") @@ -25983,23 +26020,25 @@ (define_insn "aesdeclast" (define_insn "aesimc" [(set (match_operand:V2DI 0 "register_operand" "=x") - (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm")] + (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xja")] UNSPEC_AESIMC))] "TARGET_AES" "%vaesimc\t{%1, %0|%0, %1}" [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "aeskeygenassist" [(set (match_operand:V2DI 0 "register_operand" "=x") - (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm") + (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xja") (match_operand:SI 2 "const_0_to_255_operand")] UNSPEC_AESKEYGENASSIST))] "TARGET_AES" "%vaeskeygenassist\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c index 510213a6ca7..771bcb078e1 100644 --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -45,13 +45,22 @@ typedef union DTYPE a[16]; } tmp_u; -__attribute__((target("sse4.2"))) +__attribute__((target("sse4.2,aes"))) void sse_test () { register tmp_u *tdst __asm__("%r16"); register tmp_u *src1 __asm__("%r17"); register tmp_u *src2 __asm__("%r18"); - + + src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]); + src1->a[2] = _mm_testc_si128 (src1->xi[3], src2->xi[4]); + src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6], + _MM_FROUND_CUR_DIRECTION); + src1->xf[4] = _mm_round_ps (src1->xf[7], _MM_FROUND_CUR_DIRECTION); + src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3], + _MM_FROUND_CUR_DIRECTION); + src1->xd[1] = _mm_round_pd (src1->xd[4], _MM_FROUND_CUR_DIRECTION); + src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]); src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]); tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]); @@ -77,16 +86,33 @@ void sse_test () tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]); tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]); tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]); + + tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c); + tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20); + tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30); + tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40); + + tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]); + tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b); } -__attribute__((target("avx2"))) +__attribute__((target("avx2,aes"))) void vex_test () { register tmp_u *tdst __asm__("%r16"); register tmp_u *src1 __asm__("%r17"); register tmp_u *src2 __asm__("%r18"); - + + src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]); + src1->a[2] = _mm256_testc_si256 (src1->yi[2], src2->yi[3]); + src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6], + _MM_FROUND_CUR_DIRECTION); + src1->yf[4] = _mm256_round_ps (src1->yf[2], _MM_FROUND_CUR_DIRECTION); + src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3], + _MM_FROUND_CUR_DIRECTION); + src1->yd[1] = _mm256_round_pd (src1->yd[3], _MM_FROUND_CUR_DIRECTION); + src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]); src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]); tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]); @@ -98,7 +124,6 @@ void vex_test () src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]); tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf); - tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf); tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1); @@ -112,6 +137,14 @@ void vex_test () tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]); tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]); tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]); + + tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c); + tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20); + tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30); + tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40); + + tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]); + tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b); } /* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ @@ -134,3 +167,15 @@ void vex_test () /* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phminposuw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?ptest\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundss\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundsd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpestri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpistri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpestrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpistrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?aesimc\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?aeskeygenassist\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ From patchwork Fri Sep 22 10:56:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143358 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4568901rwb; Fri, 22 Sep 2023 04:03:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGHUzmF7iQeQW+jwOHYJDSbdElDdHRnWhdi/YTVeHcM0H9zAHzrFuUfpe2xg1tVVo2QvOTo X-Received: by 2002:a05:6512:2146:b0:501:bee7:487b with SMTP id s6-20020a056512214600b00501bee7487bmr6464197lfr.11.1695380586864; Fri, 22 Sep 2023 04:03:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380586; cv=none; d=google.com; s=arc-20160816; b=Na+r4Cb6MP0xDg2bLgSNcnLDZ9QZ8F5TxAfwwiF1CPQ+QVeh/qjHCMiNVjLskSp+lB v9GFsl868nyEbEEwNgxUVkyHCXh84iN29gjxJ3k0XDbok9UEfmwYEhWgrvvCjTGlFwZl NNVEBvkJMWdN5ghhjyhU4F6b5lu1NBDkDqtQ4TjhYkGtZtZMyLZWvYEvxzo7A7Xj5ERB qDZcaiNWY+fD7J3mmD/sOgK7YXTzQqDtXFy7lFp69zBn2p4uOa3sm0mUMzj2CvF6MYde HytlAUXY5pQK5b3WSPgZO9nRZwr7vSI/LFtDgWV71GfUVDmUZL8KG3a1N89f9W96/64V ITFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=T7K+M9bi4ej5titQIpClnsT2b3AHZ2bu/T5DySkTHpE=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=hwSiZ1DGHma5kppqvrrryKM5Kfkn3m8ISGUDd47m3V7ApWc/fx0QEcDdBI3PSXczXG IQefKGKrmKOsj1a2jAeof3NIz/AQ7Z5aOuHAU4XEa2KJreg8/1tExoyJQypSSP/OxRXi FF1NCaPTt//qngqgi4YfBz6buzTZ71dFk3iWAUPboPnhdRHunEcYrXSohodinmqdI2F/ f7xxbMhtzI9wWyWXBswLkxsxIbDF50IfAKqi4iJ9JTQHdSMPetRi6Cn4rHSIxJEQGRTz RObbGSmmdPVuQ1ibm9lqOK8qFu4sjVjkMZFLF0Bpuy3Vhhd7sxTUy9NAcKreSRb/KAcd 0ExA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nke6Pn0B; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id be4-20020a0564021a2400b00532addba502si2983508edb.124.2023.09.22.04.03.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:03:06 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nke6Pn0B; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DCFCF388455E for ; Fri, 22 Sep 2023 10:58:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 923C8385414F for ; Fri, 22 Sep 2023 10:56:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 923C8385414F Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380217; x=1726916217; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gbjJZ5mQzXr58cy8X9OzJNre4ls8Y6Q63SDE49DrNto=; b=nke6Pn0BszR83saZE0ziXQN4MlT+9qb/yjY9/Nwhjh8lNn6DVgyoc30W aZNTFYlS7j2KRAzIxM4lzyzRhVv421RWFQlmG7wOiLIMErv13RJmrTBDK B1arXnTQCpUbIef1aU+1CIt3k7gN/6hJCB37Cy3GssWKR3eZ2AsoVeS9x wHpslFly6SXtUpo6oU5FEkaHRH0qo0yEaH9S4RGfkk0KsyHjYUgI733z4 bG/6UkVnT4uhickVu4JIb2XsTg6Oo8Xkx9Q673fUqvqZAktx95mIjhWmB KuKZVfJurX2uRucuQMwa/+XpqviaigFjVsQPdN87+dwtF9c+kfC4W6gyC A==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680867" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680867" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615996" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615996" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:37 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C226410050EB; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5) Date: Fri, 22 Sep 2023 18:56:30 +0800 Message-Id: <20230922105631.2298849-13-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735394066406592 X-GMAIL-MSGID: 1777735394066406592 From: Kong Lingling The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns with evex counterpart, we assume auto promotion to EGPR under APX_F if the insn uses GPR32. So for below insns, we disabled EGPR usage for their sse mnenomics, while allowing egpr generation of their v prefixed mnemonics. insn list: 1. pabsb/pabsw/pabsd 2. pextrb/pextrw/pextrd/pextrq 3. pinsrb/pinsrd/pinsrq 4. pshufb 5. extractps/insertps 6. pmaddubsw 7. pmulhrsw 8. packusdw 9. palignr 10. movntdqa 11. mpsadbw 12. pmuldq/pmulld 13. pmaxsb/pmaxsd, pminsb/pminsd pmaxud/pmaxuw, pminud/pminuw 14. (pmovsxbw/pmovsxbd/pmovsxbq, pmovsxwd/pmovsxwq, pmovsxdq pmovzxbw/pmovzxbd/pmovzxbq, pmovzxwd/pmovzxwq, pmovzxdq) 15. aesdec/aesdeclast, aesenc/aesenclast 16. pclmulqdq 17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb gcc/ChangeLog: * config/i386/i386.md (*movhi_internal): Split out non-gpr supported pextrw with mem constraint to avx/noavx alternatives, set jm and attr gpr32 0 to the noavx alternative. (*mov_internal): Likewise. * config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to "jr/jm/ja" and set_attr gpr32 0 for noavx alternative. (mmx_pshufbv4qi3): Likewise. (*mmx_pinsrd): Likewise. (*mmx_pinsrb): Likewise. (*pinsrb): Likewise. (mmx_pshufbv8qi3): Likewise. (mmx_pshufbv4qi3): Likewise. (@sse4_1_insertps_): Likewise. (*mmx_pextrw): Split altrenatives and map non-EGPR constraints, attr_gpr32 and attr_isa to noavx mnemonics. (*movv2qi_internal): Likewise. (*pextrw): Likewise. (*mmx_pextrb): Likewise. (*mmx_pextrb_zext): Likewise. (*pextrb): Likewise. (*pextrb_zext): Likewise. (vec_extractv2si_1): Likewise. (vec_extractv2si_1_zext): Likewise. * config/i386/sse.md: (vi128_h_r): New mode attr for pinsr{bw}/pextr{bw} with reg operand. (*abs2): Split altrenatives and %v in mnemonics, map non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics. (*vec_extract): Likewise. (*vec_extract): Likewise for HFBF pattern. (*vec_extract_zext): Likewise. (*vec_extractv4si_1): Likewise. (*vec_extractv4si_zext): Likewise. (*vec_extractv2di_1): Likewise. (*vec_concatv2si_sse4_1): Likewise. (_pinsr): Likewise. (vec_concatv2di): Likewise. (*sse4_1_v2qiv2di2_1): Likewise. (ssse3_avx2>_pshufb3): Change "r/m/Bm" to "jr/jm/ja" and set_attr gpr32 0 for noavx alternative, split %v for avx/noavx alternatives if necessary. (*vec_concatv2sf_sse4_1): Likewise. (*sse4_1_extractps): Likewise. (vec_set_0): Likewise for VI4F_128. (*vec_setv4sf_sse4_1): Likewise. (@sse4_1_insertps): Likewise. (ssse3_pmaddubsw128): Likewise. (*_pmulhrsw3): Likewise. (_packusdw): Likewise. (_palignr): Likewise. (_movntdqa): Likewise. (_mpsadbw): Likewise. (*sse4_1_mulv2siv2di3): Likewise. (*_mul3): Likewise. (*sse4_1_3): Likewise. (*v8hi3): Likewise. (*v16qi3): Likewise. (*sse4_1_v8qiv8hi2_1): Likewise. (*sse4_1_zero_extendv8qiv8hi2_3): Likewise. (*sse4_1_zero_extendv8qiv8hi2_4): Likewise. (*sse4_1_v4qiv4si2_1): Likewise. (*sse4_1_v4hiv4si2_1): Likewise. (*sse4_1_zero_extendv4hiv4si2_3): Likewise. (*sse4_1_zero_extendv4hiv4si2_4): Likewise. (*sse4_1_v2hiv2di2_1): Likewise. (*sse4_1_v2siv2di2_1): Likewise. (*sse4_1_zero_extendv2siv2di2_3): Likewise. (*sse4_1_zero_extendv2siv2di2_4): Likewise. (aesdec): Likewise. (aesdeclast): Likewise. (aesenc): Likewise. (aesenclast): Likewise. (pclmulqdq): Likewise. (vgf2p8affineinvqb_): Likewise. (vgf2p8affineqb_): Likewise. (vgf2p8mulb_): Likewise. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386.md | 42 +++--- gcc/config/i386/mmx.md | 143 ++++++++++++--------- gcc/config/i386/sse.md | 274 ++++++++++++++++++++++++++-------------- 3 files changed, 289 insertions(+), 170 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 271d417146c..c09ee3989cb 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2868,9 +2868,9 @@ (define_peephole2 (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" - "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m") + "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,jm,m") (match_operand:HI 1 "general_operand" - "r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*v"))] + "r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*x,*v"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { @@ -2925,15 +2925,21 @@ (define_insn "*movhi_internal" (cond [(eq_attr "alternative" "9,10,11,12,13") (const_string "sse2") (eq_attr "alternative" "14") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "15") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "14") + (const_string "0") + (const_string "1"))) (set (attr "type") (cond [(eq_attr "alternative" "4,5,6,7") (const_string "mskmov") (eq_attr "alternative" "8") (const_string "msklog") - (eq_attr "alternative" "13,14") + (eq_attr "alternative" "13,14,15") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "ssemov") (const_string "sselog1")) @@ -2958,7 +2964,7 @@ (define_insn "*movhi_internal" (set (attr "prefix") (cond [(eq_attr "alternative" "4,5,6,7,8") (const_string "vex") - (eq_attr "alternative" "9,10,11,12,13,14") + (eq_attr "alternative" "9,10,11,12,13,14,15") (const_string "maybe_evex") ] (const_string "orig"))) @@ -2967,7 +2973,7 @@ (define_insn "*movhi_internal" (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) - (eq_attr "alternative" "13,14") + (eq_attr "alternative" "13,14,15") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "TI")) @@ -4320,9 +4326,9 @@ (define_mode_attr hfbfconstf (define_insn "*mov_internal" [(set (match_operand:HFBF 0 "nonimmediate_operand" - "=?r,?r,?r,?m,v,v,?r,m,?v,v") + "=?r,?r,?r,?m,v,v,?r,jm,m,?v,v") (match_operand:HFBF 1 "general_operand" - "r ,F ,m ,r,C,v, v,v,r ,m"))] + "r ,F ,m ,r,C,v, v,v,v,r ,m"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && (lra_in_progress || reload_completed @@ -4358,18 +4364,24 @@ (define_insn "*mov_internal" } } [(set (attr "isa") - (cond [(eq_attr "alternative" "4,5,6,8,9") + (cond [(eq_attr "alternative" "4,5,6,9,10") (const_string "sse2") (eq_attr "alternative" "7") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "8") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "8") + (const_string "0") + (const_string "1"))) (set (attr "type") (cond [(eq_attr "alternative" "4") (const_string "sselog1") - (eq_attr "alternative" "5,6,8") + (eq_attr "alternative" "5,6,9") (const_string "ssemov") - (eq_attr "alternative" "7,9") + (eq_attr "alternative" "7,8,10") (if_then_else (match_test ("TARGET_AVX512FP16")) (const_string "ssemov") @@ -4389,19 +4401,19 @@ (define_insn "*mov_internal" ] (const_string "imov"))) (set (attr "prefix") - (cond [(eq_attr "alternative" "4,5,6,7,8,9") + (cond [(eq_attr "alternative" "4,5,6,7,8,9,10") (const_string "maybe_vex") ] (const_string "orig"))) (set (attr "mode") (cond [(eq_attr "alternative" "4") (const_string "V4SF") - (eq_attr "alternative" "6,8") + (eq_attr "alternative" "6,9") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) - (eq_attr "alternative" "7,9") + (eq_attr "alternative" "7,8,10") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index ef578222945..73809585a5d 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -418,9 +418,9 @@ (define_expand "movv2qi" (define_insn "*movv2qi_internal" [(set (match_operand:V2QI 0 "nonimmediate_operand" - "=r,r,r,m ,v,v,v,m,r,v") + "=r,r,r,m ,v,v,v,jm,m,r,v") (match_operand:V2QI 1 "general_operand" - "r ,C,m,rC,C,v,m,v,v,r"))] + "r ,C,m,rC,C,v,m,x,v,v,r"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) @@ -453,20 +453,26 @@ (define_insn "*movv2qi_internal" } } [(set (attr "isa") - (cond [(eq_attr "alternative" "6,8,9") + (cond [(eq_attr "alternative" "6,9,10") (const_string "sse2") (eq_attr "alternative" "7") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "8") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "7") + (const_string "0") + (const_string "1"))) (set (attr "type") - (cond [(eq_attr "alternative" "6,7") + (cond [(eq_attr "alternative" "6,7,8") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "ssemov") (const_string "sselog1")) (eq_attr "alternative" "4") (const_string "sselog1") - (eq_attr "alternative" "5,8,9") + (eq_attr "alternative" "5,9,10") (const_string "ssemov") (match_test "optimize_function_for_size_p (cfun)") (const_string "imov") @@ -483,16 +489,16 @@ (define_insn "*movv2qi_internal" ] (const_string "imov"))) (set (attr "prefix") - (cond [(eq_attr "alternative" "4,5,6,7,8,9") + (cond [(eq_attr "alternative" "4,5,6,7,8,9,10") (const_string "maybe_evex") ] (const_string "orig"))) (set (attr "mode") - (cond [(eq_attr "alternative" "6,7") + (cond [(eq_attr "alternative" "6,7,8") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "TI")) - (eq_attr "alternative" "8,9") + (eq_attr "alternative" "9,10") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) @@ -526,9 +532,9 @@ (define_insn "*movv2qi_internal" ] (const_string "HI"))) (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "8") + (cond [(eq_attr "alternative" "9") (symbol_ref "TARGET_INTER_UNIT_MOVES_FROM_VEC") - (eq_attr "alternative" "9") + (eq_attr "alternative" "10") (symbol_ref "TARGET_INTER_UNIT_MOVES_TO_VEC") ] (symbol_ref "true")))]) @@ -1167,7 +1173,7 @@ (define_expand "vcondv2sf" (define_insn "@sse4_1_insertps_" [(set (match_operand:V2FI 0 "register_operand" "=Yr,*x,v") (unspec:V2FI - [(match_operand:V2FI 2 "nonimmediate_operand" "Yrm,*xm,vm") + [(match_operand:V2FI 2 "nonimmediate_operand" "Yrjm,*xjm,vm") (match_operand:V2FI 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_INSERTPS))] @@ -1193,6 +1199,7 @@ (define_insn "@sse4_1_insertps_" } } [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -3952,7 +3959,7 @@ (define_insn "*mmx_pinsrd" [(set (match_operand:V2SI 0 "register_operand" "=x,Yv") (vec_merge:V2SI (vec_duplicate:V2SI - (match_operand:SI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:SI 2 "nonimmediate_operand" "jrjm,rm")) (match_operand:V2SI 1 "register_operand" "0,Yv") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE @@ -3971,6 +3978,7 @@ (define_insn "*mmx_pinsrd" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "type" "sselog") (set_attr "length_immediate" "1") @@ -4031,7 +4039,7 @@ (define_insn "*mmx_pinsrb" [(set (match_operand:V8QI 0 "register_operand" "=x,YW") (vec_merge:V8QI (vec_duplicate:V8QI - (match_operand:QI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:QI 2 "nonimmediate_operand" "jrjm,rm")) (match_operand:V8QI 1 "register_operand" "0,YW") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE @@ -4057,28 +4065,31 @@ (define_insn "*mmx_pinsrb" } [(set_attr "isa" "noavx,avx") (set_attr "type" "sselog") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) (define_insn "*mmx_pextrw" - [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,r,m") + [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,r,jm,m") (vec_select:HI - (match_operand:V4HI 1 "register_operand" "y,YW,YW") + (match_operand:V4HI 1 "register_operand" "y,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pextrw\t{%2, %1, %k0|%k0, %1, %2} %vpextrw\t{%2, %1, %k0|%k0, %1, %2} - %vpextrw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse2,sse4") - (set_attr "mmx_isa" "native,*,*") - (set_attr "type" "mmxcvt,sselog1,sselog1") + pextrw\t{%2, %1, %0|%0, %1, %2} + vpextrw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,sse2,sse4_noavx,avx") + (set_attr "gpr32" "1,1,0,1") + (set_attr "mmx_isa" "native,*,*,*") + (set_attr "type" "mmxcvt,sselog1,sselog1,sselog1") (set_attr "length_immediate" "1") - (set_attr "prefix" "orig,maybe_vex,maybe_vex") - (set_attr "mode" "DI,TI,TI")]) + (set_attr "prefix" "orig,maybe_vex,maybe_vex,maybe_evex") + (set_attr "mode" "DI,TI,TI,TI")]) (define_insn "*mmx_pextrw_zext" [(set (match_operand:SWI48 0 "register_operand" "=r,r") @@ -4099,29 +4110,34 @@ (define_insn "*mmx_pextrw_zext" (set_attr "mode" "DI,TI")]) (define_insn "*mmx_pextrb" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,m") + [(set (match_operand:QI 0 "nonimmediate_operand" "=jr,jm,r,m") (vec_select:QI - (match_operand:V8QI 1 "register_operand" "YW,YW") + (match_operand:V8QI 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_7_operand")])))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" "@ - %vpextrb\t{%2, %1, %k0|%k0, %1, %2} - %vpextrb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sselog1") + pextrb\t{%2, %1, %k0|%k0, %1, %2} + pextrb\t{%2, %1, %0|%0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx,avx") + (set_attr "gpr32" "1,0,1,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*mmx_pextrb_zext" - [(set (match_operand:SWI248 0 "register_operand" "=r") + [(set (match_operand:SWI248 0 "register_operand" "=jr,r") (zero_extend:SWI248 (vec_select:QI - (match_operand:V8QI 1 "register_operand" "YW") + (match_operand:V8QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_7_operand")]))))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4131,13 +4147,14 @@ (define_insn "mmx_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=x,Yw") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")] + (match_operand:V16QI 2 "vector_operand" "xja,Ywm")] UNSPEC_PSHUFB))] "TARGET_SSSE3 && TARGET_MMX_WITH_SSE" "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -4148,13 +4165,14 @@ (define_insn "mmx_pshufbv4qi3" [(set (match_operand:V4QI 0 "register_operand" "=x,Yw") (unspec:V4QI [(match_operand:V4QI 1 "register_operand" "0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")] + (match_operand:V16QI 2 "vector_operand" "xja,Ywm")] UNSPEC_PSHUFB))] "TARGET_SSSE3" "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -4414,29 +4432,31 @@ (define_split ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.cc (define_insn "*vec_extractv2si_1" - [(set (match_operand:SI 0 "nonimmediate_operand" "=y,rm,x,x,y,x,r") + [(set (match_operand:SI 0 "nonimmediate_operand" "=y,jrjm,rm,x,x,y,x,r") (vec_select:SI - (match_operand:V2SI 1 "nonimmediate_operand" " 0,x ,x,0,o,o,o") + (match_operand:V2SI 1 "nonimmediate_operand" " 0,x, x ,x,0,o,o,o") (parallel [(const_int 1)])))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ punpckhdq\t%0, %0 - %vpextrd\t{$1, %1, %0|%0, %1, 1} + pextrd\t{$1, %1, %0|%0, %1, 1} + vpextrd\t{$1, %1, %0|%0, %1, 1} %vpshufd\t{$0xe5, %1, %0|%0, %1, 0xe5} shufps\t{$0xe5, %0, %0|%0, %0, 0xe5} # # #" - [(set_attr "isa" "*,sse4,sse2,noavx,*,*,*") - (set_attr "mmx_isa" "native,*,*,*,native,*,*") - (set_attr "type" "mmxcvt,ssemov,sseshuf1,sseshuf1,mmxmov,ssemov,imov") + [(set_attr "isa" "*,sse4_noavx,avx,sse2,noavx,*,*,*") + (set_attr "gpr32" "1,0,1,1,1,1,1,1") + (set_attr "mmx_isa" "native,*,*,*,*,native,*,*") + (set_attr "type" "mmxcvt,ssemov,ssemov,sseshuf1,sseshuf1,mmxmov,ssemov,imov") (set (attr "length_immediate") - (if_then_else (eq_attr "alternative" "1,2,3") + (if_then_else (eq_attr "alternative" "1,2,3,4") (const_string "1") (const_string "*"))) - (set_attr "prefix" "orig,maybe_vex,maybe_vex,orig,orig,orig,orig") - (set_attr "mode" "DI,TI,TI,V4SF,SI,SI,SI")]) + (set_attr "prefix" "orig,orig,maybe_evex,maybe_vex,orig,orig,orig,orig") + (set_attr "mode" "DI,TI,TI,TI,V4SF,SI,SI,SI")]) (define_split [(set (match_operand:SI 0 "register_operand") @@ -4448,15 +4468,16 @@ (define_split "operands[1] = adjust_address (operands[1], SImode, 4);") (define_insn "*vec_extractv2si_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=jr,r") (zero_extend:DI (vec_select:SI - (match_operand:V2SI 1 "register_operand" "x") + (match_operand:V2SI 1 "register_operand" "x,x") (parallel [(const_int 1)]))))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_64BIT && TARGET_SSE4_1" "%vpextrd\t{$1, %1, %k0|%k0, %1, 1}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4606,7 +4627,7 @@ (define_insn "*pinsrb" [(set (match_operand:V4QI 0 "register_operand" "=x,YW") (vec_merge:V4QI (vec_duplicate:V4QI - (match_operand:QI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:QI 2 "nonimmediate_operand" "jrjm,rm")) (match_operand:V4QI 1 "register_operand" "0,YW") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 @@ -4631,6 +4652,7 @@ (define_insn "*pinsrb" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -4638,15 +4660,17 @@ (define_insn "*pinsrb" (set_attr "mode" "TI")]) (define_insn "*pextrw" - [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,m") + [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,jm,m") (vec_select:HI - (match_operand:V2HI 1 "register_operand" "YW,YW") + (match_operand:V2HI 1 "register_operand" "YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_1_operand")])))] "TARGET_SSE2" "@ %vpextrw\t{%2, %1, %k0|%k0, %1, %2} - %vpextrw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse4") + pextrw\t{%2, %1, %0|%0, %1, %2} + vpextrw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,sse4_noavx,avx") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4666,29 +4690,34 @@ (define_insn "*pextrw_zext" (set_attr "mode" "TI")]) (define_insn "*pextrb" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,m") + [(set (match_operand:QI 0 "nonimmediate_operand" "=jr,jm,r,m") (vec_select:QI - (match_operand:V4QI 1 "register_operand" "YW,YW") + (match_operand:V4QI 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "TARGET_SSE4_1" "@ - %vpextrb\t{%2, %1, %k0|%k0, %1, %2} - %vpextrb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sselog1") + pextrb\t{%2, %1, %k0|%k0, %1, %2} + pextrb\t{%2, %1, %0|%0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx,avx") + (set_attr "gpr32" "1,0,1,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*pextrb_zext" - [(set (match_operand:SWI248 0 "register_operand" "=r") + [(set (match_operand:SWI248 0 "register_operand" "=jr,r") (zero_extend:SWI248 (vec_select:QI - (match_operand:V4QI 1 "register_operand" "YW") + (match_operand:V4QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))] "TARGET_SSE4_1" "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 4db3940e422..d3b59c4866b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10836,7 +10836,7 @@ (define_insn "*vec_concatv2sf_sse4_1" (match_operand:SF 1 "nonimmediate_operand" " 0, 0,Yv, 0,0, v,m, 0 , m") (match_operand:SF 2 "nonimm_or_0_operand" - " Yr,*x,Yv, m,m, m,C,*ym, C")))] + " Yr,*x,Yv, jm,jm, m,C,*ym, C")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ unpcklps\t{%2, %0|%0, %2} @@ -10868,6 +10868,10 @@ (define_insn "*vec_concatv2sf_sse4_1" (if_then_else (eq_attr "alternative" "7,8") (const_string "native") (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "3,4") + (const_string "0") + (const_string "1"))) (set (attr "prefix_data16") (if_then_else (eq_attr "alternative" "3,4") (const_string "1") @@ -10959,7 +10963,7 @@ (define_insn "vec_set_0" (vec_merge:VI4F_128 (vec_duplicate:VI4F_128 (match_operand: 2 "general_operand" - " Yr,*x,v,m,r ,m,x,v,?rm,?rm,?rm,!x,?re,!*fF")) + " Yr,*x,v,m,r ,m,x,v,?jrjm,?jrjm,?rm,!x,?re,!*fF")) (match_operand:VI4F_128 1 "nonimm_or_0_operand" " C , C,C,C,C ,C,0,v,0 ,0 ,x ,0 ,0 ,0") (const_int 1)))] @@ -10999,6 +11003,10 @@ (define_insn "vec_set_0" (const_string "fmov") ] (const_string "ssemov"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "8,9") + (const_string "0") + (const_string "1"))) (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "8,9,10") (const_string "1") @@ -11169,7 +11177,7 @@ (define_insn "*vec_setv4sf_sse4_1" [(set (match_operand:V4SF 0 "register_operand" "=Yr,*x,v") (vec_merge:V4SF (vec_duplicate:V4SF - (match_operand:SF 2 "nonimmediate_operand" "Yrm,*xm,vm")) + (match_operand:SF 2 "nonimmediate_operand" "Yrjm,*xjm,vm")) (match_operand:V4SF 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 @@ -11190,6 +11198,7 @@ (define_insn "*vec_setv4sf_sse4_1" } [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sselog") + (set_attr "gpr32" "0,0,1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -11264,7 +11273,7 @@ (define_insn_and_split "*vec_setv2di_0_zero_extendsi_1" (define_insn "@sse4_1_insertps_" [(set (match_operand:VI4F_128 0 "register_operand" "=Yr,*x,v") (unspec:VI4F_128 - [(match_operand:VI4F_128 2 "nonimmediate_operand" "Yrm,*xm,vm") + [(match_operand:VI4F_128 2 "nonimmediate_operand" "Yrjm,*xjm,vm") (match_operand:VI4F_128 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_INSERTPS))] @@ -11290,6 +11299,7 @@ (define_insn "@sse4_1_insertps_" } } [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -11367,7 +11377,7 @@ (define_insn_and_split "*vec_extractv4sf_0" "operands[1] = gen_lowpart (SFmode, operands[1]);") (define_insn_and_split "*sse4_1_extractps" - [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,rm,rm,Yv,Yv") + [(set (match_operand:SF 0 "nonimmediate_operand" "=jrjm,jrjm,rm,Yv,Yv") (vec_select:SF (match_operand:V4SF 1 "register_operand" "Yr,*x,v,0,v") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] @@ -11401,6 +11411,7 @@ (define_insn_and_split "*sse4_1_extractps" DONE; } [(set_attr "isa" "noavx,noavx,avx,noavx,avx") + (set_attr "gpr32" "0,0,1,1,1") (set_attr "type" "sselog,sselog,sselog,*,*") (set_attr "prefix_data16" "1,1,1,*,*") (set_attr "prefix_extra" "1,1,1,*,*") @@ -12265,9 +12276,9 @@ (define_insn_and_split "*vec_extract_0" "operands[1] = gen_lowpart (mode, operands[1]);") (define_insn "*vec_extract" - [(set (match_operand:HFBF 0 "register_sse4nonimm_operand" "=?r,m,x,v") + [(set (match_operand:HFBF 0 "register_sse4nonimm_operand" "=?r,jm,m,x,v") (vec_select:HFBF - (match_operand: 1 "register_operand" "v,v,0,v") + (match_operand: 1 "register_operand" "v,x,v,0,v") (parallel [(match_operand:SI 2 "const_0_to_7_operand")])))] "TARGET_SSE2" @@ -12277,12 +12288,14 @@ (define_insn "*vec_extract" case 0: return "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}"; case 1: - return "%vpextrw\t{%2, %1, %0|%0, %1, %2}"; - + return "pextrw\t{%2, %1, %0|%0, %1, %2}"; case 2: + return "vpextrw\t{%2, %1, %0|%0, %1, %2}"; + + case 3: operands[2] = GEN_INT (INTVAL (operands[2]) * 2); return "psrldq\t{%2, %0|%0, %2}"; - case 3: + case 4: operands[2] = GEN_INT (INTVAL (operands[2]) * 2); return "vpsrldq\t{%2, %1, %0|%0, %1, %2}"; @@ -12290,8 +12303,9 @@ (define_insn "*vec_extract" gcc_unreachable (); } } - [(set_attr "isa" "*,sse4,noavx,avx") - (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1") + [(set_attr "isa" "*,sse4_noavx,avx,noavx,avx") + (set_attr "gpr32" "1,0,1,1,1") + (set_attr "type" "sselog1,sselog1,sselog1,sseishft1,sseishft1") (set_attr "prefix" "maybe_evex") (set_attr "mode" "TI")]) @@ -15653,7 +15667,7 @@ (define_insn "*sse4_1_mulv2siv2di3" (parallel [(const_int 0) (const_int 2)]))) (sign_extend:V2DI (vec_select:V2SI - (match_operand:V4SI 2 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V4SI 2 "vector_operand" "Yrja,*xja,vm") (parallel [(const_int 0) (const_int 2)])))))] "TARGET_SSE4_1 && && !(MEM_P (operands[1]) && MEM_P (operands[2]))" @@ -15662,6 +15676,7 @@ (define_insn "*sse4_1_mulv2siv2di3" pmuldq\t{%2, %0|%0, %2} vpmuldq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -15899,7 +15914,7 @@ (define_insn "*_mul3" [(set (match_operand:VI4_AVX512F 0 "register_operand" "=Yr,*x,v") (mult:VI4_AVX512F (match_operand:VI4_AVX512F 1 "bcst_vector_operand" "%0,0,v") - (match_operand:VI4_AVX512F 2 "bcst_vector_operand" "YrBm,*xBm,vmBr")))] + (match_operand:VI4_AVX512F 2 "bcst_vector_operand" "Yrja,*xja,vmBr")))] "TARGET_SSE4_1 && ix86_binary_operator_ok (MULT, mode, operands) && " "@ @@ -15907,6 +15922,7 @@ (define_insn "*_mul3" pmulld\t{%2, %0|%0, %2} vpmulld\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "") @@ -16711,7 +16727,7 @@ (define_insn "*sse4_1_3" [(set (match_operand:VI14_128 0 "register_operand" "=Yr,*x,") (smaxmin:VI14_128 (match_operand:VI14_128 1 "vector_operand" "%0,0,") - (match_operand:VI14_128 2 "vector_operand" "YrBm,*xBm,m")))] + (match_operand:VI14_128 2 "vector_operand" "Yrja,*xja,m")))] "TARGET_SSE4_1 && && !(MEM_P (operands[1]) && MEM_P (operands[2]))" @@ -16722,6 +16738,7 @@ (define_insn "*sse4_1_3" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sseiadd") (set_attr "prefix_extra" "1") + (set_attr "gpr32" "0,0,1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -16729,13 +16746,14 @@ (define_insn "*v8hi3" [(set (match_operand:V8HI 0 "register_operand" "=x,Yw") (smaxmin:V8HI (match_operand:V8HI 1 "vector_operand" "%0,Yw") - (match_operand:V8HI 2 "vector_operand" "xBm,Ywm")))] + (match_operand:V8HI 2 "vector_operand" "xja,Ywm")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pw\t{%2, %0|%0, %2} vpw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0,1") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -16803,6 +16821,7 @@ (define_insn "*sse4_1_3" vp\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0,0,1") (set_attr "prefix_extra" "1,1,*") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -16811,12 +16830,13 @@ (define_insn "*v16qi3" [(set (match_operand:V16QI 0 "register_operand" "=x,Yw") (umaxmin:V16QI (match_operand:V16QI 1 "vector_operand" "%0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")))] + (match_operand:V16QI 2 "vector_operand" "xja,Ywm")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pb\t{%2, %0|%0, %2} vpb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseiadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -18808,7 +18828,7 @@ (define_insn "_pinsr" [(set (match_operand:PINSR_MODE 0 "register_operand" "=x,x,x,x,v,v,&x") (vec_merge:PINSR_MODE (vec_duplicate:PINSR_MODE - (match_operand: 2 "nonimmediate_operand" "r,m,r,m,r,m,x")) + (match_operand: 2 "nonimmediate_operand" "jr,jm,r,m,r,m,x")) (match_operand:PINSR_MODE 1 "register_operand" "0,0,x,x,v,v,x") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE2 @@ -18845,6 +18865,7 @@ (define_insn "_pinsr" } [(set_attr "isa" "noavx,noavx,avx,avx,,,avx2") (set_attr "type" "sselog") + (set_attr "gpr32" "0,0,1,1,1,1,1") (set (attr "prefix_rex") (if_then_else (and (not (match_test "TARGET_AVX")) @@ -20005,17 +20026,23 @@ (define_insn_and_split "*vec_extract_0_mem" operands[4] = gen_lowpart (mode, operands[2]); }) +(define_mode_attr vi128_jr_r + [(V16QI "jr") (V8HI "r")]) + (define_insn "*vec_extract" - [(set (match_operand: 0 "register_sse4nonimm_operand" "=r,m") + [(set (match_operand: 0 "register_sse4nonimm_operand" "=,r,jm,m") (vec_select: - (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW") + (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to__operand")])))] "TARGET_SSE2" "@ - %vpextr\t{%2, %1, %k0|%k0, %1, %2} - %vpextr\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse4") + pextr\t{%2, %1, %k0|%k0, %1, %2} + vpextr\t{%2, %1, %k0|%k0, %1, %2} + pextr\t{%2, %1, %0|%0, %1, %2} + vpextr\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "sse2_noavx,avx,sse4_noavx,avx") + (set_attr "gpr32" "1,1,0,1") (set_attr "type" "sselog1") (set (attr "prefix_extra") (if_then_else @@ -20023,20 +20050,21 @@ (define_insn "*vec_extract" (const_string "*") (const_string "1"))) (set_attr "length_immediate" "1") - (set_attr "prefix" "maybe_vex,maybe_vex") + (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*vec_extract_zext" - [(set (match_operand:SWI48 0 "register_operand" "=r") + [(set (match_operand:SWI48 0 "register_operand" "=,r") (zero_extend:SWI48 (vec_select: - (match_operand:PEXTR_MODE12 1 "register_operand" "YW") + (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to__operand")]))))] "TARGET_SSE2" "%vpextr\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set (attr "prefix_extra") (if_then_else (eq (const_string "mode") (const_string "V8HImode")) @@ -20047,15 +20075,16 @@ (define_insn "*vec_extract_zext" (set_attr "mode" "TI")]) (define_insn "*vec_extractv16qi_zext" - [(set (match_operand:HI 0 "register_operand" "=r") + [(set (match_operand:HI 0 "register_operand" "=jr,r") (zero_extend:HI (vec_select:QI - (match_operand:V16QI 1 "register_operand" "YW") + (match_operand:V16QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_15_operand")]))))] "TARGET_SSE4_1" "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -20161,24 +20190,26 @@ (define_split "operands[1] = gen_lowpart (SImode, operands[1]);") (define_insn "*vec_extractv4si" - [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw") + [(set (match_operand:SI 0 "nonimmediate_operand" "=jrjm,rm,rm,Yr,*x,Yw") (vec_select:SI - (match_operand:V4SI 1 "register_operand" " x, v, 0, 0,Yw") + (match_operand:V4SI 1 "register_operand" "x, x, v, 0, 0, Yw") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "TARGET_SSE4_1" { switch (which_alternative) { case 0: + return "pextrd\t{%2, %1, %0|%0, %1, %2}"; case 1: - return "%vpextrd\t{%2, %1, %0|%0, %1, %2}"; - case 2: + return "vpextrd\t{%2, %1, %0|%0, %1, %2}"; + case 3: + case 4: operands[2] = GEN_INT (INTVAL (operands[2]) * 4); return "psrldq\t{%2, %0|%0, %2}"; - case 4: + case 5: operands[2] = GEN_INT (INTVAL (operands[2]) * 4); return "vpsrldq\t{%2, %1, %0|%0, %1, %2}"; @@ -20186,25 +20217,26 @@ (define_insn "*vec_extractv4si" gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq,noavx,noavx,avx") - (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1,sseishft1") + [(set_attr "isa" "noavx,avx,avx512dq,noavx,noavx,avx") + (set_attr "type" "sselog1,sselog1,sselog1,sseishft1,sseishft1,sseishft1") + (set_attr "gpr32" "0,1,1,1,1,1") (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "0,1") (const_string "1") (const_string "*"))) (set_attr "length_immediate" "1") - (set_attr "prefix" "maybe_vex,evex,orig,orig,maybe_vex") + (set_attr "prefix" "orig,vex,evex,orig,orig,maybe_vex") (set_attr "mode" "TI")]) (define_insn "*vec_extractv4si_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=jr,r,r") (zero_extend:DI (vec_select:SI - (match_operand:V4SI 1 "register_operand" "x,v") + (match_operand:V4SI 1 "register_operand" "x,x,v") (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))] "TARGET_64BIT && TARGET_SSE4_1" "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "isa" "*,avx512dq") + [(set_attr "isa" "noavx,avx,avx512dq") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -20234,13 +20266,14 @@ (define_insn_and_split "*vec_extractv4si_zext_mem" }) (define_insn "*vec_extractv2di_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,rm,m,x,x,Yv,x,v,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=jrjm,rm,rm,m,x,x,Yv,x,v,r") (vec_select:DI - (match_operand:V2DI 1 "nonimmediate_operand" "x ,v ,v,0,x, v,x,o,o") + (match_operand:V2DI 1 "nonimmediate_operand" "x, x ,v ,v,0,x, v,x,o,o") (parallel [(const_int 1)])))] "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ - %vpextrq\t{$1, %1, %0|%0, %1, 1} + pextrq\t{$1, %1, %0|%0, %1, 1} + vpextrq\t{$1, %1, %0|%0, %1, 1} vpextrq\t{$1, %1, %0|%0, %1, 1} %vmovhps\t{%1, %0|%0, %1} psrldq\t{$8, %0|%0, 8} @@ -20251,44 +20284,47 @@ (define_insn "*vec_extractv2di_1" #" [(set (attr "isa") (cond [(eq_attr "alternative" "0") - (const_string "x64_sse4") + (const_string "x64_sse4_noavx") (eq_attr "alternative" "1") + (const_string "x64_avx") + (eq_attr "alternative" "2") (const_string "x64_avx512dq") - (eq_attr "alternative" "3") - (const_string "sse2_noavx") (eq_attr "alternative" "4") - (const_string "avx") + (const_string "sse2_noavx") (eq_attr "alternative" "5") - (const_string "avx512bw") + (const_string "avx") (eq_attr "alternative" "6") - (const_string "noavx") + (const_string "avx512bw") (eq_attr "alternative" "8") + (const_string "noavx") + (eq_attr "alternative" "9") (const_string "x64") ] (const_string "*"))) (set (attr "type") - (cond [(eq_attr "alternative" "2,6,7") + (cond [(eq_attr "alternative" "3,7,8") (const_string "ssemov") - (eq_attr "alternative" "3,4,5") + (eq_attr "alternative" "4,5,6") (const_string "sseishft1") - (eq_attr "alternative" "8") + (eq_attr "alternative" "9") (const_string "imov") ] (const_string "sselog1"))) + (set_attr "gpr32" "0,1,1,1,1,1,1,1,1,1") (set (attr "length_immediate") - (if_then_else (eq_attr "alternative" "0,1,3,4,5") + (if_then_else (eq_attr "alternative" "0,1,2,4,5,6") (const_string "1") (const_string "*"))) (set (attr "prefix_rex") - (if_then_else (eq_attr "alternative" "0,1") + (if_then_else (eq_attr "alternative" "0") (const_string "1") (const_string "*"))) (set (attr "prefix_extra") - (if_then_else (eq_attr "alternative" "0,1") + (if_then_else (eq_attr "alternative" "0") (const_string "1") (const_string "*"))) - (set_attr "prefix" "maybe_vex,evex,maybe_vex,orig,vex,evex,orig,*,*") - (set_attr "mode" "TI,TI,V2SF,TI,TI,TI,V4SF,DI,DI")]) + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig,vex,evex,orig,*,*") + (set_attr "mode" "TI,TI,TI,V2SF,TI,TI,TI,V4SF,DI,DI")]) (define_split [(set (match_operand: 0 "register_operand") @@ -20406,7 +20442,7 @@ (define_insn "*vec_concatv2si_sse4_1" (match_operand:SI 1 "nonimmediate_operand" " 0, 0, x,Yv, 0, 0,Yv,rm, 0,rm") (match_operand:SI 2 "nonimm_or_0_operand" - " rm,rm,rm,rm,Yr,*x,Yv, C,*ym, C")))] + "jrjm,jrjm,rm,rm,Yr,*x,Yv, C,*ym, C")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pinsrd\t{$1, %2, %0|%0, %2, 1} @@ -20433,6 +20469,10 @@ (define_insn "*vec_concatv2si_sse4_1" (const_string "mmxmov") ] (const_string "sselog"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "0,1") + (const_string "0") + (const_string "1"))) (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "0,1,2,3") (const_string "1") @@ -20557,7 +20597,7 @@ (define_insn "vec_concatv2di" (match_operand:DI 1 "register_operand" " 0, 0,x ,Yv,0,Yv,0,0,v") (match_operand:DI 2 "nonimmediate_operand" - " rm,rm,rm,rm,x,Yv,x,m,m")))] + " jrm,jrm,rm,rm,x,Yv,x,m,m")))] "TARGET_SSE" "@ pinsrq\t{$1, %2, %0|%0, %2, 1} @@ -20587,6 +20627,10 @@ (define_insn "vec_concatv2di" (eq_attr "alternative" "0,1,2,3,4,5") (const_string "sselog") (const_string "ssemov"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "0,1") + (const_string "0") + (const_string "1"))) (set (attr "prefix_rex") (if_then_else (eq_attr "alternative" "0,1,2,3") (const_string "1") @@ -21519,7 +21563,7 @@ (define_insn "ssse3_pmaddubsw128" (const_int 12) (const_int 14)]))) (sign_extend:V8HI (vec_select:V8QI - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm") + (match_operand:V16QI 2 "vector_operand" "xja,Ywm") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) @@ -21542,6 +21586,7 @@ (define_insn "ssse3_pmaddubsw128" pmaddubsw\t{%2, %0|%0, %2} vpmaddubsw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseiadd") (set_attr "atom_unit" "simul") (set_attr "prefix_extra" "1") @@ -21660,7 +21705,7 @@ (define_insn "*_pmulhrsw3" (sign_extend: (match_operand:VI2_AVX2_AVX512BW 1 "vector_operand" "%0,")) (sign_extend: - (match_operand:VI2_AVX2_AVX512BW 2 "vector_operand" "xBm,m"))) + (match_operand:VI2_AVX2_AVX512BW 2 "vector_operand" "xja,m"))) (const_int 14)) (match_operand:VI2_AVX2_AVX512BW 3 "const1_operand")) (const_int 1))))] @@ -21670,6 +21715,7 @@ (define_insn "*_pmulhrsw3" pmulhrsw\t{%2, %0|%0, %2} vpmulhrsw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -21786,13 +21832,14 @@ (define_insn "_pshufb3" [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,") (unspec:VI1_AVX512 [(match_operand:VI1_AVX512 1 "register_operand" "0,") - (match_operand:VI1_AVX512 2 "vector_operand" "xBm,m")] + (match_operand:VI1_AVX512 2 "vector_operand" "xja,m")] UNSPEC_PSHUFB))] "TARGET_SSSE3 && && " "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -21908,7 +21955,7 @@ (define_insn "_palignr" [(set (match_operand:VIMAX_AVX2_AVX512BW 0 "register_operand" "=x,") (unspec:VIMAX_AVX2_AVX512BW [(match_operand:VIMAX_AVX2_AVX512BW 1 "register_operand" "0,") - (match_operand:VIMAX_AVX2_AVX512BW 2 "vector_operand" "xBm,m") + (match_operand:VIMAX_AVX2_AVX512BW 2 "vector_operand" "xja,m") (match_operand:SI 3 "const_0_to_255_mul_8_operand")] UNSPEC_PALIGNR))] "TARGET_SSSE3" @@ -21926,6 +21973,7 @@ (define_insn "_palignr" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseishft") (set_attr "atom_unit" "sishuf") (set_attr "prefix_extra" "1") @@ -22000,6 +22048,7 @@ (define_insn_and_split "ssse3_palignrdi" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseishft") + (set_attr "gpr32" "0,0,1") (set_attr "atom_unit" "sishuf") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -22015,12 +22064,14 @@ (define_mode_iterator VI1248_AVX512VL_AVX512BW (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) (define_insn "*abs2" - [(set (match_operand:VI1248_AVX512VL_AVX512BW 0 "register_operand" "=") + [(set (match_operand:VI1248_AVX512VL_AVX512BW 0 "register_operand" "=x,") (abs:VI1248_AVX512VL_AVX512BW - (match_operand:VI1248_AVX512VL_AVX512BW 1 "vector_operand" "Bm")))] + (match_operand:VI1248_AVX512VL_AVX512BW 1 "vector_operand" "xja,Bm")))] "TARGET_SSSE3" "%vpabs\t{%1, %0|%0, %1}" - [(set_attr "type" "sselog1") + [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) @@ -22358,11 +22409,12 @@ (define_mode_attr vi8_sse4_1_avx2_avx512 (define_insn "_movntdqa" [(set (match_operand:VI8_AVX2_AVX512F 0 "register_operand" "=Yr,*x,v") - (unspec:VI8_AVX2_AVX512F [(match_operand:VI8_AVX2_AVX512F 1 "memory_operand" "m,m,m")] + (unspec:VI8_AVX2_AVX512F [(match_operand:VI8_AVX2_AVX512F 1 "memory_operand" "jm,jm,m")] UNSPEC_MOVNTDQA))] "TARGET_SSE4_1" "%vmovntdqa\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -22381,6 +22433,7 @@ (define_insn "_mpsadbw" mpsadbw\t{%3, %2, %0|%0, %2, %3} vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog1") (set_attr "gpr32" "0") (set_attr "length_immediate" "1") @@ -22394,7 +22447,7 @@ (define_insn "_packusdw" [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand" "=Yr,*x,") (unspec:VI2_AVX2_AVX512BW [(match_operand: 1 "register_operand" "0,0,") - (match_operand: 2 "vector_operand" "YrBm,*xBm,m")] + (match_operand: 2 "vector_operand" "Yrja,*xja,m")] UNSPEC_US_TRUNCATE))] "TARGET_SSE4_1 && && " "@ @@ -22402,6 +22455,7 @@ (define_insn "_packusdw" packusdw\t{%2, %0|%0, %2} vpackusdw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,") @@ -22748,10 +22802,14 @@ (define_insn "sse4_1_v8qiv8hi2" (define_insn "*sse4_1_v8qiv8hi2_1" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,Yw") (any_extend:V8HI - (match_operand:V8QI 1 "memory_operand" "m,m,m")))] + (match_operand:V8QI 1 "memory_operand" "jm,jm,m")))] "TARGET_SSE4_1 && && " - "%vpmovbw\t{%1, %0|%0, %1}" + "@ + pmovbw\t{%1, %0|%0, %1} + pmovbw\t{%1, %0|%0, %1} + vpmovbw\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -22781,7 +22839,7 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_3" [(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,Yw") (vec_select:V16QI (vec_concat:V32QI - (match_operand:V16QI 1 "vector_operand" "YrBm,*xBm,Ywm") + (match_operand:V16QI 1 "vector_operand" "Yrja,*xja,Ywm") (match_operand:V16QI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -22806,7 +22864,8 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" [(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,Yw") @@ -22814,7 +22873,7 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" (vec_concat:V32QI (subreg:V16QI (vec_concat:VI248_128 - (match_operand: 1 "vector_operand" "YrBm,*xBm,Ywm") + (match_operand: 1 "vector_operand" "Yrja,*xja,Ywm") (match_operand: 2 "const0_operand")) 0) (match_operand:V16QI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -22841,7 +22900,8 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" } operands[1] = lowpart_subreg (V16QImode, operands[1], mode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_expand "v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand") @@ -22960,10 +23020,11 @@ (define_insn "sse4_1_v4qiv4si2" (define_insn "*sse4_1_v4qiv4si2_1" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (any_extend:V4SI - (match_operand:V4QI 1 "memory_operand" "m,m,m")))] + (match_operand:V4QI 1 "memory_operand" "jm,jm,m")))] "TARGET_SSE4_1 && " "%vpmovbd\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23132,10 +23193,11 @@ (define_insn "sse4_1_v4hiv4si2" (define_insn "*sse4_1_v4hiv4si2_1" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (any_extend:V4SI - (match_operand:V4HI 1 "memory_operand" "m,m,m")))] + (match_operand:V4HI 1 "memory_operand" "jm,jm,m")))] "TARGET_SSE4_1 && " "%vpmovwd\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23184,7 +23246,7 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_3" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") (vec_select:V8HI (vec_concat:V16HI - (match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V8HI 1 "vector_operand" "Yrja,*xja,vm") (match_operand:V8HI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -23207,7 +23269,8 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") @@ -23215,7 +23278,7 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" (vec_concat:V16HI (subreg:V8HI (vec_concat:VI148_128 - (match_operand: 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand: 1 "vector_operand" "Yrja,*xja,vm") (match_operand: 2 "const0_operand")) 0) (match_operand:V8HI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -23240,7 +23303,8 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" } operands[1] = lowpart_subreg (V8HImode, operands[1], mode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn "avx512f_v8qiv8di2" [(set (match_operand:V8DI 0 "register_operand" "=v") @@ -23378,12 +23442,14 @@ (define_insn "sse4_1_v2qiv2di2" (set_attr "mode" "TI")]) (define_insn "*sse4_1_v2qiv2di2_1" - [(set (match_operand:V2DI 0 "register_operand" "=v") + [(set (match_operand:V2DI 0 "register_operand" "=x,v") (any_extend:V2DI - (match_operand:V2QI 1 "memory_operand" "m")))] + (match_operand:V2QI 1 "memory_operand" "jm,m")))] "TARGET_SSE4_1 && " "%vpmovbq\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") + (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_evex") (set_attr "mode" "TI")]) @@ -23517,10 +23583,11 @@ (define_insn "sse4_1_v2hiv2di2" (define_insn "*sse4_1_v2hiv2di2_1" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,v") (any_extend:V2DI - (match_operand:V2HI 1 "memory_operand" "m,m,m")))] + (match_operand:V2HI 1 "memory_operand" "jm,jm,m")))] "TARGET_SSE4_1 && " "%vpmovwq\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23682,10 +23749,11 @@ (define_insn "sse4_1_v2siv2di2" (define_insn "*sse4_1_v2siv2di2_1" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,v") (any_extend:V2DI - (match_operand:V2SI 1 "memory_operand" "m,m,m")))] + (match_operand:V2SI 1 "memory_operand" "jm,jm,m")))] "TARGET_SSE4_1 && " "%vpmovdq\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23712,7 +23780,7 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_3" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (vec_select:V4SI (vec_concat:V8SI - (match_operand:V4SI 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V4SI 1 "vector_operand" "Yrja,*xja,vm") (match_operand:V4SI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -23733,14 +23801,15 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_4" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (vec_select:V4SI (vec_concat:V8SI (vec_concat:V4SI - (match_operand:V2SI 1 "vector_operand" "YrBm, *xBm, vm") + (match_operand:V2SI 1 "vector_operand" "Yrja, *xja, vm") (match_operand:V2SI 2 "const0_operand")) (match_operand:V4SI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -23762,7 +23831,8 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_4" } operands[1] = lowpart_subreg (V4SImode, operands[1], V2SImode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_expand "v2siv2di2" [(set (match_operand:V2DI 0 "register_operand") @@ -25953,7 +26023,7 @@ (define_insn "xop_vpermil23" (define_insn "aesenc" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] UNSPEC_AESENC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -25962,6 +26032,7 @@ (define_insn "aesenc" vaesenc\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -25970,7 +26041,7 @@ (define_insn "aesenc" (define_insn "aesenclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] UNSPEC_AESENCLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -25979,6 +26050,7 @@ (define_insn "aesenclast" vaesenclast\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -25987,7 +26059,7 @@ (define_insn "aesenclast" (define_insn "aesdec" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] UNSPEC_AESDEC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -25996,6 +26068,7 @@ (define_insn "aesdec" vaesdec\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -26004,7 +26077,7 @@ (define_insn "aesdec" (define_insn "aesdeclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] UNSPEC_AESDECLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -26012,6 +26085,7 @@ (define_insn "aesdeclast" vaesdeclast\t{%2, %1, %0|%0, %1, %2} vaesdeclast\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") + (set_attr "gpr32" "0,1,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") @@ -26047,7 +26121,7 @@ (define_insn "aeskeygenassist" (define_insn "pclmulqdq" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm") + (match_operand:V2DI 2 "vector_operand" "xja,xm,vm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCLMUL))] "TARGET_PCLMUL" @@ -26057,6 +26131,7 @@ (define_insn "pclmulqdq" vpclmulqdq\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,avx,vpclmulqdqvl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex,evex") @@ -29403,7 +29478,7 @@ (define_insn "vgf2p8affineinvqb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm") + (match_operand:VI1_AVX512F 2 "vector_operand" "xja,vm") (match_operand 3 "const_0_to_255_operand")] UNSPEC_GF2P8AFFINEINV))] "TARGET_GFNI" @@ -29411,6 +29486,7 @@ (define_insn "vgf2p8affineinvqb_" gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3} vgf2p8affineinvqb\t{%3, %2, %1, %0| %0, %1, %2, %3}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) @@ -29419,7 +29495,7 @@ (define_insn "vgf2p8affineqb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm") + (match_operand:VI1_AVX512F 2 "vector_operand" "xja,vm") (match_operand 3 "const_0_to_255_operand")] UNSPEC_GF2P8AFFINE))] "TARGET_GFNI" @@ -29427,6 +29503,7 @@ (define_insn "vgf2p8affineqb_" gf2p8affineqb\t{%3, %2, %0| %0, %2, %3} vgf2p8affineqb\t{%3, %2, %1, %0| %0, %1, %2, %3}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) @@ -29435,13 +29512,14 @@ (define_insn "vgf2p8mulb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "%0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")] + (match_operand:VI1_AVX512F 2 "vector_operand" "xja,vm")] UNSPEC_GF2P8MUL))] "TARGET_GFNI" "@ gf2p8mulb\t{%2, %0| %0, %2} vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) From patchwork Fri Sep 22 10:56:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 143362 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp4570570rwb; Fri, 22 Sep 2023 04:05:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGmyeTJK+W7ugOvgFQr3vDU0PbB5KwN/fPsacbGpWlVAE03+fLY2lZS8C1ELzEMxNmtGvXV X-Received: by 2002:a19:2d15:0:b0:503:36cb:5438 with SMTP id k21-20020a192d15000000b0050336cb5438mr6343316lfj.21.1695380728592; Fri, 22 Sep 2023 04:05:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695380728; cv=none; d=google.com; s=arc-20160816; b=fT0B1R7zrYWKwVVfPo5T2dzqblk8GQl7mNgiDSSgkb24BYrKa9/7mp5Sp2eTw1WI7j MGX2LoE/hmDZ2L8Qovj/SV0rO/wHYQ5D372qVLiUm1MGi+K3LLYk14ttfWNbVbkj7Z9b IRy83nmfSiIhAvsgyDPK3+fGDSwCKR+CwOGHP/qhZBZ1QvYkAFZrWQuDOZuVxRiw3IoP eaKZoBttBw53kYGT2OyFB3jzchYxCmWnN2opVBDRzogsKR77JYoOcBEFJB1BlAWtbrpB vZIeXhq//xgoNHo87CJonJS9Jl6QpyosXnT1JqEQf5eEXM2BMNQXZn9po96/LnO8oORb umYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=+jbTD4z3CTC3V0tkv6+vNBM/2nsu4iGHhfDXjN6YjQk=; fh=ohmInM8pkhFahNid/tIrxUIOBXWhoriwAcUbKKgAH4Q=; b=dbqBjcnIF3UomRtiIuPi661orkLa0tA45GYLNidUFg9I6fHYQR18EjpQdstE6Cqj8V 9F/LB98lEf8xZ8USIPd+0r6etDWrxh4w7/UmtcBqWP85oBBgpjje8AxMv6nmzU4NWxr0 UcF0uB++9EASwBFIsNFKrUZD/RCIBZOTHL5dqj09ozsJK7YGJRzRLknyjHQSaq/btO9X 4zxxwYLMkqEANdIXV8ZegkDOylwl0W1P0UOvKLRaQJRPhX8+qzMS2HcAyXdbApuGPKJB rh+ScjJGB4zx7dTFL75IIJJLI70yVGt4b0WFQ6gHPgLrBof6AvBgJ1INy8c+hCUSkvZT fctg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EvPT2vTd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j26-20020a170906831a00b009a18355fefbsi2609199ejx.925.2023.09.22.04.05.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 04:05:28 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EvPT2vTd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BDC243830B4C for ; Fri, 22 Sep 2023 10:59:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id E5736385B522 for ; Fri, 22 Sep 2023 10:56:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E5736385B522 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695380207; x=1726916207; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KNGJQO4027wvxo2LDgTRbFT5W+VWnXf7CscXjReRkxs=; b=EvPT2vTdbNTCv2rqB3OkEwRXVcKkGqvOJF5stdyQhkgU6EowNoljChwe G3q6PswIWAgXy7y3lNCpCbqaZBoSla8mJ1wEcggdIDquAfnZxCRbU4B1I sP/U9ItFUeT7vyA62au3LyqPGTWSLYqSDDlFUCeZxj60sil5DVKOK8/X8 mtIoR+zCAQhKNP6nRSlFkHI0gFRFUos5HjvjID/vmQSLznwOPqkEk55Z9 m0xMmouEHQiuEfesbMzHU97hDKXzG6ieMRDND55BXuVJKNyO0/UYC/lAV niPLT8BQqRjN3WP4WCtkt8vIt6wojsuuOBk0lg2ElalyxxGfJTn8yPxXc g==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="379680840" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="379680840" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 03:56:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782615965" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782615965" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 22 Sep 2023 03:56:38 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C8CF4100780E; Fri, 22 Sep 2023 18:56:31 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, vmakarov@redhat.com, jakub@redhat.com, Kong Lingling , Hongtao Liu Subject: [PATCH 13/13] [APX EGPR] Handle vex insns that only support GPR16 (5/5) Date: Fri, 22 Sep 2023 18:56:31 +0800 Message-Id: <20230922105631.2298849-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230922105631.2298849-1-hongyu.wang@intel.com> References: <20230922105631.2298849-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777735542841276611 X-GMAIL-MSGID: 1777735542841276611 From: Kong Lingling These vex insn may have legacy counterpart that could support EGPR, but they do not have evex counterpart. Split out its vex part from patterns and set the vex part to non-EGPR supported by adjusting constraints and attr_gpr32. insn list: 1. vmovmskpd/vmovmskps 2. vpmovmskb 3. vrsqrtss/vrsqrtps 4. vrcpss/vrcpps 5. vhaddpd/vhaddps, vhsubpd/vhsubps 6. vldmxcsr/vstmxcsr 7. vaddsubpd/vaddsubps 8. vlddqu 9. vtestps/vtestpd 10. vmaskmovps/vmaskmovpd, vpmaskmovd/vpmaskmovq 11. vperm2f128/vperm2i128 12. vinserti128/vinsertf128 13. vbroadcasti128/vbroadcastf128 14. vcmppd/vcmpps, vcmpss/vcmpsd 15. vgatherdps/vgatherqps, vgatherdpd/vgatherqpd gcc/ChangeLog: * config/i386/constraints.md (jb): New constraint for vsib memory that does not allow gpr32. * config/i386/i386.md: (setcc__sse): Replace m to jm for avx alternative and set attr_gpr32 to 0. (movmsk_df): Split avx/noavx alternatives and replace "r" to "jr" for avx alternative. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "jm/ja" for avx alternative, set its gpr32 attr to 0. (*rsqrtsf2_sse): Likewise. * config/i386/mmx.md (mmx_pmovmskb): Split alternative 1 to avx/noavx and assign jr/r constraint to dest. * config/i386/sse.md (_movmsk): Split avx/noavx alternatives and replace "r" to "jr" for avx alternative. (*_movmsk_ext): Likewise. (*_movmsk_lt): Likewise. (*_movmsk_ext_lt): Likewise. (*_movmsk_shift): Likewise. (*_movmsk_ext_shift): Likewise. (_pmovmskb): Likewise. (*_pmovmskb_zext): Likewise. (*sse2_pmovmskb_ext): Likewise. (*_pmovmskb_lt): Likewise. (*_pmovmskb_zext_lt): Likewise. (*sse2_pmovmskb_ext_lt): Likewise. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "jm/ja" for avx alternative, set its attr_gpr32 to 0. (sse_vmrcpv4sf2): Likewise. (*sse_vmrcpv4sf2): Likewise. (rsqrt2): Likewise. (sse_vmrsqrtv4sf2): Likewise. (*sse_vmrsqrtv4sf2): Likewise. (avx_hv4df3): Likewise. (sse3_hsubv2df3): Likewise. (avx_hv8sf3): Likewise. (sse3_hv4sf3): Likewise. (_lddqu): Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*sse2_gt3): Likewise. (sse_ldmxcsr): Likewise. (sse_stmxcsr): Likewise. (avx_vtest): Replace m to jm for avx alternative and set attr_gpr32 to 0. (avx2_permv2ti): Likewise. (*avx_vperm2f128_full): Likewise. (*avx_vperm2f128_nozero): Likewise. (vec_set_lo_v32qi): Likewise. (_maskload): Likewise. (_maskstore: Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*_maskcmp3_comm): Likewise. (*avx2_gathersi): Replace Tv to jb and set attr_gpr32 to 0. (*avx2_gathersi_2): Likewise. (*avx2_gatherdi): Likewise. (*avx2_gatherdi_2): Likewise. (*avx2_gatherdi_3): Likewise. (*avx2_gatherdi_4): Likewise. (avx_vbroadcastf128_): Restrict non-egpr alternative to noavx512vl, set its constraint to jm and set attr_gpr32 to 0. (vec_set_lo_): Likewise. (vec_set_lo_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_hi_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_lo_): Likewise. (avx2_set_hi_v32qi): Likewise. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/constraints.md | 6 + gcc/config/i386/i386.md | 47 +++-- gcc/config/i386/mmx.md | 11 +- gcc/config/i386/sse.md | 320 +++++++++++++++++++++------------ 4 files changed, 242 insertions(+), 142 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 36c268d7f9b..dc91bd94b27 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -428,3 +428,9 @@ (define_special_memory_constraint "ja" (and (match_operand 0 "vector_memory_operand") (not (and (match_test "TARGET_APX_EGPR") (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_address_constraint "jb" + "VSIB address operand without EGPR" + (and (match_operand 0 "vsib_address_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c09ee3989cb..a0ba1752a54 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -554,7 +554,8 @@ (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl, avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16,avxifma, - avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl" + avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl, + avx_noavx512f,avx_noavx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -908,6 +909,8 @@ (define_attr "enabled" "" (eq_attr "isa" "sse4_noavx") (symbol_ref "TARGET_SSE4_1 && !TARGET_AVX") (eq_attr "isa" "avx") (symbol_ref "TARGET_AVX") + (eq_attr "isa" "avx_noavx512f") + (symbol_ref "TARGET_AVX && !TARGET_AVX512F") (eq_attr "isa" "noavx") (symbol_ref "!TARGET_AVX") (eq_attr "isa" "avx2") (symbol_ref "TARGET_AVX2") (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2") @@ -16661,12 +16664,13 @@ (define_insn "setcc__sse" [(set (match_operand:MODEF 0 "register_operand" "=x,x") (match_operator:MODEF 3 "sse_comparison_operator" [(match_operand:MODEF 1 "register_operand" "0,x") - (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")]))] + (match_operand:MODEF 2 "nonimmediate_operand" "xm,xjm")]))] "SSE_FLOAT_MODE_P (mode)" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -20122,24 +20126,27 @@ (define_insn "*hf" (set_attr "mode" "HF")]) (define_insn "*rcpsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,ja")] UNSPEC_RCP))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrcpss\t{%d1, %0|%0, %d1} %vrcpss\t{%d1, %0|%0, %d1} - %vrcpss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rcpss\t{%1, %d0|%d0, %1} + vrcpss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -20382,24 +20389,27 @@ (define_insn "sqrtxf2" (set_attr "bdver1_decode" "direct")]) (define_insn "*rsqrtsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,ja")] UNSPEC_RSQRT))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrsqrtss\t{%d1, %0|%0, %d1} %vrsqrtss\t{%d1, %0|%0, %d1} - %vrsqrtss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rsqrtss\t{%1, %d0|%d0, %1} + vrsqrtss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -22103,14 +22113,15 @@ (define_expand "signbitxf2" }) (define_insn "movmsk_df" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI - [(match_operand:DF 1 "register_operand" "x")] + [(match_operand:DF 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "SSE_FLOAT_MODE_P (DFmode) && TARGET_SSE_MATH" "%vmovmskpd\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "DF")]) ;; Use movmskpd in SSE mode to avoid store forwarding stall diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 73809585a5d..3530615c706 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -5174,13 +5174,14 @@ (define_expand "usadv8qi" }) (define_insn_and_split "mmx_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r,r") - (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")] + [(set (match_operand:SI 0 "register_operand" "=r,r,jr") + (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x,x")] UNSPEC_MOVMSK))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pmovmskb\t{%1, %0|%0, %1} + # #" "TARGET_SSE2 && reload_completed && SSE_REGNO_P (REGNO (operands[1]))" @@ -5195,9 +5196,9 @@ (define_insn_and_split "mmx_pmovmskb" operands[2] = lowpart_subreg (QImode, operands[0], GET_MODE (operands[0])); } - [(set_attr "mmx_isa" "native,sse") - (set_attr "type" "mmxcvt,ssemov") - (set_attr "mode" "DI,TI")]) + [(set_attr "mmx_isa" "native,sse_noavx,avx") + (set_attr "type" "mmxcvt,ssemov,ssemov") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_maskmovq" [(set (match_operand:V8QI 0 "memory_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d3b59c4866b..6bffd749c6d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1833,12 +1833,14 @@ (define_peephole2 "operands[4] = adjust_address (operands[0], V2DFmode, 0);") (define_insn "_lddqu" - [(set (match_operand:VI1 0 "register_operand" "=x") - (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")] + [(set (match_operand:VI1 0 "register_operand" "=x,x") + (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m,jm")] UNSPEC_LDDQU))] "TARGET_SSE3" "%vlddqu\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "gpr32" "1,0") (set_attr "movu" "1") (set (attr "prefix_data16") (if_then_else @@ -2507,12 +2509,14 @@ (define_insn "_div3" (set_attr "mode" "")]) (define_insn "_rcp2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RCP))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xja")] UNSPEC_RCP))] "TARGET_SSE" "%vrcpps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") @@ -2521,7 +2525,7 @@ (define_insn "_rcp2" (define_insn "sse_vmrcpv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF - (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xjm")] UNSPEC_RCP) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2531,6 +2535,7 @@ (define_insn "sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2540,7 +2545,7 @@ (define_insn "*sse_vmrcpv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF (vec_duplicate:V4SF - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xjm")] UNSPEC_RCP)) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2550,6 +2555,7 @@ (define_insn "*sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2726,12 +2732,14 @@ (define_expand "rsqrt2" "TARGET_AVX512FP16") (define_insn "_rsqrt2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RSQRT))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xja")] UNSPEC_RSQRT))] "TARGET_SSE" "%vrsqrtps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) @@ -2790,7 +2798,7 @@ (define_insn "rsqrt14__mask" (define_insn "sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF - (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xjm")] UNSPEC_RSQRT) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2800,6 +2808,7 @@ (define_insn "sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -2807,7 +2816,7 @@ (define_insn "*sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF (vec_duplicate:V4SF - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xjm")] UNSPEC_RSQRT)) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2817,6 +2826,7 @@ (define_insn "*sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -2992,7 +3002,7 @@ (define_insn "vec_addsub3" (vec_merge:VF_128_256 (minus:VF_128_256 (match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm, xm")) + (match_operand:VF_128_256 2 "vector_operand" "xBm, xjm")) (plus:VF_128_256 (match_dup 1) (match_dup 2)) (const_int )))] "TARGET_SSE3" @@ -3001,6 +3011,7 @@ (define_insn "vec_addsub3" vaddsub\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set (attr "atom_unit") (if_then_else (match_test "mode == V2DFmode") @@ -3144,7 +3155,7 @@ (define_insn "avx_hv4df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (plusminus:DF (vec_select:DF - (match_operand:V4DF 2 "nonimmediate_operand" "xm") + (match_operand:V4DF 2 "nonimmediate_operand" "xjm") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))) (vec_concat:V2DF @@ -3157,6 +3168,7 @@ (define_insn "avx_hv4df3" "TARGET_AVX" "vhpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V4DF")]) @@ -3187,7 +3199,7 @@ (define_insn "*sse3_haddv2df3" (parallel [(match_operand:SI 4 "const_0_to_1_operand")]))) (plus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xjm") (parallel [(match_operand:SI 5 "const_0_to_1_operand")])) (vec_select:DF (match_dup 2) @@ -3199,6 +3211,7 @@ (define_insn "*sse3_haddv2df3" haddpd\t{%2, %0|%0, %2} vhaddpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3213,7 +3226,7 @@ (define_insn "sse3_hsubv2df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (minus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xjm") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))))] "TARGET_SSE3" @@ -3222,6 +3235,7 @@ (define_insn "sse3_hsubv2df3" vhsubpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3278,7 +3292,7 @@ (define_insn "avx_hv8sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V8SF 2 "nonimmediate_operand" "xm") + (match_operand:V8SF 2 "nonimmediate_operand" "xjm") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3302,6 +3316,7 @@ (define_insn "avx_hv8sf3" "TARGET_AVX" "vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) @@ -3320,7 +3335,7 @@ (define_insn "sse3_hv4sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V4SF 2 "vector_operand" "xBm,xm") + (match_operand:V4SF 2 "vector_operand" "xBm,xjm") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3332,6 +3347,7 @@ (define_insn "sse3_hv4sf3" vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "atom_unit" "complex") (set_attr "prefix" "orig,vex") (set_attr "prefix_rep" "1,*") @@ -3525,12 +3541,13 @@ (define_insn "avx_cmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "x") - (match_operand:VF_128_256 2 "nonimmediate_operand" "xm") + (match_operand:VF_128_256 2 "nonimmediate_operand" "xjm") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP))] "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3736,7 +3753,7 @@ (define_insn "avx_vmcmp3" (vec_merge:VF_128 (unspec:VF_128 [(match_operand:VF_128 1 "register_operand" "x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm") + (match_operand:VF_128 2 "nonimmediate_operand" "xjm") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP) (match_dup 1) @@ -3744,6 +3761,7 @@ (define_insn "avx_vmcmp3" "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3752,13 +3770,14 @@ (define_insn "*_maskcmp3_comm" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "%0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xjm")]))] "TARGET_SSE && GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3768,12 +3787,13 @@ (define_insn "_maskcmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xjm")]))] "TARGET_SSE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3784,7 +3804,7 @@ (define_insn "_vmmaskcmp3" (vec_merge:VF_128 (match_operator:VF_128 3 "sse_comparison_operator" [(match_operand:VF_128 1 "register_operand" "0,x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")]) + (match_operand:VF_128 2 "nonimmediate_operand" "xm,xjm")]) (match_dup 1) (const_int 1)))] "TARGET_SSE" @@ -3792,6 +3812,7 @@ (define_insn "_vmmaskcmp3" cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1,*") (set_attr "prefix" "orig,vex") @@ -4709,7 +4730,7 @@ (define_insn "_andnot3" (and:VFB_128_256 (not:VFB_128_256 (match_operand:VFB_128_256 1 "register_operand" "0,x,v,v")) - (match_operand:VFB_128_256 2 "vector_operand" "xBm,xm,vm,vm")))] + (match_operand:VFB_128_256 2 "vector_operand" "xBm,xjm,vm,vm")))] "TARGET_SSE && && (! || mode != HFmode)" { @@ -4753,7 +4774,8 @@ (define_insn "_andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512dq,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512dq,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set_attr "prefix" "orig,maybe_vex,evex,evex") (set (attr "mode") @@ -4761,6 +4783,10 @@ (define_insn "_andnot3" (and (eq_attr "alternative" "1") (match_test "!TARGET_AVX512DQ"))) (const_string "") + (and (not (match_test "")) + (eq_attr "alternative" "3") + (match_test "!x86_evex_reg_mentioned_p (operands, 3)")) + (const_string "") (eq_attr "alternative" "3") (const_string "") (match_test "TARGET_AVX") @@ -5063,7 +5089,7 @@ (define_insn "*andnot3" [(set (match_operand:ANDNOT_MODE 0 "register_operand" "=x,x,v,v") (and:ANDNOT_MODE (not:ANDNOT_MODE (match_operand:ANDNOT_MODE 1 "register_operand" "0,x,v,v")) - (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xm,vm,v")))] + (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xjm,vm,v")))] "TARGET_SSE" { char buf[128]; @@ -5092,7 +5118,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512vl,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512vl,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -5102,7 +5129,10 @@ (define_insn "*andnot3" (const_string "*"))) (set_attr "prefix" "orig,vex,evex,evex") (set (attr "mode") - (cond [(eq_attr "alternative" "2") + (cond [(and (eq_attr "alternative" "3") + (match_test "!x86_evex_reg_mentioned_p (operands, 3)")) + (const_string "TI") + (eq_attr "alternative" "2") (const_string "TI") (eq_attr "alternative" "3") (const_string "XI") @@ -12240,7 +12270,7 @@ (define_insn_and_split "vec_extract_lo_v32qi" "operands[1] = gen_lowpart (V16QImode, operands[1]);") (define_insn "vec_extract_hi_v32qi" - [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm") + [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xjm,vm") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -12258,7 +12288,8 @@ (define_insn "vec_extract_hi_v32qi" [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "isa" "*,avx512vl") + (set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) @@ -17130,6 +17161,7 @@ (define_insn "*sse2_gt3" pcmpgt\t{%2, %0|%0, %2} vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -17446,7 +17478,7 @@ (define_insn "*andnot3" [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v") (and:VI (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")) - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,0,0")))] + (match_operand:VI 2 "bcst_vector_operand" "xBm,xjm,vmBr,0,0")))] "TARGET_SSE && (register_operand (operands[1], mode) || register_operand (operands[2], mode))" @@ -17533,7 +17565,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx,*,*") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f,*,*") + (set_attr "gpr32" "1,0,1,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17688,7 +17721,7 @@ (define_insn "*3" [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI48_AVX_AVX512F (match_operand:VI48_AVX_AVX512F 1 "bcst_vector_operand" "%0,x,v") - (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xm,vmBr")))] + (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xjm,vmBr")))] "TARGET_SSE && && ix86_binary_operator_ok (, mode, operands)" { @@ -17718,9 +17751,11 @@ (define_insn "*3" case E_V4DImode: case E_V4SImode: case E_V2DImode: - ssesuffix = (TARGET_AVX512VL - && ( || which_alternative == 2) - ? "" : ""); + ssesuffix = ((TARGET_AVX512VL + && ( || which_alternative == 2)) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2]))) + ? "" : ""; break; default: gcc_unreachable (); @@ -17760,7 +17795,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17787,7 +17823,7 @@ (define_insn "*3" [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI12_AVX_AVX512F (match_operand:VI12_AVX_AVX512F 1 "vector_operand" "%0,x,v") - (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xjm,vm")))] "TARGET_SSE && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { char buf[64]; @@ -17816,7 +17852,10 @@ (define_insn "*3" case E_V16HImode: case E_V16QImode: case E_V8HImode: - ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : ""; + ssesuffix = (((TARGET_AVX512VL && which_alternative == 2) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2])))) + ? "q" : ""; break; default: gcc_unreachable (); @@ -17853,7 +17892,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17880,13 +17920,14 @@ (define_insn "v1ti3" [(set (match_operand:V1TI 0 "register_operand" "=x,x,v") (any_logic:V1TI (match_operand:V1TI 1 "register_operand" "%0,x,v") - (match_operand:V1TI 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:V1TI 2 "vector_operand" "xBm,xjm,vm")))] "TARGET_SSE2" "@ p\t{%2, %0|%0, %2} vp\t{%2, %1, %0|%0, %1, %2} vpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx,avx512vl") + [(set_attr "isa" "noavx,avx_noavx512vl,avx512vl") + (set_attr "gpr32" "1,0,1") (set_attr "prefix" "orig,vex,evex") (set_attr "prefix_data16" "1,*,*") (set_attr "type" "sselog") @@ -20866,33 +20907,35 @@ (define_insn "*_psadbw" (set_attr "mode" "")]) (define_insn "_movmsk" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE" "%vmovmsk\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn "*_movmsk_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (any_extend:DI (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" - "%vmovmsk\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "%vmovmsk\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20901,16 +20944,17 @@ (define_insn_and_split "*_movmsk_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (any_extend:DI (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20919,16 +20963,17 @@ (define_insn_and_split "*_movmsk_ext_lt" [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_shift" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20937,17 +20982,18 @@ (define_insn_and_split "*_movmsk_shift" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_shift" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (any_extend:DI (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20956,18 +21002,20 @@ (define_insn_and_split "*_movmsk_ext_shift [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn "_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE2" "%vpmovmskb\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -20977,14 +21025,15 @@ (define_insn "_pmovmskb" (set_attr "mode" "SI")]) (define_insn "*_pmovmskb_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (zero_extend:DI (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -20994,14 +21043,15 @@ (define_insn "*_pmovmskb_zext" (set_attr "mode" "SI")]) (define_insn "*sse2_pmovmskb_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (sign_extend:DI (unspec:SI - [(match_operand:V16QI 1 "register_operand" "x")] + [(match_operand:V16QI 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21086,9 +21136,9 @@ (define_split }) (define_insn_and_split "*_pmovmskb_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE2" @@ -21097,7 +21147,8 @@ (define_insn_and_split "*_pmovmskb_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21107,10 +21158,10 @@ (define_insn_and_split "*_pmovmskb_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*_pmovmskb_zext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (zero_extend:DI (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21119,7 +21170,8 @@ (define_insn_and_split "*_pmovmskb_zext_lt" [(set (match_dup 0) (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21129,10 +21181,10 @@ (define_insn_and_split "*_pmovmskb_zext_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*sse2_pmovmskb_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,jr") (sign_extend:DI (unspec:SI - [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x") + [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x,x") (match_operand:V16QI 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21141,7 +21193,8 @@ (define_insn_and_split "*sse2_pmovmskb_ext_lt" [(set (match_dup 0) (sign_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21202,21 +21255,25 @@ (define_insn "*sse2_maskmovdqu" (set_attr "mode" "TI")]) (define_insn "sse_ldmxcsr" - [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m")] + [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m,jm")] UNSPECV_LDMXCSR)] "TARGET_SSE" "%vldmxcsr\t%0" - [(set_attr "type" "sse") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "load")]) (define_insn "sse_stmxcsr" - [(set (match_operand:SI 0 "memory_operand" "=m") + [(set (match_operand:SI 0 "memory_operand" "=m,jm") (unspec_volatile:SI [(const_int 0)] UNSPECV_STMXCSR))] "TARGET_SSE" "%vstmxcsr\t%0" - [(set_attr "type" "sse") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "store")]) @@ -23860,11 +23917,12 @@ (define_expand "v2siv2di2" (define_insn "avx_vtest" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:VF_128_256 0 "register_operand" "x") - (match_operand:VF_128_256 1 "nonimmediate_operand" "xm")] + (match_operand:VF_128_256 1 "nonimmediate_operand" "xjm")] UNSPEC_VTESTP))] "TARGET_AVX" "vtest\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -26925,7 +26983,7 @@ (define_split (define_insn "avx_vbroadcastf128_" [(set (match_operand:V_256 0 "register_operand" "=x,x,x,v,v,v,v") (vec_concat:V_256 - (match_operand: 1 "nonimmediate_operand" "m,0,?x,m,0,m,0") + (match_operand: 1 "nonimmediate_operand" "jm,0,?x,m,0,m,0") (match_dup 1)))] "TARGET_AVX" "@ @@ -26936,8 +26994,9 @@ (define_insn "avx_vbroadcastf128_" vinsert\t{$1, %1, %0, %0|%0, %0, %1, 1} vbroadcast32x4\t{%1, %0|%0, %1} vinsert32x4\t{$1, %1, %0, %0|%0, %0, %1, 1}" - [(set_attr "isa" "*,*,*,avx512dq,avx512dq,avx512vl,avx512vl") + [(set_attr "isa" "noavx512vl,*,*,avx512dq,avx512dq,avx512vl,avx512vl") (set_attr "type" "ssemov,sselog1,sselog1,ssemov,sselog1,ssemov,sselog1") + (set_attr "gpr32" "0,1,1,1,1,1,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "0,1,1,0,1,0,1") (set_attr "prefix" "vex,vex,vex,evex,evex,evex,evex") @@ -27220,12 +27279,13 @@ (define_insn "*avx_vperm2f128_full" [(set (match_operand:AVX256MODE2P 0 "register_operand" "=x") (unspec:AVX256MODE2P [(match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm") + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xjm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMIL2F128))] "TARGET_AVX" "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27342,11 +27402,11 @@ (define_expand "avx_vinsertf128" }) (define_insn "vec_set_lo_" - [(set (match_operand:VI8F_256 0 "register_operand" "=v") + [(set (match_operand:VI8F_256 0 "register_operand" "=x,v") (vec_concat:VI8F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xjm,vm") (vec_select: - (match_operand:VI8F_256 1 "register_operand" "v") + (match_operand:VI8F_256 1 "register_operand" "x,v") (parallel [(const_int 2) (const_int 3)]))))] "TARGET_AVX && " { @@ -27357,7 +27417,9 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27386,11 +27448,11 @@ (define_insn "vec_set_hi_" (set_attr "mode" "")]) (define_insn "vec_set_lo_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xjm,vm") (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 4) (const_int 5) (const_int 6) (const_int 7)]))))] "TARGET_AVX" @@ -27400,20 +27462,22 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) (define_insn "vec_set_hi_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])) - (match_operand: 2 "nonimmediate_operand" "vm")))] + (match_operand: 2 "nonimmediate_operand" "xjm,vm")))] "TARGET_AVX" { if (TARGET_AVX512VL) @@ -27421,7 +27485,9 @@ (define_insn "vec_set_hi_" else return "vinsert\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27430,7 +27496,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_" [(set (match_operand:V16_256 0 "register_operand" "=x,v") (vec_concat:V16_256 - (match_operand: 2 "nonimmediate_operand" "xm,vm") + (match_operand: 2 "nonimmediate_operand" "xjm,vm") (vec_select: (match_operand:V16_256 1 "register_operand" "x,v") (parallel [(const_int 8) (const_int 9) @@ -27441,7 +27507,9 @@ (define_insn "vec_set_lo_" "@ vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27456,12 +27524,14 @@ (define_insn "vec_set_hi_" (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])) - (match_operand: 2 "nonimmediate_operand" "xm,vm")))] + (match_operand: 2 "nonimmediate_operand" "xjm,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27470,7 +27540,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_v32qi" [(set (match_operand:V32QI 0 "register_operand" "=x,v") (vec_concat:V32QI - (match_operand:V16QI 2 "nonimmediate_operand" "xm,v") + (match_operand:V16QI 2 "nonimmediate_operand" "xjm,v") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -27485,7 +27555,9 @@ (define_insn "vec_set_lo_v32qi" "@ vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "type" "sselog") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27504,12 +27576,14 @@ (define_insn "vec_set_hi_v32qi" (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])) - (match_operand:V16QI 2 "nonimmediate_operand" "xm,vm")))] + (match_operand:V16QI 2 "nonimmediate_operand" "xjm,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27519,7 +27593,7 @@ (define_insn "_maskload" [(set (match_operand:V48_128_256 0 "register_operand" "=x") (unspec:V48_128_256 [(match_operand: 2 "register_operand" "x") - (match_operand:V48_128_256 1 "memory_operand" "m")] + (match_operand:V48_128_256 1 "memory_operand" "jm")] UNSPEC_MASKMOV))] "TARGET_AVX" { @@ -27529,13 +27603,14 @@ (define_insn "_maskload" return "vmaskmov\t{%1, %2, %0|%0, %2, %1}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) (define_insn "_maskstore" - [(set (match_operand:V48_128_256 0 "memory_operand" "+m") + [(set (match_operand:V48_128_256 0 "memory_operand" "+jm") (unspec:V48_128_256 [(match_operand: 1 "register_operand" "x") (match_operand:V48_128_256 2 "register_operand" "x") @@ -27549,6 +27624,7 @@ (define_insn "_maskstore" return "vmaskmov\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") @@ -27806,7 +27882,7 @@ (define_insn "avx_vec_concat" [(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv") (vec_concat:V_256_512 (match_operand: 1 "nonimmediate_operand" "x,v,xm,vm") - (match_operand: 2 "nonimm_or_0_operand" "xBt,vm,C,C")))] + (match_operand: 2 "nonimm_or_0_operand" "xjm,vm,C,C")))] "TARGET_AVX && (operands[2] == CONST0_RTX (mode) || !MEM_P (operands[1]))" @@ -28145,7 +28221,7 @@ (define_insn "*avx2_gathersi" [(match_operand:VEC_GATHER_MODE 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28156,6 +28232,7 @@ (define_insn "*avx2_gathersi" "TARGET_AVX2" "%M3vgatherd\t{%1, %7, %0|%0, %7, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28165,7 +28242,7 @@ (define_insn "*avx2_gathersi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28176,6 +28253,7 @@ (define_insn "*avx2_gathersi_2" "TARGET_AVX2" "%M2vgatherd\t{%1, %6, %0|%0, %6, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28206,7 +28284,7 @@ (define_insn "*avx2_gatherdi" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28217,6 +28295,7 @@ (define_insn "*avx2_gatherdi" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %2|%2, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28226,7 +28305,7 @@ (define_insn "*avx2_gatherdi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28241,6 +28320,7 @@ (define_insn "*avx2_gatherdi_2" return "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28251,7 +28331,7 @@ (define_insn "*avx2_gatherdi_3" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28264,6 +28344,7 @@ (define_insn "*avx2_gatherdi_3" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %0|%0, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28274,7 +28355,7 @@ (define_insn "*avx2_gatherdi_4" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28287,6 +28368,7 @@ (define_insn "*avx2_gatherdi_4" "TARGET_AVX2" "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")])