From patchwork Thu Aug 31 08:20:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137241 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp97279vqu; Thu, 31 Aug 2023 01:25:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFiW6xIUGA6s45EGQ9/GnhCSov/WROiYHuL8aIBeUPdqRn+JDwNeiLplbeH8q6U2yrWWMx9 X-Received: by 2002:a05:6512:3113:b0:4f6:3677:54e with SMTP id n19-20020a056512311300b004f63677054emr2668997lfb.36.1693470345711; Thu, 31 Aug 2023 01:25:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470345; cv=none; d=google.com; s=arc-20160816; b=PQOkCQKcTQ0QjppZ/cC0NDFzQtiRS7pIUvyRTz0Qil7AVtC5U/j6kETLuts8ZGYjXO 5YSmA8LygoCtVjJ6C6Ge+9Ys6lKCoxeGqtqh7ZqPSolNVfPgERq3L3yfeysK2q6Ohiay lcBADZ/MSVSKM1SbI7IGTkYRJDqtPQttZiYbpf3zgQPMHTljTU5pUFqjCO1A05bDAWSt 002riEhcKMn83HPiP2QMeAzYRLoKW6oXm6sKSIQTVadoFxu00rfoajCyp2EA4PvpixZa v+EoXGFDO5UaFRWlEIhmAf8EGp4xhLbV5p1CNzy4tkOydhaL1oYuzbTJ3KeQkO18+SUJ jY5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=o46KNCjN3jFeFWRDZVWWp1Cq+8Ud376IqrNX6JD0Ijo=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=yvBOa90ytSUc323QZpd47b4qP/zU1Tl57aod0lbHX7vSX3nRi6KRq+aci0SwJdv9Qe +t8Ca33bDvzWSnFhYE3pL5vBcPRSOLKKBqFG/9d5c7lo/DK3PLRZu4oeEhJmWSwMxaT7 eN/vjH0EG8VpnqmQPYFfwqDG1l5bC45suHYGZUu71LJuxCwgUxJx/N9Elggw6YFYzgeA Mqp2qqWH1Z5wN2C73ux0XSHHL+z8GTCnjFOLo5hE4FlHC0syiUs5AqL0lBQMxUOjxmWR DuAAxDx3V9k1gc/kxB6Z3933zUeNs6uFI5yXM2zvJRSKXpOZFbVaaRkkJZliWVW0iRbK NC9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="ZoyZt/F4"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y16-20020a1709064b1000b0099bcf34927fsi625075eju.640.2023.08.31.01.25.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:25:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="ZoyZt/F4"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 60C98382DC42 for ; Thu, 31 Aug 2023 08:22:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 60C98382DC42 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470170; bh=o46KNCjN3jFeFWRDZVWWp1Cq+8Ud376IqrNX6JD0Ijo=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ZoyZt/F4VggPlwggegOjpQOja3yxHpdoZO9RL9I3hFINPzJNs7KWq+hFfsmOduPdq hz5i3gsi4nNIzOfOUHqneirX9igJIJCoUa1ojI2sssmAWOIB3fM6zTYtOThYwOyT8E ZpP9IfF4NF09tJmg0VibH8DuO6tWqBV35kuzzuaU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 882DA385840D for ; Thu, 31 Aug 2023 08:20:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 882DA385840D X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235604" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235604" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938646" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938646" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:25 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 77AE4100519D; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class Date: Thu, 31 Aug 2023 16:20:12 +0800 Message-Id: <20230831082024.314097-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732360981194466 X-GMAIL-MSGID: 1775732360981194466 From: Kong Lingling Current reload infrastructure does not support selective base_reg_class for backend insn. Add insn argument to base_reg_class for lra/reload usage. gcc/ChangeLog: * addresses.h (base_reg_class): Add insn argument. Pass to MODE_CODE_BASE_REG_CLASS. (regno_ok_for_base_p_1): Add insn argument. Pass to REGNO_MODE_CODE_OK_FOR_BASE_P. (regno_ok_for_base_p): Add insn argument and parse to ok_for_base_p_1. * config/avr/avr.h (MODE_CODE_BASE_REG_CLASS): Add insn argument. (REGNO_MODE_CODE_OK_FOR_BASE_P): Ditto. * config/gcn/gcn.h (MODE_CODE_BASE_REG_CLASS): Ditto. (REGNO_MODE_CODE_OK_FOR_BASE_P): Ditto. * config/rl78/rl78.h (REGNO_MODE_CODE_OK_FOR_BASE_P): Ditto. (MODE_CODE_BASE_REG_CLASS): Ditto. * doc/tm.texi: Add insn argument for MODE_CODE_BASE_REG_CLASS and REGNO_MODE_CODE_OK_FOR_BASE_P. * doc/tm.texi.in: Ditto. * lra-constraints.cc (process_address_1): Pass insn to base_reg_class. (curr_insn_transform): Ditto. * reload.cc (find_reloads): Ditto. (find_reloads_address): Ditto. (find_reloads_address_1): Ditto. (find_reloads_subreg_address): Ditto. * reload1.cc (maybe_fix_stack_asms): Ditto. --- gcc/addresses.h | 15 +++++++++------ gcc/config/avr/avr.h | 5 +++-- gcc/config/gcn/gcn.h | 4 ++-- gcc/config/rl78/rl78.h | 6 ++++-- gcc/doc/tm.texi | 8 ++++++-- gcc/doc/tm.texi.in | 8 ++++++-- gcc/lra-constraints.cc | 15 +++++++++------ gcc/reload.cc | 30 ++++++++++++++++++------------ gcc/reload1.cc | 2 +- 9 files changed, 58 insertions(+), 35 deletions(-) diff --git a/gcc/addresses.h b/gcc/addresses.h index 3519c241c6d..08b100cfe6d 100644 --- a/gcc/addresses.h +++ b/gcc/addresses.h @@ -28,11 +28,12 @@ inline enum reg_class base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, addr_space_t as ATTRIBUTE_UNUSED, enum rtx_code outer_code ATTRIBUTE_UNUSED, - enum rtx_code index_code ATTRIBUTE_UNUSED) + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn *insn ATTRIBUTE_UNUSED = NULL) { #ifdef MODE_CODE_BASE_REG_CLASS return MODE_CODE_BASE_REG_CLASS (MACRO_MODE (mode), as, outer_code, - index_code); + index_code, insn); #else #ifdef MODE_BASE_REG_REG_CLASS if (index_code == REG) @@ -56,11 +57,12 @@ ok_for_base_p_1 (unsigned regno ATTRIBUTE_UNUSED, machine_mode mode ATTRIBUTE_UNUSED, addr_space_t as ATTRIBUTE_UNUSED, enum rtx_code outer_code ATTRIBUTE_UNUSED, - enum rtx_code index_code ATTRIBUTE_UNUSED) + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn* insn ATTRIBUTE_UNUSED = NULL) { #ifdef REGNO_MODE_CODE_OK_FOR_BASE_P return REGNO_MODE_CODE_OK_FOR_BASE_P (regno, MACRO_MODE (mode), as, - outer_code, index_code); + outer_code, index_code, insn); #else #ifdef REGNO_MODE_OK_FOR_REG_BASE_P if (index_code == REG) @@ -79,12 +81,13 @@ ok_for_base_p_1 (unsigned regno ATTRIBUTE_UNUSED, inline bool regno_ok_for_base_p (unsigned regno, machine_mode mode, addr_space_t as, - enum rtx_code outer_code, enum rtx_code index_code) + enum rtx_code outer_code, enum rtx_code index_code, + rtx_insn* insn = NULL) { if (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0) regno = reg_renumber[regno]; - return ok_for_base_p_1 (regno, mode, as, outer_code, index_code); + return ok_for_base_p_1 (regno, mode, as, outer_code, index_code, insn); } #endif /* GCC_ADDRESSES_H */ diff --git a/gcc/config/avr/avr.h b/gcc/config/avr/avr.h index 8e7e00db13b..1d090fe0838 100644 --- a/gcc/config/avr/avr.h +++ b/gcc/config/avr/avr.h @@ -280,12 +280,13 @@ enum reg_class { #define REGNO_REG_CLASS(R) avr_regno_reg_class(R) -#define MODE_CODE_BASE_REG_CLASS(mode, as, outer_code, index_code) \ +#define MODE_CODE_BASE_REG_CLASS(mode, as, outer_code, index_code, insn) \ avr_mode_code_base_reg_class (mode, as, outer_code, index_code) #define INDEX_REG_CLASS NO_REGS -#define REGNO_MODE_CODE_OK_FOR_BASE_P(num, mode, as, outer_code, index_code) \ +#define REGNO_MODE_CODE_OK_FOR_BASE_P(num, mode, as, outer_code, \ + index_code, insn) \ avr_regno_mode_code_ok_for_base_p (num, mode, as, outer_code, index_code) #define REGNO_OK_FOR_INDEX_P(NUM) 0 diff --git a/gcc/config/gcn/gcn.h b/gcc/config/gcn/gcn.h index 4ff9a5d4d12..b56702a77fd 100644 --- a/gcc/config/gcn/gcn.h +++ b/gcc/config/gcn/gcn.h @@ -437,9 +437,9 @@ enum reg_class 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0 }} #define REGNO_REG_CLASS(REGNO) gcn_regno_reg_class (REGNO) -#define MODE_CODE_BASE_REG_CLASS(MODE, AS, OUTER, INDEX) \ +#define MODE_CODE_BASE_REG_CLASS(MODE, AS, OUTER, INDEX, INSN) \ gcn_mode_code_base_reg_class (MODE, AS, OUTER, INDEX) -#define REGNO_MODE_CODE_OK_FOR_BASE_P(NUM, MODE, AS, OUTER, INDEX) \ +#define REGNO_MODE_CODE_OK_FOR_BASE_P(NUM, MODE, AS, OUTER, INDEX, INSN) \ gcn_regno_mode_code_ok_for_base_p (NUM, MODE, AS, OUTER, INDEX) #define INDEX_REG_CLASS VGPR_REGS #define REGNO_OK_FOR_INDEX_P(regno) regno_ok_for_index_p (regno) diff --git a/gcc/config/rl78/rl78.h b/gcc/config/rl78/rl78.h index 7a7c6a44ba2..d0ed9162292 100644 --- a/gcc/config/rl78/rl78.h +++ b/gcc/config/rl78/rl78.h @@ -375,10 +375,12 @@ enum reg_class #define REGNO_OK_FOR_INDEX_P(regno) REGNO_OK_FOR_BASE_P (regno) -#define REGNO_MODE_CODE_OK_FOR_BASE_P(regno, mode, address_space, outer_code, index_code) \ +#define REGNO_MODE_CODE_OK_FOR_BASE_P(regno, mode, address_space, outer_code, \ + index_code, insn) \ rl78_regno_mode_code_ok_for_base_p (regno, mode, address_space, outer_code, index_code) -#define MODE_CODE_BASE_REG_CLASS(mode, address_space, outer_code, index_code) \ +#define MODE_CODE_BASE_REG_CLASS(mode, address_space, outer_code, index_code, \ + insn) \ rl78_mode_code_base_reg_class (mode, address_space, outer_code, index_code) #define RETURN_ADDR_RTX(COUNT, FRAMEADDR) \ diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index d0d47b0d471..a4239e3de10 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -2533,7 +2533,7 @@ register address. You should define this macro if base plus index addresses have different requirements than other base register uses. @end defmac -@defmac MODE_CODE_BASE_REG_CLASS (@var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}) +@defmac MODE_CODE_BASE_REG_CLASS (@var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}, @var{insn}) A C expression whose value is the register class to which a valid base register for a memory reference in mode @var{mode} to address space @var{address_space} must belong. @var{outer_code} and @var{index_code} @@ -2542,6 +2542,8 @@ the code of the immediately enclosing expression (@code{MEM} for the top level of an address, @code{ADDRESS} for something that occurs in an @code{address_operand}). @var{index_code} is the code of the corresponding index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. +@code{insn} indicates insn specific base register class should be subset +of the original base register class. @end defmac @defmac INDEX_REG_CLASS @@ -2579,7 +2581,7 @@ Use of this macro is deprecated; please use the more general @code{REGNO_MODE_CODE_OK_FOR_BASE_P}. @end defmac -@defmac REGNO_MODE_CODE_OK_FOR_BASE_P (@var{num}, @var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}) +@defmac REGNO_MODE_CODE_OK_FOR_BASE_P (@var{num}, @var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}, @var{insn}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses, accessing memory in mode @var{mode} in address space @var{address_space}. @@ -2592,6 +2594,8 @@ address, @code{ADDRESS} for something that occurs in an corresponding index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. The mode may be @code{VOIDmode} for addresses that appear outside a @code{MEM}, i.e., as an @code{address_operand}. +@code{insn} indicates insn specific base register class should be subset +of the original base register class. @end defmac @defmac REGNO_OK_FOR_INDEX_P (@var{num}) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 4ac96dc357d..72898f3adba 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2128,7 +2128,7 @@ register address. You should define this macro if base plus index addresses have different requirements than other base register uses. @end defmac -@defmac MODE_CODE_BASE_REG_CLASS (@var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}) +@defmac MODE_CODE_BASE_REG_CLASS (@var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}, @var{insn}) A C expression whose value is the register class to which a valid base register for a memory reference in mode @var{mode} to address space @var{address_space} must belong. @var{outer_code} and @var{index_code} @@ -2137,6 +2137,8 @@ the code of the immediately enclosing expression (@code{MEM} for the top level of an address, @code{ADDRESS} for something that occurs in an @code{address_operand}). @var{index_code} is the code of the corresponding index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. +@code{insn} indicates insn specific base register class should be subset +of the original base register class. @end defmac @defmac INDEX_REG_CLASS @@ -2174,7 +2176,7 @@ Use of this macro is deprecated; please use the more general @code{REGNO_MODE_CODE_OK_FOR_BASE_P}. @end defmac -@defmac REGNO_MODE_CODE_OK_FOR_BASE_P (@var{num}, @var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}) +@defmac REGNO_MODE_CODE_OK_FOR_BASE_P (@var{num}, @var{mode}, @var{address_space}, @var{outer_code}, @var{index_code}, @var{insn}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses, accessing memory in mode @var{mode} in address space @var{address_space}. @@ -2187,6 +2189,8 @@ address, @code{ADDRESS} for something that occurs in an corresponding index expression if @var{outer_code} is @code{PLUS}; @code{SCRATCH} otherwise. The mode may be @code{VOIDmode} for addresses that appear outside a @code{MEM}, i.e., as an @code{address_operand}. +@code{insn} indicates insn specific base register class should be subset +of the original base register class. @end defmac @defmac REGNO_OK_FOR_INDEX_P (@var{num}) diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index c718bedff32..9e7915ce934 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -3672,7 +3672,7 @@ process_address_1 (int nop, bool check_only_p, REGNO (*ad.base_term)) != NULL_RTX) ? after : NULL), base_reg_class (ad.mode, ad.as, ad.base_outer_code, - get_index_code (&ad))))) + get_index_code (&ad), curr_insn)))) { change_p = true; if (ad.base_term2 != NULL) @@ -3722,7 +3722,8 @@ process_address_1 (int nop, bool check_only_p, rtx_insn *last = get_last_insn (); int code = -1; enum reg_class cl = base_reg_class (ad.mode, ad.as, - SCRATCH, SCRATCH); + SCRATCH, SCRATCH, + curr_insn); rtx addr = *ad.inner; new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "addr"); @@ -3785,7 +3786,8 @@ process_address_1 (int nop, bool check_only_p, /* index * scale + disp => new base + index * scale, case (1) above. */ enum reg_class cl = base_reg_class (ad.mode, ad.as, PLUS, - GET_CODE (*ad.index)); + GET_CODE (*ad.index), + curr_insn); lra_assert (INDEX_REG_CLASS != NO_REGS); new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "disp"); @@ -3846,7 +3848,7 @@ process_address_1 (int nop, bool check_only_p, *ad.base_term = XEXP (SET_SRC (set), 0); *ad.disp_term = XEXP (SET_SRC (set), 1); cl = base_reg_class (ad.mode, ad.as, ad.base_outer_code, - get_index_code (&ad)); + get_index_code (&ad), curr_insn); regno = REGNO (*ad.base_term); if (regno >= FIRST_PSEUDO_REGISTER && cl != lra_get_allocno_class (regno)) @@ -3890,7 +3892,8 @@ process_address_1 (int nop, bool check_only_p, else { enum reg_class cl = base_reg_class (ad.mode, ad.as, - SCRATCH, SCRATCH); + SCRATCH, SCRATCH, + curr_insn); rtx addr = *ad.inner; new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "addr"); @@ -4639,7 +4642,7 @@ curr_insn_transform (bool check_only_p) push_to_sequence (before); rclass = base_reg_class (GET_MODE (op), MEM_ADDR_SPACE (op), - MEM, SCRATCH); + MEM, SCRATCH, curr_insn); if (GET_RTX_CLASS (code) == RTX_AUTOINC) new_reg = emit_inc (rclass, *loc, *loc, /* This value does not matter for MODIFY. */ diff --git a/gcc/reload.cc b/gcc/reload.cc index 2126bdd117c..72f7e27af15 100644 --- a/gcc/reload.cc +++ b/gcc/reload.cc @@ -3321,7 +3321,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, were handled in find_reloads_address. */ this_alternative[i] = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH); + ADDRESS, SCRATCH, insn); win = 1; badop = 0; break; @@ -3508,7 +3508,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, the address into a base register. */ this_alternative[i] = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH); + ADDRESS, SCRATCH, insn); badop = 0; break; @@ -4018,7 +4018,7 @@ find_reloads (rtx_insn *insn, int replace, int ind_levels, int live_known, operand_reloadnum[i] = push_reload (XEXP (recog_data.operand[i], 0), NULL_RTX, &XEXP (recog_data.operand[i], 0), (rtx*) 0, - base_reg_class (VOIDmode, as, MEM, SCRATCH), + base_reg_class (VOIDmode, as, MEM, SCRATCH, insn), address_mode, VOIDmode, 0, 0, i, RELOAD_OTHER); rld[operand_reloadnum[i]].inc @@ -4897,7 +4897,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, if (reg_equiv_constant (regno) != 0) { find_reloads_address_part (reg_equiv_constant (regno), loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), GET_MODE (ad), opnum, type, ind_levels); return 1; } @@ -4966,7 +4967,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, /* If we do not have one of the cases above, we must do the reload. */ push_reload (ad, NULL_RTX, loc, (rtx*) 0, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, SCRATCH, insn), GET_MODE (ad), VOIDmode, 0, 0, opnum, type); return 1; } @@ -5123,7 +5124,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, reload the sum into a base reg. That will at least work. */ find_reloads_address_part (ad, loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), GET_MODE (ad), opnum, type, ind_levels); } return ! removed_and; @@ -5203,7 +5205,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, op_index == 0 ? addend : offset_reg); *loc = ad; - cls = base_reg_class (mode, as, MEM, GET_CODE (addend)); + cls = base_reg_class (mode, as, MEM, GET_CODE (addend), insn); find_reloads_address_part (XEXP (ad, op_index), &XEXP (ad, op_index), cls, GET_MODE (ad), opnum, type, ind_levels); @@ -5261,7 +5263,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, } find_reloads_address_part (ad, loc, - base_reg_class (mode, as, MEM, SCRATCH), + base_reg_class (mode, as, MEM, + SCRATCH, insn), address_mode, opnum, type, ind_levels); return ! removed_and; } @@ -5513,7 +5516,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, if (context == 1) context_reg_class = INDEX_REG_CLASS; else - context_reg_class = base_reg_class (mode, as, outer_code, index_code); + context_reg_class = base_reg_class (mode, as, outer_code, index_code, + insn); switch (code) { @@ -5738,7 +5742,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, reloadnum = push_reload (tem, tem, &XEXP (x, 0), &XEXP (op1, 0), base_reg_class (mode, as, - code, index_code), + code, index_code, + insn), GET_MODE (x), GET_MODE (x), 0, 0, opnum, RELOAD_OTHER); @@ -5756,7 +5761,8 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, reloadnum = push_reload (XEXP (op1, 0), XEXP (x, 0), &XEXP (op1, 0), &XEXP (x, 0), base_reg_class (mode, as, - code, index_code), + code, index_code, + insn), GET_MODE (x), GET_MODE (x), 0, 0, opnum, RELOAD_OTHER); @@ -6216,7 +6222,7 @@ find_reloads_subreg_address (rtx x, int opnum, enum reload_type type, { push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0, base_reg_class (GET_MODE (tem), MEM_ADDR_SPACE (tem), - MEM, SCRATCH), + MEM, SCRATCH, insn), GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0, opnum, type); reloaded = 1; } diff --git a/gcc/reload1.cc b/gcc/reload1.cc index 9ba822d1ff7..f41f4a4de22 100644 --- a/gcc/reload1.cc +++ b/gcc/reload1.cc @@ -1382,7 +1382,7 @@ maybe_fix_stack_asms (void) if (insn_extra_address_constraint (cn)) cls = (int) reg_class_subunion[cls] [(int) base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, - ADDRESS, SCRATCH)]; + ADDRESS, SCRATCH, chain->insn)]; else cls = (int) reg_class_subunion[cls] [reg_class_for_constraint (cn)]; From patchwork Thu Aug 31 08:20:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137237 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp96275vqu; Thu, 31 Aug 2023 01:22:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGjxHTwrS5skqY0HCsNxmK37Mi35CWswh31yvUq850CO2yzicO1aBHtSvmI+/SF6MgkG+Gm X-Received: by 2002:a05:6402:1a59:b0:51e:5322:a642 with SMTP id bf25-20020a0564021a5900b0051e5322a642mr3541041edb.27.1693470163167; Thu, 31 Aug 2023 01:22:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470163; cv=none; d=google.com; s=arc-20160816; b=SDXYcmUAoSCb/v+THxiSUNWumfKf5FLzOyvH8YT47DsPkVEfL692HlB4/9qX71UbDg xYJ+J1lCt2mS05dYP0Kkx/JAzyYZLiPggrB8+uep0WhTNKBwQFN0sSLFDEMLrNCVstQ+ rNJ9b8mDZxnZwgMa+Wn1VxmFlbf1hwrfNZi2B2Zujj/dCsNc9yazkruocGBwsv9Kf6GX QAFZ5JNnSG2n28PQPCpY4wv7JppkVqc6RdvlR8+Jz5MJokuvA1qL4SrcEHuZpKuM9BXk qE2yLk0D3MoKtSOhOKbhDqRh85i4gjfGDTwa1L8dJb6mJOqj0SVOBTaZ095vPmN9d0e+ mNIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=VIf0mism40fFWcCGrnTAwdsJzGYGWOs+ybqn4iae3E4=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=Rv06KHeU4UHQ0miqGpam4R+H/eEqj99k2av65w3Wz97xNSg93pIzH87E1gD2nc3e5Z SyThpUGOLHO/5d7fF/F26lBb4uoej7PYZuIdGctxVDM2v6TRg+kCOx0nOWaLZjo5IwMZ uW8DKjfEVtcjjPX6typ/AJY+HYpeV9iTxpwok/wp7IXbWOwf7HjTbx0SkK4Mknt3pDAj oBtiQ0l2CZko3pGtWgPoxh6GkDMFeuUUYIyXzoRQxqdktEhk3Pa9JU09HF4c4HWVUcqB VtON7678oovdKAOO9iNcHP/sEdbGlEUK0z7Du1NprK+MScc2nqQaPkxame4e3enm4wY7 ooWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=F6I5F5yf; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id w4-20020a056402128400b0052a10293879si653486edv.446.2023.08.31.01.22.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:22:43 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=F6I5F5yf; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 755FC3856944 for ; Thu, 31 Aug 2023 08:21:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 755FC3856944 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470093; bh=VIf0mism40fFWcCGrnTAwdsJzGYGWOs+ybqn4iae3E4=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=F6I5F5yfHjB9cLWaQxUKWD/Swl26Fb2x6d6QVwYe+0eKhkHkSr6AcGXIfC+BHTF8t 6s7UHjBxcqAOw2VPcIQkku5O0xg7bJooyiDvOFRfZND6cIfFBC6Qq/pRfGpQtfPDv5 ZtAFnnwa+6Mf19+XIArab54WqPeIMPNZXBI6XP0c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 21CD13858C2B for ; Thu, 31 Aug 2023 08:20:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 21CD13858C2B X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235572" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235572" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938619" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938619" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:25 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7C119100519E; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 02/13] [APX EGPR] middle-end: Add index_reg_class with insn argument. Date: Thu, 31 Aug 2023 16:20:13 +0800 Message-Id: <20230831082024.314097-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732169929671993 X-GMAIL-MSGID: 1775732169929671993 Like base_reg_class, INDEX_REG_CLASS also does not support backend insn. Add index_reg_class with insn argument for lra/reload usage. gcc/ChangeLog: * addresses.h (index_reg_class): New wrapper function like base_reg_class. * doc/tm.texi: Document INSN_INDEX_REG_CLASS. * doc/tm.texi.in: Ditto. * lra-constraints.cc (index_part_to_reg): Pass index_class. (process_address_1): Calls index_reg_class with curr_insn and replace INDEX_REG_CLASS with its return value index_cl. * reload.cc (find_reloads_address): Likewise. (find_reloads_address_1): Likewise. --- gcc/addresses.h | 10 ++++++++++ gcc/doc/tm.texi | 9 +++++++++ gcc/doc/tm.texi.in | 9 +++++++++ gcc/lra-constraints.cc | 17 +++++++++-------- gcc/reload.cc | 4 ++-- 5 files changed, 39 insertions(+), 10 deletions(-) diff --git a/gcc/addresses.h b/gcc/addresses.h index 08b100cfe6d..4bd96a3fc83 100644 --- a/gcc/addresses.h +++ b/gcc/addresses.h @@ -47,6 +47,16 @@ base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, #endif } +inline enum reg_class +index_reg_class (rtx_insn *insn ATTRIBUTE_UNUSED = NULL) +{ +#ifdef INSN_INDEX_REG_CLASS + return INSN_INDEX_REG_CLASS (insn); +#else + return INDEX_REG_CLASS; +#endif +} + /* Wrapper function to unify target macros REGNO_MODE_CODE_OK_FOR_BASE_P, REGNO_MODE_OK_FOR_REG_BASE_P, REGNO_MODE_OK_FOR_BASE_P and REGNO_OK_FOR_BASE_P. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index a4239e3de10..5a50f5cf7f3 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -2553,6 +2553,15 @@ address where its value is either multiplied by a scale factor or added to another register (as well as added to a displacement). @end defmac +@defmac INSN_INDEX_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +index register must belong. An index register is one used in an +address where its value is either multiplied by a scale factor or +added to another register (as well as added to a displacement). +@code{insn} indicates insn specific index register class should be +subset of the original index register class. +@end defmac + @defmac REGNO_OK_FOR_BASE_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 72898f3adba..65748e19ccd 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2148,6 +2148,15 @@ address where its value is either multiplied by a scale factor or added to another register (as well as added to a displacement). @end defmac +@defmac INSN_INDEX_REG_CLASS (@var{insn}) +A C expression whose value is the register class to which a valid +index register must belong. An index register is one used in an +address where its value is either multiplied by a scale factor or +added to another register (as well as added to a displacement). +@code{insn} indicates insn specific index register class should be +subset of the original index register class. +@end defmac + @defmac REGNO_OK_FOR_BASE_P (@var{num}) A C expression which is nonzero if register number @var{num} is suitable for use as a base register in operand addresses. diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index 9e7915ce934..161b67d8b73 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -3390,12 +3390,12 @@ base_plus_disp_to_reg (struct address_info *ad, rtx disp) /* Make reload of index part of address AD. Return the new pseudo. */ static rtx -index_part_to_reg (struct address_info *ad) +index_part_to_reg (struct address_info *ad, enum reg_class index_class) { rtx new_reg; new_reg = lra_create_new_reg (GET_MODE (*ad->index), NULL_RTX, - INDEX_REG_CLASS, NULL, "index term"); + index_class, NULL, "index term"); expand_mult (GET_MODE (*ad->index), *ad->index_term, GEN_INT (get_index_scale (ad)), new_reg, 1); return new_reg; @@ -3650,13 +3650,14 @@ process_address_1 (int nop, bool check_only_p, /* If INDEX_REG_CLASS is assigned to base_term already and isn't to index_term, swap them so to avoid assigning INDEX_REG_CLASS to both when INDEX_REG_CLASS is a single register class. */ + enum reg_class index_cl = index_reg_class (curr_insn); if (ad.base_term != NULL && ad.index_term != NULL - && ira_class_hard_regs_num[INDEX_REG_CLASS] == 1 + && ira_class_hard_regs_num[index_cl] == 1 && REG_P (*ad.base_term) && REG_P (*ad.index_term) - && in_class_p (*ad.base_term, INDEX_REG_CLASS, NULL) - && ! in_class_p (*ad.index_term, INDEX_REG_CLASS, NULL)) + && in_class_p (*ad.base_term, index_cl, NULL) + && ! in_class_p (*ad.index_term, index_cl, NULL)) { std::swap (ad.base, ad.index); std::swap (ad.base_term, ad.index_term); @@ -3680,7 +3681,7 @@ process_address_1 (int nop, bool check_only_p, } if (ad.index_term != NULL && process_addr_reg (ad.index_term, check_only_p, - before, NULL, INDEX_REG_CLASS)) + before, NULL, index_cl)) change_p = true; /* Target hooks sometimes don't treat extra-constraint addresses as @@ -3789,7 +3790,7 @@ process_address_1 (int nop, bool check_only_p, GET_CODE (*ad.index), curr_insn); - lra_assert (INDEX_REG_CLASS != NO_REGS); + lra_assert (index_cl != NO_REGS); new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, NULL, "disp"); lra_emit_move (new_reg, *ad.disp); *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), @@ -3885,7 +3886,7 @@ process_address_1 (int nop, bool check_only_p, changed pseudo on the equivalent memory and a subreg of the pseudo onto the memory of different mode for which the scale is prohibitted. */ - new_reg = index_part_to_reg (&ad); + new_reg = index_part_to_reg (&ad, index_cl); *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), *ad.base_term, new_reg); } diff --git a/gcc/reload.cc b/gcc/reload.cc index 72f7e27af15..66b484b12fa 100644 --- a/gcc/reload.cc +++ b/gcc/reload.cc @@ -5114,7 +5114,7 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, rtx ad, /* Reload the displacement into an index reg. We assume the frame pointer or arg pointer is a base reg. */ find_reloads_address_part (XEXP (ad, 1), &XEXP (ad, 1), - INDEX_REG_CLASS, GET_MODE (ad), opnum, + index_reg_class (insn), GET_MODE (ad), opnum, type, ind_levels); return 0; } @@ -5514,7 +5514,7 @@ find_reloads_address_1 (machine_mode mode, addr_space_t as, bool reloaded_inner_of_autoinc = false; if (context == 1) - context_reg_class = INDEX_REG_CLASS; + context_reg_class = index_reg_class (insn); else context_reg_class = base_reg_class (mode, as, outer_code, index_code, insn); From patchwork Thu Aug 31 08:20:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137240 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp96363vqu; Thu, 31 Aug 2023 01:22:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE1LtEAWfvg/RsmY7yzP7G7ZA47RleSfcs2kScssqrb6ychGbBjyyAwgTUmNwKuBhSFk6gN X-Received: by 2002:a17:907:7896:b0:9a5:7f99:be4c with SMTP id ku22-20020a170907789600b009a57f99be4cmr3221727ejc.33.1693470179126; Thu, 31 Aug 2023 01:22:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470179; cv=none; d=google.com; s=arc-20160816; b=k0NqmEi3C4ClxIsIhm7Fe3dSkFf62vzFAG6RArHHLb6x3e0M6b/SrVJvw5B88ThenB P2cCFpDAXI3OaCobnhI5wm2j9iM9EIigkpy8oFY5E01ZjBgsvzMoMZLeyRbqvLoEivkL l1fLZ0jJg4L5V8I9PPEz1KQtTd6oZv10/n/5JnnIMG/sdpiybtY+MZnFwHPzbhNWC4bK d4PXBZcUakSlPllXydwtDzYWnTIs4JeJSuGGo8kHwA+OnmkA8gWU6YNmhDtLlMnI6mD1 z+9ihLdgsrSZdGVi4HRzsUTdu/f2/0tig9Jpfp5AbrEz2M6pVWdWo2MCjFw2MjbMya5y uKhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=pAmj8A+W3DCy/DmV+U19up6/KbpmOSBKD7Ri7Je51EY=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=nF6v1RD4d2F3/R7aW811a6Jc/zrkRZizVpblQyiGTicQaI9UuV2P0qWNFoooVdB8uz zCKSB0bYiqejTqbkD2oithJ/r3FdyMxQzfjDDowRzLvjmw4DGX8dNcpeQ7Pn9irsScXK WvPsSxdfxtUXwMSXw+PvXXfAdj4sYiH2Ypuqvpip4GsyMq7pNFph/bxIonsxlTeRmC1p 9WAL351Biaz54uPQhpHQzDqZBHF/0SH7nl9wANGErgWlT2rGRMIDfrrOWx9+7FUnNFG+ 1rqREl/LF55I2ueOTU+aSoZ5WAiFuPhtKrwD0T6kZ6Ouwee0Vai0Tz1C8IRVQzGXoxhv AxWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Q0S5Cvom; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p7-20020a170906b20700b009a1af0d8898si589577ejz.840.2023.08.31.01.22.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:22:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Q0S5Cvom; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2EF953836E9F for ; Thu, 31 Aug 2023 08:21:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2EF953836E9F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470099; bh=pAmj8A+W3DCy/DmV+U19up6/KbpmOSBKD7Ri7Je51EY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Q0S5CvomI2RQkq631H81WOZZWShsqu0J5+pn07QjT8vbxw6vJKF3AHvEyr3BVM0LS rr94TD4BEXpBB5NPAyhTMsqX3NDTFTIQU9RelgDCIQCm5ZR7KzJNykmkCqKgDHZCRW XTwBNh0+Om2Ux/YXaWVcSSokinS6q2MFmxDrymT8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 1BEF03858408 for ; Thu, 31 Aug 2023 08:20:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1BEF03858408 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235590" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235590" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938634" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938634" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:25 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7F101100512B; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 03/13] [APX_EGPR] Initial support for APX_F Date: Thu, 31 Aug 2023 16:20:14 +0800 Message-Id: <20230831082024.314097-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732186405847156 X-GMAIL-MSGID: 1775732186405847156 From: Kong Lingling Add -mapx-features= enumeration to separate subfeatures of APX_F. -mapxf is treated same as previous ISA flag, while it sets -mapx-features=apx_all that enables all subfeatures. gcc/ChangeLog: * common/config/i386/cpuinfo.h (XSTATE_APX_F): New macro. (XCR_APX_F_ENABLED_MASK): Likewise. (get_available_features): Detect APX_F under * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_APX_F_SET): New. (OPTION_MASK_ISA2_APX_F_UNSET): Likewise. (ix86_handle_option): Handle -mapxf. * common/config/i386/i386-cpuinfo.h (FEATURE_APX_F): New. * common/config/i386/i386-isas.h: Add entry for APX_F. * config/i386/cpuid.h (bit_APX_F): New. * config/i386/i386.h (bit_APX_F): (TARGET_APX_EGPR, TARGET_APX_PUSH2POP2, TARGET_APX_NDD): New define. * config/i386/i386-opts.h (enum apx_features): New enum. * config/i386/i386-isa.def (APX_F): New DEF_PTA. * config/i386/i386-options.cc (ix86_function_specific_save): Save ix86_apx_features. (ix86_function_specific_restore): Restore it. (ix86_valid_target_attribute_inner_p): Add mapxf. (ix86_option_override_internal): Set ix86_apx_features for PTA and TARGET_APX_F. Also reports error when APX_F is set but not having TARGET_64BIT. * config/i386/i386.opt: (-mapxf): New ISA flag option. (-mapx=): New enumeration option. (apx_features): New enum type. (apx_none): New enum value. (apx_egpr): Likewise. (apx_push2pop2): Likewise. (apx_ndd): Likewise. (apx_all): Likewise. * doc/invoke.texi: Document mapxf. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-1.c: New test. --- gcc/common/config/i386/cpuinfo.h | 12 +++++++++++- gcc/common/config/i386/i386-common.cc | 17 +++++++++++++++++ gcc/common/config/i386/i386-cpuinfo.h | 1 + gcc/common/config/i386/i386-isas.h | 1 + gcc/config/i386/cpuid.h | 1 + gcc/config/i386/i386-isa.def | 1 + gcc/config/i386/i386-options.cc | 15 +++++++++++++++ gcc/config/i386/i386-opts.h | 8 ++++++++ gcc/config/i386/i386.h | 4 ++++ gcc/config/i386/i386.opt | 25 +++++++++++++++++++++++++ gcc/doc/invoke.texi | 11 +++++++---- gcc/testsuite/gcc.target/i386/apx-1.c | 8 ++++++++ 12 files changed, 99 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-1.c diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h index 24ae0dbf0ac..141d3743316 100644 --- a/gcc/common/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo.h @@ -678,6 +678,7 @@ get_available_features (struct __processor_model *cpu_model, #define XSTATE_HI_ZMM 0x80 #define XSTATE_TILECFG 0x20000 #define XSTATE_TILEDATA 0x40000 +#define XSTATE_APX_F 0x80000 #define XCR_AVX_ENABLED_MASK \ (XSTATE_SSE | XSTATE_YMM) @@ -685,11 +686,13 @@ get_available_features (struct __processor_model *cpu_model, (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) #define XCR_AMX_ENABLED_MASK \ (XSTATE_TILECFG | XSTATE_TILEDATA) +#define XCR_APX_F_ENABLED_MASK XSTATE_APX_F - /* Check if AVX and AVX512 are usable. */ + /* Check if AVX, AVX512 and APX are usable. */ int avx_usable = 0; int avx512_usable = 0; int amx_usable = 0; + int apx_usable = 0; /* Check if KL is usable. */ int has_kl = 0; if ((ecx & bit_OSXSAVE)) @@ -709,6 +712,8 @@ get_available_features (struct __processor_model *cpu_model, } amx_usable = ((xcrlow & XCR_AMX_ENABLED_MASK) == XCR_AMX_ENABLED_MASK); + apx_usable = ((xcrlow & XCR_APX_F_ENABLED_MASK) + == XCR_APX_F_ENABLED_MASK); } #define set_feature(f) \ @@ -922,6 +927,11 @@ get_available_features (struct __processor_model *cpu_model, if (edx & bit_AMX_COMPLEX) set_feature (FEATURE_AMX_COMPLEX); } + if (apx_usable) + { + if (edx & bit_APX_F) + set_feature (FEATURE_APX_F); + } } } diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index 95468b7c405..86596e96ad1 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -123,6 +123,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_SET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_SET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_SET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_APX_F_SET OPTION_MASK_ISA2_APX_F /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same as -msse4.2. */ @@ -309,6 +310,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_UNSET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_UNSET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_UNSET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_APX_F_UNSET OPTION_MASK_ISA2_APX_F /* SSE4 includes both SSE4.1 and SSE4.2. -mno-sse4 should the same as -mno-sse4.1. */ @@ -1341,6 +1343,21 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mapxf: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_APX_F_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_APX_F_SET; + opts->x_ix86_apx_features = apx_all; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_APX_F_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_APX_F_UNSET; + opts->x_ix86_apx_features = apx_none; + } + return true; + case OPT_mfma: if (value) { diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h index 9153b4d0a54..8bf592191ab 100644 --- a/gcc/common/config/i386/i386-cpuinfo.h +++ b/gcc/common/config/i386/i386-cpuinfo.h @@ -261,6 +261,7 @@ enum processor_features FEATURE_SM3, FEATURE_SHA512, FEATURE_SM4, + FEATURE_APX_F, CPU_FEATURE_MAX }; diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h index 2297903a45e..47e0cbd6f5b 100644 --- a/gcc/common/config/i386/i386-isas.h +++ b/gcc/common/config/i386/i386-isas.h @@ -191,4 +191,5 @@ ISA_NAMES_TABLE_START ISA_NAMES_TABLE_ENTRY("sm3", FEATURE_SM3, P_NONE, "-msm3") ISA_NAMES_TABLE_ENTRY("sha512", FEATURE_SHA512, P_NONE, "-msha512") ISA_NAMES_TABLE_ENTRY("sm4", FEATURE_SM4, P_NONE, "-msm4") + ISA_NAMES_TABLE_ENTRY("apxf", FEATURE_APX_F, P_NONE, "-mapxf") ISA_NAMES_TABLE_END diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h index 73c15480350..f3d3a2a1c22 100644 --- a/gcc/config/i386/cpuid.h +++ b/gcc/config/i386/cpuid.h @@ -149,6 +149,7 @@ #define bit_AVXNECONVERT (1 << 5) #define bit_AVXVNNIINT16 (1 << 10) #define bit_PREFETCHI (1 << 14) +#define bit_APX_F (1 << 21) /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */ #define bit_XSAVEOPT (1 << 0) diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def index aeafcf870ac..c581f343339 100644 --- a/gcc/config/i386/i386-isa.def +++ b/gcc/config/i386/i386-isa.def @@ -121,3 +121,4 @@ DEF_PTA(AVXVNNIINT16) DEF_PTA(SM3) DEF_PTA(SHA512) DEF_PTA(SM4) +DEF_PTA(APX_F) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index e47f9ed5d5f..8881462e3b0 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -694,6 +694,7 @@ ix86_function_specific_save (struct cl_target_option *ptr, ptr->branch_cost = ix86_branch_cost; ptr->tune_defaulted = ix86_tune_defaulted; ptr->arch_specified = ix86_arch_specified; + ptr->x_ix86_apx_features = opts->x_ix86_apx_features; ptr->x_ix86_isa_flags_explicit = opts->x_ix86_isa_flags_explicit; ptr->x_ix86_isa_flags2_explicit = opts->x_ix86_isa_flags2_explicit; ptr->x_recip_mask_explicit = opts->x_recip_mask_explicit; @@ -832,6 +833,7 @@ ix86_function_specific_restore (struct gcc_options *opts, ix86_prefetch_sse = ptr->prefetch_sse; ix86_tune_defaulted = ptr->tune_defaulted; ix86_arch_specified = ptr->arch_specified; + opts->x_ix86_apx_features = ptr->x_ix86_apx_features; opts->x_ix86_isa_flags_explicit = ptr->x_ix86_isa_flags_explicit; opts->x_ix86_isa_flags2_explicit = ptr->x_ix86_isa_flags2_explicit; opts->x_recip_mask_explicit = ptr->x_recip_mask_explicit; @@ -1109,6 +1111,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], IX86_ATTR_ISA ("sm3", OPT_msm3), IX86_ATTR_ISA ("sha512", OPT_msha512), IX86_ATTR_ISA ("sm4", OPT_msm4), + IX86_ATTR_ISA ("apxf", OPT_mapxf), /* enum options */ IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_), @@ -2080,6 +2083,9 @@ ix86_option_override_internal (bool main_args_p, opts->x_ix86_stringop_alg = no_stringop; } + if (TARGET_APX_F && !TARGET_64BIT) + error ("%<-mapxf%> is not supported for 32-bit code"); + if (TARGET_UINTR && !TARGET_64BIT) error ("%<-muintr%> not supported for 32-bit code"); @@ -2293,6 +2299,11 @@ ix86_option_override_internal (bool main_args_p, SET_TARGET_POPCNT (opts); } + if (TARGET_64BIT_P (opts->x_ix86_isa_flags) + && ((processor_alias_table[i].flags & PTA_APX_F) != 0) + && !TARGET_EXPLICIT_APX_F_P (opts)) + opts->x_ix86_apx_features = apx_all; + if ((processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE)) != 0) ix86_prefetch_sse = true; @@ -2444,6 +2455,10 @@ ix86_option_override_internal (bool main_args_p, /* Arrange to set up i386_stack_locals for all functions. */ init_machine_status = ix86_init_machine_status; + /* Override APX flag here if ISA bit is set. */ + if (TARGET_APX_F && opts->x_ix86_apx_features != apx_all) + opts->x_ix86_apx_features = apx_all; + /* Validate -mregparm= value. */ if (opts_set->x_ix86_regparm) { diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h index be359f3e3d5..2ec76a16bce 100644 --- a/gcc/config/i386/i386-opts.h +++ b/gcc/config/i386/i386-opts.h @@ -134,4 +134,12 @@ enum lam_type { lam_u57 }; +enum apx_features { + apx_none = 0, + apx_egpr = 1 << 0, + apx_push2pop2 = 1 << 1, + apx_ndd = 1 << 2, + apx_all = apx_egpr | apx_push2pop2 | apx_ndd, +}; + #endif diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 3e8488f2ae8..8c7ed541a8f 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -51,6 +51,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) +#define TARGET_APX_EGPR (ix86_apx_features & apx_egpr) +#define TARGET_APX_PUSH2POP2 (ix86_apx_features & apx_push2pop2) +#define TARGET_APX_NDD (ix86_apx_features & apx_ndd) + #include "config/vxworks-dummy.h" #include "config/i386/i386-opts.h" diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 78b499304a4..1ee4d90186e 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1310,3 +1310,28 @@ Enable vectorization for gather instruction. mscatter Target Alias(mtune-ctrl=, use_scatter, ^use_scatter) Enable vectorization for scatter instruction. + +mapxf +Target Mask(ISA2_APX_F) Var(ix86_isa_flags2) Save +Support APX code generation. + +mapx= +Target Joined Enum(apx_features) EnumSet Var(ix86_apx_features) Init(apx_none) Save + +Enum +Name(apx_features) Type(int) + +EnumValue +Enum(apx_features) String(none) Value(apx_none) Set(1) + +EnumValue +Enum(apx_features) String(egpr) Value(apx_egpr) Set(2) + +EnumValue +Enum(apx_features) String(push2pop2) Value(apx_push2pop2) Set(3) + +EnumValue +Enum(apx_features) String(ndd) Value(apx_ndd) Set(4) + +EnumValue +Enum(apx_features) String(all) Value(apx_all) Set(1) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 16aa92b5e86..48d7ccc3be8 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1438,7 +1438,7 @@ See RS/6000 and PowerPC Options. -mrdseed -msgx -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 --mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 +-mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 -mapxf -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops -minline-stringops-dynamically -mstringop-strategy=@var{alg} -mkl -mwidekl @@ -33688,6 +33688,9 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @need 200 @opindex msm4 @itemx -msm4 +@need 200 +@opindex mapxf +@itemx -mapxf These switches enable the use of instructions in the MMX, SSE, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG, @@ -33698,9 +33701,9 @@ GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16, ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE, UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512-FP16, AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI, RAOINT, -AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4 or CLDEMOTE extended instruction -sets. Each has a corresponding @option{-mno-} option to disable use of these -instructions. +AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4, APX_F or CLDEMOTE extended +instruction sets. Each has a corresponding @option{-mno-} option to disable +use of these instructions. These extensions are also available as built-in functions: see @ref{x86 Built-in Functions}, for details of the functions enabled and diff --git a/gcc/testsuite/gcc.target/i386/apx-1.c b/gcc/testsuite/gcc.target/i386/apx-1.c new file mode 100644 index 00000000000..956229ab6e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mapxf" } */ +/* { dg-error "'-mapxf' not supported for 32-bit code" "" { target ia32 } 0 } */ + +void +apx_hanlder () +{ +} From patchwork Thu Aug 31 08:20:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137245 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp97548vqu; Thu, 31 Aug 2023 01:26:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGoOXFytV9cx5o/B1ZBi8KWbEikNjo9hQdQCI63kU/OESxhr3AfEn1dLqRBYgSPVHD9s2gB X-Received: by 2002:aa7:cfd2:0:b0:523:4922:c9c4 with SMTP id r18-20020aa7cfd2000000b005234922c9c4mr3990655edy.11.1693470387235; Thu, 31 Aug 2023 01:26:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470387; cv=none; d=google.com; s=arc-20160816; b=v8qgi+SRwRM7o6wm8gfdgG9HUNZvr3yvWFfmVWTCzDop2dH6Lz5Am7/M0J06bq+rON m38gxfhVxZaOBWr3xvr/i0hpytPKG877h8xuwWEDdRlKLURqkN9PjvzzK0h4fRzEDkYF YqevUMBjiaKoTeny3vMY6UkxXRgK1iHOI6nNg8DFibT99a+3a/nwsQCK7OVJ86xHlamf JYUfGJVfVAvyAhNMajx3ylMcN9V4WSAM2YrtqawzsJMVpotMJ1O5jrl0UgIgbkBnQatx m+/LwU9qdxxw1hUHGRUi+dmvI599kDCp7LXEODQkarqZbdy14z64vjK6hDqT8M4RNJCf c+TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=BeYNFA+R3Xy6xAqaXK5buEeuW0x01qG56bVCriSfq1k=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=WzB/1DyuuYRGs60lf0fMS3/waRYil6g0pdVYd2xSeH1w3dJ1QhIUIzye/BAWao/wFi O7Ds/Sh3JK0Xlk2y+FkhrEdEhAl5xb7jwA2RrYuQ9Ltk7ZKFmaRqMzvPPpsi2eSSnK4l iknupYuQ3PnQYIMIJy8Fgh8ajK7EAYpPK7VvtZIIGOug+Iyi2BlEXgmU6O3Xg08RrmfV qIpawoW9StQ4uOFLZuSkfc4oJTHLIdsB64pZykDxfAMPpNJTikCeDAQhRJdO592G133X S9ADV78JakHl6t3Koqi7EFokyMOxNhh4+1ukQ2zRX4UB8cnyyNlx6dCI1ZjIbl/hO+Dd esag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xIAnZKEj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r8-20020aa7cfc8000000b0052a48f40657si736858edy.35.2023.08.31.01.26.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:26:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xIAnZKEj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2DE853886C46 for ; Thu, 31 Aug 2023 08:23:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2DE853886C46 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470183; bh=BeYNFA+R3Xy6xAqaXK5buEeuW0x01qG56bVCriSfq1k=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=xIAnZKEj6vnIhRRM/l6C31KuilQ8SJnv2YR2oYPrcicLkuMN8GIaTbB2x/vDWLuHw Q+wbDHDXtC1Y3xi12BsR2Ue4Y2TeL75PEw38k+iXjEzX5+8ovOWiQBCWwh34XXMYP3 s0bgpiuDxDYFyh3uEi/+4+yJhqlPQA2Z7tWHpS3c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id DF197385841A for ; Thu, 31 Aug 2023 08:20:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF197385841A X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235611" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235611" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938650" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938650" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:25 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 83EFE100512C; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 04/13] [APX EGPR] Add 16 new integer general purpose registers Date: Thu, 31 Aug 2023 16:20:15 +0800 Message-Id: <20230831082024.314097-5-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732404856668768 X-GMAIL-MSGID: 1775732404856668768 From: Kong Lingling Extend GENERAL_REGS with extra r16-r31 registers like REX registers, named as REX2 registers. They will only be enabled under TARGET_APX_EGPR. gcc/ChangeLog: * config/i386/i386-protos.h (x86_extended_rex2reg_mentioned_p): New function prototype. * config/i386/i386.cc (regclass_map): Add mapping for 16 new general registers. (debugger64_register_map): Likewise. (ix86_conditional_register_usage): Clear REX2 register when APX disabled. (ix86_code_end): Add handling for REX2 reg. (print_reg): Likewise. (ix86_output_jmp_thunk_or_indirect): Likewise. (ix86_output_indirect_branch_via_reg): Likewise. (ix86_attr_length_vex_default): Likewise. (ix86_emit_save_regs): Adjust to allow saving r31. (ix86_register_priority): Set REX2 reg priority same as REX. (x86_extended_reg_mentioned_p): Add check for REX2 regs. (x86_extended_rex2reg_mentioned_p): New function. * config/i386/i386.h (CALL_USED_REGISTERS): Add new extended registers. (REG_ALLOC_ORDER): Likewise. (FIRST_REX2_INT_REG): Define. (LAST_REX2_INT_REG): Ditto. (GENERAL_REGS): Add 16 new registers. (INT_SSE_REGS): Likewise. (FLOAT_INT_REGS): Likewise. (FLOAT_INT_SSE_REGS): Likewise. (INT_MASK_REGS): Likewise. (ALL_REGS):Likewise. (REX2_INT_REG_P): Define. (REX2_INT_REGNO_P): Ditto. (GENERAL_REGNO_P): Add REX2_INT_REGNO_P. (REGNO_OK_FOR_INDEX_P): Ditto. (REG_OK_FOR_INDEX_NONSTRICT_P): Add new extended registers. * config/i386/i386.md: Add 16 new integer general registers. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-egprs-names.c: New test. * gcc.target/i386/apx-spill_to_egprs-1.c: Likewise. * gcc.target/i386/apx-interrupt-1.c: Likewise. --- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 67 ++++++++++-- gcc/config/i386/i386.h | 47 +++++--- gcc/config/i386/i386.md | 18 +++- .../gcc.target/i386/apx-egprs-names.c | 17 +++ .../gcc.target/i386/apx-interrupt-1.c | 102 ++++++++++++++++++ .../gcc.target/i386/apx-spill_to_egprs-1.c | 25 +++++ 7 files changed, 253 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-egprs-names.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-interrupt-1.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 9ffb125fc2b..bd4782800c4 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -64,6 +64,7 @@ extern bool symbolic_reference_mentioned_p (rtx); extern bool extended_reg_mentioned_p (rtx); extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); +extern bool x86_extended_rex2reg_mentioned_p (rtx); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 1bc3f11ff07..d26d9ab0d9d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -169,7 +169,12 @@ enum reg_class const regclass_map[FIRST_PSEUDO_REGISTER] = ALL_SSE_REGS, ALL_SSE_REGS, ALL_SSE_REGS, ALL_SSE_REGS, /* Mask registers. */ ALL_MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS, - MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS + MASK_REGS, MASK_REGS, MASK_REGS, MASK_REGS, + /* REX2 registers */ + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, + GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, GENERAL_REGS, }; /* The "default" register map used in 32bit mode. */ @@ -227,7 +232,10 @@ int const debugger64_register_map[FIRST_PSEUDO_REGISTER] = /* AVX-512 registers 24-31 */ 75, 76, 77, 78, 79, 80, 81, 82, /* Mask registers */ - 118, 119, 120, 121, 122, 123, 124, 125 + 118, 119, 120, 121, 122, 123, 124, 125, + /* rex2 extend interger registers */ + 130, 131, 132, 133, 134, 135, 136, 137, + 138, 139, 140, 141, 142, 143, 144, 145 }; /* Define the register numbers to be used in Dwarf debugging information. @@ -521,6 +529,13 @@ ix86_conditional_register_usage (void) accessible_reg_set &= ~reg_class_contents[ALL_MASK_REGS]; } + + /* If APX is disabled, disable the registers. */ + if (! (TARGET_APX_EGPR && TARGET_64BIT)) + { + for (i = FIRST_REX2_INT_REG; i <= LAST_REX2_INT_REG; i++) + CLEAR_HARD_REG_BIT (accessible_reg_set, i); + } } /* Canonicalize a comparison from one we don't have to one we do have. */ @@ -6179,6 +6194,13 @@ ix86_code_end (void) regno, false); } + for (regno = FIRST_REX2_INT_REG; regno <= LAST_REX2_INT_REG; regno++) + { + if (TEST_HARD_REG_BIT (indirect_thunks_used, regno)) + output_indirect_thunk_function (indirect_thunk_prefix_none, + regno, false); + } + for (regno = FIRST_INT_REG; regno <= LAST_INT_REG; regno++) { char name[32]; @@ -7190,10 +7212,10 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align, static void ix86_emit_save_regs (void) { - unsigned int regno; + int regno; rtx_insn *insn; - for (regno = FIRST_PSEUDO_REGISTER - 1; regno-- > 0; ) + for (regno = FIRST_PSEUDO_REGISTER - 1; regno >= 0; regno--) if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true)) { insn = emit_insn (gen_push (gen_rtx_REG (word_mode, regno))); @@ -13037,7 +13059,7 @@ print_reg (rtx x, int code, FILE *file) /* Irritatingly, AMD extended registers use different naming convention: "r%d[bwd]" */ - if (REX_INT_REGNO_P (regno)) + if (REX_INT_REGNO_P (regno) || REX2_INT_REGNO_P (regno)) { gcc_assert (TARGET_64BIT); switch (msize) @@ -16251,7 +16273,7 @@ ix86_output_jmp_thunk_or_indirect (const char *thunk_name, const int regno) { if (thunk_name != NULL) { - if (REX_INT_REGNO_P (regno) + if ((REX_INT_REGNO_P (regno) || REX2_INT_REGNO_P (regno)) && ix86_indirect_branch_cs_prefix) fprintf (asm_out_file, "\tcs\n"); fprintf (asm_out_file, "\tjmp\t"); @@ -16303,7 +16325,7 @@ ix86_output_indirect_branch_via_reg (rtx call_op, bool sibcall_p) { if (thunk_name != NULL) { - if (REX_INT_REGNO_P (regno) + if ((REX_INT_REGNO_P (regno) || REX_INT_REGNO_P (regno)) && ix86_indirect_branch_cs_prefix) fprintf (asm_out_file, "\tcs\n"); fprintf (asm_out_file, "\tcall\t"); @@ -17060,19 +17082,26 @@ ix86_attr_length_vex_default (rtx_insn *insn, bool has_0f_opcode, for (i = recog_data.n_operands - 1; i >= 0; --i) if (REG_P (recog_data.operand[i])) { - /* REX.W bit uses 3 byte VEX prefix. */ + /* REX.W bit uses 3 byte VEX prefix. + REX2 with vex use extended EVEX prefix length is 4-byte. */ if (GET_MODE (recog_data.operand[i]) == DImode && GENERAL_REG_P (recog_data.operand[i])) return 3 + 1; /* REX.B bit requires 3-byte VEX. Right here we don't know which - operand will be encoded using VEX.B, so be conservative. */ + operand will be encoded using VEX.B, so be conservative. + REX2 with vex use extended EVEX prefix length is 4-byte. */ if (REX_INT_REGNO_P (recog_data.operand[i]) + || REX2_INT_REGNO_P (recog_data.operand[i]) || REX_SSE_REGNO_P (recog_data.operand[i])) reg_only = 3 + 1; } else if (MEM_P (recog_data.operand[i])) { + /* REX2.X or REX2.B bits use 3 byte VEX prefix. */ + if (x86_extended_rex2reg_mentioned_p (recog_data.operand[i])) + return 4; + /* REX.X or REX.B bits use 3 byte VEX prefix. */ if (x86_extended_reg_mentioned_p (recog_data.operand[i])) return 3 + 1; @@ -19509,6 +19538,8 @@ ix86_register_priority (int hard_regno) /* New x86-64 int registers result in bigger code size. Discourage them. */ if (REX_INT_REGNO_P (hard_regno)) return 2; + if (REX2_INT_REGNO_P (hard_regno)) + return 2; /* New x86-64 SSE registers result in bigger code size. Discourage them. */ if (REX_SSE_REGNO_P (hard_regno)) return 2; @@ -22755,7 +22786,23 @@ x86_extended_reg_mentioned_p (rtx insn) { const_rtx x = *iter; if (REG_P (x) - && (REX_INT_REGNO_P (REGNO (x)) || REX_SSE_REGNO_P (REGNO (x)))) + && (REX_INT_REGNO_P (REGNO (x)) || REX_SSE_REGNO_P (REGNO (x)) + || REX2_INT_REGNO_P (REGNO (x)))) + return true; + } + return false; +} + +/* Return true when INSN mentions register that must be encoded using REX2 + prefix. */ +bool +x86_extended_rex2reg_mentioned_p (rtx insn) +{ + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, INSN_P (insn) ? PATTERN (insn) : insn, NONCONST) + { + const_rtx x = *iter; + if (REG_P (x) && REX2_INT_REGNO_P (REGNO (x))) return true; } return false; diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 8c7ed541a8f..1ab291177f5 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -948,7 +948,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/ \ 0, 0, 0, 0, 0, 0, 0, 0, \ /* k0, k1, k2, k3, k4, k5, k6, k7*/ \ - 0, 0, 0, 0, 0, 0, 0, 0 } + 0, 0, 0, 0, 0, 0, 0, 0, \ +/* r16, r17, r18, r19, r20, r21, r22, r23*/ \ + 0, 0, 0, 0, 0, 0, 0, 0, \ +/* r24, r25, r26, r27, r28, r29, r30, r31*/ \ + 0, 0, 0, 0, 0, 0, 0, 0} \ /* 1 for registers not available across function calls. These must include the FIXED_REGISTERS and also any @@ -985,7 +989,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/ \ 1, 1, 1, 1, 1, 1, 1, 1, \ /* k0, k1, k2, k3, k4, k5, k6, k7*/ \ - 1, 1, 1, 1, 1, 1, 1, 1 } + 1, 1, 1, 1, 1, 1, 1, 1, \ +/* r16, r17, r18, r19, r20, r21, r22, r23*/ \ + 1, 1, 1, 1, 1, 1, 1, 1, \ +/* r24, r25, r26, r27, r28, r29, r30, r31*/ \ + 1, 1, 1, 1, 1, 1, 1, 1} \ /* Order in which to allocate registers. Each register must be listed once, even those in FIXED_REGISTERS. List frame pointer @@ -1001,7 +1009,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, \ 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, \ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, \ - 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 } + 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, \ + 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91} /* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order to be rearranged based on a particular function. When using sse math, @@ -1203,6 +1212,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define FIRST_MASK_REG MASK0_REG #define LAST_MASK_REG MASK7_REG +#define FIRST_REX2_INT_REG R16_REG +#define LAST_REX2_INT_REG R31_REG + /* Override this in other tm.h files to cope with various OS lossage requiring a frame pointer. */ #ifndef SUBTARGET_FRAME_POINTER_REQUIRED @@ -1280,7 +1292,9 @@ enum reg_class INDEX_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp */ LEGACY_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp */ GENERAL_REGS, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp - %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 + %r16 %r17 %r18 %r19 %r20 %r21 %r22 %r23 + %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1380,7 +1394,7 @@ enum reg_class { 0x7e, 0xff0, 0x0 }, /* TLS_GOTBASE_REGS */ \ { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ - { 0x900ff, 0xff0, 0x0 }, /* GENERAL_REGS */ \ + { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ @@ -1390,13 +1404,13 @@ enum reg_class { 0xff00000, 0xfffff000, 0xf }, /* ALL_SSE_REGS */ \ { 0xf0000000, 0xf, 0x0 }, /* MMX_REGS */ \ { 0xff0ff00, 0xfffff000, 0xf }, /* FLOAT_SSE_REGS */ \ - { 0x9ffff, 0xff0, 0x0 }, /* FLOAT_INT_REGS */ \ - { 0xff900ff, 0xfffffff0, 0xf }, /* INT_SSE_REGS */ \ - { 0xff9ffff, 0xfffffff0, 0xf }, /* FLOAT_INT_SSE_REGS */ \ + { 0x9ffff, 0xff0, 0xffff000 }, /* FLOAT_INT_REGS */ \ + { 0xff900ff, 0xfffffff0, 0xffff00f }, /* INT_SSE_REGS */ \ + { 0xff9ffff, 0xfffffff0, 0xffff00f }, /* FLOAT_INT_SSE_REGS */ \ { 0x0, 0x0, 0xfe0 }, /* MASK_REGS */ \ { 0x0, 0x0, 0xff0 }, /* ALL_MASK_REGS */ \ - { 0x900ff, 0xff0, 0xff0 }, /* INT_MASK_REGS */ \ -{ 0xffffffff, 0xffffffff, 0xfff } /* ALL_REGS */ \ + { 0x900ff, 0xff0, 0xffffff0 }, /* INT_MASK_REGS */ \ +{ 0xffffffff, 0xffffffff, 0xfffffff } /* ALL_REGS */ \ } /* The same information, inverted: @@ -1426,13 +1440,17 @@ enum reg_class #define REX_INT_REGNO_P(N) \ IN_RANGE ((N), FIRST_REX_INT_REG, LAST_REX_INT_REG) +#define REX2_INT_REG_P(X) (REG_P (X) && REX2_INT_REGNO_P (REGNO (X))) +#define REX2_INT_REGNO_P(N) \ + IN_RANGE ((N), FIRST_REX2_INT_REG, LAST_REX2_INT_REG) + #define GENERAL_REG_P(X) (REG_P (X) && GENERAL_REGNO_P (REGNO (X))) #define GENERAL_REGNO_P(N) \ - (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N)) + (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) #define INDEX_REG_P(X) (REG_P (X) && INDEX_REGNO_P (REGNO (X))) #define INDEX_REGNO_P(N) \ - (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N)) + (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) #define ANY_QI_REG_P(X) (REG_P (X) && ANY_QI_REGNO_P (REGNO (X))) #define ANY_QI_REGNO_P(N) \ @@ -1698,6 +1716,7 @@ typedef struct ix86_args { has been allocated, which happens in reginfo.cc during register allocation. */ + #define REGNO_OK_FOR_INDEX_P(REGNO) \ (INDEX_REGNO_P (REGNO) \ || INDEX_REGNO_P (reg_renumber[(REGNO)])) @@ -1990,7 +2009,9 @@ do { \ "xmm20", "xmm21", "xmm22", "xmm23", \ "xmm24", "xmm25", "xmm26", "xmm27", \ "xmm28", "xmm29", "xmm30", "xmm31", \ - "k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7" } + "k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7", \ + "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", \ + "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31" } #define REGISTER_NAMES HI_REGISTER_NAMES diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index eef8a0e01eb..e3270658cb7 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -464,7 +464,23 @@ (define_constants (MASK5_REG 73) (MASK6_REG 74) (MASK7_REG 75) - (FIRST_PSEUDO_REG 76) + (R16_REG 76) + (R17_REG 77) + (R18_REG 78) + (R19_REG 79) + (R20_REG 80) + (R21_REG 81) + (R22_REG 82) + (R23_REG 83) + (R24_REG 84) + (R25_REG 85) + (R26_REG 86) + (R27_REG 87) + (R28_REG 88) + (R29_REG 89) + (R30_REG 90) + (R31_REG 91) + (FIRST_PSEUDO_REG 92) ]) ;; Insn callee abi index. diff --git a/gcc/testsuite/gcc.target/i386/apx-egprs-names.c b/gcc/testsuite/gcc.target/i386/apx-egprs-names.c new file mode 100644 index 00000000000..445bcf2c250 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-egprs-names.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mapxf -m64" } */ +/* { dg-final { scan-assembler "r31" } } */ +/* { dg-final { scan-assembler "r30" } } */ +/* { dg-final { scan-assembler "r29" } } */ +/* { dg-final { scan-assembler "r28" } } */ +void foo () +{ + register long a __asm ("r31"); + register int b __asm ("r30"); + register short c __asm ("r29"); + register char d __asm ("r28"); + __asm__ __volatile__ ("mov %0, %%rax" : : "r" (a) : "rax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (b) : "eax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (c) : "eax"); + __asm__ __volatile__ ("mov %0, %%eax" : : "r" (d) : "eax"); +} diff --git a/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c new file mode 100644 index 00000000000..441dbf04bf2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c @@ -0,0 +1,102 @@ +/* { dg-do compile } */ +/* { dg-options "-mapxf -m64 -O2 -mgeneral-regs-only -mno-cld -mno-push-args -maccumulate-outgoing-args" } */ + +extern void foo (void *) __attribute__ ((interrupt)); +extern int bar (int); + +void foo (void *frame) +{ + int a,b,c,d,e,f,i; + a = bar (5); + b = bar (a); + c = bar (b); + d = bar (c); + e = bar (d); + f = bar (e); + for (i = 1; i < 10; i++) + { + a += bar (a + i) + bar (b + i) + + bar (c + i) + bar (d + i) + + bar (e + i) + bar (f + i); + } +} +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)ax" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)bx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)cx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)dx" 1 } } */ +/* { dg-final { scan-assembler-times "push(?:l|q)\[\\t \]*%(?:e|r)si" 1 } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%rdi" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r8" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r9" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r10" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r11" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r12" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r13" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r14" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r15" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r16" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r17" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r18" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r19" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r20" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r21" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r22" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r23" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r24" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r25" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r26" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r27" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r28" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r29" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r30" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "pushq\[\\t \]*%r31" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 145, -16} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 144, -24} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 143, -32} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 142, -40} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 141, -48} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 140, -56} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 139, -64} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 138, -72} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 137, -80} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 136, -88} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 135, -96} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 134, -104} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 133, -112} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 132, -120} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 131, -128} 1 } } */ +/* { dg-final { scan-assembler-times {\t\.cfi_offset 130, -136} 1 } } */ +/* { dg-final { scan-assembler-times ".cfi_restore" 15} } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)ax" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)bx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)cx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)dx" 1 } } */ +/* { dg-final { scan-assembler-times "pop(?:l|q)\[\\t \]*%(?:e|r)si" 1 } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%rdi" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r8" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r9" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r10" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r11" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r12" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r13" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r14" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r15" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r16" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r17" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r18" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r19" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r20" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r21" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r22" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r23" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r24" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r25" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r26" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r27" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r28" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r29" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r30" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "popq\[\\t \]*%r31" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "iret" 1 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "iretq" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "\tcld" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c new file mode 100644 index 00000000000..290863d63a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c @@ -0,0 +1,25 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=skylake-avx512 -mapxf -DDTYPE32" } */ + +#include "spill_to_mask-1.c" + +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r16d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r17d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r18d" } } */ +/* { dg-final { scan-assembler "movq\[ \t]+\[^\\n\\r\]*, %r19" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r20d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r21d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r22d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r23d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r24d" } } */ +/* { dg-final { scan-assembler "addl\[ \t]+\[^\\n\\r\]*, %r25d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r26d" } } */ +/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r27d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r28d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r29d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r30d" } } */ +/* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r31d" } } */ +/* { dg-final { scan-assembler-not "knot" } } */ +/* { dg-final { scan-assembler-not "kxor" } } */ +/* { dg-final { scan-assembler-not "kor" } } */ +/* { dg-final { scan-assembler-not "kandn" } } */ From patchwork Thu Aug 31 08:20:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137243 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp97389vqu; Thu, 31 Aug 2023 01:26:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF8mjaQj3klYka0j7BTAYzc9Tl0JZeAt0eyxvkCzT5pvBkqQSEaYMNu20PaELUFhafpiHLY X-Received: by 2002:aa7:db4a:0:b0:527:3a95:3fa4 with SMTP id n10-20020aa7db4a000000b005273a953fa4mr3164485edt.20.1693470360983; Thu, 31 Aug 2023 01:26:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470360; cv=none; d=google.com; s=arc-20160816; b=geNJumJArVx6tN36XCECjEaV3bph6KeTVhaIByZ7D0q5y0uE4jRDIU26bRA05hBYyu IrgZPoRmigp3GZDk2kDDymriuPFl2jFbHKYLuBY6Hus9gIY3/JeWYPCp0XXSRrSaH1yc F6EuET+RcveMQ7Xj1vnKZa3U6IdUQKAreXUnI7go/V1DRlv5bNT4f2q6yjnX3JXkPint XDz11zrcjjo3MtZH2o7sYR0gaLJ4UBw6kvZw+tf6uov2PxWfn/WWyrOwuqH7DsXfci/R iVqM32I50zMCCz3s6kw+kqwOh/vtdKJkrH/TxvYfkEpG2cVNx0chAejDQ4CbI/HIybw5 ui5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=Z6atURnho0REEwoNtn3UTcP5IM3Kk6qniL1DyxbVKdM=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=Fcwiiq2Bu0dO4hg1yQPJulTaukK1buyvgfhjAk8UFjuYIe5dfV6S0NXuejubRmxwtD 00z3jonDpoMEiw9SGnuVLwrHnMvWsFs6faSh/Nnj8mZzmbKeyz2zWKrcXcc1/GAthQLA lc7dZIi0XEq60OG8iZl1Janm+xHZmXqX76cyKs9zSJG45q7tmQ+OMBJZKBd/m7iQen71 KSBYQ+ogIKauU22Oc5tItQecp02+W7nJe0fkA++eXk8Z2cjxfXLb/RTneDp2iMstYCAX YS+1JyaB/m99xdZN7enky4VxHgUGLzLyacY2hwCQOuXLJ1NUF6aCzymafleLjh7JBEvP ck9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tvhJjaes; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l17-20020a056402345100b00523d212769csi689127edc.227.2023.08.31.01.26.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:26:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tvhJjaes; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C19133882AF4 for ; Thu, 31 Aug 2023 08:22:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C19133882AF4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470174; bh=Z6atURnho0REEwoNtn3UTcP5IM3Kk6qniL1DyxbVKdM=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=tvhJjaesUaXKZ+BGbMCpGQI/9sPsSheVAnZrnUNcQ63A6UPKhVs5EneZB0ZGFz7Yy EvWZwf2cFUek0qKbitIrmbJqdbperCWZHIhxs5lhFKOD7U+hTeU41Mf0RL6u5+JZz4 rqKmo3efoh312Ru169l6AIwL7L39M5DLjEsXn5Zw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 8720C3858414 for ; Thu, 31 Aug 2023 08:20:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8720C3858414 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235634" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235634" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938666" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938666" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 883A4100512D; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 05/13] [APX EGPR] Add register and memory constraints that disallow EGPR Date: Thu, 31 Aug 2023 16:20:16 +0800 Message-Id: <20230831082024.314097-6-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732376927765929 X-GMAIL-MSGID: 1775732376927765929 From: Kong Lingling For APX, as we extended the GENERAL_REG_CLASS, new constraints are needed to restrict insns that cannot adopt EGPR either in its reg or memory operands. gcc/ChangeLog: * config/i386/constraints.md (h): New register constraint for GENERAL_GPR16. (Bt): New non-EGPR memory constraint. (BT): Likewise for Bm constraint. * config/i386/i386.h (enum reg_class): Add new reg class GENERAL_GPR16. --- gcc/config/i386/constraints.md | 19 ++++++++++++++++++- gcc/config/i386/i386.h | 4 ++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index fd490f39110..f487bf2e5a3 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -19,7 +19,7 @@ ;;; Unused letters: ;;; H -;;; h j z +;;; j z ;; Integer register constraints. ;; It is not necessary to define 'r' here. @@ -165,6 +165,8 @@ (define_register_constraint "YW" ;; k TLS address that allows insn using non-integer registers ;; n Memory operand without REX prefix ;; r Broadcast memory operand +;; t Memory operand without EGPR +;; T Vector memory operand without EGPR ;; s Sibcall memory operand, not valid for TARGET_X32 ;; w Call memory operand, not valid for TARGET_X32 ;; z Constant call address operand. @@ -201,6 +203,18 @@ (define_special_memory_constraint "Bn" "@internal Memory operand without REX prefix." (match_operand 0 "norex_memory_operand")) +(define_memory_constraint "Bt" + "@internal Memory operand without GPR32." + (and (match_operand 0 "memory_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + +(define_special_memory_constraint "BT" + "@internal vector memory operand without GPR32." + (and (match_operand 0 "vector_memory_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + (define_special_memory_constraint "Br" "@internal bcst memory operand." (match_operand 0 "bcst_mem_operand")) @@ -371,3 +385,6 @@ (define_address_constraint "Tv" (define_address_constraint "Ts" "Address operand without segment register" (match_operand 0 "address_no_seg_operand")) + +(define_register_constraint "h" + "TARGET_APX_EGPR ? GENERAL_GPR16 : GENERAL_REGS") diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 1ab291177f5..7ec3086641c 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1295,6 +1295,8 @@ enum reg_class %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 %r16 %r17 %r18 %r19 %r20 %r21 %r22 %r23 %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ + GENERAL_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1357,6 +1359,7 @@ enum reg_class "INDEX_REGS", \ "LEGACY_REGS", \ "GENERAL_REGS", \ + "GENERAL_GPR16", \ "FP_TOP_REG", "FP_SECOND_REG", \ "FLOAT_REGS", \ "SSE_FIRST_REG", \ @@ -1395,6 +1398,7 @@ enum reg_class { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ + { 0x900ff, 0xff0, 0x0 }, /* GENERAL_GPR16 */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ From patchwork Thu Aug 31 08:20:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137239 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp96352vqu; Thu, 31 Aug 2023 01:22:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGont2s0EvSGP9GgYGgxiSteWHJe0wG+iRE+iKninfrAiSUe3FFasSRHwtKCbdexpEqlM5e X-Received: by 2002:a17:907:78c4:b0:9a2:256a:65ca with SMTP id kv4-20020a17090778c400b009a2256a65camr3781595ejc.14.1693470178112; Thu, 31 Aug 2023 01:22:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470178; cv=none; d=google.com; s=arc-20160816; b=jL+w8aFqF/udCWK+G7ytq+OnL+h/gGCIVmyEHbLBl2xXkO6zZLGPA61MvM0SdjSAAd 2T05AZJEkHA9Trt1nelJVNu/Xc39QGmi9eR7lOWMrm78ntxV2asZaYReej9nPmV1N6SI hin+DFInxLXB6aD5Q/cauBwfWmNxuCajPJ82nD7idd8XKQ5aKu92AKlVYoJo7SO4R/zj Jno6Vj13ZHRE4OB+sC7QbDPvQj9rHUTTIESkGZwzl3KWYY/4VAQvKJsjPbXzUqmoBNDf riAqkK6TrHJoDUWP6SAt2jRsefoYwhDb0U5bTUPi5/FTi1jV7qlGuHBmHsAnchbuZ30b nu+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=I82DX3oAFc+OnILozMXfkPpMkR/K0LeIhBkiF53s/LE=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=mk7/++JeSySQXpzfOHEDz2y6NwUpKDeZpAIfev8ocSKRapYvNJjnmB3IM+DoXbXi6P 40m+Of6JJMrRWlTgqC24sybD3fiENZzkpYPFRE2dTTAihuF05nqYcuTt/ORG1YKsUPIg jbmPcqEfl+01Z9fDi8TDBHCAo1yk5mc2hqb5hg7TqK++HUNvvcMtkE4r1IvQR6XBUTe9 fKsRcCQ88lLswWcjsqoYcgSB7a/uYvzxTP9mxX4Y1gm7nlEcUP8LdUIUcQzVbm23GwDH rS8redU4pjsfkE8pfOUGnN4l5XDQHW+yBg4Lsd45L3bsqYwXgiAbB95V7TESHeU/ZINb d0yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pODL585J; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l13-20020a170906a40d00b0099bd627bb9dsi590016ejz.983.2023.08.31.01.22.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:22:58 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pODL585J; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0C8A43836E96 for ; Thu, 31 Aug 2023 08:21:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0C8A43836E96 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470099; bh=I82DX3oAFc+OnILozMXfkPpMkR/K0LeIhBkiF53s/LE=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=pODL585JSZ1rMYLJqjkETGWngw5K+Yt98DUZ7s5Jfm8sm/iJaIKLxxWB38Adjz1yl ndTYRp7Lp0f9UOvvcJF7LDfuPAjx44uJHv8qDdvJrc9eocKUQxfvw0nYO2gwcr9Onw PtkdPLrP0LJ0G0pI3fvgqaSpl9HqKMQ6VPb6P0Ec= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id DCC243858284 for ; Thu, 31 Aug 2023 08:20:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DCC243858284 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235624" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235624" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938655" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938655" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8B7E6100512E; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint. Date: Thu, 31 Aug 2023 16:20:17 +0800 Message-Id: <20230831082024.314097-7-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732185204064522 X-GMAIL-MSGID: 1775732185204064522 From: Kong Lingling In inline asm, we do not know if the insn can use EGPR, so disable EGPR usage by default from mapping the common reg/mem constraint to non-EGPR constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage for inline asm. gcc/ChangeLog: * config/i386/i386.cc (INCLUDE_STRING): Add include for ix86_md_asm_adjust. (ix86_md_asm_adjust): When APX EGPR enabled without specifying the target option, map reg/mem constraints to non-EGPR constraints. * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-inline-gpr-norex2.c: New test. --- gcc/config/i386/i386.cc | 44 +++++++ gcc/config/i386/i386.opt | 5 + .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ 3 files changed, 156 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index d26d9ab0d9d..9460ebbfda4 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ +#define INCLUDE_STRING #define IN_TARGET_CODE 1 #include "config.h" @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec &outputs, vec & /*inputs*/, bool saw_asm_flag = false; start_sequence (); + /* TODO: Here we just mapped the general r/m constraints to non-EGPR + constraints, will eventually map all the usable constraints in the future. */ + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) + { + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 + and replace only r, exclude Br and Yr. */ + for (unsigned i = 0; i < constraints.length (); i++) + { + std::string *s = new std::string (constraints[i]); + size_t pos = s->find ('r'); + while (pos != std::string::npos) + { + if (pos > 0 + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) + pos = s->find ('r', pos + 1); + else + { + s->replace (pos, 1, "h"); + constraints[i] = (const char*) s->c_str (); + break; + } + } + } + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with + "Bt/Bt/BT". */ + for (unsigned i = 0; i < constraints.length (); i++) + { + std::string *s = new std::string (constraints[i]); + size_t pos = s->find ("m"); + size_t pos2 = s->find ("memory"); + if (pos != std::string::npos) + { + if (pos > 0 && (s->at (pos - 1) == 'B')) + s->replace (pos - 1, 2, "BT"); + else if (pos2 != std::string::npos) + s->replace (pos, 6, "Bt"); + else + s->replace (pos, 1, "Bt"); + constraints[i] = (const char*) s->c_str (); + } + } + } + for (unsigned i = 0, n = outputs.length (); i < n; ++i) { const char *con = constraints[i]; diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 1ee4d90186e..5c8d3a207e3 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1335,3 +1335,8 @@ Enum(apx_features) String(ndd) Value(apx_ndd) Set(4) EnumValue Enum(apx_features) String(all) Value(apx_all) Set(1) + +mapx-inline-asm-use-gpr32 +Target Var(ix86_apx_inline_asm_use_gpr32) Init(0) +Enable GPR32 in inline asm when APX_EGPR enabled, do not +hook reg or mem constraint in inline asm to GPR16. diff --git a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c new file mode 100644 index 00000000000..21534450045 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c @@ -0,0 +1,107 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mapxf -m64 -march=skylake-avx512 -DDTYPE32" } */ + +typedef unsigned int u32; +typedef unsigned long long u64; + +#ifdef DTYPE32 +typedef u32 DTYPE; +#define byteswap byteswapu32 +#endif + +#define R(x,n) ( (x >> n) | (x << (32 - n))) + +#define S0(x) (R(x, 2) ^ R(x,13) ^ R(x,22)) +#define S1(x) (R(x, 6) ^ R(x,11) ^ R(x,25)) + +#define TT(a,b,c,d,e,f,g,h,x,K) \ +{ \ + tmp1 = h + S1(e) + (g ^ (e & (f ^ g))) + K + x; \ + tmp2 = S0(a) + ((a & b) | (c & (a | b))); \ + h = tmp1 + tmp2; \ + d += tmp1; \ +} + +static inline u32 byteswapu32(u32 x) +{ + x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16; + x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8; + return x; +} + +void foo (DTYPE in[16], DTYPE out[8], const DTYPE C[16]) +{ + DTYPE tmp1 = 0, tmp2 = 0, a, b, c, d, e, f, g, h; + DTYPE w0, w1, w2, w3, w4, w5, w6, w7, + w8, w9, w10, w11, w12, w13, w14, w15; + w0 = byteswap(in[0]); + w1 = byteswap(in[1]); + w2 = byteswap(in[2]); + w3 = byteswap(in[3]); + w4 = byteswap(in[4]); + w5 = byteswap(in[5]); + w6 = byteswap(in[6]); + w7 = byteswap(in[7]); + w8 = byteswap(in[8]); + w9 = byteswap(in[9]); + w10 = byteswap(in[10]); + w11 = byteswap(in[11]); + w12 = byteswap(in[12]); + w13 = byteswap(in[13]); + w14 = byteswap(in[14]); + w15 = byteswap(in[15]); + a = out[0]; + b = out[1]; + c = out[2]; + d = out[3]; + e = out[4]; + f = out[5]; + g = out[6]; + h = out[7]; + + TT(a, b, c, d, e, f, g, h, w0, C[0]); + TT(h, a, b, c, d, e, f, g, w1, C[1]); + TT(g, h, a, b, c, d, e, f, w2, C[2]); + TT(f, g, h, a, b, c, d, e, w3, C[3]); + TT(e, f, g, h, a, b, c, d, w4, C[4]); + TT(d, e, f, g, h, a, b, c, w5, C[5]); + TT(c, d, e, f, g, h, a, b, w6, C[6]); + TT(b, c, d, e, f, g, h, a, w7, C[7]); + TT(a, b, c, d, e, f, g, h, w8, C[8]); + TT(h, a, b, c, d, e, f, g, w9, C[9]); + TT(g, h, a, b, c, d, e, f, w10, C[10]); + TT(f, g, h, a, b, c, d, e, w11, C[11]); + TT(e, f, g, h, a, b, c, d, w12, C[12]); + TT(d, e, f, g, h, a, b, c, w13, C[13]); + TT(c, d, e, f, g, h, a, b, w14, C[14]); + TT(b, c, d, e, f, g, h, a, w15, C[15]); + + out[0] += a; + out[1] += b; + out[2] += c; + out[3] += d; + out[4] += e; + out[5] += f; + out[6] += g; + out[7] += h; + + __asm__ __volatile__ ("test_asm_xmm %0, %%rax" : : "Yr" (out[7]) : "rax"); + __asm__ __volatile__ ("test_asm_Brr %0, %%rax" : : "Brr" (w14) : "rbx"); + __asm__ __volatile__ ("test_asm_rBr %0, %%rax" : : "rBr" (w13) : "rbx"); + __asm__ __volatile__ ("test_asm_r %0, %%rax" : : "r" (w15) : "rbx"); + __asm__ __volatile__ ("test_asm_m %0, %%rax" : : "m" (out[0]) : "rbx"); + __asm__ __volatile__ ("test_asm_mem %0, %%rax" : : "memory" (out[1]) : "rbx"); +} + +/* { dg-final { scan-assembler-not "knot" } } */ +/* { dg-final { scan-assembler-not "kxor" } } */ +/* { dg-final { scan-assembler-not "kor" } } */ +/* { dg-final { scan-assembler-not "kandn" } } */ +/* { dg-final { scan-assembler-times "test_asm_xmm %xmm5, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_Brr %r15d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_rBr %r14d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_r %r13d, %rax" 1 } } */ +/* { dg-final { scan-assembler-not "test_asm_rBr %r31d, %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_r %r30d, %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_m \\(%r29d\\), %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_mem \\(%r28d\\), %rax" } } */ From patchwork Thu Aug 31 08:20:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137244 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp97400vqu; Thu, 31 Aug 2023 01:26:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFVomCVMoLCdkeo/PnbW0WCdR6jWhG/Avc9acj2+3D0A0Jlnd430gE9AQLZaGlVALhSjyCN X-Received: by 2002:aa7:c159:0:b0:525:44c5:48e2 with SMTP id r25-20020aa7c159000000b0052544c548e2mr3220472edp.22.1693470362458; Thu, 31 Aug 2023 01:26:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470362; cv=none; d=google.com; s=arc-20160816; b=A01JfFwl7vOZWp5Ge1nZzduLxz+ZkUHEo9LMhxZO6IgccMYaGbpYTWwKUpF13v19Sx EUYjNuogCcIBOBUrXd0lRbT/Plbcb2G5rMN1sm6t38zC27qqSKD7z/n8sPAoi2IjhwyP tNC58ADi50/+asUq1/DoaMp5d3n68R0ZWXyAiuy2f+R7If7zkIXrfj43KOHRmX/suxKB 3KRAuA1Lzwan9UYfKCl76vU6KGqcu2rt43VLe1r7gRQgV/6qRHLxBiWcgU1yMUM/IUNA DmgVtxP6pCPiBk4Zm1zGZnE/46QA6LNSlhBLXHN4ZYaZVZbygLvlwWr9MZvjP+btmycJ CVWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=UegoOXaWbbMVZd3nLNlxRKfbkoMXe9Uy1r4pFoGxXsc=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=rhZ/dS6rQzyFB0xRqWkyQJfIRPEho/92KXA7kJDQD+xl0kT2J0NsNnAx4P9YbH9kSs BO6u0ODxuam0Tod5ONl82J9G8/JSl8ADs1lVrl5kXe/nvV25ZjpAXpn67N64YF9/LIZ7 Ifobwu62jGw74r9+zW7YHgQfEZN6EqfdsWqJLQDJSJ5qZEOBYJnyhHI9V3MD+a0N+mjC Ma3V4xwd3IIBTdq7zwncU8vIpuj4C+eq0BQQSrofSIQc7JWgposoMHpXNCZDKJ2aY5ea 1O7BSQS5kh+sIECuRPa71wxXbwZaaG/qk0hgty/xx2Hb7z8qTUlmpy54UWY7MZVqQg1u B32A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fGyrlV9s; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a10-20020aa7cf0a000000b0052bd29110c3si678569edy.441.2023.08.31.01.26.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:26:02 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fGyrlV9s; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18E23382C11E for ; Thu, 31 Aug 2023 08:22:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 18E23382C11E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470175; bh=UegoOXaWbbMVZd3nLNlxRKfbkoMXe9Uy1r4pFoGxXsc=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=fGyrlV9sgegsFYzzBnoMOLe71r3RzOWz/ZYEvITuKYnNT/M/LXJCYdz6lcUSR5pKA yvdwYNY+jR7cZRoowB2PheRXS57LKwPbXHIHHrArwEgW0EQZonfj79I7gfdrMQ+Rl1 NNab8jImXEcHX70xLLBuP6+RPbM5DfJMPbRFIc8c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 873F73858288 for ; Thu, 31 Aug 2023 08:20:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 873F73858288 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235640" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235640" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938668" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938668" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8EF48100512F; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 07/13] [APX EGPR] Add backend hook for base_reg_class/index_reg_class. Date: Thu, 31 Aug 2023 16:20:18 +0800 Message-Id: <20230831082024.314097-8-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732378806694944 X-GMAIL-MSGID: 1775732378806694944 From: Kong Lingling Add backend helper functions to verify if a rtx_insn can adopt EGPR to its base/index reg of memory operand. The verification rule goes like 1. For asm insn, enable/disable EGPR by ix86_apx_inline_asm_use_gpr32. 2. Disable EGPR for unrecognized insn. 3. If which_alternative is not decided, loop through enabled alternatives and check its attr_gpr32. Only enable EGPR when all enabled alternatives has attr_gpr32 = 1. 4. If which_alternative is decided, enable/disable EGPR by its corresponding attr_gpr32. gcc/ChangeLog: * config/i386/i386-protos.h (ix86_mode_code_base_reg_class): New prototype. (ix86_regno_mode_code_ok_for_base_p): Likewise. (ix86_insn_index_reg_class): Likewise. * config/i386/i386.cc (ix86_memory_address_use_extended_reg_class_p): New helper function to scan the insn. (ix86_mode_code_base_reg_class): New function to choose BASE_REG_CLASS. (ix86_regno_mode_code_ok_for_base_p): Likewise for base regno. (ix86_insn_index_reg_class): Likewise for INDEX_REG_CLASS. * config/i386/i386.h (MODE_CODE_BASE_REG_CLASS): Define. (REGNO_MODE_CODE_OK_FOR_BASE_P): Likewise. (INSN_INDEX_REG_CLASS): Likewise. (enum reg_class): Add INDEX_GPR16. (GENERAL_GPR16_REGNO_P): Define. * config/i386/i386.md (gpr32): New attribute. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-inline-gpr-norex2.c: Adjust. --- gcc/config/i386/i386-protos.h | 7 ++ gcc/config/i386/i386.cc | 98 +++++++++++++++++++ gcc/config/i386/i386.h | 16 ++- gcc/config/i386/i386.md | 3 + .../gcc.target/i386/apx-inline-gpr-norex2.c | 7 +- 5 files changed, 127 insertions(+), 4 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index bd4782800c4..78eb3e0f584 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -79,6 +79,13 @@ extern bool ix86_expand_set_or_cpymem (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, bool); extern bool ix86_expand_cmpstrn_or_cmpmem (rtx, rtx, rtx, rtx, rtx, bool); +extern enum reg_class ix86_mode_code_base_reg_class (machine_mode, addr_space_t, + RTX_CODE, RTX_CODE, + rtx_insn *); +extern bool ix86_regno_mode_code_ok_for_base_p (int, machine_mode, addr_space_t, + RTX_CODE, RTX_CODE, + rtx_insn *); +extern enum reg_class ix86_insn_index_reg_class (rtx_insn *); extern bool constant_address_p (rtx); extern bool legitimate_pic_operand_p (rtx); extern bool legitimate_pic_address_disp_p (rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 9460ebbfda4..412f3aefc43 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -11054,6 +11054,104 @@ ix86_validate_address_register (rtx op) return NULL_RTX; } +/* Return true if insn memory address can use any available reg + in BASE_REG_CLASS or INDEX_REG_CLASS, otherwise false. + For APX, some instruction can't be encoded with gpr32 + which is BASE_REG_CLASS or INDEX_REG_CLASS, for that case + returns false. */ +static bool +ix86_memory_address_use_extended_reg_class_p (rtx_insn* insn) +{ + /* LRA will do some initialization with insn == NULL, + return the maximum reg class for that. + For other cases, real insn will be passed and checked. */ + bool ret = true; + if (TARGET_APX_EGPR && insn) + { + if (asm_noperands (PATTERN (insn)) >= 0 + || GET_CODE (PATTERN (insn)) == ASM_INPUT) + return ix86_apx_inline_asm_use_gpr32; + + if (INSN_CODE (insn) < 0) + return false; + + /* Try recog the insn before calling get_attr_gpr32. Save + the current recog_data first. */ + /* Also save which_alternative for current recog. */ + + struct recog_data_d recog_data_save = recog_data; + int which_alternative_saved = which_alternative; + + /* Update the recog_data for alternative check. */ + if (recog_data.insn != insn) + extract_insn_cached (insn); + + /* If alternative is not set, loop throught each alternative + of insn and get gpr32 attr for all enabled alternatives. + If any enabled alternatives has 0 value for gpr32, disallow + gpr32 for addressing. */ + if (which_alternative_saved == -1) + { + alternative_mask enabled = get_enabled_alternatives (insn); + bool curr_insn_gpr32 = false; + for (int i = 0; i < recog_data.n_alternatives; i++) + { + if (!TEST_BIT (enabled, i)) + continue; + which_alternative = i; + curr_insn_gpr32 = get_attr_gpr32 (insn); + if (!curr_insn_gpr32) + ret = false; + } + } + else + { + which_alternative = which_alternative_saved; + ret = get_attr_gpr32 (insn); + } + + recog_data = recog_data_save; + which_alternative = which_alternative_saved; + } + + return ret; +} + +/* For APX, some instructions can't be encoded with gpr32. */ +enum reg_class +ix86_mode_code_base_reg_class (machine_mode mode ATTRIBUTE_UNUSED, + addr_space_t as ATTRIBUTE_UNUSED, + enum rtx_code outer_code ATTRIBUTE_UNUSED, + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn* insn) +{ + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return BASE_REG_CLASS; + return GENERAL_GPR16; +} + +bool +ix86_regno_mode_code_ok_for_base_p (int regno, + machine_mode mode ATTRIBUTE_UNUSED, + addr_space_t as ATTRIBUTE_UNUSED, + enum rtx_code outer_code ATTRIBUTE_UNUSED, + enum rtx_code index_code ATTRIBUTE_UNUSED, + rtx_insn* insn) +{ + + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return GENERAL_REGNO_P (regno); + return GENERAL_GPR16_REGNO_P (regno); +} + +enum reg_class +ix86_insn_index_reg_class (rtx_insn* insn) +{ + if (ix86_memory_address_use_extended_reg_class_p (insn)) + return INDEX_REG_CLASS; + return INDEX_GPR16; +} + /* Recognizes RTL expressions that are valid memory addresses for an instruction. The MODE argument is the machine mode for the MEM expression that wants to use this address. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 7ec3086641c..c8362ef451c 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1018,6 +1018,13 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define ADJUST_REG_ALLOC_ORDER x86_order_regs_for_local_alloc () +#define MODE_CODE_BASE_REG_CLASS(MODE, AS, OUTER, INDEX, INSN) \ + ix86_mode_code_base_reg_class (MODE, AS, OUTER, INDEX, INSN) +#define REGNO_MODE_CODE_OK_FOR_BASE_P(NUM, MODE, AS, OUTER, INDEX, INSN) \ + ix86_regno_mode_code_ok_for_base_p (NUM, MODE, AS, OUTER, INDEX, INSN) + +#define INSN_INDEX_REG_CLASS(INSN) \ + ix86_insn_index_reg_class (INSN) #define OVERRIDE_ABI_FORMAT(FNDECL) ix86_call_abi_override (FNDECL) @@ -1297,6 +1304,8 @@ enum reg_class %r24 %r25 %r26 %r27 %r28 %r29 %r30 %r31 */ GENERAL_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp %esp %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ + INDEX_GPR16, /* %eax %ebx %ecx %edx %esi %edi %ebp + %r8 %r9 %r10 %r11 %r12 %r13 %r14 %r15 */ FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */ FLOAT_REGS, SSE_FIRST_REG, @@ -1360,6 +1369,7 @@ enum reg_class "LEGACY_REGS", \ "GENERAL_REGS", \ "GENERAL_GPR16", \ + "INDEX_GPR16", \ "FP_TOP_REG", "FP_SECOND_REG", \ "FLOAT_REGS", \ "SSE_FIRST_REG", \ @@ -1395,10 +1405,11 @@ enum reg_class { 0x0f, 0x0, 0x0 }, /* Q_REGS */ \ { 0x900f0, 0x0, 0x0 }, /* NON_Q_REGS */ \ { 0x7e, 0xff0, 0x0 }, /* TLS_GOTBASE_REGS */ \ - { 0x7f, 0xff0, 0x0 }, /* INDEX_REGS */ \ + { 0x7f, 0xff0, 0xffff000 }, /* INDEX_REGS */ \ { 0x900ff, 0x0, 0x0 }, /* LEGACY_REGS */ \ { 0x900ff, 0xff0, 0xffff000 }, /* GENERAL_REGS */ \ { 0x900ff, 0xff0, 0x0 }, /* GENERAL_GPR16 */ \ + { 0x0007f, 0xff0, 0x0 }, /* INDEX_GPR16 */ \ { 0x100, 0x0, 0x0 }, /* FP_TOP_REG */ \ { 0x200, 0x0, 0x0 }, /* FP_SECOND_REG */ \ { 0xff00, 0x0, 0x0 }, /* FLOAT_REGS */ \ @@ -1456,6 +1467,9 @@ enum reg_class #define INDEX_REGNO_P(N) \ (LEGACY_INDEX_REGNO_P (N) || REX_INT_REGNO_P (N) || REX2_INT_REGNO_P (N)) +#define GENERAL_GPR16_REGNO_P(N) \ + (LEGACY_INT_REGNO_P (N) || REX_INT_REGNO_P (N)) + #define ANY_QI_REG_P(X) (REG_P (X) && ANY_QI_REGNO_P (REGNO (X))) #define ANY_QI_REGNO_P(N) \ (TARGET_64BIT ? GENERAL_REGNO_P (N) : QI_REGNO_P (N)) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e3270658cb7..b9eaea78f00 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -873,6 +873,9 @@ (define_attr "use_carry" "0,1" (const_string "0")) ;; Define attribute to indicate unaligned ssemov insns (define_attr "movu" "0,1" (const_string "0")) +;; Define attribute to indicate gpr32 insns. +(define_attr "gpr32" "0, 1" (const_string "1")) + ;; Define instruction set of MMX instructions (define_attr "mmx_isa" "base,native,sse,sse_noavx,avx" (const_string "base")) diff --git a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c index 21534450045..6dfc6714c2f 100644 --- a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c @@ -98,9 +98,10 @@ void foo (DTYPE in[16], DTYPE out[8], const DTYPE C[16]) /* { dg-final { scan-assembler-not "kor" } } */ /* { dg-final { scan-assembler-not "kandn" } } */ /* { dg-final { scan-assembler-times "test_asm_xmm %xmm5, %rax" 1 } } */ -/* { dg-final { scan-assembler-times "test_asm_Brr %r15d, %rax" 1 } } */ -/* { dg-final { scan-assembler-times "test_asm_rBr %r14d, %rax" 1 } } */ -/* { dg-final { scan-assembler-times "test_asm_r %r13d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_Brr %r12d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_rBr %eax, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_r %eax, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_m \\(%rax\\), %rax" 1 } } */ /* { dg-final { scan-assembler-not "test_asm_rBr %r31d, %rax" } } */ /* { dg-final { scan-assembler-not "test_asm_r %r30d, %rax" } } */ /* { dg-final { scan-assembler-not "test_asm_m \\(%r29d\\), %rax" } } */ From patchwork Thu Aug 31 08:20:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137238 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp96340vqu; Thu, 31 Aug 2023 01:22:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHZ/ij1GoeWllMnqQb3ZKB53TkOQ+GjDTGcrIlZPYly+f1SUOyfwEuEykbi0ZgnKyHQQCzZ X-Received: by 2002:a17:906:1089:b0:9a1:c370:1af2 with SMTP id u9-20020a170906108900b009a1c3701af2mr3717694eju.3.1693470176541; Thu, 31 Aug 2023 01:22:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470176; cv=none; d=google.com; s=arc-20160816; b=DIBkrCgYUL9fqd/eURcBjcGSh1UBijIVj95m5DZa4g+oXHPKaBwvPgWX+EaoOfT0wk XrfSZxGtBDszGY6Mb5SVjcQbOwN9AajuGsPp1PFj+ndgsg19o3LtjqzXw/Wjsr9+XGge ZH8gCpFJBXeIZCONMyGoC18whAHq6XNJOn+aKeyp9qGek0vEsG2cnmJhpvWRqVF3I1J8 zTvRh4YU38ydf22AY0Q0T0oKpAcpZ01gc4Wth7NjsZdZKFehyH7cI/OUzkOlRJTvtgpE J23NeIOYYNmB4GlK7dBCOtWyiWwE6VFzWyFjHhWjbKBJfc3FZBhfx00dlkMoN+/jl7WB Q7JQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=veyPbRMSCFJDyQnuvu1piNdyRmfOtaTpWmGzLXd+BxY=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=mjFKpALmbLHlt47f3ubgSyMUxRdiaTSXLfvS/cvCvscKGEYG+abrcJcKbX+ka5M3Vk 7ebEfMJOO/LDWS/Z0eve+79s83DLJNor7S99ehtI3/Ln5FrO0fSGvcDCPOGs5PP+Plju LezdKA/GjUEHvFsPPFr/1/KkX4zRivsle65B1dNLz3AMISWEa49gJryJbWancZGjQycf hlC66epBUV9yx6hpug7fsB//Uon4tJMPaeqD4Svt1rU4cm9vEIdhYID94njMkjCP8kgt SAtok0vvXH51SAVNlnQuqRAEqDSQLUC9x+4Roti/mzNg3IfE2jHQn4QOr7WympEWpE7z Mqdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="GsDniB/l"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id rh28-20020a17090720fc00b00987b20b66bbsi620332ejb.711.2023.08.31.01.22.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:22:56 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="GsDniB/l"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6D565386482B for ; Thu, 31 Aug 2023 08:21:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6D565386482B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470098; bh=veyPbRMSCFJDyQnuvu1piNdyRmfOtaTpWmGzLXd+BxY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=GsDniB/lk6d/fLOQZ1VY0CzlDcSOSsN5G/p34m95ECPqzHh3JzDfeLH9N/iWo+aI4 rVTGo8+jdxMXUGVSFgSAEhGjUZHOVEIxjZSJ+Ic16PC27A8fcywMOumQ+wqUPSOv0d CYJH8+HnuDDsZnCmt3lLFPTe5ywyQud34fJG6wx8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 83662385842E for ; Thu, 31 Aug 2023 08:20:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 83662385842E X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235629" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235629" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938664" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938664" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 92A191005130; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 08/13] [APX EGPR] Handle GPR16 only vector move insns Date: Thu, 31 Aug 2023 16:20:19 +0800 Message-Id: <20230831082024.314097-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732183878628056 X-GMAIL-MSGID: 1775732183878628056 For vector move insns like vmovdqa/vmovdqu, their evex counterparts requrire explicit suffix 64/32/16/8. The usage of these instruction are prohibited under AVX10_1 or AVX512F, so for AVX2+APX_F we select vmovaps/vmovups for vector load/store insns that contains EGPR. gcc/ChangeLog: * config/i386/i386.cc (ix86_get_ssemov): Check if egpr is used, adjust mnemonic for vmovduq/vmovdqa. * config/i386/sse.md (*_vinsert_0): Check if egpr is used, adjust mnemonic for vmovdqu/vmovdqa. (avx_vec_concat): Likewise, and separate alternative 0 to avx_noavx512f. --- gcc/config/i386/i386.cc | 31 ++++++++++++++++++++++++++++++- gcc/config/i386/sse.md | 34 ++++++++++++++++++++++++---------- 2 files changed, 54 insertions(+), 11 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 412f3aefc43..f5d642948bc 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -5469,6 +5469,11 @@ ix86_get_ssemov (rtx *operands, unsigned size, bool evex_reg_p = (size == 64 || EXT_REX_SSE_REG_P (operands[0]) || EXT_REX_SSE_REG_P (operands[1])); + + bool egpr_p = (TARGET_APX_EGPR + && (x86_extended_rex2reg_mentioned_p (operands[0]) + || x86_extended_rex2reg_mentioned_p (operands[1]))); + machine_mode scalar_mode; const char *opcode = NULL; @@ -5547,6 +5552,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5563,6 +5574,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_TFmode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; @@ -5581,6 +5594,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, ? "vmovdqu8" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu8" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5589,12 +5608,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, : "%vmovdqa"); break; case E_HImode: - if (evex_reg_p) + if (evex_reg_p || egpr_p) opcode = (misaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); + else if (egpr_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovups") + : "%vmovaps"); else opcode = (misaligned_p ? (TARGET_AVX512BW @@ -5605,6 +5630,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_SImode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; @@ -5613,6 +5640,8 @@ ix86_get_ssemov (rtx *operands, unsigned size, case E_OImode: if (evex_reg_p) opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; + else if (egpr_p) + opcode = misaligned_p ? "%vmovups" : "%vmovaps"; else opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; break; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 192e746fda3..bd6674d34f9 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -18918,6 +18918,12 @@ (define_insn "*_vinsert_0" { if (which_alternative == 0) return "vinsert\t{$0, %2, %1, %0|%0, %1, %2, 0}"; + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[2])); + const char *align_templ = egpr_used ? "vmovdqa\t{%2, %x0|%x0, %2}" + : "vmovaps\t{%2, %x0|%x0, %2}"; + const char *unalign_templ = egpr_used ? "vmovdqu\t{%2, %x0|%x0, %2}" + : "vmovups\t{%2, %x0|%x0, %2}"; switch (mode) { case E_V8DFmode: @@ -18933,17 +18939,17 @@ (define_insn "*_vinsert_0" case E_V8DImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu64\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa64\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; case E_V16SImode: if (misaligned_operand (operands[2], mode)) return which_alternative == 2 ? "vmovdqu32\t{%2, %x0|%x0, %2}" - : "vmovdqu\t{%2, %x0|%x0, %2}"; + : unalign_templ; else return which_alternative == 2 ? "vmovdqa32\t{%2, %x0|%x0, %2}" - : "vmovdqa\t{%2, %x0|%x0, %2}"; + : align_templ; default: gcc_unreachable (); } @@ -27652,11 +27658,13 @@ (define_insn "avx_vec_concat" [(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv") (vec_concat:V_256_512 (match_operand: 1 "nonimmediate_operand" "x,v,xm,vm") - (match_operand: 2 "nonimm_or_0_operand" "xm,vm,C,C")))] + (match_operand: 2 "nonimm_or_0_operand" "xBt,vm,C,C")))] "TARGET_AVX && (operands[2] == CONST0_RTX (mode) || !MEM_P (operands[1]))" { + bool egpr_used = (TARGET_APX_EGPR + && x86_extended_rex2reg_mentioned_p (operands[1])); switch (which_alternative) { case 0: @@ -27704,7 +27712,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovups\t{%1, %t0|%t0, %1}" + : "vmovdqu\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %t0|%t0, %1}"; else @@ -27713,7 +27722,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %t0|%t0, %1}"; + return egpr_used ? "vmovaps\t{%1, %t0|%t0, %1}" + : "vmovdqa\t{%1, %t0|%t0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %t0|%t0, %1}"; else @@ -27723,7 +27733,8 @@ (define_insn "avx_vec_concat" if (misaligned_operand (operands[1], mode)) { if (which_alternative == 2) - return "vmovdqu\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovups\t{%1, %x0|%x0, %1}" + : "vmovdqu\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqu64\t{%1, %x0|%x0, %1}"; else @@ -27732,7 +27743,8 @@ (define_insn "avx_vec_concat" else { if (which_alternative == 2) - return "vmovdqa\t{%1, %x0|%x0, %1}"; + return egpr_used ? "vmovaps\t{%1, %x0|%x0, %1}" + : "vmovdqa\t{%1, %x0|%x0, %1}"; else if (GET_MODE_SIZE (mode) == 8) return "vmovdqa64\t{%1, %x0|%x0, %1}"; else @@ -27745,7 +27757,9 @@ (define_insn "avx_vec_concat" gcc_unreachable (); } } - [(set_attr "type" "sselog,sselog,ssemov,ssemov") + [(set_attr "isa" "noavx512f,avx512f,*,*") + (set_attr "gpr32" "0,1,1,1") + (set_attr "type" "sselog,sselog,ssemov,ssemov") (set_attr "prefix_extra" "1,1,*,*") (set_attr "length_immediate" "1,1,*,*") (set_attr "prefix" "maybe_evex") From patchwork Thu Aug 31 08:20:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137246 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp98227vqu; Thu, 31 Aug 2023 01:28:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHkBdHKuLaIdSMrZFwrwm3KzmLp+XzKrbVcFBxi34j/JVcZe4ECc10r2LZjoJ11mhlLlVnp X-Received: by 2002:a17:907:7710:b0:9a1:914e:490e with SMTP id kw16-20020a170907771000b009a1914e490emr3385673ejc.53.1693470493555; Thu, 31 Aug 2023 01:28:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470493; cv=none; d=google.com; s=arc-20160816; b=ETnrpUNm+9M2ugaqNsB+b8/yGcNQnf74YzpsCAFxr8tX4P6eGs30Ygs9LvdE7y4iPZ ljSr23fUbM28WPwgh/esvU/GQKOiOfA1pN/+2KJDXJkywSxkGoJBVhpIHQMYMD7BOWLw 6ns2OaTI0lMgCXpVRIp0ehh78nMCubsFKpej0SlSqM9+meIXLiF+zPunc+5eyz/GeNI5 EKnl1c9YD9SOWkaneDI2sW1NY0zXym7inRYqqF972t2hW+vvxl7YTdi4viYj1MqkaAlY tnkSCP540lRhk9pLJBRz7dh8V4FboiGDjoC4lqTrmfxARo2s5tJySR27afYL//Ved3Or +0qA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=l0gjFiMwB9plySwttdyUbU2ZBEwkxfskG3APomOnSdg=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=CHtxO0aiE5JFdW3nwPo8D1FYkLwIqjFSmW6PGOvgN6aLqQrYI2Pas1KqIKVBev/FXE GfWA+crdaRWyWSlrdLl+O6yYJzVnFDjDZKjMp54HmGH1BnQwyH9qL03GrY/AKMI68QT3 1oZItGpkl0reu+7wda1d6Fmmc4KgnjklSQ/4/HDvqn2UQFWgu636pSyDByMaVhvZK6Ga 8TpgxHyy1wkR/EOry0LyZpnDBx223uxXbcu4lsdr3igOI0xRzILBByhuvcs/Qhf85nCy 43eAmAjH+SnhKuIpwdoNSxpiypFp66iDkvvuy8obNRqMSXZjWpR0ZT+Ich4b4GquZ7Zc ewZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=RWyCSuFd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id se28-20020a170906ce5c00b0099331b3e6f2si610404ejb.663.2023.08.31.01.28.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:28:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=RWyCSuFd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AE91B3893668 for ; Thu, 31 Aug 2023 08:23:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AE91B3893668 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470239; bh=l0gjFiMwB9plySwttdyUbU2ZBEwkxfskG3APomOnSdg=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=RWyCSuFd6oOmuIBg+zGbzL0thZc7V6ojwsITOessjR7V+NI1p+PDcFZdifOa+0qL/ 0HfU7pQJy9LuKRFK8eAJYnwY49uLt58JSXLzGnUBAD49WrvLoF3VGAFyf/hvOZdNLt +EewZWT2hfFta7Mqh3fxLgJGMxCWytGtI5Uv0z8o= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id AE7703858438 for ; Thu, 31 Aug 2023 08:20:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AE7703858438 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235658" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235658" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938687" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938687" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:31 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 95BB41005131; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 09/13] [APX EGPR] Handle legacy insn that only support GPR16 (1/5) Date: Thu, 31 Aug 2023 16:20:20 +0800 Message-Id: <20230831082024.314097-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732516263295435 X-GMAIL-MSGID: 1775732516263295435 From: Kong Lingling These legacy insn in opcode map0/1 only support GPR16, and do not have vex/evex counterpart, directly adjust constraints and add gpr32 attr to patterns. insn list: 1. xsave/xsave64, xrstor/xrstor64 2. xsaves/xsaves64, xrstors/xrstors64 3. xsavec/xsavec64 4. xsaveopt/xsaveopt64 5. fxsave64/fxrstor64 gcc/ChangeLog: * config/i386/i386.md (): Set attr gpr32 0 and constraint Bt. (_rex64): Likewise. (_rex64): Likewise. (64): Likewise. (fxsave64): Likewise. (fxstore64): Likewise. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add apxf check. * gcc.target/i386/apx-legacy-insn-check-norex2.c: New test. * gcc.target/i386/apx-legacy-insn-check-norex2-asm.c: New assembler test. --- gcc/config/i386/i386.md | 18 +++++++---- .../i386/apx-legacy-insn-check-norex2-asm.c | 5 ++++ .../i386/apx-legacy-insn-check-norex2.c | 30 +++++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 10 +++++++ 4 files changed, 57 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c create mode 100644 gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index b9eaea78f00..83ad01b43c1 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -25626,11 +25626,12 @@ (define_insn "fxsave" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "fxsave64" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=Bt") (unspec_volatile:BLK [(const_int 0)] UNSPECV_FXSAVE64))] "TARGET_64BIT && TARGET_FXSR" "fxsave64\t%0" [(set_attr "type" "other") + (set_attr "gpr32" "0") (set_attr "memory" "store") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25646,11 +25647,12 @@ (define_insn "fxrstor" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "fxrstor64" - [(unspec_volatile [(match_operand:BLK 0 "memory_operand" "m")] + [(unspec_volatile [(match_operand:BLK 0 "memory_operand" "Bt")] UNSPECV_FXRSTOR64)] "TARGET_64BIT && TARGET_FXSR" "fxrstor64\t%0" [(set_attr "type" "other") + (set_attr "gpr32" "0") (set_attr "memory" "load") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25704,7 +25706,7 @@ (define_insn "" (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "_rex64" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=Bt") (unspec_volatile:BLK [(match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] @@ -25713,11 +25715,12 @@ (define_insn "_rex64" "\t%0" [(set_attr "type" "other") (set_attr "memory" "store") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "" - [(set (match_operand:BLK 0 "memory_operand" "=m") + [(set (match_operand:BLK 0 "memory_operand" "=Bt") (unspec_volatile:BLK [(match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] @@ -25726,6 +25729,7 @@ (define_insn "" "\t%0" [(set_attr "type" "other") (set_attr "memory" "store") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) @@ -25743,7 +25747,7 @@ (define_insn "" (define_insn "_rex64" [(unspec_volatile:BLK - [(match_operand:BLK 0 "memory_operand" "m") + [(match_operand:BLK 0 "memory_operand" "Bt") (match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] ANY_XRSTOR)] @@ -25751,12 +25755,13 @@ (define_insn "_rex64" "\t%0" [(set_attr "type" "other") (set_attr "memory" "load") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 3"))]) (define_insn "64" [(unspec_volatile:BLK - [(match_operand:BLK 0 "memory_operand" "m") + [(match_operand:BLK 0 "memory_operand" "Bt") (match_operand:SI 1 "register_operand" "a") (match_operand:SI 2 "register_operand" "d")] ANY_XRSTOR64)] @@ -25764,6 +25769,7 @@ (define_insn "64" "64\t%0" [(set_attr "type" "other") (set_attr "memory" "load") + (set_attr "gpr32" "0") (set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 4"))]) diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c new file mode 100644 index 00000000000..7ecc861435f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2-asm.c @@ -0,0 +1,5 @@ +/* { dg-do assemble { target apxf } } */ +/* { dg-options "-O1 -mapxf -m64 -DDTYPE32" } */ + +#include "apx-legacy-insn-check-norex2.c" + diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c new file mode 100644 index 00000000000..1e5450dfb73 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mapxf -m64 -DDTYPE32" } */ + +#include + +typedef unsigned int u32; +typedef unsigned long long u64; + +#ifndef DTYPE32 +#define DTYPE32 +#endif + +#ifdef DTYPE32 +typedef u32 DTYPE; +#endif + +__attribute__((target("xsave,fxsr"))) +void legacy_test () +{ + register DTYPE* val __asm__("r16"); + _xsave64 (val, 1); + _xrstor64 (val, 1); + _fxsave64 (val); + _fxrstor64 (val); +} + +/* { dg-final { scan-assembler-not "xsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index d353cc0aaf0..6359408542a 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -9938,6 +9938,16 @@ proc check_effective_target_sm4 { } { } "-msm4" ] } +proc check_effective_target_apxf { } { + return [check_no_compiler_messages apxf object { + void + foo () + { + __asm__ volatile ("add\t%%r16, %%r31" ::); + } + } "-mapxf" ] +} + # Return 1 if sse instructions can be compiled. proc check_effective_target_sse { } { return [check_no_compiler_messages sse object { From patchwork Thu Aug 31 08:20:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137247 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp98231vqu; Thu, 31 Aug 2023 01:28:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGOV7NoXIuVOOWzcsAfjTjrNCAq+PvxZZ96dFfYDSGo4KWUm04MmVe5PJP1DmhhwPufY3FL X-Received: by 2002:a17:907:948b:b0:9a5:794f:f3c5 with SMTP id dm11-20020a170907948b00b009a5794ff3c5mr2266851ejc.6.1693470494269; Thu, 31 Aug 2023 01:28:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470494; cv=none; d=google.com; s=arc-20160816; b=qEe/ZIWOw1B80E2FbcpXSk12wLm26WJMvyb+LEBfUpOurxwFcctgCw/7lgl+Rv5aX+ s5HJehAC2PHNpiO1lKE3dUZ2uqJMLD1nPPadBwGltAM1aK4zb4l1w1qieyuVJQRjjwey 6EVy3HIFQLiuXhMEtvha1nrQNPvj2mjOXDbO/ZJbWmrL4JHHsbcsAjCnUgLO33qq1IcZ 8pw4kdL1E5GK8lE2w5aO38t+133cSincMZTiMbYNpMFM3TISfC1tw5ThpOKxnHp9xVl6 RJv/1SACza3h2AORdVtM3ep8VyRel5DeGPPJ7M/kDTBMJzk4q0DqSbS2wiM6UaqFQu/o FjNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=Px38u6yquKLkjVnIVlT1YzWBCHYj4ZT2VH+d+hyh1nA=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=AdjCFrTxN+Fw4rFUQovO0DrfPzjTdeNm6GIHvo9Lqe/drZrvkdKedKZTRHEavzhw/i 1ffveYKgR/wSSdUrIWqqS+I0oQI9d4IRloje+Dl8UsSXxjY1Ugk44PJP93Z5I21jv1wh mQFxqg3a85qMbLdfaXQ7uhGWHu8G/okEzHFWTpJEADUujMtS14tmAuwOjm1xdl7p24Er 3tDHyykFpmx9wRBcvlWMbefzqwoduY+28qae+AJFMLv8igm3lK2hVC95I1pbDu5U+tvP jKgVBTeG3rkjSHjH1p/QQ5EQLO6jQhIuPzyk2p8H9fofMT/RaSIueIkdQdne4vSqu2fI eATg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dPkKEfa0; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id se28-20020a170906ce5c00b00997e71d036csi652413ejb.678.2023.08.31.01.28.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:28:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dPkKEfa0; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 34604385696F for ; Thu, 31 Aug 2023 08:24:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 34604385696F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470240; bh=Px38u6yquKLkjVnIVlT1YzWBCHYj4ZT2VH+d+hyh1nA=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=dPkKEfa0ZFlZmcPuA9WNm5NRrlUA6+vj/9CDzqi7lyvixwCVem43mWHRn/33GBTjp jeOl+i9Bq4SY4t6LNsMbcb4asVulD89wuNvXSkr7GC/tnQvgwnHa18JGNVbhSxscGf iBzErtp2cT7b8p54F/MLl0aMGXwGfEtN/ykpRUa4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 14AEA3857700 for ; Thu, 31 Aug 2023 08:20:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14AEA3857700 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235710" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235710" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938744" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938744" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:31 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 993DF1005132; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5) Date: Thu, 31 Aug 2023 16:20:21 +0800 Message-Id: <20230831082024.314097-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732517010804527 X-GMAIL-MSGID: 1775732517010804527 From: Kong Lingling These legacy insns in opcode map2/3 have vex but no evex counterpart, disable EGPR for them by adjusting alternatives and attr_gpr32. insn list: 1. phaddw/vphaddw, phaddd/vphaddd, phaddsw/vphaddsw 2. phsubw/vphsubw, phsubd/vphsubd, phsubsw/vphsubsw 3. psignb/vpsginb, psignw/vpsignw, psignd/vpsignd 4. blendps/vblendps, blendpd/vblendpd 5. blendvps/vblendvps, blendvpd/vblendvpd 6. pblendvb/vpblendvb, pblendw/vpblendw 7. mpsadbw/vmpsadbw 8. dpps/vddps, dppd/vdppd 9. pcmpeqq/vpcmpeqq, pcmpgtq/vpcmpgtq gcc/ChangeLog: * config/i386/sse.md (avx2_phwv16hi3): Set attr gpr32 0 and constraint Bt/BM to all mem alternatives. (ssse3_phwv8hi3): Likewise. (ssse3_phwv4hi3): Likewise. (avx2_phdv8si3): Likewise. (ssse3_phdv4si3): Likewise. (ssse3_phdv2si3): Likewise. (_psign3): Likewise. (ssse3_psign3): Likewise. (_blend_blendv_blendv_lt): Likewise. (*_blendv_not_ltint: Likewise. (_dp): Likewise. (_mpsadbw): Likewise. (_pblendvb): Likewise. (*_pblendvb_lt): Likewise. (sse4_1_pblend): Likewise. (*avx2_pblend): Likewise. (avx2_permv2ti): Likewise. (*avx_vperm2f128_nozero): Likewise. (*avx2_eq3): Likewise. (*sse4_1_eqv2di3): Likewise. (sse4_2_gtv2di3): Likewise. (avx2_gt3): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add sse/vex intrinsic tests. --- gcc/config/i386/sse.md | 80 ++++++++----- .../i386/apx-legacy-insn-check-norex2.c | 106 ++++++++++++++++++ 2 files changed, 159 insertions(+), 27 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index bd6674d34f9..05963de9219 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16837,7 +16837,7 @@ (define_insn "*avx2_eq3" [(set (match_operand:VI_256 0 "register_operand" "=x") (eq:VI_256 (match_operand:VI_256 1 "nonimmediate_operand" "%x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))] "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "vpcmpeq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -16845,6 +16845,7 @@ (define_insn "*avx2_eq3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17027,7 +17028,7 @@ (define_insn "*sse4_1_eqv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (eq:V2DI (match_operand:V2DI 1 "vector_operand" "%0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pcmpeqq\t{%2, %0|%0, %2} @@ -17035,6 +17036,7 @@ (define_insn "*sse4_1_eqv2di3" vpcmpeqq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17043,7 +17045,7 @@ (define_insn "*sse2_eq3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (eq:VI124_128 (match_operand:VI124_128 1 "vector_operand" "%0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ @@ -17058,7 +17060,7 @@ (define_insn "sse4_2_gtv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (gt:V2DI (match_operand:V2DI 1 "register_operand" "0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))] "TARGET_SSE4_2" "@ pcmpgtq\t{%2, %0|%0, %2} @@ -17066,6 +17068,7 @@ (define_insn "sse4_2_gtv2di3" vpcmpgtq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17074,7 +17077,7 @@ (define_insn "avx2_gt3" [(set (match_operand:VI_256 0 "register_operand" "=x") (gt:VI_256 (match_operand:VI_256 1 "register_operand" "x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))] "TARGET_AVX2" "vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -17082,6 +17085,7 @@ (define_insn "avx2_gt3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17105,7 +17109,7 @@ (define_insn "*sse2_gt3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (gt:VI124_128 (match_operand:VI124_128 1 "register_operand" "0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))] "TARGET_SSE2" "@ pcmpgt\t{%2, %0|%0, %2} @@ -21228,7 +21232,7 @@ (define_insn "avx2_phwv16hi3" (vec_select:V16HI (vec_concat:V32HI (match_operand:V16HI 1 "register_operand" "x") - (match_operand:V16HI 2 "nonimmediate_operand" "xm")) + (match_operand:V16HI 2 "nonimmediate_operand" "xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 16) (const_int 18) (const_int 20) (const_int 22) @@ -21244,6 +21248,7 @@ (define_insn "avx2_phwv16hi3" "TARGET_AVX2" "vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21254,7 +21259,7 @@ (define_insn "ssse3_phwv8hi3" (vec_select:V8HI (vec_concat:V16HI (match_operand:V8HI 1 "register_operand" "0,x") - (match_operand:V8HI 2 "vector_operand" "xBm,xm")) + (match_operand:V8HI 2 "vector_operand" "xBT,xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) (const_int 12) (const_int 14)])) @@ -21269,6 +21274,7 @@ (define_insn "ssse3_phwv8hi3" vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") @@ -21280,7 +21286,7 @@ (define_insn_and_split "ssse3_phwv4hi3" (vec_select:V4HI (vec_concat:V8HI (match_operand:V4HI 1 "register_operand" "0,0,x") - (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,x")) + (match_operand:V4HI 2 "register_mmxmem_operand" "yBt,x,x")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) (vec_select:V4HI @@ -21309,6 +21315,7 @@ (define_insn_and_split "ssse3_phwv4hi3" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) @@ -21320,7 +21327,7 @@ (define_insn "avx2_phdv8si3" (vec_select:V8SI (vec_concat:V16SI (match_operand:V8SI 1 "register_operand" "x") - (match_operand:V8SI 2 "nonimmediate_operand" "xm")) + (match_operand:V8SI 2 "nonimmediate_operand" "xBt")) (parallel [(const_int 0) (const_int 2) (const_int 8) (const_int 10) (const_int 4) (const_int 6) (const_int 12) (const_int 14)])) @@ -21332,6 +21339,7 @@ (define_insn "avx2_phdv8si3" "TARGET_AVX2" "vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21342,7 +21350,7 @@ (define_insn "ssse3_phdv4si3" (vec_select:V4SI (vec_concat:V8SI (match_operand:V4SI 1 "register_operand" "0,x") - (match_operand:V4SI 2 "vector_operand" "xBm,xm")) + (match_operand:V4SI 2 "vector_operand" "xBT,xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) (vec_select:V4SI @@ -21355,6 +21363,7 @@ (define_insn "ssse3_phdv4si3" vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_data16" "1,*") (set_attr "prefix_extra" "1") @@ -21367,7 +21376,7 @@ (define_insn_and_split "ssse3_phdv2si3" (vec_select:V2SI (vec_concat:V4SI (match_operand:V2SI 1 "register_operand" "0,0,x") - (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,x")) + (match_operand:V2SI 2 "register_mmxmem_operand" "yBt,x,x")) (parallel [(const_int 0) (const_int 2)])) (vec_select:V2SI (vec_concat:V4SI (match_dup 1) (match_dup 2)) @@ -21394,6 +21403,7 @@ (define_insn_and_split "ssse3_phdv2si3" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) @@ -21848,7 +21858,7 @@ (define_insn "_psign3" [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x") (unspec:VI124_AVX2 [(match_operand:VI124_AVX2 1 "register_operand" "0,x") - (match_operand:VI124_AVX2 2 "vector_operand" "xBm,xm")] + (match_operand:VI124_AVX2 2 "vector_operand" "xBT,xBt")] UNSPEC_PSIGN))] "TARGET_SSSE3" "@ @@ -21856,6 +21866,7 @@ (define_insn "_psign3" vpsign\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) @@ -21864,7 +21875,7 @@ (define_insn "ssse3_psign3" [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,x") (unspec:MMXMODEI [(match_operand:MMXMODEI 1 "register_operand" "0,0,x") - (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,x")] + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "yBt,x,x")] UNSPEC_PSIGN))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" "@ @@ -21874,6 +21885,7 @@ (define_insn "ssse3_psign3" [(set_attr "isa" "*,noavx,avx") (set_attr "mmx_isa" "native,*,*") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) (set_attr "mode" "DI,TI,TI")]) @@ -22153,7 +22165,7 @@ (define_mode_attr blendbits (define_insn "_blend" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (vec_merge:VF_128_256 - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VF_128_256 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to__operand")))] "TARGET_SSE4_1" @@ -22163,6 +22175,7 @@ (define_insn "_blend" vblend\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22173,7 +22186,7 @@ (define_insn "_blendv" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VF_128_256 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22183,6 +22196,7 @@ (define_insn "_blendv" vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22234,7 +22248,7 @@ (define_insn_and_split "*_blendv_lt" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (lt:VF_128_256 (match_operand: 3 "register_operand" "Yz,Yz,x") (match_operand: 4 "const0_operand"))] @@ -22248,6 +22262,7 @@ (define_insn_and_split "*_blendv_lt" "operands[3] = gen_lowpart (mode, operands[3]);" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22266,7 +22281,7 @@ (define_insn_and_split "*_blendv_ltint" [(set (match_operand: 0 "register_operand" "=Yr,*x,x") (unspec: [(match_operand: 1 "register_operand" "0,0,x") - (match_operand: 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand: 2 "vector_operand" "YrBT,*xBT,xBt") (subreg: (lt:VI48_AVX (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") @@ -22286,6 +22301,7 @@ (define_insn_and_split "*_blendv_ltint" } [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22324,7 +22340,7 @@ (define_insn "_dp" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "vector_operand" "%0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_DP))] "TARGET_SSE4_1" @@ -22334,6 +22350,7 @@ (define_insn "_dp" vdp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemul") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22362,7 +22379,7 @@ (define_insn "_mpsadbw" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_MPSADBW))] "TARGET_SSE4_1" @@ -22372,6 +22389,7 @@ (define_insn "_mpsadbw" vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -22400,7 +22418,7 @@ (define_insn "_pblendvb" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22410,6 +22428,7 @@ (define_insn "_pblendvb" vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22449,7 +22468,7 @@ (define_insn_and_split "*_pblendvb_lt" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (lt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x") (match_operand:VI1_AVX2 4 "const0_operand"))] UNSPEC_BLENDV))] @@ -22462,6 +22481,7 @@ (define_insn_and_split "*_pblendvb_lt" "" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22493,7 +22513,7 @@ (define_insn_and_split "*_pblendvb_lt_subreg_not" (define_insn "sse4_1_pblend" [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x") (vec_merge:V8_128 - (match_operand:V8_128 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:V8_128 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:V8_128 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_SSE4_1" @@ -22503,6 +22523,7 @@ (define_insn "sse4_1_pblend" vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,orig,vex") @@ -22565,7 +22586,7 @@ (define_expand "avx2_pblend_1" (define_insn "*avx2_pblend" [(set (match_operand:V16_256 0 "register_operand" "=x") (vec_merge:V16_256 - (match_operand:V16_256 2 "nonimmediate_operand" "xm") + (match_operand:V16_256 2 "nonimmediate_operand" "xBt") (match_operand:V16_256 1 "register_operand" "x") (match_operand:SI 3 "avx2_pblendw_operand")))] "TARGET_AVX2" @@ -22574,6 +22595,7 @@ (define_insn "*avx2_pblend" return "vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -22582,7 +22604,7 @@ (define_insn "*avx2_pblend" (define_insn "avx2_pblendd" [(set (match_operand:VI4_AVX2 0 "register_operand" "=x") (vec_merge:VI4_AVX2 - (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xm") + (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xBt") (match_operand:VI4_AVX2 1 "register_operand" "x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_AVX2" @@ -26443,11 +26465,13 @@ (define_insn "avx512f_perm_1" (set_attr "prefix" "") (set_attr "mode" "")]) +;; TODO (APX): vmovaps supports EGPR but not others, could split +;; pattern to enable gpr32 for this one. (define_insn "avx2_permv2ti" [(set (match_operand:V4DI 0 "register_operand" "=x") (unspec:V4DI [(match_operand:V4DI 1 "register_operand" "x") - (match_operand:V4DI 2 "nonimmediate_operand" "xm") + (match_operand:V4DI 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMTI))] "TARGET_AVX2" @@ -26474,6 +26498,7 @@ (define_insn "avx2_permv2ti" return "vperm2i128\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -27089,7 +27114,7 @@ (define_insn "*avx_vperm2f128_nozero" (vec_select:AVX256MODE2P (vec_concat: (match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm")) + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xBt")) (match_parallel 3 "" [(match_operand 4 "const_int_operand")])))] "TARGET_AVX @@ -27106,6 +27131,7 @@ (define_insn "*avx_vperm2f128_nozero" return "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c index 1e5450dfb73..510213a6ca7 100644 --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -28,3 +28,109 @@ void legacy_test () /* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ + +#ifdef DTYPE +#undef DTYPE +#define DTYPE u64 +#endif + +typedef union +{ + __m128i xi[8]; + __m128 xf[8]; + __m128d xd[8]; + __m256i yi[4]; + __m256 yf[4]; + __m256d yd[4]; + DTYPE a[16]; +} tmp_u; + +__attribute__((target("sse4.2"))) +void sse_test () +{ + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]); + src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]); + tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]); + tdst->xi[3] = _mm_hsub_epi16 (src1->xi[6], src2->xi[7]); + tdst->xi[4] = _mm_hsub_epi32 (src1->xi[0], src2->xi[1]); + tdst->xi[5] = _mm_hsubs_epi16 (src1->xi[2], src2->xi[3]); + + src1->xi[6] = _mm_cmpeq_epi64 (tdst->xi[4], src2->xi[5]); + src1->xi[7] = _mm_cmpgt_epi64 (tdst->xi[6], src2->xi[7]); + + tdst->xf[0] = _mm_dp_ps (src1->xf[0], src2->xf[1], 0xbf); + tdst->xd[1] = _mm_dp_pd (src1->xd[2], src2->xd[3], 0xae); + + tdst->xi[2] = _mm_mpsadbw_epu8 (src1->xi[4], src2->xi[5], 0xc1); + + tdst->xi[3] = _mm_blend_epi16 (src1->xi[6], src2->xi[7], 0xc); + tdst->xi[4] = _mm_blendv_epi8 (src1->xi[0], src2->xi[1], tdst->xi[2]); + tdst->xf[5] = _mm_blend_ps (src1->xf[3], src2->xf[4], 0x4); + tdst->xf[6] = _mm_blendv_ps (src1->xf[5], src2->xf[6], tdst->xf[7]); + tdst->xd[7] = _mm_blend_pd (tdst->xd[0], src1->xd[1], 0x1); + tdst->xd[0] = _mm_blendv_pd (src1->xd[2], src2->xd[3], tdst->xd[4]); + + tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]); + tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]); + tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]); +} + +__attribute__((target("avx2"))) +void vex_test () +{ + + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]); + src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]); + tdst->yi[0] = _mm256_hsub_epi16 (src1->yi[3], src2->yi[0]); + tdst->yi[1] = _mm256_hsub_epi32 (src1->yi[0], src2->yi[1]); + tdst->yi[2] = _mm256_hsubs_epi16 (src1->yi[2], src2->yi[3]); + + src1->yi[2] = _mm256_cmpeq_epi64 (tdst->yi[1], src2->yi[2]); + src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]); + + tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf); + tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf); + + tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1); + + tdst->yi[0] = _mm256_blend_epi16 (src1->yi[1], src2->yi[2], 0xc); + tdst->yi[1] = _mm256_blendv_epi8 (src1->yi[1], src2->yi[2], tdst->yi[0]); + tdst->yf[2] = _mm256_blend_ps (src1->yf[0], src2->yf[1], 0x4); + tdst->yf[3] = _mm256_blendv_ps (src1->yf[2], src2->yf[3], tdst->yf[1]); + tdst->yd[3] = _mm256_blend_pd (tdst->yd[1], src1->yd[0], 0x1); + tdst->yd[1] = _mm256_blendv_pd (src1->yd[2], src2->yd[3], tdst->yd[2]); + + tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]); + tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]); +} + +/* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpgtq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dpps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dppd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psadbw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendvb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ From patchwork Thu Aug 31 08:20:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137242 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp97371vqu; Thu, 31 Aug 2023 01:26:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0NOEd9USA4WiuBCjeFg9tPfohtTatWkms5Zkn9crqQHV1dcbaMI+cigzX+dPTEXwRnYwt X-Received: by 2002:aa7:d743:0:b0:51d:95ac:22ed with SMTP id a3-20020aa7d743000000b0051d95ac22edmr2400643eds.1.1693470360021; Thu, 31 Aug 2023 01:26:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470360; cv=none; d=google.com; s=arc-20160816; b=atFj1VbHgh74+W6TL8fD2udbOtAoQS85tQdmVkbjl/h/kEaW5btPaquRikkORZcg7F oLlCHvMw1I4YDJoYSbrqdJxA69qvS0LDXJanc9SliQC2mraE1Zouo/34HHsOReT/NjjQ 6xW4tGlntGMUiHp3osz/6VNZn56Mwa8NkBt6GqWMCC5vda8p6DbCpWoV1/KgCqvAoQzp DTB62orSh3nnQUYsalLoa8jEbErUpD+L8E6bf+8j1lSL7A7zEXbsrBz6Nz4zF4WfP1L/ oqnxTvwcHBMile5HjRnFhtlPng9A2usS4PiQW5iD3e/5fNUp2cASLDK8IiPCkDdI4HLN yEpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=S2ApwrVex89tMNymPUzPfIHPOZfKJ5rjx78A3zoTszk=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=DRXwu+RbwxT6Bz6gbUjk1IojeUdXOJypK8BNOBgY2GG//nBI9qXTXDnSYERfmn8i0U M4uwXO0R1c78LLiCwixLMzIIBrE92GDMnQsd2kJihry64WgZA1N5SFQcux6eKAcO/FBK BKlune0Xc51//McpSn7MgMhoveqkp01vkWSO+ffut0ZmEbF6S9ErCDrDrTiE9zNUwzUb wRwKS9uQVSiRjP3SRKvb8JiLD2G89qY539Py35tuH7Amf/+9MKW/tgZrkhjDYyu8TQZL gIe0RgLeMsWJit54hmSt/pXUMl3MmdndQV/61AdhUcFfS/Qg0CNSK1A13lmfpyqn9j/W G8kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NpQ6ChMI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u24-20020aa7d998000000b0052a09569b71si724557eds.174.2023.08.31.01.25.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:26:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NpQ6ChMI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6FE8B388553D for ; Thu, 31 Aug 2023 08:22:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6FE8B388553D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470174; bh=S2ApwrVex89tMNymPUzPfIHPOZfKJ5rjx78A3zoTszk=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=NpQ6ChMIsoF+fXyDLNusUvXt0a5wrRgmrYS0qoKZAHFFXHFJLmGNQu4Q6lw2Ih7ZO caBkal9O2HncbQoDNYd62OFcyKvtDYO7hztSTXe+ZPvX/97DHUqcF/nBLoYQ4XdEH0 ikUPtUP5QbtfPla9g5QCpIUauF+20uLN16O8Ba2g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id BD9F13857C44 for ; Thu, 31 Aug 2023 08:20:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BD9F13857C44 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235687" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235687" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938723" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938723" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9DADA1005133; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5) Date: Thu, 31 Aug 2023 16:20:22 +0800 Message-Id: <20230831082024.314097-12-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732376020395030 X-GMAIL-MSGID: 1775732376020395030 From: Kong Lingling Disable EGPR usage for below legacy insns in opcode map2/3 that have vex but no evex counterpart. insn list: 1. phminposuw/vphminposuw 2. ptest/vptest 3. roundps/vroundps, roundpd/vroundpd, roundss/vroundss, roundsd/vroundsd 4. pcmpestri/vpcmpestri, pcmpestrm/vpcmpestrm 5. pcmpistri/vpcmpistri, pcmpistrm/vpcmpistrm 6. aesimc/vaesimc, aeskeygenassist/vaeskeygenassist gcc/ChangeLog: * config/i386/i386-protos.h (x86_evex_reg_mentioned_p): New prototype. * config/i386/i386.cc (x86_evex_reg_mentioned_p): New function. * config/i386/i386.md (sse4_1_round2): Set attr gpr32 0 and constraint Bt/BM to all non-evex alternatives, adjust alternative outputs if evex reg is mentioned. * config/i386/sse.md (_ptest): Set attr gpr32 0 and constraint Bt/BM to all non-evex alternatives. (ptesttf2): Likewise. (_round): Likewise. (sse4_2_pcmpestri): Likewise. (sse4_2_pcmpestrm): Likewise. (sse4_2_pcmpestr_cconly): Likewise. (sse4_2_pcmpistr): Likewise. (sse4_2_pcmpistri): Likewise. (sse4_2_pcmpistrm): Likewise. (sse4_2_pcmpistr_cconly): Likewise. (aesimc): Likewise. (aeskeygenassist): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add intrinsic tests. --- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 13 +++ gcc/config/i386/i386.md | 3 +- gcc/config/i386/sse.md | 93 +++++++++++++------ .../i386/apx-legacy-insn-check-norex2.c | 55 ++++++++++- 5 files changed, 132 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 78eb3e0f584..bbb219e3039 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -65,6 +65,7 @@ extern bool extended_reg_mentioned_p (rtx); extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); extern bool x86_extended_rex2reg_mentioned_p (rtx); +extern bool x86_evex_reg_mentioned_p (rtx [], int); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index f5d642948bc..ec93c5bab97 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22936,6 +22936,19 @@ x86_extended_rex2reg_mentioned_p (rtx insn) return false; } +/* Return true when rtx operands mentions register that must be encoded using + evex prefix. */ +bool +x86_evex_reg_mentioned_p (rtx operands[], int nops) +{ + int i; + for (i = 0; i < nops; i++) + if (EXT_REX_SSE_REG_P (operands[i]) + || x86_extended_rex2reg_mentioned_p (operands[i])) + return true; + return false; +} + /* If profitable, negate (without causing overflow) integer constant of mode MODE at location LOC. Return true in this case. */ bool diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 83ad01b43c1..4c305e72389 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -21603,7 +21603,7 @@ (define_expand "significand2" (define_insn "sse4_1_round2" [(set (match_operand:MODEFH 0 "register_operand" "=x,x,x,v,v") (unspec:MODEFH - [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,m,v,m") + [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,Bt,v,m") (match_operand:SI 2 "const_0_to_15_operand")] UNSPEC_ROUND))] "TARGET_SSE4_1" @@ -21616,6 +21616,7 @@ (define_insn "sse4_1_round2" [(set_attr "type" "ssecvt") (set_attr "prefix_extra" "1,1,1,*,*") (set_attr "length_immediate" "1") + (set_attr "gpr32" "1,1,0,1,1") (set_attr "prefix" "maybe_vex,maybe_vex,maybe_vex,evex,evex") (set_attr "isa" "noavx512f,noavx512f,noavx512f,avx512f,avx512f") (set_attr "avx_partial_xmm_update" "false,false,true,false,true") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 05963de9219..456713b991a 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -22617,11 +22617,12 @@ (define_insn "avx2_pblendd" (define_insn "sse4_1_phminposuw" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,x") - (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,xm")] + (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "YrBT,*xBT,xBt")] UNSPEC_PHMINPOSUW))] "TARGET_SSE4_1" "%vphminposuw\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -23810,12 +23811,13 @@ (define_insn "avx_vtest" (define_insn "*_ptest" [(set (reg FLAGS_REG) (unspec [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x") - (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")] + (match_operand:V_AVX 1 "vector_operand" "YrBT, *xBT, xBt")] UNSPEC_PTEST))] "TARGET_SSE4_1 && ix86_match_ptest_ccmode (insn)" "%vptest\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set (attr "btver2_decode") @@ -23852,12 +23854,13 @@ (define_expand "_ptest" (define_insn "ptesttf2" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:TF 0 "register_operand" "Yr, *x, x") - (match_operand:TF 1 "vector_operand" "YrBm, *xBm, xm")] + (match_operand:TF 1 "vector_operand" "YrBT, *xBT, xBt")] UNSPEC_PTEST))] "TARGET_SSE4_1" "%vptest\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -23968,13 +23971,14 @@ (define_expand "lrint2" (define_insn "_round" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 - [(match_operand:VF_128_256 1 "vector_operand" "YrBm,*xBm,xm") + [(match_operand:VF_128_256 1 "vector_operand" "YrBT,*xBT,xBt") (match_operand:SI 2 "const_0_to_15_operand")] UNSPEC_ROUND))] "TARGET_SSE4_1" "%vround\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -24061,19 +24065,32 @@ (define_insn "sse4_1_round" [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v") (vec_merge:VF_128 (unspec:VF_128 - [(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm") + [(match_operand:VF_128 2 "nonimmediate_operand" "YrBt,*xBt,xBt,vm") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_ROUND) (match_operand:VF_128 1 "register_operand" "0,0,x,v") (const_int 1)))] "TARGET_SSE4_1" - "@ - round\t{%3, %2, %0|%0, %2, %3} - round\t{%3, %2, %0|%0, %2, %3} - vround\t{%3, %2, %1, %0|%0, %1, %2, %3} - vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}" - [(set_attr "isa" "noavx,noavx,avx,avx512f") +{ + switch (which_alternative) + { + case 0: + case 1: + return "round\t{%3, %2, %0|%0, %2, %3}"; + case 2: + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + case 3: + if (x86_evex_reg_mentioned_p (operands, 3)) + return "vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + else + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,noavx,noavx512f,avx512f") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0,0,0,1") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*,*") (set_attr "prefix_extra" "1") @@ -24085,19 +24102,32 @@ (define_insn "*sse4_1_round" (vec_merge:VFH_128 (vec_duplicate:VFH_128 (unspec: - [(match_operand: 2 "nonimmediate_operand" "Yrm,*xm,xm,vm") + [(match_operand: 2 "nonimmediate_operand" "YrBt,*xBt,xBt,vm") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_ROUND)) (match_operand:VFH_128 1 "register_operand" "0,0,x,v") (const_int 1)))] "TARGET_SSE4_1" - "@ - round\t{%3, %2, %0|%0, %2, %3} - round\t{%3, %2, %0|%0, %2, %3} - vround\t{%3, %2, %1, %0|%0, %1, %2, %3} - vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}" - [(set_attr "isa" "noavx,noavx,avx,avx512f") +{ + switch (which_alternative) + { + case 0: + case 1: + return "round\t{%3, %2, %0|%0, %2, %3}"; + case 2: + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + case 3: + if (x86_evex_reg_mentioned_p (operands, 3) || mode == V8HFmode) + return "vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + else + return "vround\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,noavx,noavx512f,avx512f") (set_attr "type" "ssecvt") + (set_attr "gpr32" "0,0,0,1") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*,*") (set_attr "prefix_extra" "1") @@ -24318,7 +24348,7 @@ (define_insn "sse4_2_pcmpestri" (unspec:SI [(match_operand:V16QI 1 "register_operand" "x,x") (match_operand:SI 2 "register_operand" "a,a") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt") (match_operand:SI 4 "register_operand" "d,d") (match_operand:SI 5 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24333,6 +24363,7 @@ (define_insn "sse4_2_pcmpestri" "TARGET_SSE4_2" "%vpcmpestri\t{%5, %3, %1|%1, %3, %5}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "length_immediate" "1") @@ -24345,7 +24376,7 @@ (define_insn "sse4_2_pcmpestrm" (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x,x") (match_operand:SI 2 "register_operand" "a,a") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt") (match_operand:SI 4 "register_operand" "d,d") (match_operand:SI 5 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24360,6 +24391,7 @@ (define_insn "sse4_2_pcmpestrm" "TARGET_SSE4_2" "%vpcmpestrm\t{%5, %3, %1|%1, %3, %5}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24372,7 +24404,7 @@ (define_insn "sse4_2_pcmpestr_cconly" (unspec:CC [(match_operand:V16QI 2 "register_operand" "x,x,x,x") (match_operand:SI 3 "register_operand" "a,a,a,a") - (match_operand:V16QI 4 "nonimmediate_operand" "x,m,x,m") + (match_operand:V16QI 4 "nonimmediate_operand" "x,Bt,x,Bt") (match_operand:SI 5 "register_operand" "d,d,d,d") (match_operand:SI 6 "const_0_to_255_operand")] UNSPEC_PCMPESTR)) @@ -24385,6 +24417,7 @@ (define_insn "sse4_2_pcmpestr_cconly" %vpcmpestri\t{%6, %4, %2|%2, %4, %6} %vpcmpestri\t{%6, %4, %2|%2, %4, %6}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load,none,load") @@ -24396,7 +24429,7 @@ (define_insn_and_split "sse4_2_pcmpistr" [(set (match_operand:SI 0 "register_operand" "=c,c") (unspec:SI [(match_operand:V16QI 2 "register_operand" "x,x") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt") (match_operand:SI 4 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (match_operand:V16QI 1 "register_operand" "=Yz,Yz") @@ -24439,6 +24472,7 @@ (define_insn_and_split "sse4_2_pcmpistr" DONE; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load") @@ -24448,7 +24482,7 @@ (define_insn "sse4_2_pcmpistri" [(set (match_operand:SI 0 "register_operand" "=c,c") (unspec:SI [(match_operand:V16QI 1 "register_operand" "x,x") - (match_operand:V16QI 2 "nonimmediate_operand" "x,m") + (match_operand:V16QI 2 "nonimmediate_operand" "x,Bt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (reg:CC FLAGS_REG) @@ -24460,6 +24494,7 @@ (define_insn "sse4_2_pcmpistri" "TARGET_SSE4_2" "%vpcmpistri\t{%3, %2, %1|%1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24471,7 +24506,7 @@ (define_insn "sse4_2_pcmpistrm" [(set (match_operand:V16QI 0 "register_operand" "=Yz,Yz") (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x,x") - (match_operand:V16QI 2 "nonimmediate_operand" "x,m") + (match_operand:V16QI 2 "nonimmediate_operand" "x,Bt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (set (reg:CC FLAGS_REG) @@ -24483,6 +24518,7 @@ (define_insn "sse4_2_pcmpistrm" "TARGET_SSE4_2" "%vpcmpistrm\t{%3, %2, %1|%1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -24494,7 +24530,7 @@ (define_insn "sse4_2_pcmpistr_cconly" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:V16QI 2 "register_operand" "x,x,x,x") - (match_operand:V16QI 3 "nonimmediate_operand" "x,m,x,m") + (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt,x,Bt") (match_operand:SI 4 "const_0_to_255_operand")] UNSPEC_PCMPISTR)) (clobber (match_scratch:V16QI 0 "=Yz,Yz,X,X")) @@ -24506,6 +24542,7 @@ (define_insn "sse4_2_pcmpistr_cconly" %vpcmpistri\t{%4, %3, %2|%2, %3, %4} %vpcmpistri\t{%4, %3, %2|%2, %3, %4}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "memory" "none,load,none,load") @@ -25990,23 +26027,25 @@ (define_insn "aesdeclast" (define_insn "aesimc" [(set (match_operand:V2DI 0 "register_operand" "=x") - (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm")] + (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBT")] UNSPEC_AESIMC))] "TARGET_AES" "%vaesimc\t{%1, %0|%0, %1}" [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "aeskeygenassist" [(set (match_operand:V2DI 0 "register_operand" "=x") - (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm") + (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBT") (match_operand:SI 2 "const_0_to_255_operand")] UNSPEC_AESKEYGENASSIST))] "TARGET_AES" "%vaeskeygenassist\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c index 510213a6ca7..771bcb078e1 100644 --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -45,13 +45,22 @@ typedef union DTYPE a[16]; } tmp_u; -__attribute__((target("sse4.2"))) +__attribute__((target("sse4.2,aes"))) void sse_test () { register tmp_u *tdst __asm__("%r16"); register tmp_u *src1 __asm__("%r17"); register tmp_u *src2 __asm__("%r18"); - + + src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]); + src1->a[2] = _mm_testc_si128 (src1->xi[3], src2->xi[4]); + src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6], + _MM_FROUND_CUR_DIRECTION); + src1->xf[4] = _mm_round_ps (src1->xf[7], _MM_FROUND_CUR_DIRECTION); + src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3], + _MM_FROUND_CUR_DIRECTION); + src1->xd[1] = _mm_round_pd (src1->xd[4], _MM_FROUND_CUR_DIRECTION); + src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]); src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]); tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]); @@ -77,16 +86,33 @@ void sse_test () tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]); tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]); tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]); + + tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c); + tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20); + tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30); + tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40); + + tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]); + tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b); } -__attribute__((target("avx2"))) +__attribute__((target("avx2,aes"))) void vex_test () { register tmp_u *tdst __asm__("%r16"); register tmp_u *src1 __asm__("%r17"); register tmp_u *src2 __asm__("%r18"); - + + src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]); + src1->a[2] = _mm256_testc_si256 (src1->yi[2], src2->yi[3]); + src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6], + _MM_FROUND_CUR_DIRECTION); + src1->yf[4] = _mm256_round_ps (src1->yf[2], _MM_FROUND_CUR_DIRECTION); + src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3], + _MM_FROUND_CUR_DIRECTION); + src1->yd[1] = _mm256_round_pd (src1->yd[3], _MM_FROUND_CUR_DIRECTION); + src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]); src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]); tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]); @@ -98,7 +124,6 @@ void vex_test () src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]); tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf); - tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf); tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1); @@ -112,6 +137,14 @@ void vex_test () tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]); tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]); tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]); + + tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c); + tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20); + tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30); + tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40); + + tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]); + tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b); } /* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ @@ -134,3 +167,15 @@ void vex_test () /* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phminposuw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?ptest\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundss\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundsd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?roundpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpestri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpistri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpestrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpistrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?aesimc\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?aeskeygenassist\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ From patchwork Thu Aug 31 08:20:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137248 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp98373vqu; Thu, 31 Aug 2023 01:28:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGsTnGcDZrFm0mAMIj574rSH/845AKUnLQKQrD7usXv1ELw40whcfuQSWmli9anA9654ew8 X-Received: by 2002:a05:6512:402a:b0:500:8fcb:e0c9 with SMTP id br42-20020a056512402a00b005008fcbe0c9mr4005316lfb.69.1693470520165; Thu, 31 Aug 2023 01:28:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470520; cv=none; d=google.com; s=arc-20160816; b=sQetJFkxk9+3uoklS5/nyGiyAwGmieiLjlK6VaUxIcu6M+K0X38YJc0VqoigoZeBiH MMlGRllQ8zOP2Ydec54JtSGKoUpQddTywVjZQX+6WyN4iRThF8tWEdNhrcLk4jAf8cL8 LkHY+c2NbxgfIxe0gUmc4p+HN/vC9diBLB7jJ64GIyX++b6Iu3U+cOxIyLU1iuY0dIla Va9h1bTHpVpWXQW2LxauhiKQYSg0DGrZmp1RHyiYrWwSX0HZgm5Chw12W7QmCFLxFMCk XCboCDT+eFS55Umx+zGCfV87SGVKwQEqGePg6uzAVlNXd2568dnC0l2EerDgi1Rv4stH t48g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=0Bs4NMpzG47PwbA+Ewv++BNXxspakFeW6CAGjh7UxSI=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=Lq8olLtK6F4Ej6soFRHaP8jwjom+PpXOgNZaQCTrPqn05ZZyul8gjbWmip2Zz55TFp CAqqVCmA1wVLWEVJMxe68cFrH2LwW3IKbxsbG6JXpRt6536rzuq7F51yIFsEV2IxVew4 lf9nIqJp+WlhXUam3jMhFP+3jjJzCV6k3lXI2hYeB1u7hCfghOuu6+7KV9p8L7MI8Df3 O1A9iHz6Iv0y+WU6fHL2pHS4a6mmIVOnJgY0pkeC0UepD51DLEH2Zl1yb64HpTJO0xtU pwJev0nHPuFOHnSPP2qc96if/2KoQXJjFS2hirSDcaKnSDjDW7aX7n5Ggc8sMjRjDrV+ zTcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=vw1qfq9M; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d15-20020aa7c1cf000000b0052a1dcd8416si711623edp.491.2023.08.31.01.28.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:28:40 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=vw1qfq9M; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 00F2F388202F for ; Thu, 31 Aug 2023 08:24:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 00F2F388202F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470260; bh=0Bs4NMpzG47PwbA+Ewv++BNXxspakFeW6CAGjh7UxSI=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=vw1qfq9MlBL20qEnI9SzO4WYuTJPamAO0hjp0TtfDybVuFFJPJYOPOi80J6oYUtri vC2t0Ze2ZtOz+A/QpHImKKmKIwTN+brAFb69ASQn/PTRMaeIjLBU3KnCn972ro3EWF 7GmMJbEnM98MJCIc1gkiJdPjLjZBRf78yyo8/7BA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 3624F385C41F for ; Thu, 31 Aug 2023 08:21:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3624F385C41F X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462236012" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462236012" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:21:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862939182" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862939182" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A29851005134; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5) Date: Thu, 31 Aug 2023 16:20:23 +0800 Message-Id: <20230831082024.314097-13-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732543869382132 X-GMAIL-MSGID: 1775732543869382132 From: Kong Lingling The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns with evex counterpart, we assume auto promotion to EGPR under APX_F if the insn uses GPR32. So for below insns, we disabled EGPR usage for their sse mnenomics, while allowing egpr generation of their v prefixed mnemonics. insn list: 1. pabsb/pabsw/pabsd 2. pextrb/pextrw/pextrd/pextrq 3. pinsrb/pinsrd/pinsrq 4. pshufb 5. extractps/insertps 6. pmaddubsw 7. pmulhrsw 8. packusdw 9. palignr 10. movntdqa 11. mpsadbw 12. pmuldq/pmulld 13. pmaxsb/pmaxsd, pminsb/pminsd pmaxud/pmaxuw, pminud/pminuw 14. (pmovsxbw/pmovsxbd/pmovsxbq, pmovsxwd/pmovsxwq, pmovsxdq pmovzxbw/pmovzxbd/pmovzxbq, pmovzxwd/pmovzxwq, pmovzxdq) 15. aesdec/aesdeclast, aesenc/aesenclast 16. pclmulqdq 17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb gcc/ChangeLog: * config/i386/i386.md (*movhi_internal): Split out non-gpr supported pextrw with mem constraint to avx/noavx alternatives, set Bt and attr gpr32 0 to the noavx alternative. (*mov_internal): Likewise. * config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to "h/Bt/BT" and set_attr gpr32 0 for noavx alternative. (mmx_pshufbv4qi3): Likewise. (*mmx_pinsrd): Likewise. (*mmx_pinsrb): Likewise. (*pinsrb): Likewise. (mmx_pshufbv8qi3): Likewise. (mmx_pshufbv4qi3): Likewise. (@sse4_1_insertps_): Likewise. (*mmx_pextrw): Split altrenatives and map non-EGPR constraints, attr_gpr32 and attr_isa to noavx mnemonics. (*movv2qi_internal): Likewise. (*pextrw): Likewise. (*mmx_pextrb): Likewise. (*mmx_pextrb_zext): Likewise. (*pextrb): Likewise. (*pextrb_zext): Likewise. (vec_extractv2si_1): Likewise. (vec_extractv2si_1_zext): Likewise. * config/i386/sse.md: (vi128_h_r): New mode attr for pinsr{bw}/pextr{bw} with reg operand. (*abs2): Split altrenatives and %v in mnemonics, map non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics. (*vec_extract): Likewise. (*vec_extract): Likewise for HFBF pattern. (*vec_extract_zext): Likewise. (*vec_extractv4si_1): Likewise. (*vec_extractv4si_zext): Likewise. (*vec_extractv2di_1): Likewise. (*vec_concatv2si_sse4_1): Likewise. (_pinsr): Likewise. (vec_concatv2di): Likewise. (*sse4_1_v2qiv2di2_1): Likewise. (ssse3_avx2>_pshufb3): Change "r/m/Bm" to "h/Bt/BT" and set_attr gpr32 0 for noavx alternative, split %v for avx/noavx alternatives if necessary. (*vec_concatv2sf_sse4_1): Likewise. (*sse4_1_extractps): Likewise. (vec_set_0): Likewise for VI4F_128. (*vec_setv4sf_sse4_1): Likewise. (@sse4_1_insertps): Likewise. (ssse3_pmaddubsw128): Likewise. (*_pmulhrsw3): Likewise. (_packusdw): Likewise. (_palignr): Likewise. (_movntdqa): Likewise. (_mpsadbw): Likewise. (*sse4_1_mulv2siv2di3): Likewise. (*_mul3): Likewise. (*sse4_1_3): Likewise. (*v8hi3): Likewise. (*v16qi3): Likewise. (*sse4_1_v8qiv8hi2_1): Likewise. (*sse4_1_zero_extendv8qiv8hi2_3): Likewise. (*sse4_1_zero_extendv8qiv8hi2_4): Likewise. (*sse4_1_v4qiv4si2_1): Likewise. (*sse4_1_v4hiv4si2_1): Likewise. (*sse4_1_zero_extendv4hiv4si2_3): Likewise. (*sse4_1_zero_extendv4hiv4si2_4): Likewise. (*sse4_1_v2hiv2di2_1): Likewise. (*sse4_1_v2siv2di2_1): Likewise. (*sse4_1_zero_extendv2siv2di2_3): Likewise. (*sse4_1_zero_extendv2siv2di2_4): Likewise. (aesdec): Likewise. (aesdeclast): Likewise. (aesenc): Likewise. (aesenclast): Likewise. (pclmulqdq): Likewise. (vgf2p8affineinvqb_): Likewise. (vgf2p8affineqb_): Likewise. (vgf2p8mulb_): Likewise. --- gcc/config/i386/i386.md | 50 ++++--- gcc/config/i386/mmx.md | 159 ++++++++++++-------- gcc/config/i386/sse.md | 315 ++++++++++++++++++++++++++-------------- 3 files changed, 339 insertions(+), 185 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 4c305e72389..8ec249b268d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2868,9 +2868,9 @@ (define_peephole2 (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" - "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m") + "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,Bt,m") (match_operand:HI 1 "general_operand" - "r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*v"))] + "r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*x,*v"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { @@ -2904,8 +2904,10 @@ (define_insn "*movhi_internal" if (SSE_REG_P (operands[0])) return "%vpinsrw\t{$0, %1, %d0|%d0, %1, 0}"; + else if (!TARGET_AVX) + return "pextrw\t{$0, %1, %0|%0, %1, 0}"; else - return "%vpextrw\t{$0, %1, %0|%0, %1, 0}"; + return "vpextrw\t{$0, %1, %0|%0, %1, 0}"; case TYPE_MSKLOG: if (operands[1] == const0_rtx) @@ -2925,15 +2927,21 @@ (define_insn "*movhi_internal" (cond [(eq_attr "alternative" "9,10,11,12,13") (const_string "sse2") (eq_attr "alternative" "14") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "15") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "14") + (const_string "0") + (const_string "1"))) (set (attr "type") (cond [(eq_attr "alternative" "4,5,6,7") (const_string "mskmov") (eq_attr "alternative" "8") (const_string "msklog") - (eq_attr "alternative" "13,14") + (eq_attr "alternative" "13,14,15") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "ssemov") (const_string "sselog1")) @@ -2958,7 +2966,7 @@ (define_insn "*movhi_internal" (set (attr "prefix") (cond [(eq_attr "alternative" "4,5,6,7,8") (const_string "vex") - (eq_attr "alternative" "9,10,11,12,13,14") + (eq_attr "alternative" "9,10,11,12,13,14,15") (const_string "maybe_evex") ] (const_string "orig"))) @@ -2967,7 +2975,7 @@ (define_insn "*movhi_internal" (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) - (eq_attr "alternative" "13,14") + (eq_attr "alternative" "13,14,15") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "TI")) @@ -4320,9 +4328,9 @@ (define_mode_attr hfbfconstf (define_insn "*mov_internal" [(set (match_operand:HFBF 0 "nonimmediate_operand" - "=?r,?r,?r,?m,v,v,?r,m,?v,v") + "=?r,?r,?r,?m,v,v,?r,Bt,m,?v,v") (match_operand:HFBF 1 "general_operand" - "r ,F ,m ,r,C,v, v,v,r ,m"))] + "r ,F ,m ,r,C,v, v,v,v,r ,m"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && (lra_in_progress || reload_completed @@ -4347,8 +4355,10 @@ (define_insn "*mov_internal" if (SSE_REG_P (operands[0])) return "%vpinsrw\t{$0, %1, %d0|%d0, %1, 0}"; + else if (!TARGET_AVX) + return "pextrw\t{$0, %1, %0|%0, %1, 0}"; else - return "%vpextrw\t{$0, %1, %0|%0, %1, 0}"; + return "vpextrw\t{$0, %1, %0|%0, %1, 0}"; default: if (get_attr_mode (insn) == MODE_SI) @@ -4358,18 +4368,24 @@ (define_insn "*mov_internal" } } [(set (attr "isa") - (cond [(eq_attr "alternative" "4,5,6,8,9") + (cond [(eq_attr "alternative" "4,5,6,9,10") (const_string "sse2") (eq_attr "alternative" "7") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "8") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "8") + (const_string "0") + (const_string "1"))) (set (attr "type") (cond [(eq_attr "alternative" "4") (const_string "sselog1") - (eq_attr "alternative" "5,6,8") + (eq_attr "alternative" "5,6,9") (const_string "ssemov") - (eq_attr "alternative" "7,9") + (eq_attr "alternative" "7,8,10") (if_then_else (match_test ("TARGET_AVX512FP16")) (const_string "ssemov") @@ -4389,19 +4405,19 @@ (define_insn "*mov_internal" ] (const_string "imov"))) (set (attr "prefix") - (cond [(eq_attr "alternative" "4,5,6,7,8,9") + (cond [(eq_attr "alternative" "4,5,6,7,8,9,10") (const_string "maybe_vex") ] (const_string "orig"))) (set (attr "mode") (cond [(eq_attr "alternative" "4") (const_string "V4SF") - (eq_attr "alternative" "6,8") + (eq_attr "alternative" "6,9") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) - (eq_attr "alternative" "7,9") + (eq_attr "alternative" "7,8,10") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index ef578222945..63803c89f2b 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -418,9 +418,9 @@ (define_expand "movv2qi" (define_insn "*movv2qi_internal" [(set (match_operand:V2QI 0 "nonimmediate_operand" - "=r,r,r,m ,v,v,v,m,r,v") + "=r,r,r,m ,v,v,v,Bt,m,r,v") (match_operand:V2QI 1 "general_operand" - "r ,C,m,rC,C,v,m,v,v,r"))] + "r ,C,m,rC,C,v,m,x,v,v,r"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) @@ -442,8 +442,10 @@ (define_insn "*movv2qi_internal" if (SSE_REG_P (operands[0])) return "%vpinsrw\t{$0, %1, %d0|%d0, %1, 0}"; + else if (!TARGET_AVX) + return "pextrw\t{$0, %1, %0|%0, %1, 0}"; else - return "%vpextrw\t{$0, %1, %0|%0, %1, 0}"; + return "vpextrw\t{$0, %1, %0|%0, %1, 0}"; case TYPE_SSEMOV: return ix86_output_ssemov (insn, operands); @@ -453,20 +455,26 @@ (define_insn "*movv2qi_internal" } } [(set (attr "isa") - (cond [(eq_attr "alternative" "6,8,9") + (cond [(eq_attr "alternative" "6,9,10") (const_string "sse2") (eq_attr "alternative" "7") - (const_string "sse4") + (const_string "sse4_noavx") + (eq_attr "alternative" "8") + (const_string "avx") ] (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "7") + (const_string "0") + (const_string "1"))) (set (attr "type") - (cond [(eq_attr "alternative" "6,7") + (cond [(eq_attr "alternative" "6,7,8") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "ssemov") (const_string "sselog1")) (eq_attr "alternative" "4") (const_string "sselog1") - (eq_attr "alternative" "5,8,9") + (eq_attr "alternative" "5,9,10") (const_string "ssemov") (match_test "optimize_function_for_size_p (cfun)") (const_string "imov") @@ -483,16 +491,16 @@ (define_insn "*movv2qi_internal" ] (const_string "imov"))) (set (attr "prefix") - (cond [(eq_attr "alternative" "4,5,6,7,8,9") + (cond [(eq_attr "alternative" "4,5,6,7,8,9,10") (const_string "maybe_evex") ] (const_string "orig"))) (set (attr "mode") - (cond [(eq_attr "alternative" "6,7") + (cond [(eq_attr "alternative" "6,7,8") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "TI")) - (eq_attr "alternative" "8,9") + (eq_attr "alternative" "9,10") (if_then_else (match_test "TARGET_AVX512FP16") (const_string "HI") (const_string "SI")) @@ -526,9 +534,9 @@ (define_insn "*movv2qi_internal" ] (const_string "HI"))) (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "8") + (cond [(eq_attr "alternative" "9") (symbol_ref "TARGET_INTER_UNIT_MOVES_FROM_VEC") - (eq_attr "alternative" "9") + (eq_attr "alternative" "10") (symbol_ref "TARGET_INTER_UNIT_MOVES_TO_VEC") ] (symbol_ref "true")))]) @@ -1167,7 +1175,7 @@ (define_expand "vcondv2sf" (define_insn "@sse4_1_insertps_" [(set (match_operand:V2FI 0 "register_operand" "=Yr,*x,v") (unspec:V2FI - [(match_operand:V2FI 2 "nonimmediate_operand" "Yrm,*xm,vm") + [(match_operand:V2FI 2 "nonimmediate_operand" "YrBt,*xBt,vm") (match_operand:V2FI 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_INSERTPS))] @@ -1193,6 +1201,7 @@ (define_insn "@sse4_1_insertps_" } } [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -3952,7 +3961,7 @@ (define_insn "*mmx_pinsrd" [(set (match_operand:V2SI 0 "register_operand" "=x,Yv") (vec_merge:V2SI (vec_duplicate:V2SI - (match_operand:SI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:SI 2 "nonimmediate_operand" "hBt,rm")) (match_operand:V2SI 1 "register_operand" "0,Yv") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE @@ -3971,6 +3980,7 @@ (define_insn "*mmx_pinsrd" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "type" "sselog") (set_attr "length_immediate" "1") @@ -4031,7 +4041,7 @@ (define_insn "*mmx_pinsrb" [(set (match_operand:V8QI 0 "register_operand" "=x,YW") (vec_merge:V8QI (vec_duplicate:V8QI - (match_operand:QI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:QI 2 "nonimmediate_operand" "hBt,rm")) (match_operand:V8QI 1 "register_operand" "0,YW") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE @@ -4057,28 +4067,31 @@ (define_insn "*mmx_pinsrb" } [(set_attr "isa" "noavx,avx") (set_attr "type" "sselog") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) (define_insn "*mmx_pextrw" - [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,r,m") + [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,r,Bt,m") (vec_select:HI - (match_operand:V4HI 1 "register_operand" "y,YW,YW") + (match_operand:V4HI 1 "register_operand" "y,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pextrw\t{%2, %1, %k0|%k0, %1, %2} %vpextrw\t{%2, %1, %k0|%k0, %1, %2} - %vpextrw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse2,sse4") - (set_attr "mmx_isa" "native,*,*") - (set_attr "type" "mmxcvt,sselog1,sselog1") + pextrw\t{%2, %1, %0|%0, %1, %2} + vpextrw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,sse2,sse4_noavx,avx") + (set_attr "gpr32" "1,1,0,1") + (set_attr "mmx_isa" "native,*,*,*") + (set_attr "type" "mmxcvt,sselog1,sselog1,sselog1") (set_attr "length_immediate" "1") - (set_attr "prefix" "orig,maybe_vex,maybe_vex") - (set_attr "mode" "DI,TI,TI")]) + (set_attr "prefix" "orig,maybe_vex,maybe_vex,maybe_evex") + (set_attr "mode" "DI,TI,TI,TI")]) (define_insn "*mmx_pextrw_zext" [(set (match_operand:SWI48 0 "register_operand" "=r,r") @@ -4099,29 +4112,36 @@ (define_insn "*mmx_pextrw_zext" (set_attr "mode" "DI,TI")]) (define_insn "*mmx_pextrb" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,m") + [(set (match_operand:QI 0 "nonimmediate_operand" "=h,Bt,r,m") (vec_select:QI - (match_operand:V8QI 1 "register_operand" "YW,YW") + (match_operand:V8QI 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_7_operand")])))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" "@ - %vpextrb\t{%2, %1, %k0|%k0, %1, %2} - %vpextrb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sselog1") + pextrb\t{%2, %1, %k0|%k0, %1, %2} + pextrb\t{%2, %1, %0|%0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx,avx") + (set_attr "gpr32" "1,0,1,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*mmx_pextrb_zext" - [(set (match_operand:SWI248 0 "register_operand" "=r") + [(set (match_operand:SWI248 0 "register_operand" "=h,r") (zero_extend:SWI248 (vec_select:QI - (match_operand:V8QI 1 "register_operand" "YW") + (match_operand:V8QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_7_operand")]))))] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" - "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + "@ + pextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4131,13 +4151,14 @@ (define_insn "mmx_pshufbv8qi3" [(set (match_operand:V8QI 0 "register_operand" "=x,Yw") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")] + (match_operand:V16QI 2 "vector_operand" "xBT,Ywm")] UNSPEC_PSHUFB))] "TARGET_SSSE3 && TARGET_MMX_WITH_SSE" "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -4148,13 +4169,14 @@ (define_insn "mmx_pshufbv4qi3" [(set (match_operand:V4QI 0 "register_operand" "=x,Yw") (unspec:V4QI [(match_operand:V4QI 1 "register_operand" "0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")] + (match_operand:V16QI 2 "vector_operand" "xBT,Ywm")] UNSPEC_PSHUFB))] "TARGET_SSSE3" "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -4414,29 +4436,31 @@ (define_split ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.cc (define_insn "*vec_extractv2si_1" - [(set (match_operand:SI 0 "nonimmediate_operand" "=y,rm,x,x,y,x,r") + [(set (match_operand:SI 0 "nonimmediate_operand" "=y,hBt,rm,x,x,y,x,r") (vec_select:SI - (match_operand:V2SI 1 "nonimmediate_operand" " 0,x ,x,0,o,o,o") + (match_operand:V2SI 1 "nonimmediate_operand" " 0,x, x ,x,0,o,o,o") (parallel [(const_int 1)])))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ punpckhdq\t%0, %0 - %vpextrd\t{$1, %1, %0|%0, %1, 1} + pextrd\t{$1, %1, %0|%0, %1, 1} + vpextrd\t{$1, %1, %0|%0, %1, 1} %vpshufd\t{$0xe5, %1, %0|%0, %1, 0xe5} shufps\t{$0xe5, %0, %0|%0, %0, 0xe5} # # #" - [(set_attr "isa" "*,sse4,sse2,noavx,*,*,*") - (set_attr "mmx_isa" "native,*,*,*,native,*,*") - (set_attr "type" "mmxcvt,ssemov,sseshuf1,sseshuf1,mmxmov,ssemov,imov") + [(set_attr "isa" "*,sse4_noavx,avx,sse2,noavx,*,*,*") + (set_attr "gpr32" "1,0,1,1,1,1,1,1") + (set_attr "mmx_isa" "native,*,*,*,*,native,*,*") + (set_attr "type" "mmxcvt,ssemov,ssemov,sseshuf1,sseshuf1,mmxmov,ssemov,imov") (set (attr "length_immediate") - (if_then_else (eq_attr "alternative" "1,2,3") + (if_then_else (eq_attr "alternative" "1,2,3,4") (const_string "1") (const_string "*"))) - (set_attr "prefix" "orig,maybe_vex,maybe_vex,orig,orig,orig,orig") - (set_attr "mode" "DI,TI,TI,V4SF,SI,SI,SI")]) + (set_attr "prefix" "orig,orig,maybe_evex,maybe_vex,orig,orig,orig,orig") + (set_attr "mode" "DI,TI,TI,TI,V4SF,SI,SI,SI")]) (define_split [(set (match_operand:SI 0 "register_operand") @@ -4448,15 +4472,18 @@ (define_split "operands[1] = adjust_address (operands[1], SImode, 4);") (define_insn "*vec_extractv2si_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=h,r") (zero_extend:DI (vec_select:SI - (match_operand:V2SI 1 "register_operand" "x") + (match_operand:V2SI 1 "register_operand" "x,x") (parallel [(const_int 1)]))))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_64BIT && TARGET_SSE4_1" - "%vpextrd\t{$1, %1, %k0|%k0, %1, 1}" - [(set_attr "type" "sselog1") + "@ + pextrd\t{$1, %1, %k0|%k0, %1, 1} + vpextrd\t{$1, %1, %k0|%k0, %1, 1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4606,7 +4633,7 @@ (define_insn "*pinsrb" [(set (match_operand:V4QI 0 "register_operand" "=x,YW") (vec_merge:V4QI (vec_duplicate:V4QI - (match_operand:QI 2 "nonimmediate_operand" "rm,rm")) + (match_operand:QI 2 "nonimmediate_operand" "hBt,rm")) (match_operand:V4QI 1 "register_operand" "0,YW") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 @@ -4631,6 +4658,7 @@ (define_insn "*pinsrb" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -4638,15 +4666,17 @@ (define_insn "*pinsrb" (set_attr "mode" "TI")]) (define_insn "*pextrw" - [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,m") + [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,Bt,m") (vec_select:HI - (match_operand:V2HI 1 "register_operand" "YW,YW") + (match_operand:V2HI 1 "register_operand" "YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_1_operand")])))] "TARGET_SSE2" "@ %vpextrw\t{%2, %1, %k0|%k0, %1, %2} - %vpextrw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse4") + pextrw\t{%2, %1, %0|%0, %1, %2} + vpextrw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,sse4_noavx,avx") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -4666,29 +4696,36 @@ (define_insn "*pextrw_zext" (set_attr "mode" "TI")]) (define_insn "*pextrb" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,m") + [(set (match_operand:QI 0 "nonimmediate_operand" "=h,Bt,r,m") (vec_select:QI - (match_operand:V4QI 1 "register_operand" "YW,YW") + (match_operand:V4QI 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "TARGET_SSE4_1" "@ - %vpextrb\t{%2, %1, %k0|%k0, %1, %2} - %vpextrb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "sselog1") + pextrb\t{%2, %1, %k0|%k0, %1, %2} + pextrb\t{%2, %1, %0|%0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx,avx") + (set_attr "gpr32" "1,0,1,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*pextrb_zext" - [(set (match_operand:SWI248 0 "register_operand" "=r") + [(set (match_operand:SWI248 0 "register_operand" "=h,r") (zero_extend:SWI248 (vec_select:QI - (match_operand:V4QI 1 "register_operand" "YW") + (match_operand:V4QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))] "TARGET_SSE4_1" - "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + "@ + pextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 456713b991a..4913c34ed37 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10840,7 +10840,7 @@ (define_insn "*vec_concatv2sf_sse4_1" (match_operand:SF 1 "nonimmediate_operand" " 0, 0,Yv, 0,0, v,m, 0 , m") (match_operand:SF 2 "nonimm_or_0_operand" - " Yr,*x,Yv, m,m, m,C,*ym, C")))] + " Yr,*x,Yv, Bt,Bt, m,C,*ym, C")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ unpcklps\t{%2, %0|%0, %2} @@ -10872,6 +10872,10 @@ (define_insn "*vec_concatv2sf_sse4_1" (if_then_else (eq_attr "alternative" "7,8") (const_string "native") (const_string "*"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "3,4") + (const_string "0") + (const_string "1"))) (set (attr "prefix_data16") (if_then_else (eq_attr "alternative" "3,4") (const_string "1") @@ -10963,7 +10967,7 @@ (define_insn "vec_set_0" (vec_merge:VI4F_128 (vec_duplicate:VI4F_128 (match_operand: 2 "general_operand" - " Yr,*x,v,m,r ,m,x,v,?rm,?rm,?rm,!x,?re,!*fF")) + " Yr,*x,v,m,r ,m,x,v,?hBt,?hBt,?rm,!x,?re,!*fF")) (match_operand:VI4F_128 1 "nonimm_or_0_operand" " C , C,C,C,C ,C,0,v,0 ,0 ,x ,0 ,0 ,0") (const_int 1)))] @@ -11003,6 +11007,10 @@ (define_insn "vec_set_0" (const_string "fmov") ] (const_string "ssemov"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "8,9") + (const_string "0") + (const_string "1"))) (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "8,9,10") (const_string "1") @@ -11175,7 +11183,7 @@ (define_insn "*vec_setv4sf_sse4_1" [(set (match_operand:V4SF 0 "register_operand" "=Yr,*x,v") (vec_merge:V4SF (vec_duplicate:V4SF - (match_operand:SF 2 "nonimmediate_operand" "Yrm,*xm,vm")) + (match_operand:SF 2 "nonimmediate_operand" "YrBt,*xBt,vm")) (match_operand:V4SF 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE4_1 @@ -11196,6 +11204,7 @@ (define_insn "*vec_setv4sf_sse4_1" } [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sselog") + (set_attr "gpr32" "0,0,1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -11270,7 +11279,7 @@ (define_insn_and_split "*vec_setv2di_0_zero_extendsi_1" (define_insn "@sse4_1_insertps_" [(set (match_operand:VI4F_128 0 "register_operand" "=Yr,*x,v") (unspec:VI4F_128 - [(match_operand:VI4F_128 2 "nonimmediate_operand" "Yrm,*xm,vm") + [(match_operand:VI4F_128 2 "nonimmediate_operand" "YrBt,*xBt,vm") (match_operand:VI4F_128 1 "register_operand" "0,0,v") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_INSERTPS))] @@ -11296,6 +11305,7 @@ (define_insn "@sse4_1_insertps_" } } [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -11373,7 +11383,7 @@ (define_insn_and_split "*vec_extractv4sf_0" "operands[1] = gen_lowpart (SFmode, operands[1]);") (define_insn_and_split "*sse4_1_extractps" - [(set (match_operand:SF 0 "nonimmediate_operand" "=rm,rm,rm,Yv,Yv") + [(set (match_operand:SF 0 "nonimmediate_operand" "=hBt,hBt,rm,Yv,Yv") (vec_select:SF (match_operand:V4SF 1 "register_operand" "Yr,*x,v,0,v") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] @@ -11407,6 +11417,7 @@ (define_insn_and_split "*sse4_1_extractps" DONE; } [(set_attr "isa" "noavx,noavx,avx,noavx,avx") + (set_attr "gpr32" "0,0,1,1,1") (set_attr "type" "sselog,sselog,sselog,*,*") (set_attr "prefix_data16" "1,1,1,*,*") (set_attr "prefix_extra" "1,1,1,*,*") @@ -12271,9 +12282,9 @@ (define_insn_and_split "*vec_extract_0" "operands[1] = gen_lowpart (mode, operands[1]);") (define_insn "*vec_extract" - [(set (match_operand:HFBF 0 "register_sse4nonimm_operand" "=?r,m,x,v") + [(set (match_operand:HFBF 0 "register_sse4nonimm_operand" "=?r,Bt,m,x,v") (vec_select:HFBF - (match_operand: 1 "register_operand" "v,v,0,v") + (match_operand: 1 "register_operand" "v,x,v,0,v") (parallel [(match_operand:SI 2 "const_0_to_7_operand")])))] "TARGET_SSE2" @@ -12283,12 +12294,14 @@ (define_insn "*vec_extract" case 0: return "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}"; case 1: - return "%vpextrw\t{%2, %1, %0|%0, %1, %2}"; - + return "pextrw\t{%2, %1, %0|%0, %1, %2}"; case 2: + return "vpextrw\t{%2, %1, %0|%0, %1, %2}"; + + case 3: operands[2] = GEN_INT (INTVAL (operands[2]) * 2); return "psrldq\t{%2, %0|%0, %2}"; - case 3: + case 4: operands[2] = GEN_INT (INTVAL (operands[2]) * 2); return "vpsrldq\t{%2, %1, %0|%0, %1, %2}"; @@ -12296,8 +12309,9 @@ (define_insn "*vec_extract" gcc_unreachable (); } } - [(set_attr "isa" "*,sse4,noavx,avx") - (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1") + [(set_attr "isa" "*,sse4_noavx,avx,noavx,avx") + (set_attr "gpr32" "1,0,1,1,1") + (set_attr "type" "sselog1,sselog1,sselog1,sseishft1,sseishft1") (set_attr "prefix" "maybe_evex") (set_attr "mode" "TI")]) @@ -15659,7 +15673,7 @@ (define_insn "*sse4_1_mulv2siv2di3" (parallel [(const_int 0) (const_int 2)]))) (sign_extend:V2DI (vec_select:V2SI - (match_operand:V4SI 2 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V4SI 2 "vector_operand" "YrBT,*xBT,vm") (parallel [(const_int 0) (const_int 2)])))))] "TARGET_SSE4_1 && && !(MEM_P (operands[1]) && MEM_P (operands[2]))" @@ -15668,6 +15682,7 @@ (define_insn "*sse4_1_mulv2siv2di3" pmuldq\t{%2, %0|%0, %2} vpmuldq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -15905,7 +15920,7 @@ (define_insn "*_mul3" [(set (match_operand:VI4_AVX512F 0 "register_operand" "=Yr,*x,v") (mult:VI4_AVX512F (match_operand:VI4_AVX512F 1 "bcst_vector_operand" "%0,0,v") - (match_operand:VI4_AVX512F 2 "bcst_vector_operand" "YrBm,*xBm,vmBr")))] + (match_operand:VI4_AVX512F 2 "bcst_vector_operand" "YrBT,*xBT,vmBr")))] "TARGET_SSE4_1 && ix86_binary_operator_ok (MULT, mode, operands) && " "@ @@ -15913,6 +15928,7 @@ (define_insn "*_mul3" pmulld\t{%2, %0|%0, %2} vpmulld\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "") @@ -16717,7 +16733,7 @@ (define_insn "*sse4_1_3" [(set (match_operand:VI14_128 0 "register_operand" "=Yr,*x,") (smaxmin:VI14_128 (match_operand:VI14_128 1 "vector_operand" "%0,0,") - (match_operand:VI14_128 2 "vector_operand" "YrBm,*xBm,m")))] + (match_operand:VI14_128 2 "vector_operand" "YrBT,*xBT,m")))] "TARGET_SSE4_1 && && !(MEM_P (operands[1]) && MEM_P (operands[2]))" @@ -16728,6 +16744,7 @@ (define_insn "*sse4_1_3" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sseiadd") (set_attr "prefix_extra" "1") + (set_attr "gpr32" "0,0,1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -16735,13 +16752,14 @@ (define_insn "*v8hi3" [(set (match_operand:V8HI 0 "register_operand" "=x,Yw") (smaxmin:V8HI (match_operand:V8HI 1 "vector_operand" "%0,Yw") - (match_operand:V8HI 2 "vector_operand" "xBm,Ywm")))] + (match_operand:V8HI 2 "vector_operand" "xBT,Ywm")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pw\t{%2, %0|%0, %2} vpw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0,1") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -16809,6 +16827,7 @@ (define_insn "*sse4_1_3" vp\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0,0,1") (set_attr "prefix_extra" "1,1,*") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -16817,12 +16836,13 @@ (define_insn "*v16qi3" [(set (match_operand:V16QI 0 "register_operand" "=x,Yw") (umaxmin:V16QI (match_operand:V16QI 1 "vector_operand" "%0,Yw") - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm")))] + (match_operand:V16QI 2 "vector_operand" "xBT,Ywm")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pb\t{%2, %0|%0, %2} vpb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseiadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -18813,7 +18833,7 @@ (define_insn "_pinsr" [(set (match_operand:PINSR_MODE 0 "register_operand" "=x,x,x,x,v,v,&x") (vec_merge:PINSR_MODE (vec_duplicate:PINSR_MODE - (match_operand: 2 "nonimmediate_operand" "r,m,r,m,r,m,x")) + (match_operand: 2 "nonimmediate_operand" "h,Bt,r,m,r,m,x")) (match_operand:PINSR_MODE 1 "register_operand" "0,0,x,x,v,v,x") (match_operand:SI 3 "const_int_operand")))] "TARGET_SSE2 @@ -18850,6 +18870,7 @@ (define_insn "_pinsr" } [(set_attr "isa" "noavx,noavx,avx,avx,,,avx2") (set_attr "type" "sselog") + (set_attr "gpr32" "0,0,1,1,1,1,1") (set (attr "prefix_rex") (if_then_else (and (not (match_test "TARGET_AVX")) @@ -20010,17 +20031,23 @@ (define_insn_and_split "*vec_extract_0_mem" operands[4] = gen_lowpart (mode, operands[2]); }) +(define_mode_attr vi128_h_r + [(V16QI "h") (V8HI "r")]) + (define_insn "*vec_extract" - [(set (match_operand: 0 "register_sse4nonimm_operand" "=r,m") + [(set (match_operand: 0 "register_sse4nonimm_operand" "=,r,Bt,m") (vec_select: - (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW") + (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW,YW,YW") (parallel [(match_operand:SI 2 "const_0_to__operand")])))] "TARGET_SSE2" "@ - %vpextr\t{%2, %1, %k0|%k0, %1, %2} - %vpextr\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "*,sse4") + pextr\t{%2, %1, %k0|%k0, %1, %2} + vpextr\t{%2, %1, %k0|%k0, %1, %2} + pextr\t{%2, %1, %0|%0, %1, %2} + vpextr\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "sse2_noavx,avx,sse4_noavx,avx") + (set_attr "gpr32" "1,1,0,1") (set_attr "type" "sselog1") (set (attr "prefix_extra") (if_then_else @@ -20028,20 +20055,23 @@ (define_insn "*vec_extract" (const_string "*") (const_string "1"))) (set_attr "length_immediate" "1") - (set_attr "prefix" "maybe_vex,maybe_vex") + (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) (define_insn "*vec_extract_zext" - [(set (match_operand:SWI48 0 "register_operand" "=r") + [(set (match_operand:SWI48 0 "register_operand" "=,r") (zero_extend:SWI48 (vec_select: - (match_operand:PEXTR_MODE12 1 "register_operand" "YW") + (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to__operand")]))))] "TARGET_SSE2" - "%vpextr\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + "@ + pextr\t{%2, %1, %k0|%k0, %1, %2} + vpextr\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set (attr "prefix_extra") (if_then_else (eq (const_string "mode") (const_string "V8HImode")) @@ -20052,15 +20082,18 @@ (define_insn "*vec_extract_zext" (set_attr "mode" "TI")]) (define_insn "*vec_extractv16qi_zext" - [(set (match_operand:HI 0 "register_operand" "=r") + [(set (match_operand:HI 0 "register_operand" "=h,r") (zero_extend:HI (vec_select:QI - (match_operand:V16QI 1 "register_operand" "YW") + (match_operand:V16QI 1 "register_operand" "YW,YW") (parallel [(match_operand:SI 2 "const_0_to_15_operand")]))))] "TARGET_SSE4_1" - "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "type" "sselog1") + "@ + pextrb\t{%2, %1, %k0|%k0, %1, %2} + vpextrb\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "maybe_vex") @@ -20166,24 +20199,26 @@ (define_split "operands[1] = gen_lowpart (SImode, operands[1]);") (define_insn "*vec_extractv4si" - [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw") + [(set (match_operand:SI 0 "nonimmediate_operand" "=hBt,rm,rm,Yr,*x,Yw") (vec_select:SI - (match_operand:V4SI 1 "register_operand" " x, v, 0, 0,Yw") + (match_operand:V4SI 1 "register_operand" "x, x, v, 0, 0, Yw") (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] "TARGET_SSE4_1" { switch (which_alternative) { case 0: + return "pextrd\t{%2, %1, %0|%0, %1, %2}"; case 1: - return "%vpextrd\t{%2, %1, %0|%0, %1, %2}"; - case 2: + return "vpextrd\t{%2, %1, %0|%0, %1, %2}"; + case 3: + case 4: operands[2] = GEN_INT (INTVAL (operands[2]) * 4); return "psrldq\t{%2, %0|%0, %2}"; - case 4: + case 5: operands[2] = GEN_INT (INTVAL (operands[2]) * 4); return "vpsrldq\t{%2, %1, %0|%0, %1, %2}"; @@ -20191,25 +20226,29 @@ (define_insn "*vec_extractv4si" gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq,noavx,noavx,avx") - (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1,sseishft1") + [(set_attr "isa" "noavx,avx,avx512dq,noavx,noavx,avx") + (set_attr "type" "sselog1,sselog1,sselog1,sseishft1,sseishft1,sseishft1") + (set_attr "gpr32" "0,1,1,1,1,1") (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "0,1") (const_string "1") (const_string "*"))) (set_attr "length_immediate" "1") - (set_attr "prefix" "maybe_vex,evex,orig,orig,maybe_vex") + (set_attr "prefix" "orig,vex,evex,orig,orig,maybe_vex") (set_attr "mode" "TI")]) (define_insn "*vec_extractv4si_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=h,r,r") (zero_extend:DI (vec_select:SI - (match_operand:V4SI 1 "register_operand" "x,v") + (match_operand:V4SI 1 "register_operand" "x,x,v") (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))] "TARGET_64BIT && TARGET_SSE4_1" - "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "isa" "*,avx512dq") + "@ + pextrd\t{%2, %1, %k0|%k0, %1, %2} + vpextrd\t{%2, %1, %k0|%k0, %1, %2} + vpextrd\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "noavx,avx,avx512dq") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -20239,13 +20278,14 @@ (define_insn_and_split "*vec_extractv4si_zext_mem" }) (define_insn "*vec_extractv2di_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=rm,rm,m,x,x,Yv,x,v,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=hBt,rm,rm,m,x,x,Yv,x,v,r") (vec_select:DI - (match_operand:V2DI 1 "nonimmediate_operand" "x ,v ,v,0,x, v,x,o,o") + (match_operand:V2DI 1 "nonimmediate_operand" "x, x ,v ,v,0,x, v,x,o,o") (parallel [(const_int 1)])))] "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ - %vpextrq\t{$1, %1, %0|%0, %1, 1} + pextrq\t{$1, %1, %0|%0, %1, 1} + vpextrq\t{$1, %1, %0|%0, %1, 1} vpextrq\t{$1, %1, %0|%0, %1, 1} %vmovhps\t{%1, %0|%0, %1} psrldq\t{$8, %0|%0, 8} @@ -20256,44 +20296,47 @@ (define_insn "*vec_extractv2di_1" #" [(set (attr "isa") (cond [(eq_attr "alternative" "0") - (const_string "x64_sse4") + (const_string "x64_sse4_noavx") (eq_attr "alternative" "1") + (const_string "x64_avx") + (eq_attr "alternative" "2") (const_string "x64_avx512dq") - (eq_attr "alternative" "3") - (const_string "sse2_noavx") (eq_attr "alternative" "4") - (const_string "avx") + (const_string "sse2_noavx") (eq_attr "alternative" "5") - (const_string "avx512bw") + (const_string "avx") (eq_attr "alternative" "6") - (const_string "noavx") + (const_string "avx512bw") (eq_attr "alternative" "8") + (const_string "noavx") + (eq_attr "alternative" "9") (const_string "x64") ] (const_string "*"))) (set (attr "type") - (cond [(eq_attr "alternative" "2,6,7") + (cond [(eq_attr "alternative" "3,7,8") (const_string "ssemov") - (eq_attr "alternative" "3,4,5") + (eq_attr "alternative" "4,5,6") (const_string "sseishft1") - (eq_attr "alternative" "8") + (eq_attr "alternative" "9") (const_string "imov") ] (const_string "sselog1"))) + (set_attr "gpr32" "0,1,1,1,1,1,1,1,1,1") (set (attr "length_immediate") - (if_then_else (eq_attr "alternative" "0,1,3,4,5") + (if_then_else (eq_attr "alternative" "0,1,2,4,5,6") (const_string "1") (const_string "*"))) (set (attr "prefix_rex") - (if_then_else (eq_attr "alternative" "0,1") + (if_then_else (eq_attr "alternative" "0") (const_string "1") (const_string "*"))) (set (attr "prefix_extra") - (if_then_else (eq_attr "alternative" "0,1") + (if_then_else (eq_attr "alternative" "0") (const_string "1") (const_string "*"))) - (set_attr "prefix" "maybe_vex,evex,maybe_vex,orig,vex,evex,orig,*,*") - (set_attr "mode" "TI,TI,V2SF,TI,TI,TI,V4SF,DI,DI")]) + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig,vex,evex,orig,*,*") + (set_attr "mode" "TI,TI,TI,V2SF,TI,TI,TI,V4SF,DI,DI")]) (define_split [(set (match_operand: 0 "register_operand") @@ -20411,7 +20454,7 @@ (define_insn "*vec_concatv2si_sse4_1" (match_operand:SI 1 "nonimmediate_operand" " 0, 0, x,Yv, 0, 0,Yv,rm, 0,rm") (match_operand:SI 2 "nonimm_or_0_operand" - " rm,rm,rm,rm,Yr,*x,Yv, C,*ym, C")))] + " hBt,hBt,rm,rm,Yr,*x,Yv, C,*ym, C")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pinsrd\t{$1, %2, %0|%0, %2, 1} @@ -20438,6 +20481,10 @@ (define_insn "*vec_concatv2si_sse4_1" (const_string "mmxmov") ] (const_string "sselog"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "0,1") + (const_string "0") + (const_string "1"))) (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "0,1,2,3") (const_string "1") @@ -20562,7 +20609,7 @@ (define_insn "vec_concatv2di" (match_operand:DI 1 "register_operand" " 0, 0,x ,Yv,0,Yv,0,0,v") (match_operand:DI 2 "nonimmediate_operand" - " rm,rm,rm,rm,x,Yv,x,m,m")))] + " hm,hm,rm,rm,x,Yv,x,m,m")))] "TARGET_SSE" "@ pinsrq\t{$1, %2, %0|%0, %2, 1} @@ -20592,6 +20639,10 @@ (define_insn "vec_concatv2di" (eq_attr "alternative" "0,1,2,3,4,5") (const_string "sselog") (const_string "ssemov"))) + (set (attr "gpr32") + (if_then_else (eq_attr "alternative" "0,1") + (const_string "0") + (const_string "1"))) (set (attr "prefix_rex") (if_then_else (eq_attr "alternative" "0,1,2,3") (const_string "1") @@ -21525,7 +21576,7 @@ (define_insn "ssse3_pmaddubsw128" (const_int 12) (const_int 14)]))) (sign_extend:V8HI (vec_select:V8QI - (match_operand:V16QI 2 "vector_operand" "xBm,Ywm") + (match_operand:V16QI 2 "vector_operand" "xBT,Ywm") (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) @@ -21548,6 +21599,7 @@ (define_insn "ssse3_pmaddubsw128" pmaddubsw\t{%2, %0|%0, %2} vpmaddubsw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseiadd") (set_attr "atom_unit" "simul") (set_attr "prefix_extra" "1") @@ -21666,7 +21718,7 @@ (define_insn "*_pmulhrsw3" (sign_extend: (match_operand:VI2_AVX2_AVX512BW 1 "vector_operand" "%0,")) (sign_extend: - (match_operand:VI2_AVX2_AVX512BW 2 "vector_operand" "xBm,m"))) + (match_operand:VI2_AVX2_AVX512BW 2 "vector_operand" "xBT,m"))) (const_int 14)) (match_operand:VI2_AVX2_AVX512BW 3 "const1_operand")) (const_int 1))))] @@ -21676,6 +21728,7 @@ (define_insn "*_pmulhrsw3" pmulhrsw\t{%2, %0|%0, %2} vpmulhrsw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseimul") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -21792,13 +21845,14 @@ (define_insn "_pshufb3" [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,") (unspec:VI1_AVX512 [(match_operand:VI1_AVX512 1 "register_operand" "0,") - (match_operand:VI1_AVX512 2 "vector_operand" "xBm,m")] + (match_operand:VI1_AVX512 2 "vector_operand" "xBT,m")] UNSPEC_PSHUFB))] "TARGET_SSSE3 && && " "@ pshufb\t{%2, %0|%0, %2} vpshufb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") @@ -21915,7 +21969,7 @@ (define_insn "_palignr" [(set (match_operand:VIMAX_AVX2_AVX512BW 0 "register_operand" "=x,") (unspec:VIMAX_AVX2_AVX512BW [(match_operand:VIMAX_AVX2_AVX512BW 1 "register_operand" "0,") - (match_operand:VIMAX_AVX2_AVX512BW 2 "vector_operand" "xBm,m") + (match_operand:VIMAX_AVX2_AVX512BW 2 "vector_operand" "xBT,m") (match_operand:SI 3 "const_0_to_255_mul_8_operand")] UNSPEC_PALIGNR))] "TARGET_SSSE3" @@ -21933,6 +21987,7 @@ (define_insn "_palignr" } } [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "type" "sseishft") (set_attr "atom_unit" "sishuf") (set_attr "prefix_extra" "1") @@ -22007,6 +22062,7 @@ (define_insn_and_split "ssse3_palignrdi" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseishft") + (set_attr "gpr32" "0,0,1") (set_attr "atom_unit" "sishuf") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -22022,12 +22078,16 @@ (define_mode_iterator VI1248_AVX512VL_AVX512BW (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) (define_insn "*abs2" - [(set (match_operand:VI1248_AVX512VL_AVX512BW 0 "register_operand" "=") + [(set (match_operand:VI1248_AVX512VL_AVX512BW 0 "register_operand" "=x,") (abs:VI1248_AVX512VL_AVX512BW - (match_operand:VI1248_AVX512VL_AVX512BW 1 "vector_operand" "Bm")))] + (match_operand:VI1248_AVX512VL_AVX512BW 1 "vector_operand" "xBT,Bm")))] "TARGET_SSSE3" - "%vpabs\t{%1, %0|%0, %1}" - [(set_attr "type" "sselog1") + "@ + pabs\t{%1, %0|%0, %1} + vpabs\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) @@ -22365,11 +22425,15 @@ (define_mode_attr vi8_sse4_1_avx2_avx512 (define_insn "_movntdqa" [(set (match_operand:VI8_AVX2_AVX512F 0 "register_operand" "=Yr,*x,v") - (unspec:VI8_AVX2_AVX512F [(match_operand:VI8_AVX2_AVX512F 1 "memory_operand" "m,m,m")] + (unspec:VI8_AVX2_AVX512F [(match_operand:VI8_AVX2_AVX512F 1 "memory_operand" "Bt,Bt,m")] UNSPEC_MOVNTDQA))] "TARGET_SSE4_1" - "%vmovntdqa\t{%1, %0|%0, %1}" + "@ + movntdqa\t{%1, %0|%0, %1} + movntdqa\t{%1, %0|%0, %1} + vmovntdqa\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -22388,6 +22452,7 @@ (define_insn "_mpsadbw" mpsadbw\t{%3, %2, %0|%0, %2, %3} vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog1") (set_attr "gpr32" "0") (set_attr "length_immediate" "1") @@ -22401,7 +22466,7 @@ (define_insn "_packusdw" [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand" "=Yr,*x,") (unspec:VI2_AVX2_AVX512BW [(match_operand: 1 "register_operand" "0,0,") - (match_operand: 2 "vector_operand" "YrBm,*xBm,m")] + (match_operand: 2 "vector_operand" "YrBT,*xBT,m")] UNSPEC_US_TRUNCATE))] "TARGET_SSE4_1 && && " "@ @@ -22409,6 +22474,7 @@ (define_insn "_packusdw" packusdw\t{%2, %0|%0, %2} vpackusdw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,") @@ -22755,10 +22821,14 @@ (define_insn "sse4_1_v8qiv8hi2" (define_insn "*sse4_1_v8qiv8hi2_1" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,Yw") (any_extend:V8HI - (match_operand:V8QI 1 "memory_operand" "m,m,m")))] + (match_operand:V8QI 1 "memory_operand" "Bt,Bt,m")))] "TARGET_SSE4_1 && && " - "%vpmovbw\t{%1, %0|%0, %1}" + "@ + pmovbw\t{%1, %0|%0, %1} + pmovbw\t{%1, %0|%0, %1} + vpmovbw\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -22788,7 +22858,7 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_3" [(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,Yw") (vec_select:V16QI (vec_concat:V32QI - (match_operand:V16QI 1 "vector_operand" "YrBm,*xBm,Ywm") + (match_operand:V16QI 1 "vector_operand" "YrBT,*xBT,Ywm") (match_operand:V16QI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -22813,7 +22883,8 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" [(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,Yw") @@ -22821,7 +22892,7 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" (vec_concat:V32QI (subreg:V16QI (vec_concat:VI248_128 - (match_operand: 1 "vector_operand" "YrBm,*xBm,Ywm") + (match_operand: 1 "vector_operand" "YrBT,*xBT,Ywm") (match_operand: 2 "const0_operand")) 0) (match_operand:V16QI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -22848,7 +22919,8 @@ (define_insn_and_split "*sse4_1_zero_extendv8qiv8hi2_4" } operands[1] = lowpart_subreg (V16QImode, operands[1], mode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_expand "v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand") @@ -22967,10 +23039,11 @@ (define_insn "sse4_1_v4qiv4si2" (define_insn "*sse4_1_v4qiv4si2_1" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (any_extend:V4SI - (match_operand:V4QI 1 "memory_operand" "m,m,m")))] + (match_operand:V4QI 1 "memory_operand" "Bt,Bt,m")))] "TARGET_SSE4_1 && " "%vpmovbd\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23139,10 +23212,14 @@ (define_insn "sse4_1_v4hiv4si2" (define_insn "*sse4_1_v4hiv4si2_1" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (any_extend:V4SI - (match_operand:V4HI 1 "memory_operand" "m,m,m")))] + (match_operand:V4HI 1 "memory_operand" "Bt,Bt,m")))] "TARGET_SSE4_1 && " - "%vpmovwd\t{%1, %0|%0, %1}" + "@ + pmovwd\t{%1, %0|%0, %1} + pmovwd\t{%1, %0|%0, %1} + vpmovwd\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23191,7 +23268,7 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_3" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") (vec_select:V8HI (vec_concat:V16HI - (match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V8HI 1 "vector_operand" "YrBT,*xBT,vm") (match_operand:V8HI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -23214,7 +23291,8 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") @@ -23222,7 +23300,7 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" (vec_concat:V16HI (subreg:V8HI (vec_concat:VI148_128 - (match_operand: 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand: 1 "vector_operand" "YrBT,*xBT,vm") (match_operand: 2 "const0_operand")) 0) (match_operand:V8HI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -23247,7 +23325,8 @@ (define_insn_and_split "*sse4_1_zero_extendv4hiv4si2_4" } operands[1] = lowpart_subreg (V8HImode, operands[1], mode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn "avx512f_v8qiv8di2" [(set (match_operand:V8DI 0 "register_operand" "=v") @@ -23385,12 +23464,16 @@ (define_insn "sse4_1_v2qiv2di2" (set_attr "mode" "TI")]) (define_insn "*sse4_1_v2qiv2di2_1" - [(set (match_operand:V2DI 0 "register_operand" "=v") + [(set (match_operand:V2DI 0 "register_operand" "=x,v") (any_extend:V2DI - (match_operand:V2QI 1 "memory_operand" "m")))] + (match_operand:V2QI 1 "memory_operand" "Bt,m")))] "TARGET_SSE4_1 && " - "%vpmovbq\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovbq\t{%1, %0|%0, %1} + vpmovbq\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") + (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_evex") (set_attr "mode" "TI")]) @@ -23524,10 +23607,14 @@ (define_insn "sse4_1_v2hiv2di2" (define_insn "*sse4_1_v2hiv2di2_1" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,v") (any_extend:V2DI - (match_operand:V2HI 1 "memory_operand" "m,m,m")))] + (match_operand:V2HI 1 "memory_operand" "Bt,Bt,m")))] "TARGET_SSE4_1 && " - "%vpmovwq\t{%1, %0|%0, %1}" + "@ + pmovwq\t{%1, %0|%0, %1} + pmovwq\t{%1, %0|%0, %1} + vpmovwq\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23689,10 +23776,14 @@ (define_insn "sse4_1_v2siv2di2" (define_insn "*sse4_1_v2siv2di2_1" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,v") (any_extend:V2DI - (match_operand:V2SI 1 "memory_operand" "m,m,m")))] + (match_operand:V2SI 1 "memory_operand" "Bt,Bt,m")))] "TARGET_SSE4_1 && " - "%vpmovdq\t{%1, %0|%0, %1}" + "@ + pmovdq\t{%1, %0|%0, %1} + pmovdq\t{%1, %0|%0, %1} + vpmovdq\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1") (set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,maybe_evex") @@ -23719,7 +23810,7 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_3" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (vec_select:V4SI (vec_concat:V8SI - (match_operand:V4SI 1 "vector_operand" "YrBm,*xBm,vm") + (match_operand:V4SI 1 "vector_operand" "YrBT,*xBT,vm") (match_operand:V4SI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] @@ -23740,14 +23831,15 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_3" DONE; } } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_4" [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") (vec_select:V4SI (vec_concat:V8SI (vec_concat:V4SI - (match_operand:V2SI 1 "vector_operand" "YrBm, *xBm, vm") + (match_operand:V2SI 1 "vector_operand" "YrBT, *xBT, vm") (match_operand:V2SI 2 "const0_operand")) (match_operand:V4SI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" @@ -23769,7 +23861,8 @@ (define_insn_and_split "*sse4_1_zero_extendv2siv2di2_4" } operands[1] = lowpart_subreg (V4SImode, operands[1], V2SImode); } - [(set_attr "isa" "noavx,noavx,avx")]) + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "gpr32" "0,0,1")]) (define_expand "v2siv2di2" [(set (match_operand:V2DI 0 "register_operand") @@ -25960,7 +26053,7 @@ (define_insn "xop_vpermil23" (define_insn "aesenc" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xBT,xm,vm")] UNSPEC_AESENC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -25969,6 +26062,7 @@ (define_insn "aesenc" vaesenc\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -25977,7 +26071,7 @@ (define_insn "aesenc" (define_insn "aesenclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xBT,xm,vm")] UNSPEC_AESENCLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -25986,6 +26080,7 @@ (define_insn "aesenclast" vaesenclast\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -25994,7 +26089,7 @@ (define_insn "aesenclast" (define_insn "aesdec" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xBT,xm,vm")] UNSPEC_AESDEC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -26003,6 +26098,7 @@ (define_insn "aesdec" vaesdec\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") (set_attr "btver2_decode" "double,double,double") @@ -26011,7 +26107,7 @@ (define_insn "aesdec" (define_insn "aesdeclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xBT,xm,vm")] UNSPEC_AESDECLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ @@ -26019,6 +26115,7 @@ (define_insn "aesdeclast" vaesdeclast\t{%2, %1, %0|%0, %1, %2} vaesdeclast\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,aes,avx512vl") + (set_attr "gpr32" "0,1,1") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex,evex") @@ -26054,7 +26151,7 @@ (define_insn "aeskeygenassist" (define_insn "pclmulqdq" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm") + (match_operand:V2DI 2 "vector_operand" "xBT,xm,vm") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_PCLMUL))] "TARGET_PCLMUL" @@ -26064,6 +26161,7 @@ (define_insn "pclmulqdq" vpclmulqdq\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,avx,vpclmulqdqvl") (set_attr "type" "sselog1") + (set_attr "gpr32" "0,1,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex,evex") @@ -29395,7 +29493,7 @@ (define_insn "vgf2p8affineinvqb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm") + (match_operand:VI1_AVX512F 2 "vector_operand" "xBT,vm") (match_operand 3 "const_0_to_255_operand")] UNSPEC_GF2P8AFFINEINV))] "TARGET_GFNI" @@ -29403,6 +29501,7 @@ (define_insn "vgf2p8affineinvqb_" gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3} vgf2p8affineinvqb\t{%3, %2, %1, %0| %0, %1, %2, %3}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) @@ -29411,7 +29510,7 @@ (define_insn "vgf2p8affineqb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm") + (match_operand:VI1_AVX512F 2 "vector_operand" "xBT,vm") (match_operand 3 "const_0_to_255_operand")] UNSPEC_GF2P8AFFINE))] "TARGET_GFNI" @@ -29419,6 +29518,7 @@ (define_insn "vgf2p8affineqb_" gf2p8affineqb\t{%3, %2, %0| %0, %2, %3} vgf2p8affineqb\t{%3, %2, %1, %0| %0, %1, %2, %3}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) @@ -29427,13 +29527,14 @@ (define_insn "vgf2p8mulb_" [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v") (unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "%0,v") - (match_operand:VI1_AVX512F 2 "vector_operand" "xBm,vm")] + (match_operand:VI1_AVX512F 2 "vector_operand" "xBT,vm")] UNSPEC_GF2P8MUL))] "TARGET_GFNI" "@ gf2p8mulb\t{%2, %0| %0, %2} vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "0,1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "")]) From patchwork Thu Aug 31 08:20:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 137249 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp98733vqu; Thu, 31 Aug 2023 01:29:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHo9JrKfxKMZ9FS2FU0+Sdyxuaa7K7jfqagHIadoF2e0vfpbHGDAJkI1tfQcr1NvS3Rt6kr X-Received: by 2002:a17:906:5da5:b0:9a5:b8c1:8bfa with SMTP id n5-20020a1709065da500b009a5b8c18bfamr3201242ejv.28.1693470579629; Thu, 31 Aug 2023 01:29:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693470579; cv=none; d=google.com; s=arc-20160816; b=N7Ii2Bz5ETqPnZ1XxD2WNZBf2FIwooH3LYNyCGF+LwUPksaMjVHkhFodr5wO++wnif JrFcEMf9HnCEr3JsMO4BJlQW0MxZR2c4nhm34HWcNC1WzPODBhpz22XDYfy57g0SY7Fr 3rgmEunkULTUsmM9FX7bopFMpo7xwbA6WgnctEiUYFLPlbAe9C2wZbuZhOhIud0kqRxx qbqRYpl8HLDqpIuBWOMEhQq+FFKKIw72lKUgxV9lzFcfcXKdSzoH0QBAwHI3mZ275157 AKBv++x4M+6YFGyVAx7h/2tNmSlGJ7AqJ6uFyI2Rl9QOBzc/uDvhL6aZWVSq9tCVOOKQ 2iSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=ANnsCeW6wMUVwwPDimMr8sYHqUbUHJtJEMVtCySqAZY=; fh=t6VkRRFhh90/YyDrY4l675lM3BlOpES7S7srbNWOHSE=; b=fUrDTm2SlXMpbEGyVAjR3Gphv64oeDbtJErJjIsjirnCpazmYDPjFCyoL8c+1++7Ou KBfGD790oL7GX6mKxZhkk+M+45XC7f2EmkCuYLDfNgzhbge0oNsqZ8MDtkoICc2R6Kg0 LQ17t2yUK9mp5V+ZJuW9JTBEzaYx1xZuJZJGzkKzjT9z2VA3F2Fdob0zOqkaQndywJGr IQ2JpQgvWWipx1TX9fi732s4i0mcOAcH8DSoS4XANBLJThJ7q4wSClrucO49x0M9fhXg 19KpeuynRf/QX7nMfrhSNI189Pkqb3AkfVxJlc9I1qOTWn4J0n9wn+C4TxcOhJcRimsv CA0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=cOKuS5ry; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id kd9-20020a17090798c900b0099c714ad27bsi639622ejc.749.2023.08.31.01.29.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Aug 2023 01:29:39 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=cOKuS5ry; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F2986385DC19 for ; Thu, 31 Aug 2023 08:25:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F2986385DC19 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470314; bh=ANnsCeW6wMUVwwPDimMr8sYHqUbUHJtJEMVtCySqAZY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=cOKuS5ry5GVF1fVtn8bQ+MajNlTD64pmPwBaRLMQkS8Xl7Jv2LF/JtYXHNzH2jwbT kJ9JqsksqVfFE8HpHq2Vsh6fmmYxkvtNbPHy28vpnWQzJ38LSDvNJuAWy+f2VAEUm6 cOOmRTAEuAQV3B6xq05LxI+mPvrVlMSEa0Ruw/Ao= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 1FAB73857019 for ; Thu, 31 Aug 2023 08:21:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1FAB73857019 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235943" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235943" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:21:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938939" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938939" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AAEDB1005135; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 13/13] [APX EGPR] Handle vex insns that only support GPR16 (5/5) Date: Thu, 31 Aug 2023 16:20:24 +0800 Message-Id: <20230831082024.314097-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775732606451795952 X-GMAIL-MSGID: 1775732606451795952 From: Kong Lingling These vex insn may have legacy counterpart that could support EGPR, but they do not have evex counterpart. Split out its vex part from patterns and set the vex part to non-EGPR supported by adjusting constraints and attr_gpr32. insn list: 1. vmovmskpd/vmovmskps 2. vpmovmskb 3. vrsqrtss/vrsqrtps 4. vrcpss/vrcpps 5. vhaddpd/vhaddps, vhsubpd/vhsubps 6. vldmxcsr/vstmxcsr 7. vaddsubpd/vaddsubps 8. vlddqu 9. vtestps/vtestpd 10. vmaskmovps/vmaskmovpd, vpmaskmovd/vpmaskmovq 11. vperm2f128/vperm2i128 12. vinserti128/vinsertf128 13. vbroadcasti128/vbroadcastf128 14. vcmppd/vcmpps, vcmpss/vcmpsd 15. vgatherdps/vgatherqps, vgatherdpd/vgatherqpd gcc/ChangeLog: * config/i386/constraints.md (TV): New constraint for vsib memory that does not allow gpr32. * config/i386/i386.md: (setcc__sse): Replace m to Bt for avx alternative and set attr_gpr32 to 0. (movmsk_df): Split avx/noavx alternatives and replace "r" to "h" for avx alternative. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "Bt/BT" for avx alternative, set its gpr32 attr to 0. (*rsqrtsf2_sse): Likewise. * config/i386/mmx.md (mmx_pmovmskb): Split alternative 1 to avx/noavx and assign h/r constraint to dest. * config/i386/sse.md (_movmsk): Split avx/noavx alternatives and replace "r" to "h" for avx alternative. (*_movmsk_ext): Likewise. (*_movmsk_lt): Likewise. (*_movmsk_ext_lt): Likewise. (*_movmsk_shift): Likewise. (*_movmsk_ext_shift): Likewise. (_pmovmskb): Likewise. (*_pmovmskb_zext): Likewise. (*sse2_pmovmskb_ext): Likewise. (*_pmovmskb_lt): Likewise. (*_pmovmskb_zext_lt): Likewise. (*sse2_pmovmskb_ext_lt): Likewise. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "Bt/BT" for avx alternative, set its attr_gpr32 to 0. (sse_vmrcpv4sf2): Likewise. (*sse_vmrcpv4sf2): Likewise. (rsqrt2): Likewise. (sse_vmrsqrtv4sf2): Likewise. (*sse_vmrsqrtv4sf2): Likewise. (avx_hv4df3): Likewise. (sse3_hsubv2df3): Likewise. (avx_hv8sf3): Likewise. (sse3_hv4sf3): Likewise. (_lddqu): Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*sse2_gt3): Likewise. (sse_ldmxcsr): Likewise. (sse_stmxcsr): Likewise. (avx_vtest): Replace m to Bt for avx alternative and set attr_gpr32 to 0. (avx2_permv2ti): Likewise. (*avx_vperm2f128_full): Likewise. (*avx_vperm2f128_nozero): Likewise. (vec_set_lo_v32qi): Likewise. (_maskload): Likewise. (_maskstore: Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*_maskcmp3_comm): Likewise. (*avx2_gathersi): Replace Tv to TV and set attr_gpr32 to 0. (*avx2_gathersi_2): Likewise. (*avx2_gatherdi): Likewise. (*avx2_gatherdi_2): Likewise. (*avx2_gatherdi_3): Likewise. (*avx2_gatherdi_4): Likewise. (avx_vbroadcastf128_): Restrict non-egpr alternative to noavx512vl, set its constraint to Bt and set attr_gpr32 to 0. (vec_set_lo_): Likewise. (vec_set_lo_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_hi_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_lo_): Likewise. (avx2_set_hi_v32qi): Likewise. --- gcc/config/i386/constraints.md | 7 + gcc/config/i386/i386.md | 52 +++-- gcc/config/i386/mmx.md | 11 +- gcc/config/i386/sse.md | 337 +++++++++++++++++++++------------ 4 files changed, 261 insertions(+), 146 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index f487bf2e5a3..052b6a95841 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -374,6 +374,7 @@ (define_constraint "Z" ;; T prefix is used for different address constraints ;; v - VSIB address +;; V - VSIB address with no rex2 register ;; s - address with no segment register ;; i - address with no index and no rip ;; b - address with no base and no rip @@ -386,5 +387,11 @@ (define_address_constraint "Ts" "Address operand without segment register" (match_operand 0 "address_no_seg_operand")) +(define_address_constraint "TV" + "VSIB address operand" + (and (match_operand 0 "vsib_address_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + (define_register_constraint "h" "TARGET_APX_EGPR ? GENERAL_GPR16 : GENERAL_REGS") diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8ec249b268d..d31c1910026 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -554,7 +554,8 @@ (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl, avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16,avxifma, - avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl" + avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl, + avx_noavx512f,avx_noavx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -908,6 +909,8 @@ (define_attr "enabled" "" (eq_attr "isa" "sse4_noavx") (symbol_ref "TARGET_SSE4_1 && !TARGET_AVX") (eq_attr "isa" "avx") (symbol_ref "TARGET_AVX") + (eq_attr "isa" "avx_noavx512f") + (symbol_ref "TARGET_AVX && !TARGET_AVX512F") (eq_attr "isa" "noavx") (symbol_ref "!TARGET_AVX") (eq_attr "isa" "avx2") (symbol_ref "TARGET_AVX2") (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2") @@ -16665,12 +16668,13 @@ (define_insn "setcc__sse" [(set (match_operand:MODEF 0 "register_operand" "=x,x") (match_operator:MODEF 3 "sse_comparison_operator" [(match_operand:MODEF 1 "register_operand" "0,x") - (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")]))] + (match_operand:MODEF 2 "nonimmediate_operand" "xm,xBt")]))] "SSE_FLOAT_MODE_P (mode)" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -20126,24 +20130,28 @@ (define_insn "*hf" (set_attr "mode" "HF")]) (define_insn "*rcpsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,BT")] UNSPEC_RCP))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrcpss\t{%d1, %0|%0, %d1} %vrcpss\t{%d1, %0|%0, %d1} - %vrcpss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rcpss\t{%1, %d0|%d0, %1} + vrcpss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") + (set_attr "gpr32" "0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -20386,24 +20394,27 @@ (define_insn "sqrtxf2" (set_attr "bdver1_decode" "direct")]) (define_insn "*rsqrtsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,BT")] UNSPEC_RSQRT))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrsqrtss\t{%d1, %0|%0, %d1} %vrsqrtss\t{%d1, %0|%0, %d1} - %vrsqrtss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rsqrtss\t{%1, %d0|%d0, %1} + vrsqrtss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -22107,14 +22118,17 @@ (define_expand "signbitxf2" }) (define_insn "movmsk_df" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:DF 1 "register_operand" "x")] + [(match_operand:DF 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "SSE_FLOAT_MODE_P (DFmode) && TARGET_SSE_MATH" - "%vmovmskpd\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmskpd\t{%1, %0|%0, %1} + vmovmskpd\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "DF")]) ;; Use movmskpd in SSE mode to avoid store forwarding stall diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 63803c89f2b..9dcb165d270 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -5182,13 +5182,14 @@ (define_expand "usadv8qi" }) (define_insn_and_split "mmx_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r,r") - (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")] + [(set (match_operand:SI 0 "register_operand" "=r,r,h") + (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x,x")] UNSPEC_MOVMSK))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pmovmskb\t{%1, %0|%0, %1} + # #" "TARGET_SSE2 && reload_completed && SSE_REGNO_P (REGNO (operands[1]))" @@ -5203,9 +5204,9 @@ (define_insn_and_split "mmx_pmovmskb" operands[2] = lowpart_subreg (QImode, operands[0], GET_MODE (operands[0])); } - [(set_attr "mmx_isa" "native,sse") - (set_attr "type" "mmxcvt,ssemov") - (set_attr "mode" "DI,TI")]) + [(set_attr "mmx_isa" "native,sse_noavx,avx") + (set_attr "type" "mmxcvt,ssemov,ssemov") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_maskmovq" [(set (match_operand:V8QI 0 "memory_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 4913c34ed37..4b6bed36061 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1845,12 +1845,16 @@ (define_peephole2 "operands[4] = adjust_address (operands[0], V2DFmode, 0);") (define_insn "_lddqu" - [(set (match_operand:VI1 0 "register_operand" "=x") - (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")] + [(set (match_operand:VI1 0 "register_operand" "=x,x") + (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m,Bt")] UNSPEC_LDDQU))] "TARGET_SSE3" - "%vlddqu\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + "@ + lddqu\t{%1, %0|%0, %1} + vlddqu\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "gpr32" "1,0") (set_attr "movu" "1") (set (attr "prefix_data16") (if_then_else @@ -2519,12 +2523,16 @@ (define_insn "_div3" (set_attr "mode" "")]) (define_insn "_rcp2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RCP))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xBT")] UNSPEC_RCP))] "TARGET_SSE" - "%vrcpps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + "@ + rcpps\t{%1, %0|%0, %1} + vrcpps\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") @@ -2543,6 +2551,7 @@ (define_insn "sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2562,6 +2571,7 @@ (define_insn "*sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2738,12 +2748,16 @@ (define_expand "rsqrt2" "TARGET_AVX512FP16") (define_insn "_rsqrt2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RSQRT))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xBT")] UNSPEC_RSQRT))] "TARGET_SSE" - "%vrsqrtps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + "@ + rsqrtps\t{%1, %0|%0, %1} + vrsqrtps\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) @@ -2802,7 +2816,7 @@ (define_insn "rsqrt14__mask" (define_insn "sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF - (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xBt")] UNSPEC_RSQRT) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2812,6 +2826,7 @@ (define_insn "sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -2819,7 +2834,7 @@ (define_insn "*sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF (vec_duplicate:V4SF - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xBt")] UNSPEC_RSQRT)) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2829,6 +2844,7 @@ (define_insn "*sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -3004,7 +3020,7 @@ (define_insn "vec_addsub3" (vec_merge:VF_128_256 (minus:VF_128_256 (match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm, xm")) + (match_operand:VF_128_256 2 "vector_operand" "xBm, xBt")) (plus:VF_128_256 (match_dup 1) (match_dup 2)) (const_int )))] "TARGET_SSE3" @@ -3013,6 +3029,7 @@ (define_insn "vec_addsub3" vaddsub\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set (attr "atom_unit") (if_then_else (match_test "mode == V2DFmode") @@ -3156,7 +3173,7 @@ (define_insn "avx_hv4df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (plusminus:DF (vec_select:DF - (match_operand:V4DF 2 "nonimmediate_operand" "xm") + (match_operand:V4DF 2 "nonimmediate_operand" "xBt") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))) (vec_concat:V2DF @@ -3169,6 +3186,7 @@ (define_insn "avx_hv4df3" "TARGET_AVX" "vhpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V4DF")]) @@ -3199,7 +3217,7 @@ (define_insn "*sse3_haddv2df3" (parallel [(match_operand:SI 4 "const_0_to_1_operand")]))) (plus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xBt") (parallel [(match_operand:SI 5 "const_0_to_1_operand")])) (vec_select:DF (match_dup 2) @@ -3211,6 +3229,7 @@ (define_insn "*sse3_haddv2df3" haddpd\t{%2, %0|%0, %2} vhaddpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3225,7 +3244,7 @@ (define_insn "sse3_hsubv2df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (minus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xBt") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))))] "TARGET_SSE3" @@ -3234,6 +3253,7 @@ (define_insn "sse3_hsubv2df3" vhsubpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3290,7 +3310,7 @@ (define_insn "avx_hv8sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V8SF 2 "nonimmediate_operand" "xm") + (match_operand:V8SF 2 "nonimmediate_operand" "xBt") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3314,6 +3334,7 @@ (define_insn "avx_hv8sf3" "TARGET_AVX" "vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) @@ -3332,7 +3353,7 @@ (define_insn "sse3_hv4sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V4SF 2 "vector_operand" "xBm,xm") + (match_operand:V4SF 2 "vector_operand" "xBm,xBt") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3344,6 +3365,7 @@ (define_insn "sse3_hv4sf3" vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "atom_unit" "complex") (set_attr "prefix" "orig,vex") (set_attr "prefix_rep" "1,*") @@ -3537,12 +3559,13 @@ (define_insn "avx_cmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "x") - (match_operand:VF_128_256 2 "nonimmediate_operand" "xm") + (match_operand:VF_128_256 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP))] "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3748,7 +3771,7 @@ (define_insn "avx_vmcmp3" (vec_merge:VF_128 (unspec:VF_128 [(match_operand:VF_128 1 "register_operand" "x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm") + (match_operand:VF_128 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP) (match_dup 1) @@ -3756,6 +3779,7 @@ (define_insn "avx_vmcmp3" "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3764,13 +3788,14 @@ (define_insn "*_maskcmp3_comm" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "%0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xBt")]))] "TARGET_SSE && GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3780,12 +3805,13 @@ (define_insn "_maskcmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xBt")]))] "TARGET_SSE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3796,7 +3822,7 @@ (define_insn "_vmmaskcmp3" (vec_merge:VF_128 (match_operator:VF_128 3 "sse_comparison_operator" [(match_operand:VF_128 1 "register_operand" "0,x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")]) + (match_operand:VF_128 2 "nonimmediate_operand" "xm,xBt")]) (match_dup 1) (const_int 1)))] "TARGET_SSE" @@ -3804,6 +3830,7 @@ (define_insn "_vmmaskcmp3" cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1,*") (set_attr "prefix" "orig,vex") @@ -4721,7 +4748,7 @@ (define_insn "_andnot3" (and:VFB_128_256 (not:VFB_128_256 (match_operand:VFB_128_256 1 "register_operand" "0,x,v,v")) - (match_operand:VFB_128_256 2 "vector_operand" "xBm,xm,vm,vm")))] + (match_operand:VFB_128_256 2 "vector_operand" "xBm,xBt,vm,vm")))] "TARGET_SSE && && (! || mode != HFmode)" { @@ -4765,7 +4792,8 @@ (define_insn "_andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512dq,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512dq,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set_attr "prefix" "orig,maybe_vex,evex,evex") (set (attr "mode") @@ -5075,7 +5103,7 @@ (define_insn "*andnot3" [(set (match_operand:ANDNOT_MODE 0 "register_operand" "=x,x,v,v") (and:ANDNOT_MODE (not:ANDNOT_MODE (match_operand:ANDNOT_MODE 1 "register_operand" "0,x,v,v")) - (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xm,vm,v")))] + (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xBt,vm,v")))] "TARGET_SSE" { char buf[128]; @@ -5104,7 +5132,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512vl,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512vl,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -12246,7 +12275,7 @@ (define_insn_and_split "vec_extract_lo_v32qi" "operands[1] = gen_lowpart (V16QImode, operands[1]);") (define_insn "vec_extract_hi_v32qi" - [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm") + [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xBt,vm") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -12264,7 +12293,8 @@ (define_insn "vec_extract_hi_v32qi" [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "isa" "*,avx512vl") + (set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) @@ -17135,6 +17165,7 @@ (define_insn "*sse2_gt3" pcmpgt\t{%2, %0|%0, %2} vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -17451,7 +17482,7 @@ (define_insn "*andnot3" [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v") (and:VI (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")) - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,0,0")))] + (match_operand:VI 2 "bcst_vector_operand" "xBm,xBt,vmBr,0,0")))] "TARGET_SSE && (register_operand (operands[1], mode) || register_operand (operands[2], mode))" @@ -17538,7 +17569,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx,*,*") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f,*,*") + (set_attr "gpr32" "1,0,1,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17693,7 +17725,7 @@ (define_insn "*3" [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI48_AVX_AVX512F (match_operand:VI48_AVX_AVX512F 1 "bcst_vector_operand" "%0,x,v") - (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xm,vmBr")))] + (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xBt,vmBr")))] "TARGET_SSE && && ix86_binary_operator_ok (, mode, operands)" { @@ -17723,9 +17755,11 @@ (define_insn "*3" case E_V4DImode: case E_V4SImode: case E_V2DImode: - ssesuffix = (TARGET_AVX512VL - && ( || which_alternative == 2) - ? "" : ""); + ssesuffix = ((TARGET_AVX512VL + && ( || which_alternative == 2)) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2]))) + ? "" : ""; break; default: gcc_unreachable (); @@ -17765,7 +17799,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17792,7 +17827,7 @@ (define_insn "*3" [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI12_AVX_AVX512F (match_operand:VI12_AVX_AVX512F 1 "vector_operand" "%0,x,v") - (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xBt,vm")))] "TARGET_SSE && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { char buf[64]; @@ -17821,7 +17856,10 @@ (define_insn "*3" case E_V16HImode: case E_V16QImode: case E_V8HImode: - ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : ""; + ssesuffix = (((TARGET_AVX512VL && which_alternative == 2) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2])))) + ? "q" : ""; break; default: gcc_unreachable (); @@ -17858,7 +17896,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17885,13 +17924,14 @@ (define_insn "v1ti3" [(set (match_operand:V1TI 0 "register_operand" "=x,x,v") (any_logic:V1TI (match_operand:V1TI 1 "register_operand" "%0,x,v") - (match_operand:V1TI 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:V1TI 2 "vector_operand" "xBm,xBt,vm")))] "TARGET_SSE2" "@ p\t{%2, %0|%0, %2} vp\t{%2, %1, %0|%0, %1, %2} vpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx,avx512vl") + [(set_attr "isa" "noavx,avx_noavx512vl,avx512vl") + (set_attr "gpr32" "1,0,1") (set_attr "prefix" "orig,vex,evex") (set_attr "prefix_data16" "1,*,*") (set_attr "type" "sselog") @@ -20878,33 +20918,39 @@ (define_insn "*_psadbw" (set_attr "mode" "")]) (define_insn "_movmsk" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE" - "%vmovmsk\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmsk\t{%1, %0|%0, %1} + vmovmsk\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn "*_movmsk_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" - "%vmovmsk\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmsk\t{%1, %0|%0, %1} + vmovmsk\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20913,16 +20959,17 @@ (define_insn_and_split "*_movmsk_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20931,16 +20978,17 @@ (define_insn_and_split "*_movmsk_ext_lt" [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_shift" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20949,17 +20997,18 @@ (define_insn_and_split "*_movmsk_shift" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_shift" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20968,18 +21017,22 @@ (define_insn_and_split "*_movmsk_ext_shift [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn "_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE2" - "%vpmovmskb\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %0|%0, %1} + vpmovmskb\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -20989,14 +21042,17 @@ (define_insn "_pmovmskb" (set_attr "mode" "SI")]) (define_insn "*_pmovmskb_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (zero_extend:DI (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" - "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %k0|%k0, %1} + vpmovmskb\t{%1, %k0|%k0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21006,14 +21062,17 @@ (define_insn "*_pmovmskb_zext" (set_attr "mode" "SI")]) (define_insn "*sse2_pmovmskb_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (sign_extend:DI (unspec:SI - [(match_operand:V16QI 1 "register_operand" "x")] + [(match_operand:V16QI 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" - "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %k0|%k0, %1} + vpmovmskb\t{%1, %k0|%k0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21098,9 +21157,9 @@ (define_split }) (define_insn_and_split "*_pmovmskb_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE2" @@ -21109,7 +21168,8 @@ (define_insn_and_split "*_pmovmskb_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21119,10 +21179,10 @@ (define_insn_and_split "*_pmovmskb_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*_pmovmskb_zext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (zero_extend:DI (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21131,7 +21191,8 @@ (define_insn_and_split "*_pmovmskb_zext_lt" [(set (match_dup 0) (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21141,10 +21202,10 @@ (define_insn_and_split "*_pmovmskb_zext_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*sse2_pmovmskb_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (sign_extend:DI (unspec:SI - [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x") + [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x,x") (match_operand:V16QI 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21153,7 +21214,8 @@ (define_insn_and_split "*sse2_pmovmskb_ext_lt" [(set (match_dup 0) (sign_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21214,21 +21276,28 @@ (define_insn "*sse2_maskmovdqu" (set_attr "mode" "TI")]) (define_insn "sse_ldmxcsr" - [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m")] + [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m,Bt")] UNSPECV_LDMXCSR)] "TARGET_SSE" - "%vldmxcsr\t%0" - [(set_attr "type" "sse") + "@ + ldmxcsr\t%0 + vldmxcsr\t%0" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "load")]) (define_insn "sse_stmxcsr" - [(set (match_operand:SI 0 "memory_operand" "=m") + [(set (match_operand:SI 0 "memory_operand" "=m,Bt") (unspec_volatile:SI [(const_int 0)] UNSPECV_STMXCSR))] "TARGET_SSE" - "%vstmxcsr\t%0" + "@ + stmxcsr\t%0 + vstmxcsr\t%0" [(set_attr "type" "sse") + (set_attr "gpr32" "0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "store")]) @@ -23890,11 +23959,12 @@ (define_expand "v2siv2di2" (define_insn "avx_vtest" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:VF_128_256 0 "register_operand" "x") - (match_operand:VF_128_256 1 "nonimmediate_operand" "xm")] + (match_operand:VF_128_256 1 "nonimmediate_operand" "xBt")] UNSPEC_VTESTP))] "TARGET_AVX" "vtest\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -26955,7 +27025,7 @@ (define_split (define_insn "avx_vbroadcastf128_" [(set (match_operand:V_256 0 "register_operand" "=x,x,x,v,v,v,v") (vec_concat:V_256 - (match_operand: 1 "nonimmediate_operand" "m,0,?x,m,0,m,0") + (match_operand: 1 "nonimmediate_operand" "Bt,0,?x,m,0,m,0") (match_dup 1)))] "TARGET_AVX" "@ @@ -26966,8 +27036,9 @@ (define_insn "avx_vbroadcastf128_" vinsert\t{$1, %1, %0, %0|%0, %0, %1, 1} vbroadcast32x4\t{%1, %0|%0, %1} vinsert32x4\t{$1, %1, %0, %0|%0, %0, %1, 1}" - [(set_attr "isa" "*,*,*,avx512dq,avx512dq,avx512vl,avx512vl") + [(set_attr "isa" "noavx512vl,*,*,avx512dq,avx512dq,avx512vl,avx512vl") (set_attr "type" "ssemov,sselog1,sselog1,ssemov,sselog1,ssemov,sselog1") + (set_attr "gpr32" "0,1,1,1,1,1,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "0,1,1,0,1,0,1") (set_attr "prefix" "vex,vex,vex,evex,evex,evex,evex") @@ -27235,12 +27306,13 @@ (define_insn "*avx_vperm2f128_full" [(set (match_operand:AVX256MODE2P 0 "register_operand" "=x") (unspec:AVX256MODE2P [(match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm") + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMIL2F128))] "TARGET_AVX" "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27357,11 +27429,11 @@ (define_expand "avx_vinsertf128" }) (define_insn "vec_set_lo_" - [(set (match_operand:VI8F_256 0 "register_operand" "=v") + [(set (match_operand:VI8F_256 0 "register_operand" "=x,v") (vec_concat:VI8F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: - (match_operand:VI8F_256 1 "register_operand" "v") + (match_operand:VI8F_256 1 "register_operand" "x,v") (parallel [(const_int 2) (const_int 3)]))))] "TARGET_AVX && " { @@ -27372,7 +27444,9 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27401,11 +27475,11 @@ (define_insn "vec_set_hi_" (set_attr "mode" "")]) (define_insn "vec_set_lo_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 4) (const_int 5) (const_int 6) (const_int 7)]))))] "TARGET_AVX" @@ -27415,20 +27489,22 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) (define_insn "vec_set_hi_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])) - (match_operand: 2 "nonimmediate_operand" "vm")))] + (match_operand: 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" { if (TARGET_AVX512VL) @@ -27436,7 +27512,9 @@ (define_insn "vec_set_hi_" else return "vinsert\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27445,7 +27523,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_" [(set (match_operand:V16_256 0 "register_operand" "=x,v") (vec_concat:V16_256 - (match_operand: 2 "nonimmediate_operand" "xm,vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: (match_operand:V16_256 1 "register_operand" "x,v") (parallel [(const_int 8) (const_int 9) @@ -27456,7 +27534,9 @@ (define_insn "vec_set_lo_" "@ vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27471,12 +27551,14 @@ (define_insn "vec_set_hi_" (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])) - (match_operand: 2 "nonimmediate_operand" "xm,vm")))] + (match_operand: 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27485,7 +27567,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_v32qi" [(set (match_operand:V32QI 0 "register_operand" "=x,v") (vec_concat:V32QI - (match_operand:V16QI 2 "nonimmediate_operand" "xm,v") + (match_operand:V16QI 2 "nonimmediate_operand" "xBt,v") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -27501,6 +27583,7 @@ (define_insn "vec_set_lo_v32qi" vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27519,12 +27602,14 @@ (define_insn "vec_set_hi_v32qi" (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])) - (match_operand:V16QI 2 "nonimmediate_operand" "xm,vm")))] + (match_operand:V16QI 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27534,7 +27619,7 @@ (define_insn "_maskload" [(set (match_operand:V48_128_256 0 "register_operand" "=x") (unspec:V48_128_256 [(match_operand: 2 "register_operand" "x") - (match_operand:V48_128_256 1 "memory_operand" "m")] + (match_operand:V48_128_256 1 "memory_operand" "Bt")] UNSPEC_MASKMOV))] "TARGET_AVX" { @@ -27544,13 +27629,14 @@ (define_insn "_maskload" return "vmaskmov\t{%1, %2, %0|%0, %2, %1}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) (define_insn "_maskstore" - [(set (match_operand:V48_128_256 0 "memory_operand" "+m") + [(set (match_operand:V48_128_256 0 "memory_operand" "+Bt") (unspec:V48_128_256 [(match_operand: 1 "register_operand" "x") (match_operand:V48_128_256 2 "register_operand" "x") @@ -27564,6 +27650,7 @@ (define_insn "_maskstore" return "vmaskmov\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") @@ -28160,7 +28247,7 @@ (define_insn "*avx2_gathersi" [(match_operand:VEC_GATHER_MODE 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28171,6 +28258,7 @@ (define_insn "*avx2_gathersi" "TARGET_AVX2" "%M3vgatherd\t{%1, %7, %0|%0, %7, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28180,7 +28268,7 @@ (define_insn "*avx2_gathersi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28191,6 +28279,7 @@ (define_insn "*avx2_gathersi_2" "TARGET_AVX2" "%M2vgatherd\t{%1, %6, %0|%0, %6, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28221,7 +28310,7 @@ (define_insn "*avx2_gatherdi" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28232,6 +28321,7 @@ (define_insn "*avx2_gatherdi" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %2|%2, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28241,7 +28331,7 @@ (define_insn "*avx2_gatherdi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28256,6 +28346,7 @@ (define_insn "*avx2_gatherdi_2" return "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28266,7 +28357,7 @@ (define_insn "*avx2_gatherdi_3" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28279,6 +28370,7 @@ (define_insn "*avx2_gatherdi_3" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %0|%0, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28289,7 +28381,7 @@ (define_insn "*avx2_gatherdi_4" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28302,6 +28394,7 @@ (define_insn "*avx2_gatherdi_4" "TARGET_AVX2" "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")])