From patchwork Thu Sep 21 07:20:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hu, Lin1" X-Patchwork-Id: 142765 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp4669416vqi; Thu, 21 Sep 2023 00:31:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHEWVIh4lWevBy3pzzdSmOoWRG77kPMDiHKFkyybVw6UFnCPHlmnHaXaf8oP+FlA8EEtuAk X-Received: by 2002:a17:906:8449:b0:9a5:d899:cc36 with SMTP id e9-20020a170906844900b009a5d899cc36mr4027096ejy.2.1695281488514; Thu, 21 Sep 2023 00:31:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695281488; cv=none; d=google.com; s=arc-20160816; b=ZrSSlUr/o767oETJALWawdXUZnUUYuI8VG5G6g2tbpU696SYrET1CTMMAuQIsIuSam R5/uO3k3dxKGqQ/wYtxWa0OrLHbWasn5QBJf7WwJ3ccZpN59gOoMzW6svxDd/dSAlvq9 0J/HyTDUBPpe6wHKv1mid2ymYGRR/ulHreW5PE5BfgyV6rV6l1gIcDnm7tcHdGFUBuvA AVNuXvdH7DhpRORtxo+LooXOl+k0Pzks18peiUBBmgodr2j0ykqI1NjuyUpjvxVuW13l 0YM6z+G7kqaMWDszmyO+8wNfQc/W/um52Z2XDeDk1t+LK7gBAzmBdbgtD6OVECWIa9jU h1/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=s1nLDfNM+V/pEufr1kRDzEVu/K6a75fag8e7wG+SzjY=; fh=sF/NGAqCthaRflPLk0tS85YHxOxp+1SAOPUzsMi6/xU=; b=IEq1e7e+E+xBB9EMIUL0g5rdT4NZz/WmRkKfz2G8YW4SQcqrZiPHYhH76ihCKZBjRe mfvlPRo1R7FFVz7aQFMuGVzfwsXDtRxkIfMEdwbIl+lI83A5NPX9WHS7/fVPBTNr23d8 sNH+JCW5ktzoSWz4WHhErFrkl8+DDRRR5vy4N1LjThQN68ulST0YKStEiTSHCpgmnPy4 83e+h0XuMVlh9lLcQa2wDf+zj/30dkrODq1WtExuoJsJjcKF+csjfNSpAmpgnE8OShFD je68DDTManp82VxAJCaH4REPaHLu/+InN8LY4MLcu8beuUWRWAGeYd7FVsQY4GzILuFA a+Cg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GLR9YeTb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e20-20020a1709067e1400b009adc760a240si779582ejr.563.2023.09.21.00.31.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 00:31:28 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GLR9YeTb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 34C703834E58 for ; Thu, 21 Sep 2023 07:25:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id 2D58C3857705 for ; Thu, 21 Sep 2023 07:22:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D58C3857705 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695280961; x=1726816961; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yA05PVNVQLWuvE30W9B3WQL8p5HTOuYRHkzuBXVUdxo=; b=GLR9YeTbiNqPfnA9H5TBlJn84ZGXz6pWXULdgrTXIufkQeKiffAAi4Mz eb9wPFG01s8duZKbTzbMkcDy6wY57XKLLiVtfx2ME+UtPiOQSTapfzRev 8Neqank2L1jNf/33Q8XiM3Ya/6ZpX59UPRtAmFg/VYTICAf0ztlzkyapp 2QDyZIQLpCHgkGwMPC+EgS3/5GI+K8n9Cwn0Awq4PvDzEuI6h0Z94ENDW DEtRraqNNU45h2eKTil617jVG5+5+BtbfGItks5vA+JYcaP8X89/op1ae /U8Rt8wBnkbdnBKUbdiTClm5AW375xB+1nqmBMnHBJTy3GeHmHtAPIQ7q g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="380352167" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="380352167" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Sep 2023 00:22:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="817262203" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="817262203" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 21 Sep 2023 00:22:17 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 6255E100513D; Thu, 21 Sep 2023 15:22:14 +0800 (CST) From: "Hu, Lin1" To: gcc-patches@gcc.gnu.org Cc: hongtao.liu@intel.com, ubizjak@gmail.com, haochen.jiang@intel.com Subject: [PATCH 15/18] Support -mevex512 for AVX512BW intrins Date: Thu, 21 Sep 2023 15:20:10 +0800 Message-Id: <20230921072013.2124750-16-lin1.hu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230921072013.2124750-1-lin1.hu@intel.com> References: <20230921072013.2124750-1-lin1.hu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777631482182470827 X-GMAIL-MSGID: 1777631482182470827 From: Haochen Jiang gcc/Changelog: * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Make sure there is EVEX512 enabled. (ix86_expand_vecop_qihi2): Refuse V32QI->V32HI when no EVEX512. * config/i386/i386.cc (ix86_hard_regno_mode_ok): Disable 64 bit mask when !TARGET_EVEX512. * config/i386/i386.md (avx512bw_512): New. (SWI1248_AVX512BWDQ_64): Add TARGET_EVEX512. (*zero_extendsidi2): Change isa to avx512bw_512. (kmov_isa): Ditto. (*anddi_1): Ditto. (*andn_1): Change isa to kmov_isa. (*_1): Ditto. (*notxor_1): Ditto. (*one_cmpl2_1): Ditto. (*one_cmplsi2_1_zext): Change isa to avx512bw_512. (*ashl3_1): Change isa to kmov_isa. (*lshr3_1): Ditto. * config/i386/sse.md (VI12HFBF_AVX512VL): Add TARGET_EVEX512. (VI1248_AVX512VLBW): Ditto. (VHFBF_AVX512VL): Ditto. (VI): Ditto. (VIHFBF): Ditto. (VI_AVX2): Ditto. (VI1_AVX512): Ditto. (VI12_256_512_AVX512VL): Ditto. (VI2_AVX2_AVX512BW): Ditto. (VI2_AVX512VNNIBW): Ditto. (VI2_AVX512VL): Ditto. (VI2HFBF_AVX512VL): Ditto. (VI8_AVX2_AVX512BW): Ditto. (VIMAX_AVX2_AVX512BW): Ditto. (VIMAX_AVX512VL): Ditto. (VI12_AVX2_AVX512BW): Ditto. (VI124_AVX2_24_AVX512F_1_AVX512BW): Ditto. (VI248_AVX512VL): Ditto. (VI248_AVX512VLBW): Ditto. (VI248_AVX2_8_AVX512F_24_AVX512BW): Ditto. (VI248_AVX512BW): Ditto. (VI248_AVX512BW_AVX512VL): Ditto. (VI248_512): Ditto. (VI124_256_AVX512F_AVX512BW): Ditto. (VI_AVX512BW): Ditto. (VIHFBF_AVX512BW): Ditto. (SWI1248_AVX512BWDQ): Ditto. (SWI1248_AVX512BW): Ditto. (SWI1248_AVX512BWDQ2): Ditto. (*knotsi_1_zext): Ditto. (define_split for zero_extend + not): Ditto. (kunpckdi): Ditto. (REDUC_SMINMAX_MODE): Ditto. (VEC_EXTRACT_MODE): Ditto. (*avx512bw_permvar_truncv16siv16hi_1): Ditto. (*avx512bw_permvar_truncv16siv16hi_1_hf): Ditto. (truncv32hiv32qi2): Ditto. (avx512bw_v32hiv32qi2): Ditto. (avx512bw_v32hiv32qi2_mask): Ditto. (avx512bw_v32hiv32qi2_mask_store): Ditto. (usadv64qi): Ditto. (VEC_PERM_AVX2): Ditto. (AVX512ZEXTMASK): Ditto. (SWI24_MASK): New. (vec_pack_trunc_): Change iterator to SWI24_MASK. (avx512bw_packsswb): Add TARGET_EVEX512. (avx512bw_packssdw): Ditto. (avx512bw_interleave_highv64qi): Ditto. (avx512bw_interleave_lowv64qi): Ditto. (avx512bw_pshuflwv32hi): Ditto. (avx512bw_pshufhwv32hi): Ditto. (vec_unpacks_lo_di): Ditto. (SWI48x_MASK): New. (vec_unpacks_hi_): Change iterator to SWI48x_MASK. (avx512bw_umulhrswv32hi3): Add TARGET_EVEX512. (VI1248_AVX512VL_AVX512BW): Ditto. (avx512bw_v32qiv32hi2): Ditto. (*avx512bw_zero_extendv32qiv32hi2_1): Ditto. (*avx512bw_zero_extendv32qiv32hi2_2): Ditto. (v32qiv32hi2): Ditto. (pbroadcast_evex_isa): Change isa attribute to avx512bw_512. (VPERMI2): Add TARGET_EVEX512. (VPERMI2I): Ditto. --- gcc/config/i386/i386-expand.cc | 3 +- gcc/config/i386/i386.cc | 4 +- gcc/config/i386/i386.md | 54 ++++----- gcc/config/i386/sse.md | 193 ++++++++++++++++++--------------- 4 files changed, 128 insertions(+), 126 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 063561e1265..ff2423f91ed 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -15617,6 +15617,7 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, machine_mode mode, case E_V32HFmode: case E_V32BFmode: case E_V64QImode: + gcc_assert (TARGET_EVEX512); if (TARGET_AVX512BW) return ix86_vector_duplicate_value (mode, target, val); else @@ -23512,7 +23513,7 @@ ix86_expand_vecop_qihi2 (enum rtx_code code, rtx dest, rtx op1, rtx op2) bool uns_p = code != ASHIFTRT; if ((qimode == V16QImode && !TARGET_AVX2) - || (qimode == V32QImode && !TARGET_AVX512BW) + || (qimode == V32QImode && (!TARGET_AVX512BW || !TARGET_EVEX512)) /* There are no V64HImode instructions. */ || qimode == V64QImode) return false; diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 589b29a324d..03c96ff048d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20308,8 +20308,8 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return MASK_PAIR_REGNO_P(regno); return ((TARGET_AVX512F && VALID_MASK_REG_MODE (mode)) - || (TARGET_AVX512BW - && VALID_MASK_AVX512BW_MODE (mode))); + || (TARGET_AVX512BW && mode == SImode) + || (TARGET_AVX512BW && TARGET_EVEX512 && mode == DImode)); } if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6eb4e540140..bdececc2309 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -536,10 +536,10 @@ x64_avx,x64_avx512bw,x64_avx512dq,aes, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, - noavx512f,avx512bw,noavx512bw,avx512dq,noavx512dq, - fma_or_avx512vl,avx512vl,noavx512vl,avxvnni,avx512vnnivl, - avx512fp16,avxifma,avx512ifmavl,avxneconvert,avx512bf16vl, - vpclmulqdqvl" + noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, + noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnni, + avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxneconvert, + avx512bf16vl,vpclmulqdqvl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -904,6 +904,8 @@ (symbol_ref "TARGET_AVX512F && TARGET_EVEX512") (eq_attr "isa" "noavx512f") (symbol_ref "!TARGET_AVX512F") (eq_attr "isa" "avx512bw") (symbol_ref "TARGET_AVX512BW") + (eq_attr "isa" "avx512bw_512") + (symbol_ref "TARGET_AVX512BW && TARGET_EVEX512") (eq_attr "isa" "noavx512bw") (symbol_ref "!TARGET_AVX512BW") (eq_attr "isa" "avx512dq") (symbol_ref "TARGET_AVX512DQ") (eq_attr "isa" "noavx512dq") (symbol_ref "!TARGET_AVX512DQ") @@ -1440,7 +1442,8 @@ (define_mode_iterator SWI1248_AVX512BWDQ_64 [(QI "TARGET_AVX512DQ") HI - (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW && TARGET_64BIT")]) + (SI "TARGET_AVX512BW") + (DI "TARGET_AVX512BW && TARGET_EVEX512 && TARGET_64BIT")]) (define_insn "*cmp_ccz_1" [(set (reg FLAGS_REG) @@ -4580,7 +4583,7 @@ (eq_attr "alternative" "12") (const_string "x64_avx512bw") (eq_attr "alternative" "13") - (const_string "avx512bw") + (const_string "avx512bw_512") ] (const_string "*"))) (set (attr "mmx_isa") @@ -4657,7 +4660,7 @@ "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);") (define_mode_attr kmov_isa - [(QI "avx512dq") (HI "avx512f") (SI "avx512bw") (DI "avx512bw")]) + [(QI "avx512dq") (HI "avx512f") (SI "avx512bw") (DI "avx512bw_512")]) (define_insn "zero_extenddi2" [(set (match_operand:DI 0 "register_operand" "=r,*r,*k") @@ -11124,7 +11127,7 @@ and{q}\t{%2, %0|%0, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw") + [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") (set_attr "type" "alu,alu,alu,imovx,msklog") (set_attr "length_immediate" "*,*,*,0,*") (set (attr "prefix_rex") @@ -11647,12 +11650,13 @@ (not:SWI48 (match_operand:SWI48 1 "register_operand" "r,r,k")) (match_operand:SWI48 2 "nonimmediate_operand" "r,m,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_BMI || TARGET_AVX512BW" + "TARGET_BMI + || (TARGET_AVX512BW && (mode == SImode || TARGET_EVEX512))" "@ andn\t{%2, %1, %0|%0, %1, %2} andn\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "bmi,bmi,avx512bw") + [(set_attr "isa" "bmi,bmi,") (set_attr "type" "bitmanip,bitmanip,msklog") (set_attr "btver2_decode" "direct, double,*") (set_attr "mode" "")]) @@ -11880,13 +11884,7 @@ {}\t{%2, %0|%0, %2} {}\t{%2, %0|%0, %2} #" - [(set (attr "isa") - (cond [(eq_attr "alternative" "2") - (if_then_else (eq_attr "mode" "SI,DI") - (const_string "avx512bw") - (const_string "avx512f")) - ] - (const_string "*"))) + [(set_attr "isa" "*,*,") (set_attr "type" "alu, alu, msklog") (set_attr "mode" "")]) @@ -11913,13 +11911,7 @@ DONE; } } - [(set (attr "isa") - (cond [(eq_attr "alternative" "2") - (if_then_else (eq_attr "mode" "SI,DI") - (const_string "avx512bw") - (const_string "avx512f")) - ] - (const_string "*"))) + [(set_attr "isa" "*,*,") (set_attr "type" "alu, alu, msklog") (set_attr "mode" "")]) @@ -13300,13 +13292,7 @@ "@ not{}\t%0 #" - [(set (attr "isa") - (cond [(eq_attr "alternative" "1") - (if_then_else (eq_attr "mode" "SI,DI") - (const_string "avx512bw") - (const_string "avx512f")) - ] - (const_string "*"))) + [(set_attr "isa" "*,") (set_attr "type" "negnot,msklog") (set_attr "mode" "")]) @@ -13318,7 +13304,7 @@ "@ not{l}\t%k0 #" - [(set_attr "isa" "x64,avx512bw") + [(set_attr "isa" "x64,avx512bw_512") (set_attr "type" "negnot,msklog") (set_attr "mode" "SI,SI")]) @@ -13943,7 +13929,7 @@ return "sal{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,bmi2,avx512bw") + [(set_attr "isa" "*,*,bmi2,") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") @@ -14995,7 +14981,7 @@ return "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,avx512bw") + [(set_attr "isa" "*,bmi2,") (set_attr "type" "ishift,ishiftx,msklog") (set (attr "length_immediate") (if_then_else diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a8f93ceddc5..e59f6bf4410 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -292,10 +292,10 @@ (V32HI "TARGET_EVEX512") (V16HI "TARGET_AVX512VL") (V8HI "TARGET_AVX512VL")]) (define_mode_iterator VI12HFBF_AVX512VL - [V64QI (V16QI "TARGET_AVX512VL") (V32QI "TARGET_AVX512VL") - V32HI (V16HI "TARGET_AVX512VL") (V8HI "TARGET_AVX512VL") - V32HF (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL") - V32BF (V16BF "TARGET_AVX512VL") (V8BF "TARGET_AVX512VL")]) + [(V64QI "TARGET_EVEX512") (V16QI "TARGET_AVX512VL") (V32QI "TARGET_AVX512VL") + (V32HI "TARGET_EVEX512") (V16HI "TARGET_AVX512VL") (V8HI "TARGET_AVX512VL") + (V32HF "TARGET_EVEX512") (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL") + (V32BF "TARGET_EVEX512") (V16BF "TARGET_AVX512VL") (V8BF "TARGET_AVX512VL")]) (define_mode_iterator VI1_AVX512VL [V64QI (V16QI "TARGET_AVX512VL") (V32QI "TARGET_AVX512VL")]) @@ -445,9 +445,11 @@ (V8DI "TARGET_EVEX512") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) (define_mode_iterator VI1248_AVX512VLBW - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX512VL && TARGET_AVX512BW") + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") + (V32QI "TARGET_AVX512VL && TARGET_AVX512BW") (V16QI "TARGET_AVX512VL && TARGET_AVX512BW") - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX512VL && TARGET_AVX512BW") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") + (V16HI "TARGET_AVX512VL && TARGET_AVX512BW") (V8HI "TARGET_AVX512VL && TARGET_AVX512BW") (V16SI "TARGET_EVEX512") (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") (V8DI "TARGET_EVEX512") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) @@ -481,15 +483,15 @@ [V32HF (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL")]) (define_mode_iterator VHFBF_AVX512VL - [V32HF (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL") - V32BF (V16BF "TARGET_AVX512VL") (V8BF "TARGET_AVX512VL")]) + [(V32HF "TARGET_EVEX512") (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL") + (V32BF "TARGET_EVEX512") (V16BF "TARGET_AVX512VL") (V8BF "TARGET_AVX512VL")]) ;; All vector integer modes (define_mode_iterator VI [(V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8DI "TARGET_AVX512F && TARGET_EVEX512") - (V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX") V8HI + (V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX") V8HI (V8SI "TARGET_AVX") V4SI (V4DI "TARGET_AVX") V2DI]) @@ -497,16 +499,16 @@ (define_mode_iterator VIHFBF [(V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8DI "TARGET_AVX512F && TARGET_EVEX512") - (V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX") V8HI + (V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX") V8HI (V8SI "TARGET_AVX") V4SI (V4DI "TARGET_AVX") V2DI - (V32HF "TARGET_AVX512BW") (V16HF "TARGET_AVX") V8HF - (V32BF "TARGET_AVX512BW") (V16BF "TARGET_AVX") V8BF]) + (V32HF "TARGET_AVX512BW && TARGET_EVEX512") (V16HF "TARGET_AVX") V8HF + (V32BF "TARGET_AVX512BW && TARGET_EVEX512") (V16BF "TARGET_AVX") V8BF]) (define_mode_iterator VI_AVX2 - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX2") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8SI "TARGET_AVX2") V4SI (V8DI "TARGET_AVX512F && TARGET_EVEX512") (V4DI "TARGET_AVX2") V2DI]) @@ -541,7 +543,7 @@ [(V32QI "TARGET_AVX2") V16QI]) (define_mode_iterator VI1_AVX512 - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI]) + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX2") V16QI]) (define_mode_iterator VI1_AVX512F [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI]) @@ -550,20 +552,20 @@ [(V64QI "TARGET_AVX512VNNI") (V32QI "TARGET_AVX2") V16QI]) (define_mode_iterator VI12_256_512_AVX512VL - [V64QI (V32QI "TARGET_AVX512VL") - V32HI (V16HI "TARGET_AVX512VL")]) + [(V64QI "TARGET_EVEX512") (V32QI "TARGET_AVX512VL") + (V32HI "TARGET_EVEX512") (V16HI "TARGET_AVX512VL")]) (define_mode_iterator VI2_AVX2 [(V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI2_AVX2_AVX512BW - [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI]) + [(V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI2_AVX512F [(V32HI "TARGET_AVX512F && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI2_AVX512VNNIBW - [(V32HI "TARGET_AVX512BW || TARGET_AVX512VNNI") + [(V32HI "(TARGET_AVX512BW || TARGET_AVX512VNNI) && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI4_AVX @@ -584,12 +586,12 @@ (V8DI "TARGET_AVX512F && TARGET_EVEX512")]) (define_mode_iterator VI2_AVX512VL - [(V8HI "TARGET_AVX512VL") (V16HI "TARGET_AVX512VL") V32HI]) + [(V8HI "TARGET_AVX512VL") (V16HI "TARGET_AVX512VL") (V32HI "TARGET_EVEX512")]) (define_mode_iterator VI2HFBF_AVX512VL - [(V8HI "TARGET_AVX512VL") (V16HI "TARGET_AVX512VL") V32HI - (V8HF "TARGET_AVX512VL") (V16HF "TARGET_AVX512VL") V32HF - (V8BF "TARGET_AVX512VL") (V16BF "TARGET_AVX512VL") V32BF]) + [(V8HI "TARGET_AVX512VL") (V16HI "TARGET_AVX512VL") (V32HI "TARGET_EVEX512") + (V8HF "TARGET_AVX512VL") (V16HF "TARGET_AVX512VL") (V32HF "TARGET_EVEX512") + (V8BF "TARGET_AVX512VL") (V16BF "TARGET_AVX512VL") (V32BF "TARGET_EVEX512")]) (define_mode_iterator VI2H_AVX512VL [(V8HI "TARGET_AVX512VL") (V16HI "TARGET_AVX512VL") V32HI @@ -600,7 +602,7 @@ [V32QI (V16QI "TARGET_AVX512VL") (V64QI "TARGET_AVX512F")]) (define_mode_iterator VI8_AVX2_AVX512BW - [(V8DI "TARGET_AVX512BW") (V4DI "TARGET_AVX2") V2DI]) + [(V8DI "TARGET_AVX512BW && TARGET_EVEX512") (V4DI "TARGET_AVX2") V2DI]) (define_mode_iterator VI8_AVX2 [(V4DI "TARGET_AVX2") V2DI]) @@ -624,11 +626,11 @@ ;; ??? We should probably use TImode instead. (define_mode_iterator VIMAX_AVX2_AVX512BW - [(V4TI "TARGET_AVX512BW") (V2TI "TARGET_AVX2") V1TI]) + [(V4TI "TARGET_AVX512BW && TARGET_EVEX512") (V2TI "TARGET_AVX2") V1TI]) ;; Suppose TARGET_AVX512BW as baseline (define_mode_iterator VIMAX_AVX512VL - [V4TI (V2TI "TARGET_AVX512VL") (V1TI "TARGET_AVX512VL")]) + [(V4TI "TARGET_EVEX512") (V2TI "TARGET_AVX512VL") (V1TI "TARGET_AVX512VL")]) (define_mode_iterator VIMAX_AVX2 [(V2TI "TARGET_AVX2") V1TI]) @@ -638,15 +640,15 @@ (V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI12_AVX2_AVX512BW - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI]) + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX2") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI]) (define_mode_iterator VI24_AVX2 [(V16HI "TARGET_AVX2") V8HI (V8SI "TARGET_AVX2") V4SI]) (define_mode_iterator VI124_AVX2_24_AVX512F_1_AVX512BW - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX2") V16QI (V32HI "TARGET_AVX512F && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8SI "TARGET_AVX2") V4SI]) @@ -656,13 +658,13 @@ (V8SI "TARGET_AVX2") V4SI]) (define_mode_iterator VI248_AVX512VL - [V32HI V16SI V8DI + [(V32HI "TARGET_EVEX512") (V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512") (V16HI "TARGET_AVX512VL") (V8SI "TARGET_AVX512VL") (V4DI "TARGET_AVX512VL") (V8HI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) (define_mode_iterator VI248_AVX512VLBW - [(V32HI "TARGET_AVX512BW") + [(V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX512VL && TARGET_AVX512BW") (V8HI "TARGET_AVX512VL && TARGET_AVX512BW") (V16SI "TARGET_EVEX512") (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") @@ -678,16 +680,16 @@ (V4DI "TARGET_AVX2") V2DI]) (define_mode_iterator VI248_AVX2_8_AVX512F_24_AVX512BW - [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI - (V16SI "TARGET_AVX512BW") (V8SI "TARGET_AVX2") V4SI + [(V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI + (V16SI "TARGET_AVX512BW && TARGET_EVEX512") (V8SI "TARGET_AVX2") V4SI (V8DI "TARGET_AVX512F && TARGET_EVEX512") (V4DI "TARGET_AVX2") V2DI]) (define_mode_iterator VI248_AVX512BW - [(V32HI "TARGET_AVX512BW") (V16SI "TARGET_EVEX512") + [(V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512")]) (define_mode_iterator VI248_AVX512BW_AVX512VL - [(V32HI "TARGET_AVX512BW") + [(V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V4DI "TARGET_AVX512VL") (V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512")]) ;; Suppose TARGET_AVX512VL as baseline @@ -850,7 +852,8 @@ (define_mode_iterator VI24_128 [V8HI V4SI]) (define_mode_iterator VI248_128 [V8HI V4SI V2DI]) (define_mode_iterator VI248_256 [V16HI V8SI V4DI]) -(define_mode_iterator VI248_512 [V32HI V16SI V8DI]) +(define_mode_iterator VI248_512 + [(V32HI "TARGET_EVEX512") (V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512")]) (define_mode_iterator VI48_128 [V4SI V2DI]) (define_mode_iterator VI148_512 [(V64QI "TARGET_EVEX512") (V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512")]) @@ -861,8 +864,8 @@ (define_mode_iterator VI124_256 [V32QI V16HI V8SI]) (define_mode_iterator VI124_256_AVX512F_AVX512BW [V32QI V16HI V8SI - (V64QI "TARGET_AVX512BW") - (V32HI "TARGET_AVX512BW") + (V64QI "TARGET_AVX512BW && TARGET_EVEX512") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16SI "TARGET_AVX512F && TARGET_EVEX512")]) (define_mode_iterator VI48_256 [V8SI V4DI]) (define_mode_iterator VI48_512 @@ -870,11 +873,14 @@ (define_mode_iterator VI4_256_8_512 [V8SI V8DI]) (define_mode_iterator VI_AVX512BW [(V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512") - (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512BW")]) + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") + (V64QI "TARGET_AVX512BW && TARGET_EVEX512")]) (define_mode_iterator VIHFBF_AVX512BW [(V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512") - (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512BW") - (V32HF "TARGET_AVX512BW") (V32BF "TARGET_AVX512BW")]) + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") + (V64QI "TARGET_AVX512BW && TARGET_EVEX512") + (V32HF "TARGET_AVX512BW && TARGET_EVEX512") + (V32BF "TARGET_AVX512BW && TARGET_EVEX512")]) ;; Int-float size matches (define_mode_iterator VI2F_256_512 [V16HI V32HI V16HF V32HF V16BF V32BF]) @@ -1948,17 +1954,19 @@ ;; All integer modes with AVX512BW/DQ. (define_mode_iterator SWI1248_AVX512BWDQ - [(QI "TARGET_AVX512DQ") HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) + [(QI "TARGET_AVX512DQ") HI (SI "TARGET_AVX512BW") + (DI "TARGET_AVX512BW && TARGET_EVEX512")]) ;; All integer modes with AVX512BW, where HImode operation ;; can be used instead of QImode. (define_mode_iterator SWI1248_AVX512BW - [QI HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) + [QI HI (SI "TARGET_AVX512BW") + (DI "TARGET_AVX512BW && TARGET_EVEX512")]) ;; All integer modes with AVX512BW/DQ, even HImode requires DQ. (define_mode_iterator SWI1248_AVX512BWDQ2 [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ") - (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) + (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW && TARGET_EVEX512")]) (define_expand "kmov" [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand") @@ -2097,7 +2105,7 @@ (zero_extend:DI (not:SI (match_operand:SI 1 "register_operand" "k")))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "knotd\t{%1, %0|%0, %1}"; [(set_attr "type" "msklog") (set_attr "prefix" "vex") @@ -2107,7 +2115,7 @@ [(set (match_operand:DI 0 "mask_reg_operand") (zero_extend:DI (not:SI (match_operand:SI 1 "mask_reg_operand"))))] - "TARGET_AVX512BW && reload_completed" + "TARGET_AVX512BW && TARGET_EVEX512 && reload_completed" [(parallel [(set (match_dup 0) (zero_extend:DI @@ -2213,7 +2221,7 @@ (const_int 32)) (zero_extend:DI (match_operand:SI 2 "register_operand" "k")))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "kunpckdq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "DI")]) @@ -3455,9 +3463,9 @@ (V16HF "TARGET_AVX512FP16 && TARGET_AVX512VL") (V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2") (V8SF "TARGET_AVX") (V4DF "TARGET_AVX") - (V64QI "TARGET_AVX512BW") + (V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32HF "TARGET_AVX512FP16 && TARGET_AVX512VL") - (V32HI "TARGET_AVX512BW") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8DI "TARGET_AVX512F && TARGET_EVEX512") (V16SF "TARGET_AVX512F && TARGET_EVEX512") @@ -12340,12 +12348,12 @@ ;; Modes handled by vec_extract patterns. (define_mode_iterator VEC_EXTRACT_MODE - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX") V8HI + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX") V8HI (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8SI "TARGET_AVX") V4SI (V8DI "TARGET_AVX512F && TARGET_EVEX512") (V4DI "TARGET_AVX") V2DI - (V32HF "TARGET_AVX512BW") (V16HF "TARGET_AVX") V8HF - (V32BF "TARGET_AVX512BW") (V16BF "TARGET_AVX") V8BF + (V32HF "TARGET_AVX512BW && TARGET_EVEX512") (V16HF "TARGET_AVX") V8HF + (V32BF "TARGET_AVX512BW && TARGET_EVEX512") (V16BF "TARGET_AVX") V8BF (V16SF "TARGET_AVX512F && TARGET_EVEX512") (V8SF "TARGET_AVX") V4SF (V8DF "TARGET_AVX512F && TARGET_EVEX512") (V4DF "TARGET_AVX") V2DF (V4TI "TARGET_AVX512F && TARGET_EVEX512") (V2TI "TARGET_AVX")]) @@ -14028,7 +14036,7 @@ (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] - "TARGET_AVX512BW && ix86_pre_reload_split ()" + "TARGET_AVX512BW && TARGET_EVEX512 && ix86_pre_reload_split ()" "#" "&& 1" [(set (match_dup 0) @@ -14053,7 +14061,7 @@ (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] - "TARGET_AVX512BW && ix86_pre_reload_split ()" + "TARGET_AVX512BW && TARGET_EVEX512 && ix86_pre_reload_split ()" "#" "&& 1" [(set (match_dup 0) @@ -14173,13 +14181,13 @@ [(set (match_operand:V32QI 0 "nonimmediate_operand") (truncate:V32QI (match_operand:V32HI 1 "register_operand")))] - "TARGET_AVX512BW") + "TARGET_AVX512BW && TARGET_EVEX512") (define_insn "avx512bw_v32hiv32qi2" [(set (match_operand:V32QI 0 "nonimmediate_operand" "=v,m") (any_truncate:V32QI (match_operand:V32HI 1 "register_operand" "v,v")))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpmovwb\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "memory" "none,store") @@ -14225,7 +14233,7 @@ (match_operand:V32HI 1 "register_operand" "v,v")) (match_operand:V32QI 2 "nonimm_or_0_operand" "0C,0") (match_operand:SI 3 "register_operand" "Yk,Yk")))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpmovwb\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" [(set_attr "type" "ssemov") (set_attr "memory" "none,store") @@ -14239,7 +14247,7 @@ (match_operand:V32HI 1 "register_operand")) (match_dup 0) (match_operand:SI 2 "register_operand")))] - "TARGET_AVX512BW") + "TARGET_AVX512BW && TARGET_EVEX512") (define_mode_iterator PMOV_DST_MODE_2 [V4SI V8HI (V16QI "TARGET_AVX512BW")]) @@ -16126,7 +16134,7 @@ (match_operand:V64QI 1 "register_operand") (match_operand:V64QI 2 "nonimmediate_operand") (match_operand:V16SI 3 "nonimmediate_operand")] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" { rtx t1 = gen_reg_rtx (V8DImode); rtx t2 = gen_reg_rtx (V16SImode); @@ -17312,7 +17320,7 @@ (V8DF "TARGET_AVX512F && TARGET_EVEX512") (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8DI "TARGET_AVX512F && TARGET_EVEX512") - (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512VBMI") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V64QI "TARGET_AVX512VBMI") (V32HF "TARGET_AVX512FP16")]) (define_expand "vec_perm" @@ -18018,7 +18026,7 @@ (const_string "*")))]) (define_mode_iterator AVX512ZEXTMASK - [(DI "TARGET_AVX512BW") (SI "TARGET_AVX512BW") HI]) + [(DI "TARGET_AVX512BW && TARGET_EVEX512") (SI "TARGET_AVX512BW") HI]) (define_insn "_testm3" [(set (match_operand: 0 "register_operand" "=k") @@ -18130,16 +18138,18 @@ (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512F") +(define_mode_iterator SWI24_MASK [HI (SI "TARGET_EVEX512")]) + (define_expand "vec_pack_trunc_" [(parallel [(set (match_operand: 0 "register_operand") (ior: (ashift: (zero_extend: - (match_operand:SWI24 2 "register_operand")) + (match_operand:SWI24_MASK 2 "register_operand")) (match_dup 3)) (zero_extend: - (match_operand:SWI24 1 "register_operand")))) + (match_operand:SWI24_MASK 1 "register_operand")))) (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512BW" { @@ -18267,7 +18277,7 @@ (const_int 60) (const_int 61) (const_int 62) (const_int 63)])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpacksswb\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "") @@ -18336,7 +18346,7 @@ (const_int 14) (const_int 15) (const_int 28) (const_int 29) (const_int 30) (const_int 31)])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpackssdw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "") @@ -18398,7 +18408,7 @@ (const_int 61) (const_int 125) (const_int 62) (const_int 126) (const_int 63) (const_int 127)])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpunpckhbw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "evex") @@ -18494,7 +18504,7 @@ (const_int 53) (const_int 117) (const_int 54) (const_int 118) (const_int 55) (const_int 119)])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpunpcklbw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "evex") @@ -19677,7 +19687,7 @@ [(match_operand:V32HI 1 "nonimmediate_operand" "vm") (match_operand:SI 2 "const_0_to_255_operand")] UNSPEC_PSHUFLW))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpshuflw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "evex") @@ -19853,7 +19863,7 @@ [(match_operand:V32HI 1 "nonimmediate_operand" "vm") (match_operand:SI 2 "const_0_to_255_operand")] UNSPEC_PSHUFHW))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpshufhw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "evex") @@ -20735,7 +20745,7 @@ (define_expand "vec_unpacks_lo_di" [(set (match_operand:SI 0 "register_operand") (subreg:SI (match_operand:DI 1 "register_operand") 0))] - "TARGET_AVX512BW") + "TARGET_AVX512BW && TARGET_EVEX512") (define_expand "vec_unpacku_hi_" [(match_operand: 0 "register_operand") @@ -20774,12 +20784,15 @@ (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512F") +(define_mode_iterator SWI48x_MASK [SI (DI "TARGET_EVEX512")]) + (define_expand "vec_unpacks_hi_" [(parallel - [(set (subreg:SWI48x + [(set (subreg:SWI48x_MASK (match_operand: 0 "register_operand") 0) - (lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand") - (match_dup 2))) + (lshiftrt:SWI48x_MASK + (match_operand:SWI48x_MASK 1 "register_operand") + (match_dup 2))) (unspec [(const_int 0)] UNSPEC_MASKOP)])] "TARGET_AVX512BW" "operands[2] = GEN_INT (GET_MODE_BITSIZE (mode));") @@ -21534,7 +21547,7 @@ (const_int 1) (const_int 1) (const_int 1) (const_int 1)])) (const_int 1))))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpmulhrsw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseimul") (set_attr "prefix" "evex") @@ -22042,8 +22055,8 @@ ;; Mode iterator to handle singularity w/ absence of V2DI and V4DI ;; modes for abs instruction on pre AVX-512 targets. (define_mode_iterator VI1248_AVX512VL_AVX512BW - [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI + [(V64QI "TARGET_AVX512BW && TARGET_EVEX512") (V32QI "TARGET_AVX2") V16QI + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") (V16HI "TARGET_AVX2") V8HI (V16SI "TARGET_AVX512F && TARGET_EVEX512") (V8SI "TARGET_AVX2") V4SI (V8DI "TARGET_AVX512F && TARGET_EVEX512") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) @@ -22702,7 +22715,7 @@ [(set (match_operand:V32HI 0 "register_operand" "=v") (any_extend:V32HI (match_operand:V32QI 1 "nonimmediate_operand" "vm")))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "vpmovbw\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "prefix" "evex") @@ -22716,7 +22729,7 @@ (match_operand:V64QI 2 "const0_operand")) (match_parallel 3 "pmovzx_parallel" [(match_operand 4 "const_int_operand")])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "#" "&& reload_completed" [(set (match_dup 0) (zero_extend:V32HI (match_dup 1)))] @@ -22736,7 +22749,7 @@ (match_operand:V64QI 3 "const0_operand")) (match_parallel 4 "pmovzx_parallel" [(match_operand 5 "const_int_operand")])))] - "TARGET_AVX512BW" + "TARGET_AVX512BW && TARGET_EVEX512" "#" "&& reload_completed" [(set (match_dup 0) (zero_extend:V32HI (match_dup 1)))] @@ -22749,7 +22762,7 @@ [(set (match_operand:V32HI 0 "register_operand") (any_extend:V32HI (match_operand:V32QI 1 "nonimmediate_operand")))] - "TARGET_AVX512BW") + "TARGET_AVX512BW && TARGET_EVEX512") (define_insn "sse4_1_v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,Yw") @@ -26107,12 +26120,12 @@ (set_attr "mode" "OI")]) (define_mode_attr pbroadcast_evex_isa - [(V64QI "avx512bw") (V32QI "avx512bw") (V16QI "avx512bw") - (V32HI "avx512bw") (V16HI "avx512bw") (V8HI "avx512bw") + [(V64QI "avx512bw_512") (V32QI "avx512bw") (V16QI "avx512bw") + (V32HI "avx512bw_512") (V16HI "avx512bw") (V8HI "avx512bw") (V16SI "avx512f_512") (V8SI "avx512f") (V4SI "avx512f") (V8DI "avx512f_512") (V4DI "avx512f") (V2DI "avx512f") - (V32HF "avx512bw") (V16HF "avx512bw") (V8HF "avx512bw") - (V32BF "avx512bw") (V16BF "avx512bw") (V8BF "avx512bw")]) + (V32HF "avx512bw_512") (V16HF "avx512bw") (V8HF "avx512bw") + (V32BF "avx512bw_512") (V16BF "avx512bw") (V8BF "avx512bw")]) (define_insn "avx2_pbroadcast" [(set (match_operand:VIHFBF 0 "register_operand" "=x,v") @@ -26967,7 +26980,8 @@ (V4DI "TARGET_AVX512VL") (V4DF "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL") - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX512BW && TARGET_AVX512VL") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") + (V16HI "TARGET_AVX512BW && TARGET_AVX512VL") (V8HI "TARGET_AVX512BW && TARGET_AVX512VL") (V64QI "TARGET_AVX512VBMI") (V32QI "TARGET_AVX512VBMI && TARGET_AVX512VL") (V16QI "TARGET_AVX512VBMI && TARGET_AVX512VL")]) @@ -26976,7 +26990,8 @@ [(V16SI "TARGET_EVEX512") (V8DI "TARGET_EVEX512") (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL") - (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX512BW && TARGET_AVX512VL") + (V32HI "TARGET_AVX512BW && TARGET_EVEX512") + (V16HI "TARGET_AVX512BW && TARGET_AVX512VL") (V8HI "TARGET_AVX512BW && TARGET_AVX512VL") (V64QI "TARGET_AVX512VBMI") (V32QI "TARGET_AVX512VBMI && TARGET_AVX512VL") (V16QI "TARGET_AVX512VBMI && TARGET_AVX512VL")])