From patchwork Tue Aug 8 07:13:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132510 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1934802vqr; Tue, 8 Aug 2023 00:16:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEWDRCKnrQvMpmGaoRVTpldIskRb1dzbN1mheivVeZECXwTGdK55lDBE4b+xEpxfZ2Gawg4 X-Received: by 2002:a17:907:2c4d:b0:99b:bf8d:b7e1 with SMTP id hf13-20020a1709072c4d00b0099bbf8db7e1mr9194219ejc.17.1691478982477; Tue, 08 Aug 2023 00:16:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691478982; cv=none; d=google.com; s=arc-20160816; b=d6Sl29KpMs+VfIUirkUoj6p+VWNNr7Ay+ngr+b1szeudnBiBbJPsKf6PTB7YHSFnzF FfIagFMYl2EMZyeXZMV6oE2fwqsF5bID8TKfKG0BCKTvW19PbzjmbNaD2w6fEFPC7+yb pV0zwYApgarqYclgkVfxHizuzlAgj8HAHI2KXVO6RIiPsCjutc++lp/PAGBBkShxrmi9 mIoGVO/liXf65AP0pCMA+dJ1XnOyX1IRRLJd4+DQ1PfpX79F5ltq6UmOAOzH3H3zACEX ZFqvfjA8CuzJ1AfFXS11xls2NEbQl2BWdcRz4MifsvvDMe0BPiqD3b/Bl+H2z3+F9WWX QNrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=5jYOnjfAmOn2jzTMbC/GIq8KS989ug9zIme2TOyt2ew=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=bxpRCoGQf6AAr0OMFhjvJ75b0yDgwhqNHQ7kouTy/Kizx0QdL25HmvSWgBISuWeOX0 bO7KOrGZHv2rliLI0+yCnrVDEgBzJC5lEAt6G7flPhPV28tUpCItb5gaH1GZPvtUiSKB 068DQDoblv3pMnXUHCAx69m0eZShIQY1ZoOAtkhH61rJojNakqCredpukqv1wI0tWBL6 6Y3mZ7qHuGc4iJc7ovDiuJMP6efIU10LTWrHy/pA5DSyYKT1gJR8lrnkNszc6022MYgS 1hTTO7pBaMywwlxBx0ja0gPFal2z+7CKsOQP58HpXlre+XcRYVfRVF2zobttIcg/81wz ZYpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=XhB3ogZ2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ka11-20020a170907990b00b009939cd92a18si6729407ejc.73.2023.08.08.00.16.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:16:22 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=XhB3ogZ2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 933BE3857701 for ; Tue, 8 Aug 2023 07:15:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 933BE3857701 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691478958; bh=5jYOnjfAmOn2jzTMbC/GIq8KS989ug9zIme2TOyt2ew=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=XhB3ogZ2TXVfsDV0exjgmWVd8UwEQLSzvoITQCJC9VnS5T5EoLknzfdnEGy/bToK3 oGGOesxca/lpYbCsB9a9BqbrcHfFw64Vamu92pC34r28e+7iAweHn+ZGamF9qbFTs3 eCpESTe9jjzjYOjio0mMY/n/sdWuPKAEPNnnNqdA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id C3626385840B for ; Tue, 8 Aug 2023 07:13:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C3626385840B X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="434592322" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="434592322" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:13:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="845345923" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="845345923" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 08 Aug 2023 00:13:15 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 768371005613; Tue, 8 Aug 2023 15:13:14 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 1/3] Initial support for AVX10.1 Date: Tue, 8 Aug 2023 15:13:10 +0800 Message-Id: <20230808071312.1569559-2-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644265385457592 X-GMAIL-MSGID: 1773644265385457592 gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Add avx10_set and version and detect avx10.1. (cpu_indicator_init): Handle avx10.1-512. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AVX10_512BIT_SET): New. (OPTION_MASK_ISA2_AVX10_1_SET): Ditto. (OPTION_MASK_ISA2_AVX10_512BIT_UNSET): Ditto. (OPTION_MASK_ISA2_AVX10_1_UNSET): Ditto. (OPTION_MASK_ISA2_AVX2_UNSET): Modify for AVX10_1. (ix86_handle_option): Handle -mavx10.1, -mavx10.1-256 and -mavx10.1-512. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AVX10_512BIT, FEATURE_AVX10_1 and FEATURE_AVX10_512BIT. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for AVX10_512BIT, AVX10_1 and AVX10_1_512. * config/i386/constraints.md (Yk): Add AVX10_1. (Yv): Ditto. (k): Ditto. * config/i386/cpuid.h (bit_AVX10): New. (bit_AVX10_256): Ditto. (bit_AVX10_512): Ditto. * config/i386/i386-c.cc (ix86_target_macros_internal): Define AVX10_512BIT and AVX10_1. * config/i386/i386-isa.def (AVX10_512BIT): Add DEF_PTA(AVX10_512BIT). (AVX10_1): Add DEF_PTA(AVX10_1). * config/i386/i386-options.cc (isa2_opts): Add -mavx10.1. (ix86_valid_target_attribute_inner_p): Handle avx10-512bit, avx10.1 and avx10.1-512. (ix86_option_override_internal): Enable AVX512{F,VL,BW,DQ,CD,BF16, FP16,VBMI,VBMI2,VNNI,IFMA,BITALG,VPOPCNTDQ} features for avx10.1-512. (ix86_valid_target_attribute_inner_p): Handle AVX10_1. * config/i386/i386.cc (ix86_get_ssemov): Add AVX10_1. (ix86_conditional_register_usage): Ditto. (ix86_hard_regno_mode_ok): Ditto. (ix86_rtx_costs): Ditto. * config/i386/i386.h (VALID_MASK_AVX10_MODE): New macro. * config/i386/i386.opt: Add option -mavx10.1, -mavx10.1-256 and -mavx10.1-512. * doc/extend.texi: Document avx10.1, avx10.1-256 and avx10.1-512. * doc/invoke.texi: Document -mavx10.1, -mavx10.1-256 and -mavx10.1-512. * doc/sourcebuild.texi: Document target avx10.1, avx10.1-256 and avx10.1-512. gcc/testsuite/ChangeLog: * g++.target/i386/mv33.C: New test. * gcc.target/i386/avx10_1-1.c: Ditto. * gcc.target/i386/avx10_1-2.c: Ditto. * gcc.target/i386/avx10_1-3.c: Ditto. * gcc.target/i386/avx10_1-4.c: Ditto. * gcc.target/i386/avx10_1-5.c: Ditto. * gcc.target/i386/avx10_1-6.c: Ditto. * gcc.target/i386/avx10_1-7.c: Ditto. * gcc.target/i386/avx10_1-8.c: Ditto. * gcc.target/i386/avx10_1-9.c: Ditto. * gcc.target/i386/avx10_1-10.c: Ditto. --- gcc/common/config/i386/cpuinfo.h | 36 +++++++++++++++ gcc/common/config/i386/i386-common.cc | 53 +++++++++++++++++++++- gcc/common/config/i386/i386-cpuinfo.h | 3 ++ gcc/common/config/i386/i386-isas.h | 5 ++ gcc/config/i386/constraints.md | 6 +-- gcc/config/i386/cpuid.h | 6 +++ gcc/config/i386/i386-c.cc | 4 ++ gcc/config/i386/i386-isa.def | 2 + gcc/config/i386/i386-options.cc | 26 ++++++++++- gcc/config/i386/i386.cc | 18 ++++++-- gcc/config/i386/i386.h | 3 ++ gcc/config/i386/i386.opt | 19 ++++++++ gcc/doc/extend.texi | 13 ++++++ gcc/doc/invoke.texi | 16 +++++-- gcc/doc/sourcebuild.texi | 9 ++++ gcc/testsuite/g++.target/i386/mv33.C | 30 ++++++++++++ gcc/testsuite/gcc.target/i386/avx10_1-1.c | 22 +++++++++ gcc/testsuite/gcc.target/i386/avx10_1-10.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-2.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-3.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-4.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-5.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-6.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-7.c | 13 ++++++ gcc/testsuite/gcc.target/i386/avx10_1-8.c | 4 ++ gcc/testsuite/gcc.target/i386/avx10_1-9.c | 13 ++++++ 26 files changed, 366 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/mv33.C create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-10.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-2.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-3.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-4.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-5.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-6.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-7.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-8.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-9.c diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h index 30ef0d334ca..5abff83b4ca 100644 --- a/gcc/common/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo.h @@ -688,6 +688,9 @@ get_available_features (struct __processor_model *cpu_model, int amx_usable = 0; /* Check if KL is usable. */ int has_kl = 0; + /* Record AVX10 version. */ + int avx10_set = 0; + int version = 0; if ((ecx & bit_OSXSAVE)) { /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and @@ -906,6 +909,9 @@ get_available_features (struct __processor_model *cpu_model, { if (eax & bit_AVX512BF16) set_feature (FEATURE_AVX512BF16); + /* AVX10 has the same XSTATE with AVX512. */ + if (edx & bit_AVX10) + avx10_set = 1; } if (amx_usable) { @@ -951,6 +957,24 @@ get_available_features (struct __processor_model *cpu_model, } } + /* Get Advanced Features at level 0x24 (eax = 0x24). */ + if (avx10_set && max_cpuid_level >= 0x24) + { + __cpuid (0x18, eax, ebx, ecx, edx); + version = ebx & 0xff; + if (ebx & bit_AVX10_256) + switch (version) + { + case 1: + set_feature (FEATURE_AVX10_1); + break; + default: + gcc_unreachable (); + } + if (ebx & bit_AVX10_512) + set_feature (FEATURE_AVX10_512BIT); + } + /* Check cpuid level of extended features. */ __cpuid (0x80000000, ext_level, ebx, ecx, edx); @@ -1155,6 +1179,18 @@ cpu_indicator_init (struct __processor_model *cpu_model, } } +#define SET_AVX10_512(A,B) \ + if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX10_##A)) \ + { \ + CHECK___builtin_cpu_supports (B); \ + set_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX10_##A##_512); \ + } + + if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX10_512BIT)) + SET_AVX10_512 (1, "avx10.1-512"); + +#undef SET_AVX10_512 + gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX); gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX); gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX); diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index 26005914079..6c3bebb1846 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -123,6 +123,8 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_SET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_SET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_SET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_AVX10_512BIT_SET OPTION_MASK_ISA2_AVX10_512BIT +#define OPTION_MASK_ISA2_AVX10_1_SET OPTION_MASK_ISA2_AVX10_1 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same as -msse4.2. */ @@ -232,7 +234,8 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_AVX2_UNSET \ (OPTION_MASK_ISA2_AVXIFMA_UNSET | OPTION_MASK_ISA2_AVXVNNI_UNSET \ | OPTION_MASK_ISA2_AVXVNNIINT8_UNSET | OPTION_MASK_ISA2_AVXNECONVERT_UNSET \ - | OPTION_MASK_ISA2_AVXVNNIINT16_UNSET | OPTION_MASK_ISA2_AVX512F_UNSET) + | OPTION_MASK_ISA2_AVXVNNIINT16_UNSET | OPTION_MASK_ISA2_AVX512F_UNSET \ + | OPTION_MASK_ISA2_AVX10_1_UNSET) #define OPTION_MASK_ISA_AVX512F_UNSET \ (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX512CD_UNSET \ | OPTION_MASK_ISA_AVX512PF_UNSET | OPTION_MASK_ISA_AVX512ER_UNSET \ @@ -309,6 +312,8 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA2_SM3_UNSET OPTION_MASK_ISA2_SM3 #define OPTION_MASK_ISA2_SHA512_UNSET OPTION_MASK_ISA2_SHA512 #define OPTION_MASK_ISA2_SM4_UNSET OPTION_MASK_ISA2_SM4 +#define OPTION_MASK_ISA2_AVX10_512BIT_UNSET OPTION_MASK_ISA2_AVX10_512BIT +#define OPTION_MASK_ISA2_AVX10_1_UNSET OPTION_MASK_ISA2_AVX10_1 /* SSE4 includes both SSE4.1 and SSE4.2. -mno-sse4 should the same as -mno-sse4.1. */ @@ -1341,6 +1346,52 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mavx10_max_512bit: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_512BIT_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_512BIT_SET; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_512BIT_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_512BIT_UNSET; + } + return true; + + case OPT_mavx10_1: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX2_SET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX2_SET; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET; + } + return true; + + case OPT_mavx10_1_256: + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_512BIT_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_512BIT_SET; + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX2_SET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX2_SET; + return true; + + case OPT_mavx10_1_512: + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_SET; + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_512BIT_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_512BIT_SET; + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX2_SET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX2_SET; + return true; + case OPT_mfma: if (value) { diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h index 9153b4d0a54..8fbfb38baed 100644 --- a/gcc/common/config/i386/i386-cpuinfo.h +++ b/gcc/common/config/i386/i386-cpuinfo.h @@ -261,6 +261,9 @@ enum processor_features FEATURE_SM3, FEATURE_SHA512, FEATURE_SM4, + FEATURE_AVX10_512BIT, + FEATURE_AVX10_1, + FEATURE_AVX10_1_512, CPU_FEATURE_MAX }; diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h index 2297903a45e..35be0cc3f2a 100644 --- a/gcc/common/config/i386/i386-isas.h +++ b/gcc/common/config/i386/i386-isas.h @@ -191,4 +191,9 @@ ISA_NAMES_TABLE_START ISA_NAMES_TABLE_ENTRY("sm3", FEATURE_SM3, P_NONE, "-msm3") ISA_NAMES_TABLE_ENTRY("sha512", FEATURE_SHA512, P_NONE, "-msha512") ISA_NAMES_TABLE_ENTRY("sm4", FEATURE_SM4, P_NONE, "-msm4") + ISA_NAMES_TABLE_ENTRY("avx10-max-512bit", FEATURE_AVX10_512BIT, + P_NONE, "-mavx10-max-512bit") + ISA_NAMES_TABLE_ENTRY("avx10.1", FEATURE_AVX10_1, P_NONE, "-mavx10.1") + ISA_NAMES_TABLE_ENTRY("avx10.1-256", FEATURE_AVX10_1, P_NONE, NULL) + ISA_NAMES_TABLE_ENTRY("avx10.1-512", FEATURE_AVX10_1_512, P_NONE, NULL) ISA_NAMES_TABLE_END diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index fd490f39110..4be6bc4816a 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -78,10 +78,10 @@ "TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387 ? FP_SECOND_REG : NO_REGS" "Second from top of 80387 floating-point stack (@code{%st(1)}).") -(define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS" +(define_register_constraint "Yk" "(TARGET_AVX512F || TARGET_AVX10_1) ? MASK_REGS : NO_REGS" "@internal Any mask register that can be used as predicate, i.e. k1-k7.") -(define_register_constraint "k" "TARGET_AVX512F ? ALL_MASK_REGS : NO_REGS" +(define_register_constraint "k" "(TARGET_AVX512F || TARGET_AVX10_1) ? ALL_MASK_REGS : NO_REGS" "@internal Any mask register.") ;; Vector registers (also used for plain floating point nowadays). @@ -146,7 +146,7 @@ "@internal Lower SSE register when avoiding REX prefix and all SSE registers otherwise.") (define_register_constraint "Yv" - "TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS" + "(TARGET_AVX512VL || TARGET_AVX10_1) ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS" "@internal For AVX512VL, any EVEX encodable SSE register (@code{%xmm0-%xmm31}), otherwise any SSE register.") (define_register_constraint "Yw" diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h index 73c15480350..ca5551cefca 100644 --- a/gcc/config/i386/cpuid.h +++ b/gcc/config/i386/cpuid.h @@ -149,6 +149,7 @@ #define bit_AVXNECONVERT (1 << 5) #define bit_AVXVNNIINT16 (1 << 10) #define bit_PREFETCHI (1 << 14) +#define bit_AVX10 (1 << 19) /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */ #define bit_XSAVEOPT (1 << 0) @@ -159,6 +160,11 @@ /* %ebx */ #define bit_PTWRITE (1 << 4) +/* AVX10 sub leaf (%eax == 0x18) */ +/* %ebx */ +#define bit_AVX10_256 (1 << 17) +#define bit_AVX10_512 (1 << 18) + /* Keylocker leaf (%eax == 0x19) */ /* %ebx */ #define bit_AESKLE ( 1<<0 ) diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc index 257950582c2..caef5531593 100644 --- a/gcc/config/i386/i386-c.cc +++ b/gcc/config/i386/i386-c.cc @@ -692,6 +692,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, def_or_undef (parse_in, "__SHA512__"); if (isa_flag2 & OPTION_MASK_ISA2_SM4) def_or_undef (parse_in, "__SM4__"); + if (isa_flag2 & OPTION_MASK_ISA2_AVX10_512BIT) + def_or_undef (parse_in, "__AVX10_512BIT__"); + if (isa_flag2 & OPTION_MASK_ISA2_AVX10_1) + def_or_undef (parse_in, "__AVX10_1__"); if (TARGET_IAMCU) { def_or_undef (parse_in, "__iamcu"); diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def index aeafcf870ac..f7d741746c3 100644 --- a/gcc/config/i386/i386-isa.def +++ b/gcc/config/i386/i386-isa.def @@ -121,3 +121,5 @@ DEF_PTA(AVXVNNIINT16) DEF_PTA(SM3) DEF_PTA(SHA512) DEF_PTA(SM4) +DEF_PTA(AVX10_512BIT) +DEF_PTA(AVX10_1) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 127ee24203c..b2281fbd4b5 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -243,7 +243,9 @@ static struct ix86_target_opts isa2_opts[] = { "-mavxvnniint16", OPTION_MASK_ISA2_AVXVNNIINT16 }, { "-msm3", OPTION_MASK_ISA2_SM3 }, { "-msha512", OPTION_MASK_ISA2_SHA512 }, - { "-msm4", OPTION_MASK_ISA2_SM4 } + { "-msm4", OPTION_MASK_ISA2_SM4 }, + { "-mavx10-max-512bit", OPTION_MASK_ISA2_AVX10_512BIT }, + { "-mavx10.1", OPTION_MASK_ISA2_AVX10_1 } }; static struct ix86_target_opts isa_opts[] = { @@ -983,7 +985,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], ix86_opt_ix86_no, ix86_opt_str, ix86_opt_enum, - ix86_opt_isa + ix86_opt_isa, }; static const struct @@ -1100,6 +1102,10 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], IX86_ATTR_ISA ("sm3", OPT_msm3), IX86_ATTR_ISA ("sha512", OPT_msha512), IX86_ATTR_ISA ("sm4", OPT_msm4), + IX86_ATTR_ISA ("avx10-max-512bit", OPT_mavx10_max_512bit), + IX86_ATTR_ISA ("avx10.1", OPT_mavx10_1), + IX86_ATTR_ISA ("avx10.1-256", OPT_mavx10_1_256), + IX86_ATTR_ISA ("avx10.1-512", OPT_mavx10_1_512), /* enum options */ IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_), @@ -2524,6 +2530,22 @@ ix86_option_override_internal (bool main_args_p, &= ~((OPTION_MASK_ISA_BMI | OPTION_MASK_ISA_BMI2 | OPTION_MASK_ISA_TBM) & ~opts->x_ix86_isa_flags_explicit); + /* Enable AVX512{F,VL,BW,DQ,CD,BF16,FP16,VBMI,VBMI2,VNNI,IFMA,BITALG, + VPOPCNTDQ} features for AVX10.1/512. */ + if (TARGET_AVX10_1_P (opts->x_ix86_isa_flags2) + && TARGET_AVX10_512BIT_P (opts->x_ix86_isa_flags2)) + { + opts->x_ix86_isa_flags + |= OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX512CD + | OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512BW + | OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512IFMA + | OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VBMI2 + | OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VPOPCNTDQ + | OPTION_MASK_ISA_AVX512BITALG; + opts->x_ix86_isa_flags2 + |= OPTION_MASK_ISA2_AVX512FP16 | OPTION_MASK_ISA2_AVX512BF16; + } + /* Validate -mpreferred-stack-boundary= value or default it to PREFERRED_STACK_BOUNDARY_DEFAULT. */ ix86_preferred_stack_boundary = PREFERRED_STACK_BOUNDARY_DEFAULT; diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 5d57726e22c..e75614b993d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -513,8 +513,8 @@ ix86_conditional_register_usage (void) if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387)) accessible_reg_set &= ~reg_class_contents[FLOAT_REGS]; - /* If AVX512F is disabled, disable the registers. */ - if (! TARGET_AVX512F) + /* If AVX512F and AVX10 is disabled, disable the registers. */ + if (!TARGET_AVX512F && !TARGET_AVX10_1) { for (i = FIRST_EXT_REX_SSE_REG; i <= LAST_EXT_REX_SSE_REG; i++) CLEAR_HARD_REG_BIT (accessible_reg_set, i); @@ -5490,6 +5490,7 @@ ix86_get_ssemov (rtx *operands, unsigned size, we can only use zmm register move without memory operand. */ if (evex_reg_p && !TARGET_AVX512VL + && !TARGET_AVX10_1 && GET_MODE_SIZE (mode) < 64) { /* NB: Even though ix86_hard_regno_mode_ok doesn't allow @@ -20259,7 +20260,8 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return ((TARGET_AVX512F && VALID_MASK_REG_MODE (mode)) || (TARGET_AVX512BW - && VALID_MASK_AVX512BW_MODE (mode))); + && VALID_MASK_AVX512BW_MODE (mode)) + || (TARGET_AVX10_1 && VALID_MASK_AVX10_MODE (mode))); } if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) @@ -20294,6 +20296,13 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) || VALID_AVX512VL_128_REG_MODE (mode))) return true; + /* AVX10_1 allows sse regs16+ for 256 bit modes. */ + if (TARGET_AVX10_1 + && (VALID_AVX256_REG_OR_OI_MODE (mode) + || VALID_AVX512VL_128_REG_MODE (mode) + || VALID_AVX512F_SCALAR_MODE (mode))) + return true; + /* xmm16-xmm31 are only available for AVX-512. */ if (EXT_REX_SSE_REGNO_P (regno)) return false; @@ -21584,7 +21593,8 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, mask = XEXP (x, 2); /* This is masked instruction, assume the same cost, as nonmasked variant. */ - if (TARGET_AVX512F && register_operand (mask, GET_MODE (mask))) + if ((TARGET_AVX512F || TARGET_AVX10_1) + && register_operand (mask, GET_MODE (mask))) *total = rtx_cost (XEXP (x, 0), mode, outer_code, opno, speed); else *total = cost->sse_op; diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index ef342fcee9b..77b50913458 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1080,6 +1080,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode) +#define VALID_MASK_AVX10_MODE(MODE) ((MODE) == SImode || (MODE) == HImode \ + || (MODE) == QImode) + #define VALID_FP_MODE_P(MODE) \ ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode \ || (MODE) == SCmode || (MODE) == DCmode || (MODE) == XCmode) diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 1cc8563477a..0ce8e6204ff 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1298,3 +1298,22 @@ msm4 Target Mask(ISA2_SM4) Var(ix86_isa_flags2) Save Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and SM4 built-in functions and code generation. + +mavx10-max-512bit +Target Mask(ISA2_AVX10_512BIT) Var(ix86_isa_flags2) Save +Indicates 512 bit vector width support for AVX10. + +mavx10.1 +Target Mask(ISA2_AVX10_1) Var(ix86_isa_flags2) Save +Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, +and AVX10.1 built-in functions and code generation. + +mavx10.1-256 +Target RejectNegative +Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, +and AVX10.1 built-in functions and code generation. + +mavx10.1-512 +Target RejectNegative +Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, +and AVX10.1-512 built-in functions and code generation. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 89c5b4ea2b2..08e8b3b761c 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -7184,6 +7184,19 @@ Enable/disable the generation of the SHA512 instructions. @itemx no-sm4 Enable/disable the generation of the SM4 instructions. +@cindex @code{target("avx10.1")} function attribute, x86 +@item avx10.1 +@itemx no-avx10.1 +Enable/disable the generation of the AVX10.1 instructions. + +@cindex @code{target("avx10.1-256")} function attribute, x86 +@item avx10.1-256 +Enable the generation of the AVX10.1 instructions. + +@cindex @code{target("avx10.1-512")} function attribute, x86 +@item avx10.1-512 +Enable the generation of the AVX10.1 512 bit instructions. + @cindex @code{target("cld")} function attribute, x86 @item cld @itemx no-cld diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 674f956f4b8..43b6210c3c8 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1436,6 +1436,7 @@ See RS/6000 and PowerPC Options. -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 +-mavx10.1 -mavx10.1-256 -mavx10.1-512 -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops -minline-stringops-dynamically -mstringop-strategy=@var{alg} -mkl -mwidekl @@ -33670,6 +33671,15 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @need 200 @opindex msm4 @itemx -msm4 +@need 200 +@opindex mavx10.1 +@itemx -mavx10.1 +@need 200 +@opindex mavx10.1-256 +@itemx -mavx10.1-256 +@need 200 +@opindex mavx10.1-512 +@itemx -mavx10.1-512 These switches enable the use of instructions in the MMX, SSE, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG, @@ -33680,9 +33690,9 @@ GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16, ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE, UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512-FP16, AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI, RAOINT, -AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4 or CLDEMOTE extended instruction -sets. Each has a corresponding @option{-mno-} option to disable use of these -instructions. +AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512, SM4, AVX10.1 or CLDEMOTE extended +instruction sets. Each has a corresponding @option{-mno-} option to disable +use of these instructions. These extensions are also available as built-in functions: see @ref{x86 Built-in Functions}, for details of the functions enabled and diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 1a78b3c1abb..cab8065cd8e 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2484,6 +2484,15 @@ Target supports compiling @code{avx} instructions. @item avx_runtime Target supports the execution of @code{avx} instructions. +@item avx10.1 +Target supports the execution of @code{avx10.1} instructions. + +@item avx10.1-256 +Target supports the execution of @code{avx10.1} instructions. + +@item avx10.1-512 +Target supports the execution of @code{avx10.1-512} instructions. + @item avx2 Target supports compiling @code{avx2} instructions. diff --git a/gcc/testsuite/g++.target/i386/mv33.C b/gcc/testsuite/g++.target/i386/mv33.C new file mode 100644 index 00000000000..b50f13c5aa8 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/mv33.C @@ -0,0 +1,30 @@ +// Test that dispatching can choose the right multiversion +// for avx10.x-512 microarchitecture levels. + +// { dg-do run } +// { dg-require-ifunc "" } +// { dg-options "-O2" } + +#include + +int __attribute__ ((target("default"))) +foo () +{ + return 0; +} + +int __attribute__ ((target("avx10.1-512"))) foo () { + return 1; +} + +int main () +{ + int val = foo (); + + if (__builtin_cpu_supports ("avx10.1-512")) + assert (val == 1); + else + assert (val == 0); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-1.c new file mode 100644 index 00000000000..cfd9662bb13 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-1.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1" } */ + +#include + +void +f1 () +{ + register __m256d a __asm ("ymm17"); + register __m256d b __asm ("ymm16"); + a = _mm256_add_pd (a, b); + asm volatile ("" : "+v" (a)); +} + +void +f2 () +{ + register __m128d a __asm ("xmm17"); + register __m128d b __asm ("xmm16"); + a = _mm_add_pd (a, b); + asm volatile ("" : "+v" (a)); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-10.c b/gcc/testsuite/gcc.target/i386/avx10_1-10.c new file mode 100644 index 00000000000..9a5892d8df9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-10.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64" } */ +/* { dg-final { scan-assembler "%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__attribute__ ((target ("avx10.1-512"))) __m512d +foo () +{ + __m512d a, b; + a = a + b; + return a; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-2.c b/gcc/testsuite/gcc.target/i386/avx10_1-2.c new file mode 100644 index 00000000000..0b3991dcf74 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-2.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64 -mavx10.1-512" } */ +/* { dg-final { scan-assembler "%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__m512d +foo () +{ + __m512d a, b; + a = a + b; + return a; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-3.c b/gcc/testsuite/gcc.target/i386/avx10_1-3.c new file mode 100644 index 00000000000..3be988a1a62 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-3.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1" } */ + +#include + +int +foo (int c) +{ + register int a __asm ("k7") = c; + int b = foo (a); + asm volatile ("" : "+k" (b)); + return b; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-4.c b/gcc/testsuite/gcc.target/i386/avx10_1-4.c new file mode 100644 index 00000000000..68cbf197d61 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-4.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1-512" } */ + +#include + +long long +foo (long long c) +{ + register long long a __asm ("k7") = c; + long long b = foo (a); + asm volatile ("" : "+k" (b)); + return b; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-5.c b/gcc/testsuite/gcc.target/i386/avx10_1-5.c new file mode 100644 index 00000000000..5481ab2f386 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-5.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O0 -march=x86-64 -mavx10.1 -Wno-psabi" } */ +/* { dg-final { scan-assembler-not ".%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__m512d +foo () +{ + __m512d a, b; + a = a + b; + return a; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-6.c b/gcc/testsuite/gcc.target/i386/avx10_1-6.c new file mode 100644 index 00000000000..827c80ce51e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-6.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1" } */ + +#include + +long long +foo (long long c) +{ + register long long a __asm ("k7") = c; + long long b = foo (a); + asm volatile ("" : "+k" (b)); /* { dg-error "inconsistent operand constraints in an 'asm'" } */ + return b; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-7.c b/gcc/testsuite/gcc.target/i386/avx10_1-7.c new file mode 100644 index 00000000000..d8b8d97590b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-7.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64 -Wno-psabi" } */ +/* { dg-final { scan-assembler-not ".%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__attribute__ ((target ("avx10.1"))) __m512d +foo () +{ + __m512d a, b; + a = a + b; + return a; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-8.c b/gcc/testsuite/gcc.target/i386/avx10_1-8.c new file mode 100644 index 00000000000..8dbd201b336 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-8.c @@ -0,0 +1,4 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1-256" } */ + +#include "avx10_1-1.c" diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-9.c b/gcc/testsuite/gcc.target/i386/avx10_1-9.c new file mode 100644 index 00000000000..00493098be7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-9.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64 -Wno-psabi" } */ +/* { dg-final { scan-assembler-not ".%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__attribute__ ((target ("avx10.1-256"))) __m512d +foo () +{ + __m512d a, b; + a = a + b; + return a; +} From patchwork Tue Aug 8 07:13:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132509 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1934568vqr; Tue, 8 Aug 2023 00:15:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE1faw8iLC5eYbJuLK6PTbw/TbkIi+ktwIw9vOxfvLBM2DDa9AsVY2EE7T/JH7etHnjqRHG X-Received: by 2002:a17:906:53cd:b0:99c:602b:6a67 with SMTP id p13-20020a17090653cd00b0099c602b6a67mr10393197ejo.59.1691478945986; Tue, 08 Aug 2023 00:15:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691478945; cv=none; d=google.com; s=arc-20160816; b=qqmbvPXWbqRwhgua1bi7bw3GMBilSVtc6hG2JfuPZeFm6GG66ZJN9Iq+o1F8tUwm1I StJ2RIr0cRVWqBcTh0tr65INu1GINjU4QzJyQev37F+kH035PEuNf54sHxQDBpElonPD Va0dhunybbZZGguPnDZKmZd3W6mJB9c1MrHwTAcg8bnzCfBWV6D252useur2UCKQqNgO PswR5oVKqCs2iZkwd36P5mq3DljhBYnrfwRCvmizO2ZZbPlKYmzTdrh0z6sE9gDbedC/ b6U1B5vDxqLx57MdDI6DBDZJ9egw5vl+uMextsifnoVpAiMGii+XB0QewPTW7iingWjt /I+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=ncDd64/56y8pzHzFFk2+noLvAppfIWQOoqXu4RvsB24=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=WwzrOHIuv5mJeDd3siKoY6MGGtVPQkn34sMqTij1d+n/mFKWRKJIaiZ1cNjN2V/UnJ rbu+I0g7vr2F273/xzC427p71ZVchTXBVqj8TFcNtuoRNlokviML63HYN/Oq/8pXUxqL tWl/4MxXJrMQXrXSudv0hE/O0hf+D1OC07q+9jd5biIBJY27Im7v5AcMObvRF5mtZs6U KzUjRmZ/QrCtCwKkZEGW+3R0xrvFLGW74Kmy+KBDelgmTQCrhF2EyH6WDrsUm2X2XKWF HsPD3qsDTKhza+L2VaX6llYxcWp8oFo1bv6gyHpguv/Lj/UTCbVsD4Rsvz9WTSlYURj7 2MJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=k8qBF2MP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id jo5-20020a170906f6c500b00992fef5cff9si7029561ejb.497.2023.08.08.00.15.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:15:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=k8qBF2MP; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9BBE23855580 for ; Tue, 8 Aug 2023 07:15:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9BBE23855580 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691478901; bh=ncDd64/56y8pzHzFFk2+noLvAppfIWQOoqXu4RvsB24=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=k8qBF2MPzsICo1T6t94O1GMBCigvMxBNhBl4Krnha6F/bLm8zzXm4Na7r5PyHIHVq NsJ1uRX9vYXgFb0VFy33fcwn+UlracPCl1Rqf3qfBpnj2jZik/FF0AP/GbxSQmThPD wbEmJ/eBgC1a6SD397GXzbeSJlyyHQmJA8y5J0V0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 2A7473858C54 for ; Tue, 8 Aug 2023 07:13:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A7473858C54 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="434592298" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="434592298" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:13:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="845345867" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="845345867" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 08 Aug 2023 00:13:15 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7AD5810054DE; Tue, 8 Aug 2023 15:13:14 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 2/3] Emit a warning when disabling AVX512 with AVX10 enabled or disabling AVX10 with AVX512 enabled Date: Tue, 8 Aug 2023 15:13:11 +0800 Message-Id: <20230808071312.1569559-3-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644227300207328 X-GMAIL-MSGID: 1773644227300207328 gcc/ChangeLog: * config/i386/driver-i386.cc (host_detect_local_cpu): Do not append -mno-avx10.1 for -march=native. * config/i386/i386-options.cc (ix86_check_avx10): New function to check isa_flags and isa_flags_explicit to emit warning when AVX10 is enabled by "-m" option. (ix86_check_avx512): New function to check isa_flags and isa_flags_explicit to emit warning when AVX512 is enabled by "-m" option. (ix86_handle_option): Do not change the flags when warning is emitted. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-11.c: New test. * gcc.target/i386/avx10_1-12.c: Ditto. * gcc.target/i386/avx10_1-13.c: Ditto. * gcc.target/i386/avx10_1-14.c: Ditto. --- gcc/common/config/i386/i386-common.cc | 68 +++++++++++++++++----- gcc/config/i386/driver-i386.cc | 2 +- gcc/testsuite/gcc.target/i386/avx10_1-11.c | 5 ++ gcc/testsuite/gcc.target/i386/avx10_1-12.c | 13 +++++ gcc/testsuite/gcc.target/i386/avx10_1-13.c | 5 ++ gcc/testsuite/gcc.target/i386/avx10_1-14.c | 13 +++++ 6 files changed, 91 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-11.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-12.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-13.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-14.c diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index 6c3bebb1846..ec94251dd4c 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -388,6 +388,46 @@ set_malign_value (const char **flag, unsigned value) *flag = r; } +/* Emit a warning when using -mno-avx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2, + vnni,ifma,bitalg,vpopcntdq} with -mavx10.1 and above. */ +static bool +ix86_check_avx10 (struct gcc_options *opts) +{ + if (opts->x_ix86_isa_flags2 & opts->x_ix86_isa_flags2_explicit + & OPTION_MASK_ISA2_AVX10_1) + { + warning (0, "%<-mno-avx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni,ifma," + "bitalg,vpopcntdq}%> are ignored with %<-mavx10.1%> and above"); + return false; + } + + return true; +} + +/* Emit a warning when using -mno-avx10.1 with -mavx512{f,vl,bw,dq,cd,bf16, + fp16,vbmi,vbmi2,vnni,ifma,bitalg,vpopcntdq}. */ +static bool +ix86_check_avx512 (struct gcc_options *opts) +{ + if ((opts->x_ix86_isa_flags & opts->x_ix86_isa_flags_explicit + & (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX512CD + | OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512BW + | OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512IFMA + | OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512VBMI2 + | OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VPOPCNTDQ + | OPTION_MASK_ISA_AVX512BITALG)) + || (opts->x_ix86_isa_flags2 & opts->x_ix86_isa_flags2_explicit + & (OPTION_MASK_ISA2_AVX512FP16 | OPTION_MASK_ISA2_AVX512BF16))) + { + warning (0, "%<-mno-avx10.1%> is ignored when using with " + "%<-mavx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni," + "ifma,bitalg,vpopcntdq}%>"); + return false; + } + + return true; +} + /* Implement TARGET_HANDLE_OPTION. */ bool @@ -609,7 +649,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512F_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_UNSET; @@ -624,7 +664,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512CD_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512CD_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512CD_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512CD_UNSET; @@ -898,7 +938,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512VBMI2_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VBMI2_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512VBMI2_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VBMI2_UNSET; @@ -913,7 +953,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512FP16_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512FP16_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX512FP16_UNSET; opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX512FP16_UNSET; @@ -926,7 +966,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512VNNI_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VNNI_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512VNNI_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VNNI_UNSET; @@ -940,7 +980,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET; opts->x_ix86_isa_flags_explicit @@ -954,7 +994,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512BITALG_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512BITALG_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512BITALG_UNSET; opts->x_ix86_isa_flags_explicit @@ -970,7 +1010,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512BW_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512BW_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX512BF16_UNSET; opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX512BF16_UNSET; @@ -1037,7 +1077,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512DQ_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512DQ_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512DQ_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512DQ_UNSET; @@ -1050,7 +1090,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512BW_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512BW_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512BW_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512BW_UNSET; @@ -1065,7 +1105,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512VL_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VL_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512VL_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VL_UNSET; @@ -1078,7 +1118,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512IFMA_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512IFMA_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512IFMA_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512IFMA_UNSET; @@ -1091,7 +1131,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512VBMI_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VBMI_SET; } - else + else if (ix86_check_avx10 (opts)) { opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_AVX512VBMI_UNSET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512VBMI_UNSET; @@ -1367,7 +1407,7 @@ ix86_handle_option (struct gcc_options *opts, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX2_SET; opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX2_SET; } - else + else if (ix86_check_avx512 (opts)) { opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_1_UNSET; opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_UNSET; diff --git a/gcc/config/i386/driver-i386.cc b/gcc/config/i386/driver-i386.cc index 08d0aed6183..227ace6ff83 100644 --- a/gcc/config/i386/driver-i386.cc +++ b/gcc/config/i386/driver-i386.cc @@ -854,7 +854,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) options = concat (options, " ", isa_names_table[i].option, NULL); } - else + else if (isa_names_table[i].feature != FEATURE_AVX10_1) options = concat (options, neg_option, isa_names_table[i].option + 2, NULL); } diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-11.c b/gcc/testsuite/gcc.target/i386/avx10_1-11.c new file mode 100644 index 00000000000..10c8d781dd9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-11.c @@ -0,0 +1,5 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1 -mno-avx512f" } */ +/* { dg-warning "'-mno-avx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni,ifma,bitalg,vpopcntdq}' are ignored with '-mavx10.1' and above" "" { target *-*-* } 0 } */ + +#include "avx10_1-1.c" diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-12.c b/gcc/testsuite/gcc.target/i386/avx10_1-12.c new file mode 100644 index 00000000000..b79c92ad002 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-12.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +#include + +__attribute__ ((target ("avx10.1,no-avx512f"))) void +f1 () +{ /* { dg-warning "'-mno-avx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni,ifma,bitalg,vpopcntdq}' are ignored with '-mavx10.1' and above" } */ + register __m256d a __asm ("ymm17"); + register __m256d b __asm ("ymm16"); + a = _mm256_add_pd (a, b); + asm volatile ("" : "+v" (a)); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-13.c b/gcc/testsuite/gcc.target/i386/avx10_1-13.c new file mode 100644 index 00000000000..156d59f1d35 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-13.c @@ -0,0 +1,5 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=x86-64 -mavx512f -mno-avx10.1" } */ +/* { dg-warning "'-mno-avx10.1' is ignored when using with '-mavx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni,ifma,bitalg,vpopcntdq}'" "" { target *-*-* } 0 } */ + +#include "avx10_1-2.c" diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-14.c b/gcc/testsuite/gcc.target/i386/avx10_1-14.c new file mode 100644 index 00000000000..23d2ba8bc64 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-14.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64" } */ +/* { dg-final { scan-assembler "%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__attribute__ ((target ("avx512f,no-avx10.1"))) __m512d +foo () +{ /* { dg-warning "'-mno-avx10.1' is ignored when using with '-mavx512{f,vl,bw,dq,cd,bf16,fp16,vbmi,vbmi2,vnni,ifma,bitalg,vpopcntdq}'" } */ + __m512d a, b; + a = a + b; + return a; +} From patchwork Tue Aug 8 07:13:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132508 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1934051vqr; Tue, 8 Aug 2023 00:14:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFNIqzQfQyC2sCLHlwfD7323+21fPhQkyyKSUgOGnY8VJD60x6zuooBsfZiCx/sofdlLCDE X-Received: by 2002:a19:6457:0:b0:4fd:ddbc:1577 with SMTP id b23-20020a196457000000b004fdddbc1577mr8258190lfj.2.1691478860357; Tue, 08 Aug 2023 00:14:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691478860; cv=none; d=google.com; s=arc-20160816; b=oJD8ur6nIDspZR2ln1vlYn7uXJ0Zr+6FQzqZuXu0lL7VmiPMMHR3y/AKa4ecf3sxg4 9MSgtD2rn0d+XsAK+Pc0mvPDvdGuOZBJ279+zETDfGHME+9wvxjF71TIIFOrRe9On/Bk PW5tyuA/VnoiyYotw3JLOowhhYl+jXjcnre/yh80WXk+pOAYWjmoKoGrT1K9qmc2H8FI GNX92GdAOOLvQ/ypl5Lw6VHx5PwmkIYEKVDLPgTZU8n1zGDyoRsmeRWx28urC35YarNJ pbR0evth/UfgejmL6/EWynxU0usKExAs9ED5PSGkgtCGMD/hvsclQsPgSElhl4gPWt/N JEbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=WQhZsfiSXOgI9Ad3ElsrAb7/oBLV+5nDgqjNWc2K8a4=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=kCHyhaySUnXfheziK3LYmYaqS9fOnlI5d0eiEFumfQGgMvJvCvcT++PKgeH09Gv8HW 2z3PyGmnRIwRXJ1MXSnVS9aTb38fhiq0vn1AaIqzL52goNqdwXV41SMSMfw1eN3+08ge UCrA1v+5+T8tNRB9Nh+FpVNZ4BjXub34dWYbR9OfTvLaFe2LOWFI3W0cHqO5ZTunh7wK zr2y57oxgRi9oevpyspBWsna90Vaj5yeG7qGX1WKtx2zTeeAjQYLqjlq90QKOMsGv2p/ mbUoNJyWXzPIwR9cyJJBLNRvEbU9tUio+x845a8plsXnxRMAcmTvdvpeStG6Y3Zn+1Us JTOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZtwrjsQC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c3-20020aa7c983000000b0052349404b09si348568edt.663.2023.08.08.00.14.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:14:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZtwrjsQC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2992D3858032 for ; Tue, 8 Aug 2023 07:14:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2992D3858032 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691478844; bh=WQhZsfiSXOgI9Ad3ElsrAb7/oBLV+5nDgqjNWc2K8a4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ZtwrjsQCtaCjq262SYM3prsrPZRYOgpS07N5yE3XNGgTOch3tc95eZsW5Q5YCfHVA modA9fgZCL0P4mXseUKye/2n24x4Cv87yh3ISInpnFPE8egQVqjDJd9Fu+ZxUcXCyI wXEhnm43BYJrpTETlLBiabv7JYXhE+pD9uymdWmQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 024083858D33 for ; Tue, 8 Aug 2023 07:13:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 024083858D33 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="434592293" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="434592293" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:13:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="845345863" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="845345863" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 08 Aug 2023 00:13:15 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7DA8B1005188; Tue, 8 Aug 2023 15:13:14 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 3/3] Emit a warning when AVX10 options conflict in vector width Date: Tue, 8 Aug 2023 15:13:12 +0800 Message-Id: <20230808071312.1569559-4-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644137538394109 X-GMAIL-MSGID: 1773644137538394109 gcc/ChangeLog: * config/i386/driver-i386.cc (host_detect_local_cpu): Do not append -mno-avx10-max-512bit for -march=native. * common/config/i386/i386-common.cc (ix86_check_avx10_vector_width): New function to check isa_flags to emit a warning when there is a conflict in AVX10 options for vector width. (ix86_handle_option): Add check for avx10.1-256 and avx10.1-512. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-15.c: New test. * gcc.target/i386/avx10_1-16.c: Ditto. * gcc.target/i386/avx10_1-17.c: Ditto. * gcc.target/i386/avx10_1-18.c: Ditto. --- gcc/common/config/i386/i386-common.cc | 20 ++++++++++++++++++++ gcc/config/i386/driver-i386.cc | 3 ++- gcc/config/i386/i386-options.cc | 2 +- gcc/testsuite/gcc.target/i386/avx10_1-15.c | 5 +++++ gcc/testsuite/gcc.target/i386/avx10_1-16.c | 5 +++++ gcc/testsuite/gcc.target/i386/avx10_1-17.c | 13 +++++++++++++ gcc/testsuite/gcc.target/i386/avx10_1-18.c | 13 +++++++++++++ 7 files changed, 59 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-15.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-16.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-17.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-18.c diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc index ec94251dd4c..db88befc9b8 100644 --- a/gcc/common/config/i386/i386-common.cc +++ b/gcc/common/config/i386/i386-common.cc @@ -428,6 +428,24 @@ ix86_check_avx512 (struct gcc_options *opts) return true; } +/* Emit a warning when there is a conflict vector width in AVX10 options. */ +static void +ix86_check_avx10_vector_width (struct gcc_options *opts, bool avx10_max_512) +{ + if (avx10_max_512) + { + if (((opts->x_ix86_isa_flags2 | ~OPTION_MASK_ISA2_AVX10_512BIT) + == ~OPTION_MASK_ISA2_AVX10_512BIT) + && (opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_AVX10_512BIT)) + warning (0, "The options used for AVX10 have conflict vector width, " + "using the latter 512 as vector width"); + } + else if (opts->x_ix86_isa_flags2 & opts->x_ix86_isa_flags2_explicit + & OPTION_MASK_ISA2_AVX10_512BIT) + warning (0, "The options used for AVX10 have conflict vector width, " + "using the latter 256 as vector width"); +} + /* Implement TARGET_HANDLE_OPTION. */ bool @@ -1415,6 +1433,7 @@ ix86_handle_option (struct gcc_options *opts, return true; case OPT_mavx10_1_256: + ix86_check_avx10_vector_width (opts, false); opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_1_SET; opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_SET; opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX10_512BIT_SET; @@ -1424,6 +1443,7 @@ ix86_handle_option (struct gcc_options *opts, return true; case OPT_mavx10_1_512: + ix86_check_avx10_vector_width (opts, true); opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_1_SET; opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX10_1_SET; opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX10_512BIT_SET; diff --git a/gcc/config/i386/driver-i386.cc b/gcc/config/i386/driver-i386.cc index 227ace6ff83..f4551a74e3a 100644 --- a/gcc/config/i386/driver-i386.cc +++ b/gcc/config/i386/driver-i386.cc @@ -854,7 +854,8 @@ const char *host_detect_local_cpu (int argc, const char **argv) options = concat (options, " ", isa_names_table[i].option, NULL); } - else if (isa_names_table[i].feature != FEATURE_AVX10_1) + else if ((isa_names_table[i].feature != FEATURE_AVX10_1) + && (isa_names_table[i].feature != FEATURE_AVX10_512BIT)) options = concat (options, neg_option, isa_names_table[i].option + 2, NULL); } diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index b2281fbd4b5..8f9b825b527 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -985,7 +985,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], ix86_opt_ix86_no, ix86_opt_str, ix86_opt_enum, - ix86_opt_isa, + ix86_opt_isa }; static const struct diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-15.c b/gcc/testsuite/gcc.target/i386/avx10_1-15.c new file mode 100644 index 00000000000..fd873c9694c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-15.c @@ -0,0 +1,5 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1-512 -mavx10.1-256" } */ +/* { dg-warning "The options used for AVX10 have conflict vector width, using the latter 256 as vector width" "" { target *-*-* } 0 } */ + +#include "avx10_1-1.c" diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-16.c b/gcc/testsuite/gcc.target/i386/avx10_1-16.c new file mode 100644 index 00000000000..1e664ebd1f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-16.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=x86-64 -mavx10.1-256 -mavx10.1-512" } */ +/* { dg-warning "The options used for AVX10 have conflict vector width, using the latter 512 as vector width" "" { target *-*-* } 0 } */ + +#include "avx10_1-2.c" diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-17.c b/gcc/testsuite/gcc.target/i386/avx10_1-17.c new file mode 100644 index 00000000000..7dfff3aeeac --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-17.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +#include + +__attribute__ ((target ("avx10.1-512,avx10.1-256"))) void +f1 () +{ /* { dg-warning "The options used for AVX10 have conflict vector width, using the latter 256 as vector width" } */ + register __m256d a __asm ("ymm17"); + register __m256d b __asm ("ymm16"); + a = _mm256_add_pd (a, b); + asm volatile ("" : "+v" (a)); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-18.c b/gcc/testsuite/gcc.target/i386/avx10_1-18.c new file mode 100644 index 00000000000..955cca185fd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-18.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64" } */ +/* { dg-final { scan-assembler "%zmm" } } */ + +typedef double __m512d __attribute__ ((__vector_size__ (64), __may_alias__)); + +__attribute__ ((target ("avx10.1-256,avx10.1-512"))) __m512d +foo () +{ /* { dg-warning "The options used for AVX10 have conflict vector width, using the latter 512 as vector width" } */ + __m512d a, b; + a = a + b; + return a; +} From patchwork Tue Aug 8 07:20:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132515 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1937373vqr; Tue, 8 Aug 2023 00:23:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEkiosSbZGnMOl81Bj6aV+U1QJHWDW1hqFZ5qgMEobP7zVawozMEk9ichSpgT02kZWxZc4V X-Received: by 2002:aa7:dc12:0:b0:523:1004:1ca0 with SMTP id b18-20020aa7dc12000000b0052310041ca0mr11012428edu.5.1691479395500; Tue, 08 Aug 2023 00:23:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691479395; cv=none; d=google.com; s=arc-20160816; b=FKWno6u6SGRqVNUuEFWW1xiL739MdMw3k3X9MXYQCGldEXMeBJtqe/NRjeRgV1Obte pa7IwKmjRTJOeI0zQkphHB8SRiTohLv+KmThkprqEOW/FWEBJp7HW2lIkm2znXy4bp98 xT83nNqfUTX38CEieTWMM++QJux5Fk3kgZpY9YwnEAJcRH3xQjf0AN78ZTl5ns9oa4HC uZ/zJGlMtIkVjDbrvpCDbT4yCiEHG/Qeothdnf9+P1bUogWDJclPW4Kq/2nF7ERXFPoV myXzX/muydwx+41gaVAF7H48UXTCOUrS2r6zN50lIUi0BSlfSe0HG9AbJSidtwzmzwMu iFRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=rc0gbfgRjk/1exb7yAhUgbIAC+WENmaJYCU86v1LQMs=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=sFdP/gN4myDaOibgFhxMF1m/jjW/GmFSQnhnBIAoEPcRn9K0DcZTubfTbuQGT/hreN iMXpvV8xK3ZaYndUN4PmMOf0vvpFenqhn3JY/imZhB87rg9Gxg54SJkFpIUhxda0yhyO GV7/cvvRhwfo4QaJ9zGFFZX1AQbt+Kpd2KoZaZF7d8c3BDFR7MKvqVyxtFluO8Ca2qlq GZKHF2moS3w7D3nsGkMlV5G1n69AGYzXXyUQASP/QWoUtfgq6bmxyH38RF9iDRVZpjEa U4zUzyXPMoanYjma6H85qPif9tuSwL4SREsmwWYAqDxHVp/eA1Mgn9WNfeC2ascLrxHn O2BQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="J85/5JVl"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n19-20020aa7c453000000b0052322eb7739si5464447edr.33.2023.08.08.00.23.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:23:15 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="J85/5JVl"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E1333856DD4 for ; Tue, 8 Aug 2023 07:22:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3E1333856DD4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691479321; bh=rc0gbfgRjk/1exb7yAhUgbIAC+WENmaJYCU86v1LQMs=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=J85/5JVlaHA9OWhvUTppGs//eQJLjTjFWKvGNvjK9QLXxB65Gv/xZAU0aHFDM3q/I sS6cocDUA0WHNOdcyvjuOsPUOogwEqiqW/JhrOAKIfy6dnIidPaHm21tietbCjBFsF XCnTFTgzMb/TT7wg9yKBCPfQYLSsI+Yx9ORIQQb0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id EC1053857007 for ; Tue, 8 Aug 2023 07:20:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EC1053857007 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="457126272" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="457126272" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:20:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="796615231" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="796615231" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga008.fm.intel.com with ESMTP; 08 Aug 2023 00:20:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id EC3F81005608; Tue, 8 Aug 2023 15:20:31 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 4/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins Date: Tue, 8 Aug 2023 15:20:31 +0800 Message-Id: <20230808072031.1570222-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644548037755749 X-GMAIL-MSGID: 1773644698834452260 gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-abs-copysign-1.c: New test. * gcc.target/i386/avx10_1-vandpd-1.c: Ditto. * gcc.target/i386/avx10_1-vandps-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtps2qq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtps2uqq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtqq2pd-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtqq2ps-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtuqq2pd-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtuqq2ps-1.c: Ditto. * gcc.target/i386/avx10_1-vorpd-1.c: Ditto. * gcc.target/i386/avx10_1-vorps-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovd2m-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovm2d-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovm2q-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovq2m-1.c: Ditto. * gcc.target/i386/avx10_1-vxorpd-1.c: Ditto. * gcc.target/i386/avx10_1-vxorps-1.c: Ditto. --- .../gcc.target/i386/avx10_1-abs-copysign-1.c | 69 +++++++++++++++++++ .../gcc.target/i386/avx10_1-vandpd-1.c | 21 ++++++ .../gcc.target/i386/avx10_1-vandps-1.c | 21 ++++++ .../gcc.target/i386/avx10_1-vcvtps2qq-1.c | 28 ++++++++ .../gcc.target/i386/avx10_1-vcvtps2uqq-1.c | 27 ++++++++ .../gcc.target/i386/avx10_1-vcvtqq2pd-1.c | 27 ++++++++ .../gcc.target/i386/avx10_1-vcvtqq2ps-1.c | 26 +++++++ .../gcc.target/i386/avx10_1-vcvtuqq2pd-1.c | 27 ++++++++ .../gcc.target/i386/avx10_1-vcvtuqq2ps-1.c | 27 ++++++++ .../gcc.target/i386/avx10_1-vorpd-1.c | 22 ++++++ .../gcc.target/i386/avx10_1-vorps-1.c | 22 ++++++ .../gcc.target/i386/avx10_1-vpmovd2m-1.c | 17 +++++ .../gcc.target/i386/avx10_1-vpmovm2d-1.c | 17 +++++ .../gcc.target/i386/avx10_1-vpmovm2q-1.c | 17 +++++ .../gcc.target/i386/avx10_1-vpmovq2m-1.c | 17 +++++ .../gcc.target/i386/avx10_1-vxorpd-1.c | 23 +++++++ .../gcc.target/i386/avx10_1-vxorps-1.c | 22 ++++++ 17 files changed, 430 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-abs-copysign-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vandpd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vandps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2qq-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2uqq-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2pd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2ps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2pd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2ps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vorpd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vorps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpmovd2m-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2d-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2q-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpmovq2m-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vxorpd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vxorps-1.c diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-abs-copysign-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-abs-copysign-1.c new file mode 100644 index 00000000000..e9e45e44051 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-abs-copysign-1.c @@ -0,0 +1,69 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-Ofast -mavx10.1" } */ + +void +f1 (float x) +{ + register float a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a = __builtin_fabsf (a); + asm volatile ("" : "+v" (a)); +} +/* +void +f2 (float x, float y) +{ + register float a __asm ("xmm16"), b __asm ("xmm17"); + a = x; + b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + a = __builtin_copysignf (a, b); + asm volatile ("" : "+v" (a)); +} +*/ +void +f3 (float x) +{ + register float a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a = -a; + asm volatile ("" : "+v" (a)); +} + +void +f4 (double x) +{ + register double a __asm ("xmm18"); + a = x; + asm volatile ("" : "+v" (a)); + a = __builtin_fabs (a); + asm volatile ("" : "+v" (a)); +} +/* +void +f5 (double x, double y) +{ + register double a __asm ("xmm18"), b __asm ("xmm19"); + a = x; + b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + a = __builtin_copysign (a, b); + asm volatile ("" : "+v" (a)); +} +*/ +void +f6 (double x) +{ + register double a __asm ("xmm18"); + a = x; + asm volatile ("" : "+v" (a)); + a = -a; + asm volatile ("" : "+v" (a)); +} + +/* { dg-final { scan-assembler "vandps\[^\n\r\]*xmm16" } } */ +/* { dg-final { scan-assembler "vxorps\[^\n\r\]*xmm16" } } */ +/* { dg-final { scan-assembler "vandpd\[^\n\r\]*xmm18" } } */ +/* { dg-final { scan-assembler "vxorpd\[^\n\r\]*xmm18" } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vandpd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vandpd-1.c new file mode 100644 index 00000000000..3a765479f6d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vandpd-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vandpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d y; +volatile __m128d x; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_and_pd (y, m, y, y); + y = _mm256_maskz_and_pd (m, y, y); + x = _mm_mask_and_pd (x, m, x, x); + x = _mm_maskz_and_pd (m, x, x); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vandps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vandps-1.c new file mode 100644 index 00000000000..ed785af5f30 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vandps-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vandps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vandps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256 y; +volatile __m128 x; +volatile __mmask8 m2; + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_and_ps (y, m2, y, y); + y = _mm256_maskz_and_ps (m2, y, y); + x = _mm_mask_and_ps (x, m2, x, x); + x = _mm_maskz_and_ps (m2, x, x); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2qq-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2qq-1.c new file mode 100644 index 00000000000..dad6dbe778d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2qq-1.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2qq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x1; +volatile __m128i x2; +volatile __m256 z1; +volatile __m128 z2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x1 = _mm256_cvtps_epi64 (z2); + x1 = _mm256_mask_cvtps_epi64 (x1, m, z2); + x1 = _mm256_maskz_cvtps_epi64 (m, z2); + x2 = _mm_cvtps_epi64 (z2); + x2 = _mm_mask_cvtps_epi64 (x2, m, z2); + x2 = _mm_maskz_cvtps_epi64 (m, z2); +} + diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2uqq-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2uqq-1.c new file mode 100644 index 00000000000..24de26bd5e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtps2uqq-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtps2uqq\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x1; +volatile __m128i x2; +volatile __m256 z1; +volatile __m128 z2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x1 = _mm256_cvtps_epu64 (z2); + x1 = _mm256_mask_cvtps_epu64 (x1, m, z2); + x1 = _mm256_maskz_cvtps_epu64 (m, z2); + x2 = _mm_cvtps_epu64 (z2); + x2 = _mm_mask_cvtps_epu64 (x2, m, z2); + x2 = _mm_maskz_cvtps_epu64 (m, z2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2pd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2pd-1.c new file mode 100644 index 00000000000..5a2472292b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2pd-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i s1; +volatile __m128i s2; +volatile __m256d res1; +volatile __m128d res2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + res1 = _mm256_cvtepi64_pd (s1); + res1 = _mm256_mask_cvtepi64_pd (res1, m, s1); + res1 = _mm256_maskz_cvtepi64_pd (m, s1); + res2 = _mm_cvtepi64_pd (s2); + res2 = _mm_mask_cvtepi64_pd (res2, m, s2); + res2 = _mm_maskz_cvtepi64_pd (m, s2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2ps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2ps-1.c new file mode 100644 index 00000000000..7d735eb4c9c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtqq2ps-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i s1; +volatile __m128i s2; +volatile __m128 res; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + res = _mm256_cvtepi64_ps (s1); + res = _mm256_mask_cvtepi64_ps (res, m, s1); + res = _mm256_maskz_cvtepi64_ps (m, s1); + res = _mm_cvtepi64_ps (s2); + res = _mm_mask_cvtepi64_ps (res, m, s2); + res = _mm_maskz_cvtepi64_ps (m, s2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2pd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2pd-1.c new file mode 100644 index 00000000000..ab433a2ecde --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2pd-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i s1; +volatile __m128i s2; +volatile __m256d res1; +volatile __m128d res2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + res1 = _mm256_cvtepu64_pd (s1); + res1 = _mm256_mask_cvtepu64_pd (res1, m, s1); + res1 = _mm256_maskz_cvtepu64_pd (m, s1); + res2 = _mm_cvtepu64_pd (s2); + res2 = _mm_mask_cvtepu64_pd (res2, m, s2); + res2 = _mm_maskz_cvtepu64_pd (m, s2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2ps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2ps-1.c new file mode 100644 index 00000000000..ac9e788e4c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vcvtuqq2ps-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vcvtuqq2psy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i s1; +volatile __m128i s2; +volatile __m256 res1; +volatile __m128 res2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + res2 = _mm256_cvtepu64_ps (s1); + res2 = _mm256_mask_cvtepu64_ps (res2, m, s1); + res2 = _mm256_maskz_cvtepu64_ps (m, s1); + res2 = _mm_cvtepu64_ps (s2); + res2 = _mm_mask_cvtepu64_ps (res2, m, s2); + res2 = _mm_maskz_cvtepu64_ps (m, s2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vorpd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vorpd-1.c new file mode 100644 index 00000000000..d2367d136a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vorpd-1.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vorpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d y; +volatile __m128d x; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_or_pd (y, m, y, y); + y = _mm256_maskz_or_pd (m, y, y); + + x = _mm_mask_or_pd (x, m, x, x); + x = _mm_maskz_or_pd (m, x, x); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vorps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vorps-1.c new file mode 100644 index 00000000000..2ba919ed2e2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vorps-1.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vorps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vorps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256 y; +volatile __m128 x; +volatile __mmask8 n; + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_or_ps (y, n, y, y); + y = _mm256_maskz_or_ps (n, y, y); + + x = _mm_mask_or_ps (x, n, x, x); + x = _mm_maskz_or_ps (n, x, x); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpmovd2m-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovd2m-1.c new file mode 100644 index 00000000000..68f1a9485ed --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovd2m-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vpmovd2m\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovd2m\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x256; +volatile __m128i x128; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + m = _mm_movepi32_mask (x128); + m = _mm256_movepi32_mask (x256); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2d-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2d-1.c new file mode 100644 index 00000000000..89ac3bd49ed --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2d-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vpmovm2d\[ \\t\]+\[^\{\n\]*%k\[0-7\]\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovm2d\[ \\t\]+\[^\{\n\]*%k\[0-7\]\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x256; +volatile __m128i x128; +volatile __mmask8 m8; + +void extern +avx10_1_test (void) +{ + x128 = _mm_movm_epi32 (m8); + x256 = _mm256_movm_epi32 (m8); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2q-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2q-1.c new file mode 100644 index 00000000000..b5a3298c4ab --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovm2q-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vpmovm2q\[ \\t\]+\[^\{\n\]*%k\[0-7\]\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovm2q\[ \\t\]+\[^\{\n\]*%k\[0-7\]\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x256; +volatile __m128i x128; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x128 = _mm_movm_epi64 (m); + x256 = _mm256_movm_epi64 (m); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpmovq2m-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovq2m-1.c new file mode 100644 index 00000000000..2eb1f81a7ed --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpmovq2m-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vpmovq2m\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovq2m\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x256; +volatile __m128i x128; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + m = _mm_movepi64_mask (x128); + m = _mm256_movepi64_mask (x256); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vxorpd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vxorpd-1.c new file mode 100644 index 00000000000..062acc9b011 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vxorpd-1.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vxorpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorpd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorpd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d y; +volatile __m128d x; +volatile __mmask8 m; + + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_xor_pd (y, m, y, y); + y = _mm256_maskz_xor_pd (m, y, y); + + x = _mm_mask_xor_pd (x, m, x, x); + x = _mm_maskz_xor_pd (m, x, x); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vxorps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vxorps-1.c new file mode 100644 index 00000000000..04473ce0468 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vxorps-1.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vxorps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vxorps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256 y; +volatile __m128 x; +volatile __mmask8 n; + +void extern +avx10_1_test (void) +{ + y = _mm256_mask_xor_ps (y, n, y, y); + y = _mm256_maskz_xor_ps (n, y, y); + + x = _mm_mask_xor_ps (x, n, x, x); + x = _mm_maskz_xor_ps (n, x, x); +} From patchwork Tue Aug 8 07:20:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132514 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1937275vqr; Tue, 8 Aug 2023 00:23:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHMj0kkZfID1KBnns/62fWxXChJJufsEbJJgCzxteGko8rMwhZKrCmSccPHNiedlmOi8pVR X-Received: by 2002:a17:906:7490:b0:957:2e48:5657 with SMTP id e16-20020a170906749000b009572e485657mr10602453ejl.68.1691479381984; Tue, 08 Aug 2023 00:23:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691479381; cv=none; d=google.com; s=arc-20160816; b=xNXP4Hheu5i6/gcKGEFWbgosETW4mHEM0cjCQ1mf79T84l0DzjqDWOtVPWYbi7bCIP MrxtP+RX60MSTJw5mzeod0xlphBpOh77joSksJ2FfLpaiAsAxM9XrrLgXkNehK2fy+x1 9p+aqXI5/1gSGfL6nR4+cyKkR0fEWaE1DB1QxiNW7QhKO5820nTDBmyOtl1AYe4AsbIv Kxg+Nsmj+jgoZ+44b5TK8HKqwakw0GaW2jEfCPYRjmJ4oxlc+XEOe8pNoNxvh4ylGS67 9yPROAvHcPoL3Flf3auFhUNTCI24T9yYEfSIYkRx/RMu/tOcoPtJUA29cmpSJuGznUgb hEcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=ZaYEbZWHc2RnspdcFHovkwscdfvZ1n1ej/6nQrmu0zE=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=ZYrPSl5yxVGCE2DviMre8fSSSFSUX70wmDPMBHkqsojfBcxxjR1sp8ACtrx4JOvNJt TZqwtemV3A1bZgeXLTuzsFGar27Cbj/mtOfpLZyXJi5B8NI4ubJ0MV2/nzmxjkxrE5+w zU3NrK4DwjdZYnhrwB167C6NzHQnmiuZAv/99NT0dZ21CqDt0fNvzPIf6B9U+uHJwIL6 pH7zWpanBlKZzYIfBf/INhsfNF+YaNCOnlBftniapacwbZKMLoErA42hebuV8pUOPgGK SVL/Slq3438XPMhACV49ireE1Inb1IwTuOXN+XXiAYHE+0FdvjGoTAF6+AK8LDAifzZq vGhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ujC3Mw32; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g13-20020a170906394d00b00997dfeb04a1si7173471eje.70.2023.08.08.00.23.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:23:01 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ujC3Mw32; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 166943853D32 for ; Tue, 8 Aug 2023 07:21:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 166943853D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691479311; bh=ZaYEbZWHc2RnspdcFHovkwscdfvZ1n1ej/6nQrmu0zE=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ujC3Mw32AH6EO/BsiPh9gBvmA49Vs9AEWUNzfjWuN88NgfJurMopAw48Fj9+cv21g pbvS7xBaWE+MSWZBRj/CKM8cNHtOBkgRxnxfIZTNDniFfspaS7szAoih/kFxTIdAQd hFifTl8nxSDlkM5kYUjpBFGrkmLVaarjzvbgo4RM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 674583856DE6 for ; Tue, 8 Aug 2023 07:20:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 674583856DE6 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="457126298" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="457126298" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:20:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="796615272" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="796615272" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga008.fm.intel.com with ESMTP; 08 Aug 2023 00:20:46 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 2E8821005608; Tue, 8 Aug 2023 15:20:46 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 5/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins Date: Tue, 8 Aug 2023 15:20:46 +0800 Message-Id: <20230808072046.1570283-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644548037755749 X-GMAIL-MSGID: 1773644684141688255 gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Remove target attribute. * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_AVX10_1. * config/i386/sse.md (VF_AVX512VLDQ_AVX10_1): New. (VFH_AVX512VLDQ_AVX10_1): Ditto. (VF1_AVX512VLDQ_AVX10_1): Ditto. (reducep): Change iterator to VFH_AVX512VLDQ_AVX10_1. Remove target check. (vec_pack_float_): Change iterator to VI8_AVX512VLDQ_AVX10_1. Remove target check. (vec_unpack_fix_trunc_lo_): Change iterator to VF1_AVX512VLDQ_AVX10_1. Remove target check. (vec_unpack_fix_trunc_hi_): Ditto. (VI48F_256_DQVL_AVX10_1): Rename from VI48F_256_DQ. (avx512vl_vextractf128): Change iterator to VI48F_256_DQVL_AVX10_1. Remove target check. (vec_extract_hi__mask): Add TARGET_AVX10_1. (vec_extract_hi_): Ditto. (avx512vl_vinsert): Ditto. (vec_set_lo_): Ditto. (vec_set_hi_): Ditto. (avx512dq_rangep): Change iterator to VF_AVX512VLDQ_AVX10_1. Remove target check. (avx512dq_fpclass): Change iterator to VFH_AVX512VLDQ_AVX10_1. Remove target check. * config/i386/subst.md (mask_avx512dq_condition): Add TARGET_AVX10_1. (mask_scalar_merge): Ditto. --- gcc/config/i386/avx512vldqintrin.h | 11 ---- gcc/config/i386/i386-builtin.def | 32 +++++----- gcc/config/i386/sse.md | 94 ++++++++++++++++++------------ gcc/config/i386/subst.md | 4 +- 4 files changed, 76 insertions(+), 65 deletions(-) diff --git a/gcc/config/i386/avx512vldqintrin.h b/gcc/config/i386/avx512vldqintrin.h index a8d14a4efc9..1fbf93a0b52 100644 --- a/gcc/config/i386/avx512vldqintrin.h +++ b/gcc/config/i386/avx512vldqintrin.h @@ -1331,12 +1331,6 @@ _mm256_movepi64_mask (__m256i __A) return (__mmask8) __builtin_ia32_cvtq2mask256 ((__v4di) __A); } -#if !defined(__AVX512VL__) || !defined(__AVX512DQ__) -#pragma GCC push_options -#pragma GCC target("avx512vl,avx512dq") -#define __DISABLE_AVX512VLDQ__ -#endif /* __AVX512VLDQ__ */ - #ifdef __OPTIMIZE__ extern __inline __m128d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) @@ -2008,9 +2002,4 @@ _mm256_maskz_insertf64x2 (__mmask8 __U, __m256d __A, __m128d __B, #endif -#ifdef __DISABLE_AVX512VLDQ__ -#undef __DISABLE_AVX512VLDQ__ -#pragma GCC pop_options -#endif /* __DISABLE_AVX512VLDQ__ */ - #endif /* _AVX512VLDQINTRIN_H_INCLUDED */ diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index aa0a29caa9f..34768552e78 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -1782,8 +1782,8 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vec_dup_gprv2di_mask, "__b BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vec_dupv8sf_mask, "__builtin_ia32_broadcastss256_mask", IX86_BUILTIN_BROADCASTSS256, UNKNOWN, (int) V8SF_FTYPE_V4SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vec_dupv4sf_mask, "__builtin_ia32_broadcastss128_mask", IX86_BUILTIN_BROADCASTSS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vec_dupv4df_mask, "__builtin_ia32_broadcastsd256_mask", IX86_BUILTIN_BROADCASTSD256, UNKNOWN, (int) V4DF_FTYPE_V2DF_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v4df, "__builtin_ia32_extractf64x2_256_mask", IX86_BUILTIN_EXTRACTF64X2_256, UNKNOWN, (int) V2DF_FTYPE_V4DF_INT_V2DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v4di, "__builtin_ia32_extracti64x2_256_mask", IX86_BUILTIN_EXTRACTI64X2_256, UNKNOWN, (int) V2DI_FTYPE_V4DI_INT_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512vl_vextractf128v4df, "__builtin_ia32_extractf64x2_256_mask", IX86_BUILTIN_EXTRACTF64X2_256, UNKNOWN, (int) V2DF_FTYPE_V4DF_INT_V2DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512vl_vextractf128v4di, "__builtin_ia32_extracti64x2_256_mask", IX86_BUILTIN_EXTRACTI64X2_256, UNKNOWN, (int) V2DI_FTYPE_V4DI_INT_V2DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vinsertv8sf, "__builtin_ia32_insertf32x4_256_mask", IX86_BUILTIN_INSERTF32X4_256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V4SF_INT_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vinsertv8si, "__builtin_ia32_inserti32x4_256_mask", IX86_BUILTIN_INSERTI32X4_256, UNKNOWN, (int) V8SI_FTYPE_V8SI_V4SI_INT_V8SI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx2_sign_extendv16qiv16hi2_mask, "__builtin_ia32_pmovsxbw256_mask", IX86_BUILTIN_PMOVSXBW256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16QI_V16HI_UHI) @@ -1810,10 +1810,10 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx2_zero_extendv4hiv4di2_mask, "__ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_sse4_1_zero_extendv2hiv2di2_mask, "__builtin_ia32_pmovzxwq128_mask", IX86_BUILTIN_PMOVZXWQ128_MASK, UNKNOWN, (int) V2DI_FTYPE_V8HI_V2DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx2_zero_extendv4siv4di2_mask, "__builtin_ia32_pmovzxdq256_mask", IX86_BUILTIN_PMOVZXDQ256_MASK, UNKNOWN, (int) V4DI_FTYPE_V4SI_V4DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_sse4_1_zero_extendv2siv2di2_mask, "__builtin_ia32_pmovzxdq128_mask", IX86_BUILTIN_PMOVZXDQ128_MASK, UNKNOWN, (int) V2DI_FTYPE_V4SI_V2DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_reducepv4df_mask, "__builtin_ia32_reducepd256_mask", IX86_BUILTIN_REDUCEPD256_MASK, UNKNOWN, (int) V4DF_FTYPE_V4DF_INT_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_reducepv2df_mask, "__builtin_ia32_reducepd128_mask", IX86_BUILTIN_REDUCEPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_INT_V2DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_reducepv8sf_mask, "__builtin_ia32_reduceps256_mask", IX86_BUILTIN_REDUCEPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_INT_V8SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_reducepv4sf_mask, "__builtin_ia32_reduceps128_mask", IX86_BUILTIN_REDUCEPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_INT_V4SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv4df_mask, "__builtin_ia32_reducepd256_mask", IX86_BUILTIN_REDUCEPD256_MASK, UNKNOWN, (int) V4DF_FTYPE_V4DF_INT_V4DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv2df_mask, "__builtin_ia32_reducepd128_mask", IX86_BUILTIN_REDUCEPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_INT_V2DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv8sf_mask, "__builtin_ia32_reduceps256_mask", IX86_BUILTIN_REDUCEPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_INT_V8SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv4sf_mask, "__builtin_ia32_reduceps128_mask", IX86_BUILTIN_REDUCEPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_INT_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_permvarv16hi_mask, "__builtin_ia32_permvarhi256_mask", IX86_BUILTIN_VPERMVARHI256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI) @@ -1908,10 +1908,10 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_ss_truncatev2div2si2_mask, BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_ss_truncatev4div4si2_mask, "__builtin_ia32_pmovsqd256_mask", IX86_BUILTIN_PMOVSQD256, UNKNOWN, (int) V4SI_FTYPE_V4DI_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_us_truncatev2div2si2_mask, "__builtin_ia32_pmovusqd128_mask", IX86_BUILTIN_PMOVUSQD128, UNKNOWN, (int) V4SI_FTYPE_V2DI_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_us_truncatev4div4si2_mask, "__builtin_ia32_pmovusqd256_mask", IX86_BUILTIN_PMOVUSQD256, UNKNOWN, (int) V4SI_FTYPE_V4DI_V4SI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_rangepv4df_mask, "__builtin_ia32_rangepd256_mask", IX86_BUILTIN_RANGEPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_INT_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_rangepv2df_mask, "__builtin_ia32_rangepd128_mask", IX86_BUILTIN_RANGEPD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_rangepv8sf_mask, "__builtin_ia32_rangeps256_mask", IX86_BUILTIN_RANGEPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_INT_V8SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_rangepv4sf_mask, "__builtin_ia32_rangeps128_mask", IX86_BUILTIN_RANGEPS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangepv4df_mask, "__builtin_ia32_rangepd256_mask", IX86_BUILTIN_RANGEPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_INT_V4DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangepv2df_mask, "__builtin_ia32_rangepd128_mask", IX86_BUILTIN_RANGEPD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangepv8sf_mask, "__builtin_ia32_rangeps256_mask", IX86_BUILTIN_RANGEPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_INT_V8SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangepv4sf_mask, "__builtin_ia32_rangeps128_mask", IX86_BUILTIN_RANGEPS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_getexpv8sf_mask, "__builtin_ia32_getexpps256_mask", IX86_BUILTIN_GETEXPPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_getexpv4df_mask, "__builtin_ia32_getexppd256_mask", IX86_BUILTIN_GETEXPPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_getexpv4sf_mask, "__builtin_ia32_getexpps128_mask", IX86_BUILTIN_GETEXPPS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_UQI) @@ -2076,8 +2076,8 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_fmsubadd_v4df_mask3, "__bu BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_fmsubadd_v2df_mask3, "__builtin_ia32_vfmsubaddpd128_mask3", IX86_BUILTIN_VFMSUBADDPD128_MASK3, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_fmsubadd_v8sf_mask3, "__builtin_ia32_vfmsubaddps256_mask3", IX86_BUILTIN_VFMSUBADDPS256_MASK3, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_fmsubadd_v4sf_mask3, "__builtin_ia32_vfmsubaddps128_mask3", IX86_BUILTIN_VFMSUBADDPS128_MASK3, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vinsertv4df, "__builtin_ia32_insertf64x2_256_mask", IX86_BUILTIN_INSERTF64X2_256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V2DF_INT_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vinsertv4di, "__builtin_ia32_inserti64x2_256_mask", IX86_BUILTIN_INSERTI64X2_256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V2DI_INT_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512vl_vinsertv4df, "__builtin_ia32_insertf64x2_256_mask", IX86_BUILTIN_INSERTF64X2_256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V2DF_INT_V4DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512vl_vinsertv4di, "__builtin_ia32_inserti64x2_256_mask", IX86_BUILTIN_INSERTI64X2_256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V2DI_INT_V4DI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_ashrvv16hi_mask, "__builtin_ia32_psrav16hi_mask", IX86_BUILTIN_PSRAVV16HI, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_ashrvv8hi_mask, "__builtin_ia32_psrav8hi_mask", IX86_BUILTIN_PSRAVV8HI, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512bw_pmaddubsw512v16hi_mask, "__builtin_ia32_pmaddubsw256_mask", IX86_BUILTIN_PMADDUBSW256_MASK, UNKNOWN, (int) V16HI_FTYPE_V32QI_V32QI_V16HI_UHI) @@ -2184,11 +2184,11 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rorvv4si_mask, "__builtin_ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rolvv4si_mask, "__builtin_ia32_prolvd128_mask", IX86_BUILTIN_PROLVD128, UNKNOWN, (int) V4SI_FTYPE_V4SI_V4SI_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rorv4si_mask, "__builtin_ia32_prord128_mask", IX86_BUILTIN_PRORD128, UNKNOWN, (int) V4SI_FTYPE_V4SI_INT_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rolv4si_mask, "__builtin_ia32_prold128_mask", IX86_BUILTIN_PROLD128, UNKNOWN, (int) V4SI_FTYPE_V4SI_INT_V4SI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fpclassv4df_mask, "__builtin_ia32_fpclasspd256_mask", IX86_BUILTIN_FPCLASSPD256, UNKNOWN, (int) QI_FTYPE_V4DF_INT_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fpclassv2df_mask, "__builtin_ia32_fpclasspd128_mask", IX86_BUILTIN_FPCLASSPD128, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4df_mask, "__builtin_ia32_fpclasspd256_mask", IX86_BUILTIN_FPCLASSPD256, UNKNOWN, (int) QI_FTYPE_V4DF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv2df_mask, "__builtin_ia32_fpclasspd128_mask", IX86_BUILTIN_FPCLASSPD128, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv2df_mask, "__builtin_ia32_fpclasssd_mask", IX86_BUILTIN_FPCLASSSD_MASK, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fpclassv8sf_mask, "__builtin_ia32_fpclassps256_mask", IX86_BUILTIN_FPCLASSPS256, UNKNOWN, (int) QI_FTYPE_V8SF_INT_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fpclassv4sf_mask, "__builtin_ia32_fpclassps128_mask", IX86_BUILTIN_FPCLASSPS128, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv8sf_mask, "__builtin_ia32_fpclassps256_mask", IX86_BUILTIN_FPCLASSPS256, UNKNOWN, (int) QI_FTYPE_V8SF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4sf_mask, "__builtin_ia32_fpclassps128_mask", IX86_BUILTIN_FPCLASSPS128, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv4sf_mask, "__builtin_ia32_fpclassss_mask", IX86_BUILTIN_FPCLASSSS_MASK, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv16qi, "__builtin_ia32_cvtb2mask128", IX86_BUILTIN_CVTB2MASK128, UNKNOWN, (int) UHI_FTYPE_V16QI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv32qi, "__builtin_ia32_cvtb2mask256", IX86_BUILTIN_CVTB2MASK256, UNKNOWN, (int) USI_FTYPE_V32QI) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 9003776ee01..6784a8c5369 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -467,6 +467,14 @@ [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL") V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) +(define_mode_iterator VF_AVX512VLDQ_AVX10_1 + [(V16SF "TARGET_AVX512DQ") + (V8SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V8DF "TARGET_AVX512DQ") + (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V2DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + ;; AVX512ER SF plus 128- and 256-bit SF vector modes (define_mode_iterator VF1_AVX512ER_128_256 [(V16SF "TARGET_AVX512ER") (V8SF "TARGET_AVX") V4SF]) @@ -478,6 +486,17 @@ V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL") V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) +(define_mode_iterator VFH_AVX512VLDQ_AVX10_1 + [(V32HF "TARGET_AVX512FP16") + (V16HF "TARGET_AVX512FP16 && TARGET_AVX512VL") + (V8HF "TARGET_AVX512FP16 && TARGET_AVX512VL") + (V16SF "TARGET_AVX512DQ") + (V8SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V8DF "TARGET_AVX512DQ") + (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V2DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + (define_mode_iterator VF2_AVX512VLDQ_AVX10_1 [(V8DF "TARGET_AVX512DQ") (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") @@ -486,6 +505,11 @@ (define_mode_iterator VF1_AVX512VL [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")]) +(define_mode_iterator VF1_AVX512VLDQ_AVX10_1 + [(V16SF "TARGET_AVX512DQ") + (V8SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + (define_mode_iterator VF_AVX512FP16 [V32HF V16HF V8HF]) @@ -3520,12 +3544,12 @@ }) (define_insn "reducep" - [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v") - (unspec:VFH_AVX512VL - [(match_operand:VFH_AVX512VL 1 "" "") + [(set (match_operand:VFH_AVX512VLDQ_AVX10_1 0 "register_operand" "=v") + (unspec:VFH_AVX512VLDQ_AVX10_1 + [(match_operand:VFH_AVX512VLDQ_AVX10_1 1 "" "") (match_operand:SI 2 "const_0_to_255_operand")] UNSPEC_REDUCE))] - "TARGET_AVX512DQ || (VALID_AVX512FP16_REG_MODE (mode))" + "" "vreduce\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sse") (set_attr "prefix" "evex") @@ -8514,9 +8538,9 @@ (define_expand "vec_pack_float_" [(match_operand: 0 "register_operand") (any_float: - (match_operand:VI8_AVX512VL 1 "register_operand")) - (match_operand:VI8_AVX512VL 2 "register_operand")] - "TARGET_AVX512DQ" + (match_operand:VI8_AVX512VLDQ_AVX10_1 1 "register_operand")) + (match_operand:VI8_AVX512VLDQ_AVX10_1 2 "register_operand")] + "" { rtx r1 = gen_reg_rtx (mode); rtx r2 = gen_reg_rtx (mode); @@ -8975,8 +8999,8 @@ (define_expand "vec_unpack_fix_trunc_lo_" [(match_operand: 0 "register_operand") (any_fix: - (match_operand:VF1_AVX512VL 1 "register_operand"))] - "TARGET_AVX512DQ" + (match_operand:VF1_AVX512VLDQ_AVX10_1 1 "register_operand"))] + "" { rtx tem = operands[1]; rtx (*gen) (rtx, rtx); @@ -8998,8 +9022,8 @@ (define_expand "vec_unpack_fix_trunc_hi_" [(match_operand: 0 "register_operand") (any_fix: - (match_operand:VF1_AVX512VL 1 "register_operand"))] - "TARGET_AVX512DQ" + (match_operand:VF1_AVX512VLDQ_AVX10_1 1 "register_operand"))] + "" { rtx tem; rtx (*gen) (rtx, rtx); @@ -11812,16 +11836,19 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_mode_iterator VI48F_256_DQ - [V8SI V8SF (V4DI "TARGET_AVX512DQ") (V4DF "TARGET_AVX512DQ")]) +(define_mode_iterator VI48F_256_DQVL_AVX10_1 + [(V8SI "TARGET_AVX512VL") + (V8SF "TARGET_AVX512VL") + (V4DI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) (define_expand "avx512vl_vextractf128" [(match_operand: 0 "nonimmediate_operand") - (match_operand:VI48F_256_DQ 1 "register_operand") + (match_operand:VI48F_256_DQVL_AVX10_1 1 "register_operand") (match_operand:SI 2 "const_0_to_1_operand") (match_operand: 3 "nonimm_or_0_operand") (match_operand:QI 4 "register_operand")] - "TARGET_AVX512VL" + "" { rtx (*insn)(rtx, rtx, rtx, rtx); rtx dest = operands[0]; @@ -11960,8 +11987,7 @@ (parallel [(const_int 0) (const_int 1)])) (match_operand: 2 "nonimm_or_0_operand" "0C,0") (match_operand:QI 3 "register_operand" "Yk,Yk")))] - "TARGET_AVX512DQ - && TARGET_AVX512VL + "((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))" "vextract64x2\t{$0x0, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x0}" [(set_attr "type" "sselog1") @@ -11997,8 +12023,7 @@ (parallel [(const_int 2) (const_int 3)])) (match_operand: 2 "nonimm_or_0_operand" "0C,0") (match_operand:QI 3 "register_operand" "Yk,Yk")))] - "TARGET_AVX512DQ - && TARGET_AVX512VL + "((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))" "vextract64x2\t{$0x1, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x1}" [(set_attr "type" "sselog1") @@ -12013,13 +12038,10 @@ (parallel [(const_int 2) (const_int 3)])))] "TARGET_AVX" { - if (TARGET_AVX512VL) - { - if (TARGET_AVX512DQ) - return "vextract64x2\t{$0x1, %1, %0|%0, %1, 0x1}"; - else - return "vextract32x4\t{$0x1, %1, %0|%0, %1, 0x1}"; - } + if ((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) + return "vextract64x2\t{$0x1, %1, %0|%0, %1, 0x1}"; + else if (TARGET_AVX512VL) + return "vextract32x4\t{$0x1, %1, %0|%0, %1, 0x1}"; else return "vextract\t{$0x1, %1, %0|%0, %1, 0x1}"; } @@ -27201,7 +27223,7 @@ (match_operand:SI 3 "const_0_to_1_operand") (match_operand:VI48F_256 4 "register_operand") (match_operand: 5 "register_operand")] - "TARGET_AVX512VL" + "TARGET_AVX512VL || TARGET_AVX10_1" { rtx (*insn)(rtx, rtx, rtx, rtx, rtx); @@ -27256,7 +27278,7 @@ (parallel [(const_int 2) (const_int 3)]))))] "TARGET_AVX && " { - if (TARGET_AVX512DQ) + if ((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) return "vinsert64x2\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; else if (TARGET_AVX512VL) return "vinsert32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; @@ -27278,7 +27300,7 @@ (match_operand: 2 "nonimmediate_operand" "vm")))] "TARGET_AVX && " { - if (TARGET_AVX512DQ) + if ((TARGET_AVX512DQ && TARGET_AVX512VL)|| TARGET_AVX10_1) return "vinsert64x2\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"; else if (TARGET_AVX512VL) return "vinsert32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"; @@ -28549,13 +28571,13 @@ "operands[2] = CONST0_RTX (mode);") (define_insn "avx512dq_rangep" - [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v") - (unspec:VF_AVX512VL - [(match_operand:VF_AVX512VL 1 "register_operand" "v") - (match_operand:VF_AVX512VL 2 "" "") + [(set (match_operand:VF_AVX512VLDQ_AVX10_1 0 "register_operand" "=v") + (unspec:VF_AVX512VLDQ_AVX10_1 + [(match_operand:VF_AVX512VLDQ_AVX10_1 1 "register_operand" "v") + (match_operand:VF_AVX512VLDQ_AVX10_1 2 "" "") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_RANGE))] - "TARGET_AVX512DQ && " + "" { if (TARGET_DEST_FALSE_DEP_FOR_GLC && @@ -28594,10 +28616,10 @@ (define_insn "avx512dq_fpclass" [(set (match_operand: 0 "register_operand" "=k") (unspec: - [(match_operand:VFH_AVX512VL 1 "vector_operand" "vm") + [(match_operand:VFH_AVX512VLDQ_AVX10_1 1 "vector_operand" "vm") (match_operand 2 "const_0_to_255_operand")] UNSPEC_FPCLASS))] - "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(mode)" + "" "vfpclass\t{%2, %1, %0|%0, %1, %2}"; [(set_attr "type" "sse") (set_attr "length_immediate" "1") diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md index 59c4b395a9d..fe923458ab8 100644 --- a/gcc/config/i386/subst.md +++ b/gcc/config/i386/subst.md @@ -65,7 +65,7 @@ || TARGET_AVX10_1)") (define_subst_attr "mask_avx512vl_condition" "mask" "1" "(TARGET_AVX512VL || TARGET_AVX10_1)") (define_subst_attr "mask_avx512bw_condition" "mask" "1" "TARGET_AVX512BW") -(define_subst_attr "mask_avx512dq_condition" "mask" "1" "TARGET_AVX512DQ") +(define_subst_attr "mask_avx512dq_condition" "mask" "1" "(TARGET_AVX512DQ || TARGET_AVX10_1)") (define_subst_attr "mask_prefix" "mask" "vex" "evex") (define_subst_attr "mask_prefix2" "mask" "maybe_vex" "evex") (define_subst_attr "mask_prefix3" "mask" "orig,vex" "evex,evex") @@ -120,7 +120,7 @@ (define_subst "mask_scalar_merge" [(set (match_operand:SUBST_S 0) (match_operand:SUBST_S 1))] - "TARGET_AVX512F" + "TARGET_AVX512F || TARGET_AVX10_1" [(set (match_dup 0) (and:SUBST_S (match_dup 1) From patchwork Tue Aug 8 07:20:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 132516 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1937670vqr; Tue, 8 Aug 2023 00:24:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF7BnQuarcv95aJ1Og0zdSPB9+MSox01t/CmBjXvoRZVBqk5M6+csNhtWcFIaPzWwszR+r2 X-Received: by 2002:aa7:d9d0:0:b0:523:3f45:5678 with SMTP id v16-20020aa7d9d0000000b005233f455678mr2474369eds.31.1691479443166; Tue, 08 Aug 2023 00:24:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691479443; cv=none; d=google.com; s=arc-20160816; b=YR62mxlrbBidwt44jj5/3AbAs7IoUtWnxD7A4q+H0iF44clNJVyTi5R/27ZaJoFWm3 dZJ5lqsPbor60scKr34BzoQ91cAV3jlD9MU9UUmVA/wUYr8+5MAJD4G7+78IsAqhi5t5 XmWMCltT9eTKIzh0na0Lw6vupx6OB54YbHUywxxzQh5UWyd0UGD1pM5JLSRO6bJC2ysT h0F7yOH59K/tbrzv75udG+6Ay3LScriTfsgs07ruVRREXX88JiAjRFoEeZ95mGWJSwm4 V2Mjk2fVw7v3lUiXLXNL6jhNujV+6jwmG+70yOQ0hVYHo2fMOSCCNyrXu36rCQSqGs62 vDfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=yvhgCpGNnjo9bYXowlWwzpxIWA6dkyGS9/A3MfAxs+Q=; fh=pQVlhs2+rlZ1Q5n//Pi5Wub1q9awKW2/7mZGTpWj3XA=; b=tBlnVQA53yeCA8KliPqkBGTXn3hJVYK70XF4a1OQqwAaXvquFIn6FUI/Ukezq/tfL5 7XuBE2MzD/SlsUYPRdahWnc5E2vbc/PchaQORCcDWZVcKVdBPaMyaKuy8618EZ0Ftv07 K1Jh7hZVFrqxGDctdtEs2yEZUVr3Ym4YcYIpoiPPW921cwC5UflZRj5w/6uy9op89jeQ 11VVyY+QzivK7TbveeqQyALiU1AM9UNAA9Mhvg2Tw1EOsZAdQ3zizhk3A26/mtD1vuT6 ZAtJt1L90CawwuItdZVIRLGxwP7P6lJZKvbid8yCx4nAZV+xdNX59ZooWlwvY7ZrbkRK ul0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YYXItsAR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d22-20020aa7d696000000b005225516c0b7si7156923edr.672.2023.08.08.00.24.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 00:24:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YYXItsAR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0F2023858025 for ; Tue, 8 Aug 2023 07:22:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0F2023858025 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691479370; bh=yvhgCpGNnjo9bYXowlWwzpxIWA6dkyGS9/A3MfAxs+Q=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=YYXItsAR7jfUe6LwsVEJjQ7MyOQP3gztw9LOKsTye+qE+Llitfk5PiNU+rPd6SMKx /CML343WJw7KLifIiscVieyVJNIyy9UrQw4C/zzswUytXFvKsV2yQS9yrPSxEgkZOE nItT0XoWxtmlkARTe/jN/1GEmMp68m5/bCLT4L8M= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by sourceware.org (Postfix) with ESMTPS id 89A2B385B81D for ; Tue, 8 Aug 2023 07:21:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 89A2B385B81D X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="360834443" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="360834443" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:21:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="977746926" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="977746926" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga006.fm.intel.com with ESMTP; 08 Aug 2023 00:21:00 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7CDB11005608; Tue, 8 Aug 2023 15:20:59 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 6/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins Date: Tue, 8 Aug 2023 15:20:59 +0800 Message-Id: <20230808072059.1570341-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773644548037755749 X-GMAIL-MSGID: 1773644748726874737 gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-vextractf64x2-1.c: New test. * gcc.target/i386/avx10_1-vextracti64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vfpclasspd-1.c: Ditto. * gcc.target/i386/avx10_1-vfpclassps-1.c: Ditto. * gcc.target/i386/avx10_1-vinsertf64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vinserti64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vrangepd-1.c: Ditto. * gcc.target/i386/avx10_1-vrangeps-1.c: Ditto. * gcc.target/i386/avx10_1-vreducepd-1.c: Ditto. * gcc.target/i386/avx10_1-vreduceps-1.c: Ditto. --- .../gcc.target/i386/avx10_1-vextractf64x2-1.c | 18 ++++++++++++ .../gcc.target/i386/avx10_1-vextracti64x2-1.c | 19 ++++++++++++ .../gcc.target/i386/avx10_1-vfpclasspd-1.c | 21 ++++++++++++++ .../gcc.target/i386/avx10_1-vfpclassps-1.c | 21 ++++++++++++++ .../gcc.target/i386/avx10_1-vinsertf64x2-1.c | 18 ++++++++++++ .../gcc.target/i386/avx10_1-vinserti64x2-1.c | 18 ++++++++++++ .../gcc.target/i386/avx10_1-vrangepd-1.c | 27 +++++++++++++++++ .../gcc.target/i386/avx10_1-vrangeps-1.c | 27 +++++++++++++++++ .../gcc.target/i386/avx10_1-vreducepd-1.c | 29 +++++++++++++++++++ .../gcc.target/i386/avx10_1-vreduceps-1.c | 29 +++++++++++++++++++ 10 files changed, 227 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vextractf64x2-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vextracti64x2-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclasspd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclassps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vinsertf64x2-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vinserti64x2-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangepd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangeps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreducepd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreduceps-1.c diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vextractf64x2-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vextractf64x2-1.c new file mode 100644 index 00000000000..4c7e54dc198 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vextractf64x2-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vextractf64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vextractf64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vextractf64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d x; +volatile __m128d y; + +void extern +avx10_1_test (void) +{ + y = _mm256_extractf64x2_pd (x, 1); + y = _mm256_mask_extractf64x2_pd (y, 2, x, 1); + y = _mm256_maskz_extractf64x2_pd (2, x, 1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vextracti64x2-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vextracti64x2-1.c new file mode 100644 index 00000000000..c0bd7700d52 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vextracti64x2-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vextracti64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vextracti64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vextracti64x2\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+.{7}\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x; +volatile __m128i y; + +void extern +avx10_1_test (void) +{ + y = _mm256_extracti64x2_epi64 (x, 1); + y = _mm256_mask_extracti64x2_epi64 (y, 2, x, 1); + y = _mm256_maskz_extracti64x2_epi64 (2, x, 1); +} + diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasspd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasspd-1.c new file mode 100644 index 00000000000..806ba800023 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasspd-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vfpclasspdy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspdx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspdy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspdx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d x256; +volatile __m128d x128; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + m = _mm256_fpclass_pd_mask (x256, 13); + m = _mm_fpclass_pd_mask (x128, 13); + m = _mm256_mask_fpclass_pd_mask (2, x256, 13); + m = _mm_mask_fpclass_pd_mask (2, x128, 13); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassps-1.c new file mode 100644 index 00000000000..174903c7676 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassps-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vfpclasspsy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspsx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspsy\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasspsx\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256 x256; +volatile __m128 x128; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + m = _mm256_fpclass_ps_mask (x256, 13); + m = _mm_fpclass_ps_mask (x128, 13); + m = _mm256_mask_fpclass_ps_mask (2, x256, 13); + m = _mm_mask_fpclass_ps_mask (2, x128, 13); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vinsertf64x2-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vinsertf64x2-1.c new file mode 100644 index 00000000000..5a196844e76 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vinsertf64x2-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vinsertf64x2\[^\n\]*ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vinsertf64x2\[^\n\]*ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vinsertf64x2\[^\n\]*ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d x; +volatile __m128d y; + +void extern +avx10_1_test (void) +{ + x = _mm256_insertf64x2 (x, y, 1); + x = _mm256_mask_insertf64x2 (x, 2, x, y, 1); + x = _mm256_maskz_insertf64x2 (2, x, y, 1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vinserti64x2-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vinserti64x2-1.c new file mode 100644 index 00000000000..69ee06f0f08 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vinserti64x2-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vinserti64x2\[^\n\]*ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vinserti64x2\[^\n\]*ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vinserti64x2\[^\n\]*ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256i x; +volatile __m128i y; + +void extern +avx10_1_test (void) +{ + x = _mm256_inserti64x2 (x, y, 1); + x = _mm256_mask_inserti64x2 (x, 2, x, y, 1); + x = _mm256_maskz_inserti64x2 (2, x, y, 1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangepd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangepd-1.c new file mode 100644 index 00000000000..995b6de64ae --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangepd-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256d y; +volatile __m128d x; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + y = _mm256_range_pd (y, y, 15); + x = _mm_range_pd (x, x, 15); + + y = _mm256_mask_range_pd (y, m, y, y, 15); + x = _mm_mask_range_pd (x, m, x, x, 15); + + y = _mm256_maskz_range_pd (m, y, y, 15); + x = _mm_maskz_range_pd (m, x, x, 15); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangeps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangeps-1.c new file mode 100644 index 00000000000..faf844a9ae1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangeps-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangeps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256 y; +volatile __m128 x; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + y = _mm256_range_ps (y, y, 15); + x = _mm_range_ps (x, x, 15); + + y = _mm256_mask_range_ps (y, m, y, y, 15); + x = _mm_mask_range_ps (x, m, x, x, 15); + + y = _mm256_maskz_range_ps (m, y, y, 15); + x = _mm_maskz_range_ps (m, x, x, 15); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreducepd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreducepd-1.c new file mode 100644 index 00000000000..76bcec0d2f6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreducepd-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducepd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +#define IMM 123 + +volatile __m256d x1; +volatile __m128d x2; +volatile __mmask8 m; + +void extern +avx156p_test (void) +{ + x1 = _mm256_reduce_pd (x1, IMM); + x2 = _mm_reduce_pd (x2, IMM); + + x1 = _mm256_mask_reduce_pd (x1, m, x1, IMM); + x2 = _mm_mask_reduce_pd (x2, m, x2, IMM); + + x1 = _mm256_maskz_reduce_pd (m, x1, IMM); + x2 = _mm_maskz_reduce_pd (m, x2, IMM); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreduceps-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreduceps-1.c new file mode 100644 index 00000000000..9d3aeb362fc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreduceps-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreduceps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +#define IMM 123 + +volatile __m256 x1; +volatile __m128 x2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x1 = _mm256_reduce_ps (x1, IMM); + x2 = _mm_reduce_ps (x2, IMM); + + x1 = _mm256_mask_reduce_ps (x1, m, x1, IMM); + x2 = _mm_mask_reduce_ps (x2, m, x2, IMM); + + x1 = _mm256_maskz_reduce_ps (m, x1, IMM); + x2 = _mm_maskz_reduce_ps (m, x2, IMM); +}