From patchwork Thu Aug 17 06:55:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 135853 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp506147vqi; Wed, 16 Aug 2023 23:59:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+t8kHVfnXAxBLgH9UcoewogdPxoFRYvFS2RNjUGgBHBHaXmKwOa59xZibizJV17XfSvs3 X-Received: by 2002:a17:906:19:b0:99b:5a73:4d09 with SMTP id 25-20020a170906001900b0099b5a734d09mr3002570eja.43.1692255546655; Wed, 16 Aug 2023 23:59:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692255546; cv=none; d=google.com; s=arc-20160816; b=wMWKLoi8IoDOsG/23t+uEQspVz1Bc82nJUrZv+TElp8uUhnvlKPJyNGDW6dIBxIt0J ZiDAD0Ki7vemRsNMh4Wv6Q2WQtxHI9DZJdYUObYdfDRrp/d96BHMhErnlZ47jMHy5F3L 0TGI1Gth3WG1QPapElp8yFIDgR3lWulAa3PC1HlUMorJB/CVkln1lu0oFW8m4vqRvWN1 rfb4xDqpJSHNgwcPILNk5gsbYXfpEgq+5ijnxFPHfSOb3ZnQxM7+odALdzfyDITYh0tc v1jW6uaghlLpd5cEdwVfLxxfeXVJQuxolbzS57fYwCO/GkQ/Dd6w7uRy5spn2upN7y+K gtLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=6EwoQgX68y1JDklY4mzCspih+S7Q69Pf92ABxtnq+iI=; fh=KaFrwmRILS6MX9FYvoQEosSs8p1iCRNRTpooawDkG9g=; b=Xv/9BfYC1xqBJ1RU2HK8MK4nmii1H+pZkBHCx25aGwwrs1yzYNe20dh4COeAgCuWIr kKWkIFYo3aaQmjLuQaWMnp2Y++nqRRs/2RW/Cj8fw9svbP6gLaXpByHwg1lkqs+3QOje /HQ/TC/vVEQ3QFeiBd+0pd/LJUcC6603VASHPcBkB7ZpYrR3Q/q08ePa5gecGgsKSjBQ Sm/lAoydlfxgq05EdK6jJpigI4eVjblfSzHZTMPo3hNzz7dKSLDSxhGe1z67nT+gjvWh LAxqfSANGmmESlp2H1QBiz1NCHK4si3vJlae6pKzFVnxupWykSzTlAfNOeMZThFk0Kpp aJLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NCPApfs3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id qw18-20020a170906fcb200b009938899a3besi12520457ejb.211.2023.08.16.23.59.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Aug 2023 23:59:06 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NCPApfs3; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 09F23385084A for ; Thu, 17 Aug 2023 06:58:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 09F23385084A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692255496; bh=6EwoQgX68y1JDklY4mzCspih+S7Q69Pf92ABxtnq+iI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=NCPApfs3Uc1d9JV07ZM8BqU/Ng9QLEauuVPBIy9XveoSEqVsSLGKI/V1gaoM8p1Xs eNPdeL69kkJiT+3/Tf2s0N7ufBisQwPVOWZVz6vb+mD1unWN2zGVtI3l8dA3dwuft2 xPZj78YdOZDPdoMlIIXXdeH7zhn5UGAH54Gv3WAI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 6C852385C414 for ; Thu, 17 Aug 2023 06:57:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6C852385C414 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="436633265" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="436633265" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Aug 2023 23:57:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="848779256" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="848779256" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 16 Aug 2023 23:57:13 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 4B42C1005137; Thu, 17 Aug 2023 14:57:12 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: hongtao.liu@intel.com, ubizjak@gmail.com Subject: [PATCH 1/2] [PATCH 1/2] Support AVX10.1 for AVX512DQ intrins Date: Thu, 17 Aug 2023 14:55:08 +0800 Message-Id: <20230817065509.130068-2-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230817065509.130068-1-haochen.jiang@intel.com> References: <20230817065509.130068-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774458526249052201 X-GMAIL-MSGID: 1774458551991699000 gcc/ChangeLog: * config.gcc: Add avx512dqavx10_1intrin.h. * config/i386/avx512dqintrin.h: Move avx10_1 related intrins to new intrin file. * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_AVX10_1. * config/i386/i386.md (x64_avx512dq): Rename to x64_avx10_1_or_avx512dq. Add TARGET_AVX10_1. (*movqi_internal): Add TARGET_AVX10_1. * config/i386/immintrin.h: Add avx512dqavx10_1intrin.h. * config/i386/sse.md (SWI1248_AVX512BWDQ): Add TARGET_AVX10_1 and TARGET_AVX512F. (SWI1248_AVX512BW): Ditto. (SWI1248_AVX512BWDQ2): Ditto. (kmov): Remove TARGET_AVX512F check. (k): Remove TARGET_AVX512F check. Add TARGET_AVX10_1. (kandn): Ditto. (kxnor): Ditto. (knot): Ditto. (kadd): Remove TARGET_AVX512F check. (k): Ditto. (ktest): Ditto. (kortest): Ditto. (reduces): Add TARGET_AVX10_1. (pinsr_evex_isa): Change avx512dq to avx10_1_or_avx512dq. (*vec_extractv4si): Ditto. (*vec_extractv4si_zext): Ditto. (*vec_concatv2si_sse4_1): Ditto. (*vec_extractv2di_1): Change x64_avx512dq to x64_avx10_1_or_avx512dq. (vec_concatv2di): Ditto. (avx512dq_ranges): Add TARGET_AVX10_1. (avx512dq_vmfpclass): Ditto. * config/i386/subst.md (mask_scalar): Ditto. (round_saeonly_scalar): Ditto. gcc/testsuite/Changelog: * gcc.target/i386/sse-26.c: Skip avx512dqavx10_1intrin.h. --- gcc/config.gcc | 9 +- gcc/config/i386/avx512dqavx10_1intrin.h | 634 ++++++++++++++++++++++++ gcc/config/i386/avx512dqintrin.h | 602 ---------------------- gcc/config/i386/i386-builtin.def | 50 +- gcc/config/i386/i386.md | 8 +- gcc/config/i386/immintrin.h | 2 + gcc/config/i386/sse.md | 63 +-- gcc/config/i386/subst.md | 4 +- gcc/testsuite/gcc.target/i386/sse-26.c | 1 + 9 files changed, 706 insertions(+), 667 deletions(-) create mode 100644 gcc/config/i386/avx512dqavx10_1intrin.h diff --git a/gcc/config.gcc b/gcc/config.gcc index 415e0e1ebc5..9b1be5350cd 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -415,10 +415,11 @@ i[34567]86-*-* | x86_64-*-*) adxintrin.h fxsrintrin.h xsaveintrin.h xsaveoptintrin.h avx512cdintrin.h avx512erintrin.h avx512pfintrin.h shaintrin.h clflushoptintrin.h xsavecintrin.h - xsavesintrin.h avx512dqintrin.h avx512bwintrin.h - avx512vlintrin.h avx512vlbwintrin.h avx512vldqintrin.h - avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h - avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h + xsavesintrin.h avx512dqintrin.h avx512dqavx10_1intrin.h + avx512bwintrin.h avx512vlintrin.h avx512vlbwintrin.h + avx512vldqintrin.h avx512ifmaintrin.h avx512ifmavlintrin.h + avx512vbmiintrin.h avx512vbmivlintrin.h + avx5124fmapsintrin.h avx5124vnniwintrin.h avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h clzerointrin.h pkuintrin.h sgxintrin.h cetintrin.h gfniintrin.h cet.h avx512vbmi2intrin.h diff --git a/gcc/config/i386/avx512dqavx10_1intrin.h b/gcc/config/i386/avx512dqavx10_1intrin.h new file mode 100644 index 00000000000..4621f24863b --- /dev/null +++ b/gcc/config/i386/avx512dqavx10_1intrin.h @@ -0,0 +1,634 @@ +/* Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _IMMINTRIN_H_INCLUDED +#error "Never use directly; include instead." +#endif + +#ifndef _AVX512DQAVX10_1INTRIN_H_INCLUDED +#define _AVX512DQAVX10_1INTRIN_H_INCLUDED + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktest_mask8_u8 (__mmask8 __A, __mmask8 __B, unsigned char *__CF) +{ + *__CF = (unsigned char) __builtin_ia32_ktestcqi (__A, __B); + return (unsigned char) __builtin_ia32_ktestzqi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktestz_mask8_u8 (__mmask8 __A, __mmask8 __B) +{ + return (unsigned char) __builtin_ia32_ktestzqi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktestc_mask8_u8 (__mmask8 __A, __mmask8 __B) +{ + return (unsigned char) __builtin_ia32_ktestcqi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktest_mask16_u8 (__mmask16 __A, __mmask16 __B, unsigned char *__CF) +{ + *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B); + return (unsigned char) __builtin_ia32_ktestzhi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B) +{ + return (unsigned char) __builtin_ia32_ktestzhi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_ktestc_mask16_u8 (__mmask16 __A, __mmask16 __B) +{ + return (unsigned char) __builtin_ia32_ktestchi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kortest_mask8_u8 (__mmask8 __A, __mmask8 __B, unsigned char *__CF) +{ + *__CF = (unsigned char) __builtin_ia32_kortestcqi (__A, __B); + return (unsigned char) __builtin_ia32_kortestzqi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kortestz_mask8_u8 (__mmask8 __A, __mmask8 __B) +{ + return (unsigned char) __builtin_ia32_kortestzqi (__A, __B); +} + +extern __inline unsigned char +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kortestc_mask8_u8 (__mmask8 __A, __mmask8 __B) +{ + return (unsigned char) __builtin_ia32_kortestcqi (__A, __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kadd_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_kaddqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask16 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kadd_mask16 (__mmask16 __A, __mmask16 __B) +{ + return (__mmask16) __builtin_ia32_kaddhi ((__mmask16) __A, (__mmask16) __B); +} + +extern __inline unsigned int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_cvtmask8_u32 (__mmask8 __A) +{ + return (unsigned int) __builtin_ia32_kmovb ((__mmask8 ) __A); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_cvtu32_mask8 (unsigned int __A) +{ + return (__mmask8) __builtin_ia32_kmovb ((__mmask8) __A); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_load_mask8 (__mmask8 *__A) +{ + return (__mmask8) __builtin_ia32_kmovb (*(__mmask8 *) __A); +} + +extern __inline void +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_store_mask8 (__mmask8 *__A, __mmask8 __B) +{ + *(__mmask8 *) __A = __builtin_ia32_kmovb (__B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_knot_mask8 (__mmask8 __A) +{ + return (__mmask8) __builtin_ia32_knotqi ((__mmask8) __A); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kor_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_korqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kxnor_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_kxnorqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kxor_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_kxorqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kand_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_kandqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kandn_mask8 (__mmask8 __A, __mmask8 __B) +{ + return (__mmask8) __builtin_ia32_kandnqi ((__mmask8) __A, (__mmask8) __B); +} + +#ifdef __OPTIMIZE__ +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kshiftli_mask8 (__mmask8 __A, unsigned int __B) +{ + return (__mmask8) __builtin_ia32_kshiftliqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_kshiftri_mask8 (__mmask8 __A, unsigned int __B) +{ + return (__mmask8) __builtin_ia32_kshiftriqi ((__mmask8) __A, (__mmask8) __B); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_reduce_sd (__m128d __A, __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) _mm_setzero_pd (), + (__mmask8) -1); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_reduce_round_sd (__m128d __A, __m128d __B, int __C, const int __R) +{ + return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + (__mmask8) -1, __R); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_reduce_sd (__m128d __W, __mmask8 __U, __m128d __A, + __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) __W, + (__mmask8) __U); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_reduce_round_sd (__m128d __W, __mmask8 __U, __m128d __A, + __m128d __B, int __C, const int __R) +{ + return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) __W, + __U, __R); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_reduce_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) _mm_setzero_pd (), + (__mmask8) __U); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_reduce_round_sd (__mmask8 __U, __m128d __A, __m128d __B, + int __C, const int __R) +{ + return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + __U, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_reduce_ss (__m128 __A, __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) _mm_setzero_ps (), + (__mmask8) -1); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_reduce_round_ss (__m128 __A, __m128 __B, int __C, const int __R) +{ + return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + (__mmask8) -1, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_reduce_ss (__m128 __W, __mmask8 __U, __m128 __A, + __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) __W, + (__mmask8) __U); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_reduce_round_ss (__m128 __W, __mmask8 __U, __m128 __A, + __m128 __B, int __C, const int __R) +{ + return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) __W, + __U, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_reduce_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) _mm_setzero_ps (), + (__mmask8) __U); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_reduce_round_ss (__mmask8 __U, __m128 __A, __m128 __B, + int __C, const int __R) +{ + return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + __U, __R); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_range_sd (__m128d __A, __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + (__mmask8) -1, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_range_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) __W, + (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_range_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_range_ss (__m128 __A, __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + (__mmask8) -1, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_range_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) __W, + (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_range_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_range_round_sd (__m128d __A, __m128d __B, int __C, const int __R) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + (__mmask8) -1, __R); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_range_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, + int __C, const int __R) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) __W, + (__mmask8) __U, __R); +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_range_round_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C, + const int __R) +{ + return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, + (__v2df) __B, __C, + (__v2df) + _mm_setzero_pd (), + (__mmask8) __U, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_range_round_ss (__m128 __A, __m128 __B, int __C, const int __R) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + (__mmask8) -1, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_range_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, + int __C, const int __R) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) __W, + (__mmask8) __U, __R); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_range_round_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C, + const int __R) +{ + return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, + (__v4sf) __B, __C, + (__v4sf) + _mm_setzero_ps (), + (__mmask8) __U, __R); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_fpclass_ss_mask (__m128 __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, + (__mmask8) -1); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_fpclass_sd_mask (__m128d __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, + (__mmask8) -1); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_fpclass_ss_mask (__mmask8 __U, __m128 __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, __U); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, __U); +} + +#else +#define _kshiftli_mask8(X, Y) \ + ((__mmask8) __builtin_ia32_kshiftliqi ((__mmask8)(X), (__mmask8)(Y))) + +#define _kshiftri_mask8(X, Y) \ + ((__mmask8) __builtin_ia32_kshiftriqi ((__mmask8)(X), (__mmask8)(Y))) + +#define _mm_range_sd(A, B, C) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8) -1, _MM_FROUND_CUR_DIRECTION)) + +#define _mm_mask_range_sd(W, U, A, B, C) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ + (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) + +#define _mm_maskz_range_sd(U, A, B, C) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) + +#define _mm_range_ss(A, B, C) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8) -1, _MM_FROUND_CUR_DIRECTION)) + +#define _mm_mask_range_ss(W, U, A, B, C) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ + (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) + +#define _mm_maskz_range_ss(U, A, B, C) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) + +#define _mm_range_round_sd(A, B, C, R) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8) -1, (R))) + +#define _mm_mask_range_round_sd(W, U, A, B, C, R) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ + (__mmask8)(U), (R))) + +#define _mm_maskz_range_round_sd(U, A, B, C, R) \ + ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8)(U), (R))) + +#define _mm_range_round_ss(A, B, C, R) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8) -1, (R))) + +#define _mm_mask_range_round_ss(W, U, A, B, C, R) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ + (__mmask8)(U), (R))) + +#define _mm_maskz_range_round_ss(U, A, B, C, R) \ + ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8)(U), (R))) + +#define _mm_fpclass_ss_mask(X, C) \ + ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X), \ + (int) (C), (__mmask8) (-1))) \ + +#define _mm_fpclass_sd_mask(X, C) \ + ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ + (int) (C), (__mmask8) (-1))) \ + +#define _mm_mask_fpclass_ss_mask(X, C, U) \ + ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X), \ + (int) (C), (__mmask8) (U))) + +#define _mm_mask_fpclass_sd_mask(X, C, U) \ + ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ + (int) (C), (__mmask8) (U))) +#define _mm_reduce_sd(A, B, C) \ + ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8)-1)) + +#define _mm_mask_reduce_sd(W, U, A, B, C) \ + ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), (__mmask8)(U))) + +#define _mm_maskz_reduce_sd(U, A, B, C) \ + ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8)(U))) + +#define _mm_reduce_round_sd(A, B, C, R) \ + ((__m128d) __builtin_ia32_reducesd_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__mmask8)(U), (int)(R))) + +#define _mm_mask_reduce_round_sd(W, U, A, B, C, R) \ + ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ + (__mmask8)(U), (int)(R))) + +#define _mm_maskz_reduce_round_sd(U, A, B, C, R) \ + ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ + (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ + (__mmask8)(U), (int)(R))) + +#define _mm_reduce_ss(A, B, C) \ + ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8)-1)) + +#define _mm_mask_reduce_ss(W, U, A, B, C) \ + ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), (__mmask8)(U))) + +#define _mm_maskz_reduce_ss(U, A, B, C) \ + ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8)(U))) + +#define _mm_reduce_round_ss(A, B, C, R) \ + ((__m128) __builtin_ia32_reducess_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__mmask8)(U), (int)(R))) + +#define _mm_mask_reduce_round_ss(W, U, A, B, C, R) \ + ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ + (__mmask8)(U), (int)(R))) + +#define _mm_maskz_reduce_round_ss(U, A, B, C, R) \ + ((__m128) __builtin_ia32_reducesd_mask_round ((__v4sf)(__m128)(A), \ + (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ + (__mmask8)(U), (int)(R))) + +#endif + +#endif /* _AVX512DQAVX10_1INTRIN_H_INCLUDED */ diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h index 93900a0b5c7..64321e47131 100644 --- a/gcc/config/i386/avx512dqintrin.h +++ b/gcc/config/i386/avx512dqintrin.h @@ -34,156 +34,6 @@ #define __DISABLE_AVX512DQ__ #endif /* __AVX512DQ__ */ -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktest_mask8_u8 (__mmask8 __A, __mmask8 __B, unsigned char *__CF) -{ - *__CF = (unsigned char) __builtin_ia32_ktestcqi (__A, __B); - return (unsigned char) __builtin_ia32_ktestzqi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktestz_mask8_u8 (__mmask8 __A, __mmask8 __B) -{ - return (unsigned char) __builtin_ia32_ktestzqi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktestc_mask8_u8 (__mmask8 __A, __mmask8 __B) -{ - return (unsigned char) __builtin_ia32_ktestcqi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktest_mask16_u8 (__mmask16 __A, __mmask16 __B, unsigned char *__CF) -{ - *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B); - return (unsigned char) __builtin_ia32_ktestzhi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B) -{ - return (unsigned char) __builtin_ia32_ktestzhi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_ktestc_mask16_u8 (__mmask16 __A, __mmask16 __B) -{ - return (unsigned char) __builtin_ia32_ktestchi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kortest_mask8_u8 (__mmask8 __A, __mmask8 __B, unsigned char *__CF) -{ - *__CF = (unsigned char) __builtin_ia32_kortestcqi (__A, __B); - return (unsigned char) __builtin_ia32_kortestzqi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kortestz_mask8_u8 (__mmask8 __A, __mmask8 __B) -{ - return (unsigned char) __builtin_ia32_kortestzqi (__A, __B); -} - -extern __inline unsigned char -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kortestc_mask8_u8 (__mmask8 __A, __mmask8 __B) -{ - return (unsigned char) __builtin_ia32_kortestcqi (__A, __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kadd_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_kaddqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask16 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kadd_mask16 (__mmask16 __A, __mmask16 __B) -{ - return (__mmask16) __builtin_ia32_kaddhi ((__mmask16) __A, (__mmask16) __B); -} - -extern __inline unsigned int -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_cvtmask8_u32 (__mmask8 __A) -{ - return (unsigned int) __builtin_ia32_kmovb ((__mmask8 ) __A); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_cvtu32_mask8 (unsigned int __A) -{ - return (__mmask8) __builtin_ia32_kmovb ((__mmask8) __A); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_load_mask8 (__mmask8 *__A) -{ - return (__mmask8) __builtin_ia32_kmovb (*(__mmask8 *) __A); -} - -extern __inline void -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_store_mask8 (__mmask8 *__A, __mmask8 __B) -{ - *(__mmask8 *) __A = __builtin_ia32_kmovb (__B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_knot_mask8 (__mmask8 __A) -{ - return (__mmask8) __builtin_ia32_knotqi ((__mmask8) __A); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kor_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_korqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kxnor_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_kxnorqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kxor_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_kxorqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kand_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_kandqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kandn_mask8 (__mmask8 __A, __mmask8 __B) -{ - return (__mmask8) __builtin_ia32_kandnqi ((__mmask8) __A, (__mmask8) __B); -} - extern __inline __m512d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_broadcast_f64x2 (__m128d __A) @@ -1070,20 +920,6 @@ _mm512_maskz_cvtepu64_pd (__mmask8 __U, __m512i __A) } #ifdef __OPTIMIZE__ -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kshiftli_mask8 (__mmask8 __A, unsigned int __B) -{ - return (__mmask8) __builtin_ia32_kshiftliqi ((__mmask8) __A, (__mmask8) __B); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_kshiftri_mask8 (__mmask8 __A, unsigned int __B) -{ - return (__mmask8) __builtin_ia32_kshiftriqi ((__mmask8) __A, (__mmask8) __B); -} - extern __inline __m512d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_range_pd (__m512d __A, __m512d __B, int __C) @@ -1156,305 +992,6 @@ _mm512_maskz_range_ps (__mmask16 __U, __m512 __A, __m512 __B, int __C) _MM_FROUND_CUR_DIRECTION); } -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_reduce_sd (__m128d __A, __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) _mm_setzero_pd (), - (__mmask8) -1); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_reduce_round_sd (__m128d __A, __m128d __B, int __C, const int __R) -{ - return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - (__mmask8) -1, __R); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_reduce_sd (__m128d __W, __mmask8 __U, __m128d __A, - __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) __W, - (__mmask8) __U); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_reduce_round_sd (__m128d __W, __mmask8 __U, __m128d __A, - __m128d __B, int __C, const int __R) -{ - return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) __W, - __U, __R); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_reduce_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) _mm_setzero_pd (), - (__mmask8) __U); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_reduce_round_sd (__mmask8 __U, __m128d __A, __m128d __B, - int __C, const int __R) -{ - return (__m128d) __builtin_ia32_reducesd_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - __U, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_reduce_ss (__m128 __A, __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) _mm_setzero_ps (), - (__mmask8) -1); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_reduce_round_ss (__m128 __A, __m128 __B, int __C, const int __R) -{ - return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - (__mmask8) -1, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_reduce_ss (__m128 __W, __mmask8 __U, __m128 __A, - __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) __W, - (__mmask8) __U); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_reduce_round_ss (__m128 __W, __mmask8 __U, __m128 __A, - __m128 __B, int __C, const int __R) -{ - return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) __W, - __U, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_reduce_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) _mm_setzero_ps (), - (__mmask8) __U); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_reduce_round_ss (__mmask8 __U, __m128 __A, __m128 __B, - int __C, const int __R) -{ - return (__m128) __builtin_ia32_reducess_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - __U, __R); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_range_sd (__m128d __A, __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - (__mmask8) -1, - _MM_FROUND_CUR_DIRECTION); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_range_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) __W, - (__mmask8) __U, - _MM_FROUND_CUR_DIRECTION); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_range_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - (__mmask8) __U, - _MM_FROUND_CUR_DIRECTION); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_range_ss (__m128 __A, __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - (__mmask8) -1, - _MM_FROUND_CUR_DIRECTION); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_range_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) __W, - (__mmask8) __U, - _MM_FROUND_CUR_DIRECTION); -} - - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_range_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - (__mmask8) __U, - _MM_FROUND_CUR_DIRECTION); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_range_round_sd (__m128d __A, __m128d __B, int __C, const int __R) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - (__mmask8) -1, __R); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_range_round_sd (__m128d __W, __mmask8 __U, __m128d __A, __m128d __B, - int __C, const int __R) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) __W, - (__mmask8) __U, __R); -} - -extern __inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_range_round_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C, - const int __R) -{ - return (__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df) __A, - (__v2df) __B, __C, - (__v2df) - _mm_setzero_pd (), - (__mmask8) __U, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_range_round_ss (__m128 __A, __m128 __B, int __C, const int __R) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - (__mmask8) -1, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_range_round_ss (__m128 __W, __mmask8 __U, __m128 __A, __m128 __B, - int __C, const int __R) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) __W, - (__mmask8) __U, __R); -} - -extern __inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_maskz_range_round_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C, - const int __R) -{ - return (__m128) __builtin_ia32_rangess128_mask_round ((__v4sf) __A, - (__v4sf) __B, __C, - (__v4sf) - _mm_setzero_ps (), - (__mmask8) __U, __R); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fpclass_ss_mask (__m128 __A, const int __imm) -{ - return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, - (__mmask8) -1); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_fpclass_sd_mask (__m128d __A, const int __imm) -{ - return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, - (__mmask8) -1); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_fpclass_ss_mask (__mmask8 __U, __m128 __A, const int __imm) -{ - return (__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) __A, __imm, __U); -} - -extern __inline __mmask8 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mask_fpclass_sd_mask (__mmask8 __U, __m128d __A, const int __imm) -{ - return (__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) __A, __imm, __U); -} - extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_cvtt_roundpd_epi64 (__m512d __A, const int __R) @@ -2395,72 +1932,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm) } #else -#define _kshiftli_mask8(X, Y) \ - ((__mmask8) __builtin_ia32_kshiftliqi ((__mmask8)(X), (__mmask8)(Y))) - -#define _kshiftri_mask8(X, Y) \ - ((__mmask8) __builtin_ia32_kshiftriqi ((__mmask8)(X), (__mmask8)(Y))) - -#define _mm_range_sd(A, B, C) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8) -1, _MM_FROUND_CUR_DIRECTION)) - -#define _mm_mask_range_sd(W, U, A, B, C) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ - (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) - -#define _mm_maskz_range_sd(U, A, B, C) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) - -#define _mm_range_ss(A, B, C) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8) -1, _MM_FROUND_CUR_DIRECTION)) - -#define _mm_mask_range_ss(W, U, A, B, C) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ - (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) - -#define _mm_maskz_range_ss(U, A, B, C) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8)(U), _MM_FROUND_CUR_DIRECTION)) - -#define _mm_range_round_sd(A, B, C, R) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8) -1, (R))) - -#define _mm_mask_range_round_sd(W, U, A, B, C, R) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ - (__mmask8)(U), (R))) - -#define _mm_maskz_range_round_sd(U, A, B, C, R) \ - ((__m128d) __builtin_ia32_rangesd128_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8)(U), (R))) - -#define _mm_range_round_ss(A, B, C, R) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8) -1, (R))) - -#define _mm_mask_range_round_ss(W, U, A, B, C, R) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ - (__mmask8)(U), (R))) - -#define _mm_maskz_range_round_ss(U, A, B, C, R) \ - ((__m128) __builtin_ia32_rangess128_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8)(U), (R))) - #define _mm512_cvtt_roundpd_epi64(A, B) \ ((__m512i)__builtin_ia32_cvttpd2qq512_mask ((A), (__v8di) \ _mm512_setzero_si512 (), \ @@ -2792,22 +2263,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm) (__v16si)(__m512i)_mm512_setzero_si512 (),\ (__mmask16)(U))) -#define _mm_fpclass_ss_mask(X, C) \ - ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X), \ - (int) (C), (__mmask8) (-1))) \ - -#define _mm_fpclass_sd_mask(X, C) \ - ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ - (int) (C), (__mmask8) (-1))) \ - -#define _mm_mask_fpclass_ss_mask(X, C, U) \ - ((__mmask8) __builtin_ia32_fpclassss_mask ((__v4sf) (__m128) (X), \ - (int) (C), (__mmask8) (U))) - -#define _mm_mask_fpclass_sd_mask(X, C, U) \ - ((__mmask8) __builtin_ia32_fpclasssd_mask ((__v2df) (__m128d) (X), \ - (int) (C), (__mmask8) (U))) - #define _mm512_mask_fpclass_pd_mask(u, X, C) \ ((__mmask8) __builtin_ia32_fpclasspd512_mask ((__v8df) (__m512d) (X), \ (int) (C), (__mmask8)(u))) @@ -2824,63 +2279,6 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm) ((__mmask16) __builtin_ia32_fpclassps512_mask ((__v16sf) (__m512) (x),\ (int) (c),(__mmask16)-1)) -#define _mm_reduce_sd(A, B, C) \ - ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8)-1)) - -#define _mm_mask_reduce_sd(W, U, A, B, C) \ - ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), (__mmask8)(U))) - -#define _mm_maskz_reduce_sd(U, A, B, C) \ - ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8)(U))) - -#define _mm_reduce_round_sd(A, B, C, R) \ - ((__m128d) __builtin_ia32_reducesd_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__mmask8)(U), (int)(R))) - -#define _mm_mask_reduce_round_sd(W, U, A, B, C, R) \ - ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), \ - (__mmask8)(U), (int)(R))) - -#define _mm_maskz_reduce_round_sd(U, A, B, C, R) \ - ((__m128d) __builtin_ia32_reducesd_mask_round ((__v2df)(__m128d)(A), \ - (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (), \ - (__mmask8)(U), (int)(R))) - -#define _mm_reduce_ss(A, B, C) \ - ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8)-1)) - -#define _mm_mask_reduce_ss(W, U, A, B, C) \ - ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), (__mmask8)(U))) - -#define _mm_maskz_reduce_ss(U, A, B, C) \ - ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8)(U))) - -#define _mm_reduce_round_ss(A, B, C, R) \ - ((__m128) __builtin_ia32_reducess_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__mmask8)(U), (int)(R))) - -#define _mm_mask_reduce_round_ss(W, U, A, B, C, R) \ - ((__m128) __builtin_ia32_reducess_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), \ - (__mmask8)(U), (int)(R))) - -#define _mm_maskz_reduce_round_ss(U, A, B, C, R) \ - ((__m128) __builtin_ia32_reducesd_mask_round ((__v4sf)(__m128)(A), \ - (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (), \ - (__mmask8)(U), (int)(R))) - - #endif #ifdef __DISABLE_AVX512DQ__ diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 34768552e78..7bbe9b2bb01 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -1587,40 +1587,40 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "_ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "__builtin_ia32_ceilpd_vec_pack_sfix512", IX86_BUILTIN_CEILPD_VEC_PACK_SFIX512, (enum rtx_code) ROUND_CEIL, (int) V16SI_FTYPE_V8DF_V8DF_ROUND) /* Mask arithmetic operations */ -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kashiftqi, "__builtin_ia32_kshiftliqi", IX86_BUILTIN_KSHIFTLI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kashiftqi, "__builtin_ia32_kshiftliqi", IX86_BUILTIN_KSHIFTLI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kashifthi, "__builtin_ia32_kshiftlihi", IX86_BUILTIN_KSHIFTLI16, UNKNOWN, (int) UHI_FTYPE_UHI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kashiftsi, "__builtin_ia32_kshiftlisi", IX86_BUILTIN_KSHIFTLI32, UNKNOWN, (int) USI_FTYPE_USI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kashiftdi, "__builtin_ia32_kshiftlidi", IX86_BUILTIN_KSHIFTLI64, UNKNOWN, (int) UDI_FTYPE_UDI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_klshiftrtqi, "__builtin_ia32_kshiftriqi", IX86_BUILTIN_KSHIFTRI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_klshiftrtqi, "__builtin_ia32_kshiftriqi", IX86_BUILTIN_KSHIFTRI8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI_CONST) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_klshiftrthi, "__builtin_ia32_kshiftrihi", IX86_BUILTIN_KSHIFTRI16, UNKNOWN, (int) UHI_FTYPE_UHI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_klshiftrtsi, "__builtin_ia32_kshiftrisi", IX86_BUILTIN_KSHIFTRI32, UNKNOWN, (int) USI_FTYPE_USI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_klshiftrtdi, "__builtin_ia32_kshiftridi", IX86_BUILTIN_KSHIFTRI64, UNKNOWN, (int) UDI_FTYPE_UDI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kandqi, "__builtin_ia32_kandqi", IX86_BUILTIN_KAND8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kandqi, "__builtin_ia32_kandqi", IX86_BUILTIN_KAND8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kandhi, "__builtin_ia32_kandhi", IX86_BUILTIN_KAND16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandsi, "__builtin_ia32_kandsi", IX86_BUILTIN_KAND32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kanddi, "__builtin_ia32_kanddi", IX86_BUILTIN_KAND64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kandnqi, "__builtin_ia32_kandnqi", IX86_BUILTIN_KANDN8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kandnqi, "__builtin_ia32_kandnqi", IX86_BUILTIN_KANDN8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kandnhi, "__builtin_ia32_kandnhi", IX86_BUILTIN_KANDN16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandnsi, "__builtin_ia32_kandnsi", IX86_BUILTIN_KANDN32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kandndi, "__builtin_ia32_kandndi", IX86_BUILTIN_KANDN64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_knotqi, "__builtin_ia32_knotqi", IX86_BUILTIN_KNOT8, UNKNOWN, (int) UQI_FTYPE_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_knotqi, "__builtin_ia32_knotqi", IX86_BUILTIN_KNOT8, UNKNOWN, (int) UQI_FTYPE_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_knothi, "__builtin_ia32_knothi", IX86_BUILTIN_KNOT16, UNKNOWN, (int) UHI_FTYPE_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_knotsi, "__builtin_ia32_knotsi", IX86_BUILTIN_KNOT32, UNKNOWN, (int) USI_FTYPE_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_knotdi, "__builtin_ia32_knotdi", IX86_BUILTIN_KNOT64, UNKNOWN, (int) UDI_FTYPE_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kiorqi, "__builtin_ia32_korqi", IX86_BUILTIN_KOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kiorqi, "__builtin_ia32_korqi", IX86_BUILTIN_KOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kiorhi, "__builtin_ia32_korhi", IX86_BUILTIN_KOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kiorsi, "__builtin_ia32_korsi", IX86_BUILTIN_KOR32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kiordi, "__builtin_ia32_kordi", IX86_BUILTIN_KOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktestqi, "__builtin_ia32_ktestcqi", IX86_BUILTIN_KTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktestqi, "__builtin_ia32_ktestzqi", IX86_BUILTIN_KTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktesthi, "__builtin_ia32_ktestchi", IX86_BUILTIN_KTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_ktesthi, "__builtin_ia32_ktestzhi", IX86_BUILTIN_KTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktestqi, "__builtin_ia32_ktestcqi", IX86_BUILTIN_KTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktestqi, "__builtin_ia32_ktestzqi", IX86_BUILTIN_KTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktesthi, "__builtin_ia32_ktestchi", IX86_BUILTIN_KTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_ktesthi, "__builtin_ia32_ktestzhi", IX86_BUILTIN_KTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestsi, "__builtin_ia32_ktestcsi", IX86_BUILTIN_KTESTC32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestsi, "__builtin_ia32_ktestzsi", IX86_BUILTIN_KTESTZ32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestdi, "__builtin_ia32_ktestcdi", IX86_BUILTIN_KTESTC64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_ktestdi, "__builtin_ia32_ktestzdi", IX86_BUILTIN_KTESTZ64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kortestqi, "__builtin_ia32_kortestcqi", IX86_BUILTIN_KORTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kortestqi, "__builtin_ia32_kortestzqi", IX86_BUILTIN_KORTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kortestqi, "__builtin_ia32_kortestcqi", IX86_BUILTIN_KORTESTC8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kortestqi, "__builtin_ia32_kortestzqi", IX86_BUILTIN_KORTESTZ8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kortesthi, "__builtin_ia32_kortestchi", IX86_BUILTIN_KORTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kortesthi, "__builtin_ia32_kortestzhi", IX86_BUILTIN_KORTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestsi, "__builtin_ia32_kortestcsi", IX86_BUILTIN_KORTESTC32, UNKNOWN, (int) USI_FTYPE_USI_USI) @@ -1629,20 +1629,20 @@ BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestdi, "__builtin_ia32_kortestc BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kortestdi, "__builtin_ia32_kortestzdi", IX86_BUILTIN_KORTESTZ64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kunpckhi, "__builtin_ia32_kunpckhi", IX86_BUILTIN_KUNPCKBW, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kxnorqi, "__builtin_ia32_kxnorqi", IX86_BUILTIN_KXNOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kxnorqi, "__builtin_ia32_kxnorqi", IX86_BUILTIN_KXNOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kxnorhi, "__builtin_ia32_kxnorhi", IX86_BUILTIN_KXNOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxnorsi, "__builtin_ia32_kxnorsi", IX86_BUILTIN_KXNOR32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxnordi, "__builtin_ia32_kxnordi", IX86_BUILTIN_KXNOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kxorqi, "__builtin_ia32_kxorqi", IX86_BUILTIN_KXOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kxorqi, "__builtin_ia32_kxorqi", IX86_BUILTIN_KXOR8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kxorhi, "__builtin_ia32_kxorhi", IX86_BUILTIN_KXOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxorsi, "__builtin_ia32_kxorsi", IX86_BUILTIN_KXOR32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kxordi, "__builtin_ia32_kxordi", IX86_BUILTIN_KXOR64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kmovb, "__builtin_ia32_kmovb", IX86_BUILTIN_KMOV8, UNKNOWN, (int) UQI_FTYPE_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kmovb, "__builtin_ia32_kmovb", IX86_BUILTIN_KMOV8, UNKNOWN, (int) UQI_FTYPE_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_kmovw, "__builtin_ia32_kmovw", IX86_BUILTIN_KMOV16, UNKNOWN, (int) UHI_FTYPE_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kmovd, "__builtin_ia32_kmovd", IX86_BUILTIN_KMOV32, UNKNOWN, (int) USI_FTYPE_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kmovq, "__builtin_ia32_kmovq", IX86_BUILTIN_KMOV64, UNKNOWN, (int) UDI_FTYPE_UDI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kaddqi, "__builtin_ia32_kaddqi", IX86_BUILTIN_KADD8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_kaddhi, "__builtin_ia32_kaddhi", IX86_BUILTIN_KADD16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kaddqi, "__builtin_ia32_kaddqi", IX86_BUILTIN_KADD8, UNKNOWN, (int) UQI_FTYPE_UQI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_kaddhi, "__builtin_ia32_kaddhi", IX86_BUILTIN_KADD16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kaddsi, "__builtin_ia32_kaddsi", IX86_BUILTIN_KADD32, UNKNOWN, (int) USI_FTYPE_USI_USI) BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_kadddi, "__builtin_ia32_kadddi", IX86_BUILTIN_KADD64, UNKNOWN, (int) UDI_FTYPE_UDI_UDI) @@ -1814,8 +1814,8 @@ BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv2df_mask, "__builtin_ia32_reducepd128_mask", IX86_BUILTIN_REDUCEPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_INT_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv8sf_mask, "__builtin_ia32_reduceps256_mask", IX86_BUILTIN_REDUCEPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_INT_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducepv4sf_mask, "__builtin_ia32_reduceps128_mask", IX86_BUILTIN_REDUCEPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_INT_V4SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_permvarv16hi_mask, "__builtin_ia32_permvarhi256_mask", IX86_BUILTIN_VPERMVARHI256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_permvarv8hi_mask, "__builtin_ia32_permvarhi128_mask", IX86_BUILTIN_VPERMVARHI128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vpermt2varv16hi3_mask, "__builtin_ia32_vpermt2varhi256_mask", IX86_BUILTIN_VPERMT2VARHI256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI) @@ -2186,10 +2186,10 @@ BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rorv4si_mask, "__builtin_i BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_rolv4si_mask, "__builtin_ia32_prold128_mask", IX86_BUILTIN_PROLD128, UNKNOWN, (int) V4SI_FTYPE_V4SI_INT_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4df_mask, "__builtin_ia32_fpclasspd256_mask", IX86_BUILTIN_FPCLASSPD256, UNKNOWN, (int) QI_FTYPE_V4DF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv2df_mask, "__builtin_ia32_fpclasspd128_mask", IX86_BUILTIN_FPCLASSPD128, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv2df_mask, "__builtin_ia32_fpclasssd_mask", IX86_BUILTIN_FPCLASSSD_MASK, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_vmfpclassv2df_mask, "__builtin_ia32_fpclasssd_mask", IX86_BUILTIN_FPCLASSSD_MASK, UNKNOWN, (int) QI_FTYPE_V2DF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv8sf_mask, "__builtin_ia32_fpclassps256_mask", IX86_BUILTIN_FPCLASSPS256, UNKNOWN, (int) QI_FTYPE_V8SF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fpclassv4sf_mask, "__builtin_ia32_fpclassps128_mask", IX86_BUILTIN_FPCLASSPS128, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_vmfpclassv4sf_mask, "__builtin_ia32_fpclassss_mask", IX86_BUILTIN_FPCLASSSS_MASK, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_vmfpclassv4sf_mask, "__builtin_ia32_fpclassss_mask", IX86_BUILTIN_FPCLASSSS_MASK, UNKNOWN, (int) QI_FTYPE_V4SF_INT_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv16qi, "__builtin_ia32_cvtb2mask128", IX86_BUILTIN_CVTB2MASK128, UNKNOWN, (int) UHI_FTYPE_V16QI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtb2maskv32qi, "__builtin_ia32_cvtb2mask256", IX86_BUILTIN_CVTB2MASK256, UNKNOWN, (int) USI_FTYPE_V32QI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_cvtw2maskv8hi, "__builtin_ia32_cvtw2mask128", IX86_BUILTIN_CVTW2MASK128, UNKNOWN, (int) UQI_FTYPE_V8HI) @@ -3209,10 +3209,10 @@ BDESC (OPTION_MASK_ISA_AVX512ER, 0, CODE_FOR_avx512er_vmrsqrt28v4sf_mask_round, /* AVX512DQ. */ BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducepv8df_mask_round, "__builtin_ia32_reducepd512_mask_round", IX86_BUILTIN_REDUCEPD512_MASK_ROUND, UNKNOWN, (int) V8DF_FTYPE_V8DF_INT_V8DF_UQI_INT) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducepv16sf_mask_round, "__builtin_ia32_reduceps512_mask_round", IX86_BUILTIN_REDUCEPS512_MASK_ROUND, UNKNOWN, (int) V16SF_FTYPE_V16SF_INT_V16SF_UHI_INT) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv2df_mask_round, "__builtin_ia32_reducesd_mask_round", IX86_BUILTIN_REDUCESD128_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_reducesv4sf_mask_round, "__builtin_ia32_reducess_mask_round", IX86_BUILTIN_REDUCESS128_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_rangesv2df_mask_round, "__builtin_ia32_rangesd128_mask_round", IX86_BUILTIN_RANGESD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT) -BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_rangesv4sf_mask_round, "__builtin_ia32_rangess128_mask_round", IX86_BUILTIN_RANGESS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv2df_mask_round, "__builtin_ia32_reducesd_mask_round", IX86_BUILTIN_REDUCESD128_MASK_ROUND, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_reducesv4sf_mask_round, "__builtin_ia32_reducess_mask_round", IX86_BUILTIN_REDUCESS128_MASK_ROUND, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangesv2df_mask_round, "__builtin_ia32_rangesd128_mask_round", IX86_BUILTIN_RANGESD128, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT) +BDESC (OPTION_MASK_ISA_AVX512DQ, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_rangesv4sf_mask_round, "__builtin_ia32_rangess128_mask_round", IX86_BUILTIN_RANGESS128, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_fix_notruncv8dfv8di2_mask_round, "__builtin_ia32_cvtpd2qq512_mask", IX86_BUILTIN_CVTPD2QQ512, UNKNOWN, (int) V8DI_FTYPE_V8DF_V8DI_QI_INT) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_avx512dq_cvtps2qqv8di_mask_round, "__builtin_ia32_cvtps2qq512_mask", IX86_BUILTIN_CVTPS2QQ512, UNKNOWN, (int) V8DI_FTYPE_V8SF_V8DI_QI_INT) BDESC (OPTION_MASK_ISA_AVX512DQ, 0, CODE_FOR_fixuns_notruncv8dfv8di2_mask_round, "__builtin_ia32_cvtpd2uqq512_mask", IX86_BUILTIN_CVTPD2UQQ512, UNKNOWN, (int) V8DI_FTYPE_V8DF_V8DI_QI_INT) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 108f4af8552..3f1ce3dae21 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -533,7 +533,7 @@ ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes, + x64_avx,x64_avx512bw,x64_avx10_1_or_avx512dq,aes, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl, @@ -875,8 +875,8 @@ (symbol_ref "TARGET_64BIT && TARGET_AVX") (eq_attr "isa" "x64_avx512bw") (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") - (eq_attr "isa" "x64_avx512dq") - (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ") + (eq_attr "isa" "x64_avx10_1_or_avx512dq") + (symbol_ref "TARGET_64BIT && (TARGET_AVX512DQ || TARGET_AVX10_1)") (eq_attr "isa" "aes") (symbol_ref "TARGET_AES") (eq_attr "isa" "sse_noavx") (symbol_ref "TARGET_SSE && !TARGET_AVX") @@ -3114,7 +3114,7 @@ (eq_attr "alternative" "8") (const_string "QI") (and (eq_attr "alternative" "9,10,11,14") - (not (match_test "TARGET_AVX512DQ"))) + (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1"))) (const_string "HI") (eq_attr "type" "imovx") (const_string "SI") diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h index 29b4dbbda24..83ff756ac76 100644 --- a/gcc/config/i386/immintrin.h +++ b/gcc/config/i386/immintrin.h @@ -64,6 +64,8 @@ #include +#include + #include #include diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 6784a8c5369..d4a5bca932f 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1972,23 +1972,25 @@ ;; All integer modes with AVX512BW/DQ. (define_mode_iterator SWI1248_AVX512BWDQ - [(QI "TARGET_AVX512DQ") HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) + [(QI "TARGET_AVX512DQ || TARGET_AVX10_1") (HI "TARGET_AVX512F") + (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) ;; All integer modes with AVX512BW, where HImode operation ;; can be used instead of QImode. (define_mode_iterator SWI1248_AVX512BW - [QI HI (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) + [(QI "TARGET_AVX512F || TARGET_AVX10_1") (HI "TARGET_AVX512F") + (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) ;; All integer modes with AVX512BW/DQ, even HImode requires DQ. (define_mode_iterator SWI1248_AVX512BWDQ2 - [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ") + [(QI "TARGET_AVX512DQ || TARGET_AVX10_1") + (HI "TARGET_AVX512DQ || TARGET_AVX10_1") (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW")]) (define_expand "kmov" [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand") (match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))] - "TARGET_AVX512F - && !(MEM_P (operands[0]) && MEM_P (operands[1]))") + "!(MEM_P (operands[0]) && MEM_P (operands[1]))") (define_insn "k" [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k") @@ -1996,7 +1998,7 @@ (match_operand:SWI1248_AVX512BW 1 "register_operand" "k") (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" { if (get_attr_mode (insn) == MODE_HI) return "kw\t{%2, %1, %0|%0, %1, %2}"; @@ -2007,7 +2009,7 @@ (set_attr "prefix" "vex") (set (attr "mode") (cond [(and (match_test "mode == QImode") - (not (match_test "TARGET_AVX512DQ"))) + (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1"))) (const_string "HI") ] (const_string "")))]) @@ -2018,7 +2020,7 @@ (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand") (match_operand:SWI1248_AVX512BW 2 "mask_reg_operand"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && reload_completed" + "reload_completed" [(parallel [(set (match_dup 0) (any_logic:SWI1248_AVX512BW (match_dup 1) (match_dup 2))) @@ -2031,7 +2033,7 @@ (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")) (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" { if (get_attr_mode (insn) == MODE_HI) return "kandnw\t{%2, %1, %0|%0, %1, %2}"; @@ -2042,7 +2044,7 @@ (set_attr "prefix" "vex") (set (attr "mode") (cond [(and (match_test "mode == QImode") - (not (match_test "TARGET_AVX512DQ"))) + (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1"))) (const_string "HI") ] (const_string "")))]) @@ -2054,7 +2056,7 @@ (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand")) (match_operand:SWI1248_AVX512BW 2 "mask_reg_operand"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && reload_completed" + "reload_completed" [(parallel [(set (match_dup 0) (and:SWI1248_AVX512BW @@ -2069,7 +2071,7 @@ (match_operand:SWI1248_AVX512BW 1 "register_operand" "k") (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" { if (get_attr_mode (insn) == MODE_HI) return "kxnorw\t{%2, %1, %0|%0, %1, %2}"; @@ -2080,7 +2082,7 @@ (set_attr "prefix" "vex") (set (attr "mode") (cond [(and (match_test "mode == QImode") - (not (match_test "TARGET_AVX512DQ"))) + (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1"))) (const_string "HI") ] (const_string "")))]) @@ -2090,7 +2092,7 @@ (not:SWI1248_AVX512BW (match_operand:SWI1248_AVX512BW 1 "register_operand" "k"))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" { if (get_attr_mode (insn) == MODE_HI) return "knotw\t{%1, %0|%0, %1}"; @@ -2101,7 +2103,7 @@ (set_attr "prefix" "vex") (set (attr "mode") (cond [(and (match_test "mode == QImode") - (not (match_test "TARGET_AVX512DQ"))) + (not (match_test "TARGET_AVX512DQ || TARGET_AVX10_1"))) (const_string "HI") ] (const_string "")))]) @@ -2110,7 +2112,7 @@ [(set (match_operand:SWI1248_AVX512BW 0 "mask_reg_operand") (not:SWI1248_AVX512BW (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand")))] - "TARGET_AVX512F && reload_completed" + "reload_completed" [(parallel [(set (match_dup 0) (not:SWI1248_AVX512BW (match_dup 1))) @@ -2144,7 +2146,7 @@ (match_operand:SWI1248_AVX512BWDQ2 1 "register_operand" "k") (match_operand:SWI1248_AVX512BWDQ2 2 "register_operand" "k"))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" "kadd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "msklog") (set_attr "prefix" "vex") @@ -2159,7 +2161,7 @@ (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k") (match_operand 2 "const_0_to_255_operand"))) (unspec [(const_int 0)] UNSPEC_MASKOP)] - "TARGET_AVX512F" + "" "k\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "msklog") (set_attr "prefix" "vex") @@ -2171,7 +2173,7 @@ (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand") (match_operand 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && reload_completed" + "reload_completed" [(parallel [(set (match_dup 0) (any_lshift:SWI1248_AVX512BW @@ -2185,7 +2187,7 @@ [(match_operand:SWI1248_AVX512BWDQ2 0 "register_operand" "k") (match_operand:SWI1248_AVX512BWDQ2 1 "register_operand" "k")] UNSPEC_KTEST))] - "TARGET_AVX512F" + "" "ktest\t{%1, %0|%0, %1}" [(set_attr "mode" "") (set_attr "type" "msklog") @@ -2197,7 +2199,7 @@ [(match_operand:SWI1248_AVX512BWDQ 0 "register_operand" "k") (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k")] UNSPEC_KORTEST))] - "TARGET_AVX512F" + "" "kortest\t{%1, %0|%0, %1}" [(set_attr "mode" "") (set_attr "type" "msklog") @@ -3565,7 +3567,8 @@ UNSPEC_REDUCE) (match_dup 1) (const_int 1)))] - "TARGET_AVX512DQ || (VALID_AVX512FP16_REG_MODE (mode))" + "TARGET_AVX512DQ || (VALID_AVX512FP16_REG_MODE (mode)) + || TARGET_AVX10_1" "vreduce\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "sse") (set_attr "prefix" "evex") @@ -18897,7 +18900,7 @@ (define_mode_attr pinsr_evex_isa [(V16QI "avx512bw") (V8HI "avx512bw") (V8HF "avx512bw") - (V8BF "avx512bw") (V4SI "avx512dq") (V2DI "avx512dq")]) + (V8BF "avx512bw") (V4SI "avx10_1_or_avx512dq") (V2DI "avx10_1_or_avx512dq")]) ;; sse4_1_pinsrd must come before sse2_loadld since it is preferred. (define_insn "_pinsr" @@ -20276,7 +20279,7 @@ gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq,noavx,noavx,avx") + [(set_attr "isa" "*,avx10_1_or_avx512dq,noavx,noavx,avx") (set_attr "type" "sselog1,sselog1,sseishft1,sseishft1,sseishft1") (set (attr "prefix_extra") (if_then_else (eq_attr "alternative" "0,1") @@ -20294,7 +20297,7 @@ (parallel [(match_operand:SI 2 "const_0_to_3_operand")]))))] "TARGET_64BIT && TARGET_SSE4_1" "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}" - [(set_attr "isa" "*,avx512dq") + [(set_attr "isa" "*,avx10_1_or_avx512dq") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -20343,7 +20346,7 @@ (cond [(eq_attr "alternative" "0") (const_string "x64_sse4") (eq_attr "alternative" "1") - (const_string "x64_avx512dq") + (const_string "x64_avx10_1_or_avx512dq") (eq_attr "alternative" "3") (const_string "sse2_noavx") (eq_attr "alternative" "4") @@ -20509,7 +20512,7 @@ %vmovd\t{%1, %0|%0, %1} punpckldq\t{%2, %0|%0, %2} movd\t{%1, %0|%0, %1}" - [(set_attr "isa" "noavx,noavx,avx,avx512dq,noavx,noavx,avx,*,*,*") + [(set_attr "isa" "noavx,noavx,avx,avx10_1_or_avx512dq,noavx,noavx,avx,*,*,*") (set (attr "mmx_isa") (if_then_else (eq_attr "alternative" "8,9") (const_string "native") @@ -20665,7 +20668,7 @@ (eq_attr "alternative" "2") (const_string "x64_avx") (eq_attr "alternative" "3") - (const_string "x64_avx512dq") + (const_string "x64_avx10_1_or_avx512dq") (eq_attr "alternative" "4") (const_string "sse2_noavx") (eq_attr "alternative" "5,8") @@ -28600,7 +28603,7 @@ UNSPEC_RANGE) (match_dup 1) (const_int 1)))] - "TARGET_AVX512DQ" + "TARGET_AVX512DQ || TARGET_AVX10_1" { if (TARGET_DEST_FALSE_DEP_FOR_GLC && @@ -28634,7 +28637,7 @@ (match_operand 2 "const_0_to_255_operand")] UNSPEC_FPCLASS) (const_int 1)))] - "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(mode)" + "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(mode) || TARGET_AVX10_1" "vfpclass\t{%2, %1, %0|%0, %1, %2}"; [(set_attr "type" "sse") (set_attr "length_immediate" "1") diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md index fe923458ab8..a8b3081df70 100644 --- a/gcc/config/i386/subst.md +++ b/gcc/config/i386/subst.md @@ -353,7 +353,7 @@ (match_operand:SUBST_V 1) (match_operand:SUBST_V 2) (const_int 1)))] - "TARGET_AVX512F" + "TARGET_AVX512F || TARGET_AVX10_1" [(set (match_dup 0) (vec_merge:SUBST_V (vec_merge:SUBST_V @@ -460,7 +460,7 @@ (match_operand:SUBST_V 1) (match_operand:SUBST_V 2) (const_int 1)))] - "TARGET_AVX512F" + "TARGET_AVX512F || TARGET_AVX10_1" [(set (match_dup 0) (unspec:SUBST_V [ (vec_merge:SUBST_V diff --git a/gcc/testsuite/gcc.target/i386/sse-26.c b/gcc/testsuite/gcc.target/i386/sse-26.c index 89db33b8b8c..d67b6056954 100644 --- a/gcc/testsuite/gcc.target/i386/sse-26.c +++ b/gcc/testsuite/gcc.target/i386/sse-26.c @@ -7,5 +7,6 @@ intrinsics. */ #define _AVX512VLDQINTRIN_H_INCLUDED +#define _AVX512DQAVX10_1INTRIN_H_INCLUDED #include "sse-13.c" From patchwork Thu Aug 17 06:55:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 135852 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp506053vqi; Wed, 16 Aug 2023 23:58:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGiQ9m0B4X2lAFfCJM+pkWLjq6/iIS0tvme8xcAGcvmY8myj6ycTXq70xPvJOCArQhWzIBe X-Received: by 2002:a05:6512:2242:b0:4fd:d172:fc2c with SMTP id i2-20020a056512224200b004fdd172fc2cmr4009075lfu.21.1692255529195; Wed, 16 Aug 2023 23:58:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692255529; cv=none; d=google.com; s=arc-20160816; b=V7tCOhsqEHMz5f5ELRWWYZiQHqjQsRtKpQzifOe8R0h2wTmr8I7jyoCwSeap0ghHXy UkMjXbWK2kGAKihJ6idHsFv29Qji7Dem7FBv5m30XUy0WW+OSHJKVmIXst239DXuFtOp GfEIYUcxsWsgeMhEPq53NZNayrYuLbWFi2iXCc5WEd65NlTEKnIkQbNs91Vv7zHaIu3S CWJHlvXRd9h2AIzQI+8dxlM68mPM00w63UxD3CEc2xhQh337dA3hhjMgJcwmp5o5KTAN ZE8HkV26JsNAqx10ZW/SpP+TvDuZN0dD9YQkV7bBVLLI1IqwkNd7C384hkWpLzlRufSg y0ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=2I7/Qfy3WqSZyX73oLPl1ONVd/4w1NurR4VHRt0CoAY=; fh=KaFrwmRILS6MX9FYvoQEosSs8p1iCRNRTpooawDkG9g=; b=LHRoNUkloZbCZcXHvOM/i6bieyhNdM8wU0hbqOU+Bi7RDiFMC8Zb5Q//wa+JociUyK eVVqlrebDPIMXk+vdjCVmvIKB0Dx71ld6JIrYS0Qf6+hfZ9Lsko4bY+rItzdmvwTj8C6 GNL7/ZmbBDsup76EuHMtVCbK1K2tip9xHSB3H3gcwWqSl/X7gqqK2OIHkG4s0g+xxBx1 gAgBCmxYjU7pmlMWOv9IrthjleepU52uJ9BgtkA62kyxYPA1CxbIsHdsAAs1QBLwGnHz GqxTbNDeSmpBN3eSS6HYmZ3Akg4/qeTc32PNrQLE3s/gTy94hHs7M4eNDQsozbeWK/u2 XrQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=a3BUtomq; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id l10-20020aa7d94a000000b0052545f962cbsi8270423eds.640.2023.08.16.23.58.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Aug 2023 23:58:49 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=a3BUtomq; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 097883857351 for ; Thu, 17 Aug 2023 06:58:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 097883857351 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692255486; bh=2I7/Qfy3WqSZyX73oLPl1ONVd/4w1NurR4VHRt0CoAY=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=a3BUtomqa/+MN029iXCA3V64Le6GduIywTZ44008YFVFRl6QvIHVkEugl+G0W/l3L a9iFJUcurOFTWttvRERPn7183D+t1l2jukFgn06dWLyAkZIwN5f2V642Jf05KSw5xs 1JPh3i+ZEyglF0Q0IzOVrC2MXpGCTMyZ8c9J+r5g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 6E5F33858C2C for ; Thu, 17 Aug 2023 06:57:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6E5F33858C2C X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="436633259" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="436633259" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Aug 2023 23:57:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="848779231" X-IronPort-AV: E=Sophos;i="6.01,179,1684825200"; d="scan'208";a="848779231" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 16 Aug 2023 23:57:13 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 526B21005155; Thu, 17 Aug 2023 14:57:12 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: hongtao.liu@intel.com, ubizjak@gmail.com Subject: [PATCH 2/2] [PATCH 2/2] Support AVX10.1 for AVX512DQ intrins Date: Thu, 17 Aug 2023 14:55:09 +0800 Message-Id: <20230817065509.130068-3-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230817065509.130068-1-haochen.jiang@intel.com> References: <20230817065509.130068-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774458526249052201 X-GMAIL-MSGID: 1774458533688405826 gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-kaddb-1.c: New test. * gcc.target/i386/avx10_1-kaddw-1.c: Ditto. * gcc.target/i386/avx10_1-kandb-1.c: Ditto. * gcc.target/i386/avx10_1-kandnb-1.c: Ditto. * gcc.target/i386/avx10_1-kmovb-1.c: Ditto. * gcc.target/i386/avx10_1-kmovb-2.c: Ditto. * gcc.target/i386/avx10_1-kmovb-3.c: Ditto. * gcc.target/i386/avx10_1-kmovb-4.c: Ditto. * gcc.target/i386/avx10_1-knotb-1.c: Ditto. * gcc.target/i386/avx10_1-korb-1.c: Ditto. * gcc.target/i386/avx10_1-kortestb-1.c: Ditto. * gcc.target/i386/avx10_1-kshiftlb-1.c: Ditto. * gcc.target/i386/avx10_1-kshiftrb-1.c: Ditto. * gcc.target/i386/avx10_1-ktestb-1.c: Ditto. * gcc.target/i386/avx10_1-ktestw-1.c: Ditto. * gcc.target/i386/avx10_1-kxnorb-1.c: Ditto. * gcc.target/i386/avx10_1-kxorb-1.c: Ditto. * gcc.target/i386/avx10_1-vfpclasssd-1.c: New test. * gcc.target/i386/avx10_1-vfpclassss-1.c: Ditto. * gcc.target/i386/avx10_1-vpextr-1.c: Ditto. * gcc.target/i386/avx10_1-vpinsr-1.c: Ditto. * gcc.target/i386/avx10_1-vrangesd-1.c: Ditto. * gcc.target/i386/avx10_1-vrangess-1.c: Ditto. * gcc.target/i386/avx10_1-vreducesd-1.c: Ditto. * gcc.target/i386/avx10_1-vreducess-1.c: Ditto. --- .../gcc.target/i386/avx10_1-kaddb-1.c | 12 +++++ .../gcc.target/i386/avx10_1-kaddw-1.c | 12 +++++ .../gcc.target/i386/avx10_1-kandb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kandnb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kmovb-1.c | 15 ++++++ .../gcc.target/i386/avx10_1-kmovb-2.c | 16 ++++++ .../gcc.target/i386/avx10_1-kmovb-3.c | 17 ++++++ .../gcc.target/i386/avx10_1-kmovb-4.c | 15 ++++++ .../gcc.target/i386/avx10_1-knotb-1.c | 15 ++++++ .../gcc.target/i386/avx10_1-korb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kortestb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kshiftlb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kshiftrb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-ktestb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-ktestw-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kxnorb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-kxorb-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-vfpclasssd-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-vfpclassss-1.c | 16 ++++++ .../gcc.target/i386/avx10_1-vpextr-1.c | 53 +++++++++++++++++++ .../gcc.target/i386/avx10_1-vpinsr-1.c | 33 ++++++++++++ .../gcc.target/i386/avx10_1-vrangesd-1.c | 26 +++++++++ .../gcc.target/i386/avx10_1-vrangess-1.c | 25 +++++++++ .../gcc.target/i386/avx10_1-vreducesd-1.c | 31 +++++++++++ .../gcc.target/i386/avx10_1-vreducess-1.c | 30 +++++++++++ 25 files changed, 492 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c new file mode 100644 index 00000000000..6da7b497722 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kaddb-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kaddb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + __mmask8 k = _kadd_mask8 (11, 12); + asm volatile ("" : "+k" (k)); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c new file mode 100644 index 00000000000..033b7005d71 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kaddw-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kaddw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + __mmask16 k = _kadd_mask16 (11, 12); + asm volatile ("" : "+k" (k)); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c new file mode 100644 index 00000000000..5510a982c97 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kandb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kandb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2, k3; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) ); + + k3 = _kand_mask8 (k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c new file mode 100644 index 00000000000..e57078074e0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kandnb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kandnb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2, k3; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) ); + + k3 = _kandn_mask8 (k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c new file mode 100644 index 00000000000..15b9d9a5daa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include +volatile __mmask8 k1; + +void +avx10_1_test () +{ + __mmask8 k = _cvtu32_mask8 (11); + + asm volatile ("" : "+k" (k)); + k1 = k; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c new file mode 100644 index 00000000000..e4f73f0870e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-2.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include +volatile __mmask8 k1; + +void +avx10_1_test () +{ + __mmask8 k0 = 11; + __mmask8 k = _load_mask8 (&k0); + + asm volatile ("" : "+k" (k)); + k1 = k; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c new file mode 100644 index 00000000000..47d4d1aafe2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include +volatile __mmask8 k1 = 11; + +void +avx10_1_test () +{ + __mmask8 k0, k; + + _store_mask8 (&k, k1); + + asm volatile ("" : "+k" (k)); + k0 = k; +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c new file mode 100644 index 00000000000..79effebdfc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kmovb-4.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kmovb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include +volatile unsigned int i; + +void +avx10_1_test () +{ + __mmask8 k = 11; + + asm volatile ("" : "+k" (k)); + i = _cvtmask8_u32 (k); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c new file mode 100644 index 00000000000..4a353bcd921 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-knotb-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "knotb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (45) ); + + k2 = _knot_mask8 (k1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c new file mode 100644 index 00000000000..c912bec1482 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-korb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "korb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2, k3; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) ); + + k3 = _kor_mask8 (k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c new file mode 100644 index 00000000000..9c8783f0bc5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kortestb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -mavx10.1" } */ +/* { dg-final { scan-assembler-times "kortestb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */ + +#include + +void +avx10_1_test () { + volatile __mmask8 k1; + __mmask8 k2; + + volatile unsigned char r __attribute__((unused)); + + r = _kortestc_mask8_u8(k1, k2); + r = _kortestz_mask8_u8(k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c new file mode 100644 index 00000000000..54e8cfd98a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftlb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kshiftlb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2; + unsigned int i = 5; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + + k2 = _kshiftli_mask8 (k1, i); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c new file mode 100644 index 00000000000..625007fded0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kshiftrb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kshiftrb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2; + unsigned int i = 5; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + + k2 = _kshiftri_mask8 (k1, i); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c new file mode 100644 index 00000000000..5f4fe298bd6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-ktestb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -mavx10.1" } */ +/* { dg-final { scan-assembler-times "ktestb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */ + +#include + +void +avx10_1_test () { + volatile __mmask8 k1; + __mmask8 k2; + + volatile unsigned char r __attribute__((unused)); + + r = _ktestc_mask8_u8(k1, k2); + r = _ktestz_mask8_u8(k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c new file mode 100644 index 00000000000..c606abfb12b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-ktestw-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -mavx10.1" } */ +/* { dg-final { scan-assembler-times "ktestw\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 2 } } */ + +#include + +void +avx10_1_test () { + volatile __mmask16 k1; + __mmask16 k2; + + volatile unsigned char r __attribute__((unused)); + + r = _ktestc_mask16_u8(k1, k2); + r = _ktestz_mask16_u8(k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c new file mode 100644 index 00000000000..3abe56974bf --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kxnorb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kxnorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2, k3; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) ); + + k3 = _kxnor_mask8 (k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c new file mode 100644 index 00000000000..a39604f038b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-kxorb-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "kxorb\[ \\t\]+\[^\{\n\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +void +avx10_1_test () +{ + volatile __mmask8 k1, k2, k3; + + __asm__( "kmovb %1, %0" : "=k" (k1) : "r" (1) ); + __asm__( "kmovb %1, %0" : "=k" (k2) : "r" (2) ); + + k3 = _kxor_mask8 (k1, k2); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c new file mode 100644 index 00000000000..dbfbe421889 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclasssd-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vfpclasssd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclasssd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m128d x128; +volatile __mmask8 m8; + +void extern +avx10_1_test (void) +{ + m8 = _mm_fpclass_sd_mask (x128, 13); + m8 = _mm_mask_fpclass_sd_mask (m8, x128, 13); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c new file mode 100644 index 00000000000..20fd6d3c87b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vfpclassss-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vfpclassss\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfpclassss\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n^k\]*%k\[0-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m128 x128; +volatile __mmask8 m8; + +void extern +avx10_1_test (void) +{ + m8 = _mm_fpclass_ss_mask (x128, 13); + m8 = _mm_mask_fpclass_ss_mask (m8, x128, 13); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c new file mode 100644 index 00000000000..32fa2efa696 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpextr-1.c @@ -0,0 +1,53 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx10.1" } */ + +typedef int v4si __attribute__((vector_size (16))); +typedef long long v2di __attribute__((vector_size (16))); + +unsigned int +f1 (v4si a) +{ + register v4si c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v4si d = c; + return ((unsigned int *) &d)[3]; +} + +unsigned long long +f2 (v2di a) +{ + register v2di c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v2di d = c; + return ((unsigned long long *) &d)[1]; +} + +unsigned long long +f3 (v4si a) +{ + register v4si c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v4si d = c; + return ((unsigned int *) &d)[3]; +} + +void +f4 (v4si a, unsigned int *p) +{ + register v4si c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v4si d = c; + *p = ((unsigned int *) &d)[3]; +} + +void +f5 (v2di a, unsigned long long *p) +{ + register v2di c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v2di d = c; + *p = ((unsigned long long *) &d)[1]; +} + +/* { dg-final { scan-assembler-times "vpextrd\[^\n\r]*xmm16" 3 } } */ +/* { dg-final { scan-assembler-times "vpextrq\[^\n\r]*xmm16" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c new file mode 100644 index 00000000000..e473ddb64fc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vpinsr-1.c @@ -0,0 +1,33 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx10.1" } */ + +typedef int v4si __attribute__((vector_size (16))); +typedef long long v2di __attribute__((vector_size (16))); + +v4si +f1 (v4si a, int b) +{ + register v4si c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v4si d = c; + ((int *) &d)[3] = b; + c = d; + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler "vpinsrd\[^\n\r]*xmm16" } } */ + +v2di +f2 (v2di a, long long b) +{ + register v2di c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + v2di d = c; + ((long long *) &d)[1] = b; + c = d; + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler "vpinsrq\[^\n\r]*xmm16" } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c new file mode 100644 index 00000000000..4a388643a52 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangesd-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\$\n\]*\\$\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + + +#include + +volatile __m128d x1, x2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x1 = _mm_range_sd (x1, x2, 3); + x1 = _mm_mask_range_sd (x1, m, x1, x2, 3); + x1 = _mm_maskz_range_sd (m, x1, x2, 3); + + x1 = _mm_range_round_sd (x1, x2, 3, _MM_FROUND_NO_EXC); + x1 = _mm_mask_range_round_sd (x1, m, x1, x2, 3, _MM_FROUND_NO_EXC); + x1 = _mm_maskz_range_round_sd (m, x1, x2, 3, _MM_FROUND_NO_EXC); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c new file mode 100644 index 00000000000..f704ab95056 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vrangess-1.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\$\n\]*\\$\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vrangess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m128 x1, x2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + x1 = _mm_range_ss (x1, x2, 1); + x1 = _mm_mask_range_ss (x1, m, x1, x2, 1); + x1 = _mm_maskz_range_ss (m, x1, x2, 1); + + x1 = _mm_range_round_ss (x1, x2, 1, _MM_FROUND_NO_EXC); + x1 = _mm_mask_range_round_ss (x1, m, x1, x2, 1, _MM_FROUND_NO_EXC); + x1 = _mm_maskz_range_round_ss (m, x1, x2, 1, _MM_FROUND_NO_EXC); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c new file mode 100644 index 00000000000..5953466c372 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreducesd-1.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ + +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + + +#include + +#define IMM 123 + +volatile __m128d x1, x2, xx1, xx2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + xx1 = _mm_reduce_round_sd (xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_reduce_sd (x1, x2, IMM); + + xx1 = _mm_mask_reduce_round_sd(xx1, m, xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_mask_reduce_sd(x1, m, x1, x2, IMM); + + xx1 = _mm_maskz_reduce_round_sd(m, xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_maskz_reduce_sd(m, x1, x2, IMM); +} diff --git a/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c b/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c new file mode 100644 index 00000000000..edd7ec07923 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx10_1-vreducess-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx10.1 -O2" } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ + +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +#define IMM 123 + +volatile __m128 x1, x2, xx1, xx2; +volatile __mmask8 m; + +void extern +avx10_1_test (void) +{ + xx1 = _mm_reduce_round_ss (xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_reduce_ss (x1, x2, IMM); + + xx1 = _mm_mask_reduce_round_ss (xx1, m, xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_mask_reduce_ss (x1, m, x1, x2, IMM); + + xx1 = _mm_maskz_reduce_round_ss (m, xx1, xx2, IMM, _MM_FROUND_NO_EXC); + x1 = _mm_maskz_reduce_ss (m, x1, x2, IMM); +}