From patchwork Mon Jun 5 01:26:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 103053 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp2395984vqr; Sun, 4 Jun 2023 18:27:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7bfhm5BcEHqfMzARZbYgcrn51dSQUergMYig4jQuThgKnPdQVY/RjbT/VVHc6kT+P126hn X-Received: by 2002:a17:907:a48:b0:975:63f4:46 with SMTP id be8-20020a1709070a4800b0097563f40046mr4166505ejc.57.1685928447388; Sun, 04 Jun 2023 18:27:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685928447; cv=none; d=google.com; s=arc-20160816; b=ue14VwFzJm8cZ1rDSWxLhKarBrAifZIPndYwObDIsPj/aiB9E8XcUYgADqxfYld0es G0rk2VAgYUSzp4UXR4z2ggRs3TClR2uqr08KhxW1mRHqRmflQiye4v9LpslQzL6SaMwx USf2bUBtW27w7RV/PqHJ9dNC7KVOvxgOqHwIhH5tEeY0CU90vy3eL9q+s7MiNYHavYh7 WtZEqlKN0L0d+eQzFRaJf6oS+U8EKZXxr1OxntOIlmPWjrvW/FecflizbwJnL/32n56G WwxoFVIEVy4fwZq8iuyey8kJnFDRZwrCvxE4CkGpAWap+pled/eEu6sElaQdXaGB17UR +nzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=BumFSMZmZpAGeBv09EGF0TicwQYqMlRMNkfwEUWeJLk=; b=tEnXYoq1dBUn8zMlPH9zyd2n/7PLMxWG+DO4ZmL9k+8mFkMiRgyc3XNMcAOZMGwWVb /m67spUEkb1WJNC3iq6z8DercHxSdpKm58YaSoG2GCWNcwEldeoOdBW1mUF+q8ap3lMs 2dnmLwUaZib9LYiOy2EE6dcz24SvI2A6vDF/m7g2x+P3HXEdiL4sZVkWk9p0W8ndTVjL hTO/aq3Kql4vn5aZU31OAcBNFRf/L2Ao8EVesbioQqWrobpQJj9m8bCAfbz+Jo2fQ3l6 8OLbNM+L9r4V/oSsScpDLgcYGj2Ih50tH6QhzH2iJEPzsNq1fZ+Oa6//5XAUPloZhfUu b68A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dHSeykDB; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o22-20020a170906769600b009537ef82da3si4470655ejm.993.2023.06.04.18.27.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 18:27:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dHSeykDB; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C83B13858C31 for ; Mon, 5 Jun 2023 01:27:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C83B13858C31 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685928445; bh=BumFSMZmZpAGeBv09EGF0TicwQYqMlRMNkfwEUWeJLk=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=dHSeykDBvcUZ+rjryMhox+QlZdZK1m/A4fOfLsqD8GekuPAweqHslxWk56Isa5q+H YVPCLFnc17sx87jUdEXUfC8hC9bdb5uO7pOGMyKumB3RsesDWbzwAmBwPSxwGlGXGq lbL4OQ9/iE3ANB7OXlj6S/FYU5OvPq1M5OEKhKFs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by sourceware.org (Postfix) with ESMTPS id AD9CA3858D39 for ; Mon, 5 Jun 2023 01:26:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD9CA3858D39 X-IronPort-AV: E=McAfee;i="6600,9927,10731"; a="384550298" X-IronPort-AV: E=Sophos;i="6.00,217,1681196400"; d="scan'208";a="384550298" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2023 18:26:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10731"; a="658885295" X-IronPort-AV: E=Sophos;i="6.00,217,1681196400"; d="scan'208";a="658885295" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga003.jf.intel.com with ESMTP; 04 Jun 2023 18:26:36 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 60330100512F; Mon, 5 Jun 2023 09:26:35 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH] [x86] Add missing vec_pack/unpacks patterns for _Float16 <-> int/float conversion. Date: Mon, 5 Jun 2023 09:26:35 +0800 Message-Id: <20230605012635.2292889-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767824107231198260?= X-GMAIL-MSGID: =?utf-8?q?1767824107231198260?= This patch only support vec_pack/unpacks optabs for vector modes whose lenth >= 128. For 32/64-bit vector, they're more hanlded by BB vectorizer with truncmn2/extendmn2/fix{,uns}_truncmn2. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready to push to trunk. gcc/ChangeLog: * config/i386/sse.md (vec_pack_float_): New expander. (vec_unpack_fix_trunc_lo_): Ditto. (vec_unpack_fix_trunc_hi_): Ditto. (vec_unpacks_lo_: Ditto. (vec_unpacks_hi_: Ditto. (sse_movlhps_): New define_insn. (ssse3_palignr_perm): Extend to V_128H. (V_128H): New mode iterator. (ssepackPHmode): New mode attribute. (vunpck_extract_mode>: Ditto. (vpckfloat_concat_mode): Extend to VxSI/VxSF for _Float16. (vpckfloat_temp_mode): Ditto. (vpckfloat_op_mode): Ditto. (vunpckfixt_mode): Extend to VxHF. (vunpckfixt_model): Ditto. (vunpckfixt_extract_mode): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/vec_pack_fp16-1.c: New test. * gcc.target/i386/vec_pack_fp16-2.c: New test. * gcc.target/i386/vec_pack_fp16-3.c: New test. --- gcc/config/i386/sse.md | 216 +++++++++++++++++- .../gcc.target/i386/vec_pack_fp16-1.c | 34 +++ .../gcc.target/i386/vec_pack_fp16-2.c | 9 + .../gcc.target/i386/vec_pack_fp16-3.c | 8 + 4 files changed, 258 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/vec_pack_fp16-1.c create mode 100644 gcc/testsuite/gcc.target/i386/vec_pack_fp16-2.c create mode 100644 gcc/testsuite/gcc.target/i386/vec_pack_fp16-3.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a92f50e96b5..1eb2dd077ff 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -291,6 +291,9 @@ (define_mode_iterator V (define_mode_iterator V_128 [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")]) +(define_mode_iterator V_128H + [V16QI V8HI V8HF V8BF V4SI V2DI V4SF (V2DF "TARGET_SSE2")]) + ;; All 256bit vector modes (define_mode_iterator V_256 [V32QI V16HI V8SI V4DI V8SF V4DF]) @@ -1076,6 +1079,12 @@ (define_mode_attr ssePHmodelower (V8DI "v8hf") (V4DI "v4hf") (V2DI "v2hf") (V8DF "v8hf") (V16SF "v16hf") (V8SF "v8hf")]) + +;; Mapping of vector modes to packed vector hf modes of same sized. +(define_mode_attr ssepackPHmode + [(V16SI "V32HF") (V8SI "V16HF") (V4SI "V8HF") + (V16SF "V32HF") (V8SF "V16HF") (V4SF "V8HF")]) + ;; Mapping of vector modes to packed single mode of the same size (define_mode_attr ssePSmode [(V16SI "V16SF") (V8DF "V16SF") @@ -6918,6 +6927,61 @@ (define_mode_attr qq2phsuff (V16SF "") (V8SF "{y}") (V4SF "{x}") (V8DF "{z}") (V4DF "{y}") (V2DF "{x}")]) +(define_mode_attr vunpck_extract_mode + [(V32HF "v32hf") (V16HF "v16hf") (V8HF "v16hf")]) + +(define_expand "vec_unpacks_lo_" + [(match_operand: 0 "register_operand") + (match_operand:VF_AVX512FP16VL 1 "register_operand")] + "TARGET_AVX512FP16" +{ + rtx tem = operands[1]; + rtx (*gen) (rtx, rtx); + if (mode != V8HFmode) + { + tem = gen_reg_rtx (mode); + emit_insn (gen_vec_extract_lo_ (tem, + operands[1])); + gen = gen_extend2; + } + else + gen = gen_avx512fp16_float_extend_phv4sf2; + + emit_insn (gen (operands[0], tem)); + DONE; +}) + +(define_expand "vec_unpacks_hi_" + [(match_operand: 0 "register_operand") + (match_operand:VF_AVX512FP16VL 1 "register_operand")] + "TARGET_AVX512FP16" +{ + rtx tem = operands[1]; + rtx (*gen) (rtx, rtx); + if (mode != V8HFmode) + { + tem = gen_reg_rtx (mode); + emit_insn (gen_vec_extract_hi_ (tem, + operands[1])); + gen = gen_extend2; + } + else + { + tem = gen_reg_rtx (V8HFmode); + rtvec tmp = rtvec_alloc (8); + for (int i = 0; i != 8; i++) + RTVEC_ELT (tmp, i) = GEN_INT((i+4)%8); + + rtx selector = gen_rtx_PARALLEL (VOIDmode, tmp); + emit_move_insn (tem, + gen_rtx_VEC_SELECT (V8HFmode, operands[1], selector)); + gen = gen_avx512fp16_float_extend_phv4sf2; + } + + emit_insn (gen (operands[0], tem)); + DONE; +}) + (define_insn "avx512fp16_vcvtph2_" [(set (match_operand:VI248_AVX512VL 0 "register_operand" "=v") (unspec:VI248_AVX512VL @@ -8314,11 +8378,17 @@ (define_expand "floatv2div2sf2" }) (define_mode_attr vpckfloat_concat_mode - [(V8DI "v16sf") (V4DI "v8sf") (V2DI "v8sf")]) + [(V8DI "v16sf") (V4DI "v8sf") (V2DI "v8sf") + (V16SI "v32hf") (V8SI "v16hf") (V4SI "v16hf") + (V16SF "v32hf") (V8SF "v16hf") (V4SF "v16hf")]) (define_mode_attr vpckfloat_temp_mode - [(V8DI "V8SF") (V4DI "V4SF") (V2DI "V4SF")]) + [(V8DI "V8SF") (V4DI "V4SF") (V2DI "V4SF") + (V16SI "V16HF") (V8SI "V8HF") (V4SI "V8HF") + (V16SF "V16HF") (V8SF "V8HF") (V4SF "V8HF")]) (define_mode_attr vpckfloat_op_mode - [(V8DI "v8sf") (V4DI "v4sf") (V2DI "v2sf")]) + [(V8DI "v8sf") (V4DI "v4sf") (V2DI "v2sf") + (V16SI "v16hf") (V8SI "v8hf") (V4SI "v4hf") + (V16SF "v16hf") (V8SF "v8hf") (V4SF "v4hf")]) (define_expand "vec_pack_float_" [(match_operand: 0 "register_operand") @@ -8345,6 +8415,31 @@ (define_expand "vec_pack_float_" DONE; }) +(define_expand "vec_pack_float_" + [(match_operand: 0 "register_operand") + (any_float: + (match_operand:VI4_AVX512VL 1 "register_operand")) + (match_operand:VI4_AVX512VL 2 "register_operand")] + "TARGET_AVX512FP16" +{ + rtx r1 = gen_reg_rtx (mode); + rtx r2 = gen_reg_rtx (mode); + rtx (*gen) (rtx, rtx); + + if (mode == V4SImode) + gen = gen_avx512fp16_floatv4siv4hf2; + else + gen = gen_float2; + emit_insn (gen (r1, operands[1])); + emit_insn (gen (r2, operands[2])); + if (mode == V4SImode) + emit_insn (gen_sse_movlhps_v8hf (operands[0], r1, r2)); + else + emit_insn (gen_avx_vec_concat (operands[0], + r1, r2)); + DONE; +}) + (define_expand "floatv2div2sf2_mask" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF @@ -8747,11 +8842,14 @@ (define_expand "fix_truncv2sfv2di2" }) (define_mode_attr vunpckfixt_mode - [(V16SF "V8DI") (V8SF "V4DI") (V4SF "V2DI")]) + [(V16SF "V8DI") (V8SF "V4DI") (V4SF "V2DI") + (V32HF "V16SI") (V16HF "V8SI") (V8HF "V4SI")]) (define_mode_attr vunpckfixt_model - [(V16SF "v8di") (V8SF "v4di") (V4SF "v2di")]) + [(V16SF "v8di") (V8SF "v4di") (V4SF "v2di") + (V32HF "v16si") (V16HF "v8si") (V8HF "v4si")]) (define_mode_attr vunpckfixt_extract_mode - [(V16SF "v16sf") (V8SF "v8sf") (V4SF "v8sf")]) + [(V16SF "v16sf") (V8SF "v8sf") (V4SF "v8sf") + (V32HF "v32hf") (V16HF "v16hf") (V8HF "v16hf")]) (define_expand "vec_unpack_fix_trunc_lo_" [(match_operand: 0 "register_operand") @@ -8803,6 +8901,60 @@ (define_expand "vec_unpack_fix_trunc_hi_" DONE; }) +(define_expand "vec_unpack_fix_trunc_lo_" + [(match_operand: 0 "register_operand") + (any_fix: + (match_operand:VF_AVX512FP16VL 1 "register_operand"))] + "TARGET_AVX512FP16" +{ + rtx tem = operands[1]; + rtx (*gen) (rtx, rtx); + if (mode != V8HFmode) + { + tem = gen_reg_rtx (mode); + emit_insn (gen_vec_extract_lo_ (tem, + operands[1])); + gen = gen_fix_trunc2; + } + else + gen = gen_avx512fp16_fix_trunc2; + + emit_insn (gen (operands[0], tem)); + DONE; +}) + +(define_expand "vec_unpack_fix_trunc_hi_" + [(match_operand: 0 "register_operand") + (any_fix: + (match_operand:VF_AVX512FP16VL 1 "register_operand"))] + "TARGET_AVX512FP16" +{ + rtx tem = operands[1]; + rtx (*gen) (rtx, rtx); + if (mode != V8HFmode) + { + tem = gen_reg_rtx (mode); + emit_insn (gen_vec_extract_hi_ (tem, + operands[1])); + gen = gen_fix_trunc2; + } + else + { + tem = gen_reg_rtx (V8HFmode); + rtvec tmp = rtvec_alloc (8); + for (int i = 0; i != 8; i++) + RTVEC_ELT (tmp, i) = GEN_INT((i+4)%8); + + rtx selector = gen_rtx_PARALLEL (VOIDmode, tmp); + emit_move_insn (tem, + gen_rtx_VEC_SELECT (V8HFmode, operands[1], selector)); + gen = gen_avx512fp16_fix_trunc2; + } + + emit_insn (gen (operands[0], tem)); + DONE; +}) + (define_insn "fixuns_trunc2" [(set (match_operand: 0 "register_operand" "=v") (unsigned_fix: @@ -9616,6 +9768,31 @@ (define_expand "vec_pack_trunc_" operands[4] = gen_reg_rtx (mode); }) +(define_expand "vec_pack_trunc_" + [(match_operand: 0 "register_operand") + (match_operand:VF1_AVX512VL 1 "register_operand") + (match_operand:VF1_AVX512VL 2 "register_operand")] + "TARGET_AVX512FP16" +{ + rtx r1 = gen_reg_rtx (mode); + rtx r2 = gen_reg_rtx (mode); + rtx (*gen) (rtx, rtx); + + if (mode == V4SFmode) + gen = gen_avx512fp16_truncv4sfv4hf2; + else + gen = gen_trunc2; + emit_insn (gen (r1, operands[1])); + emit_insn (gen (r2, operands[2])); + if (mode == V4SFmode) + emit_insn (gen_sse_movlhps_v8hf (operands[0], r1, r2)); + else + emit_insn (gen_avx_vec_concat (operands[0], + r1, r2)); + DONE; + +}) + (define_expand "vec_pack_trunc_v2df" [(match_operand:V4SF 0 "register_operand") (match_operand:V2DF 1 "vector_operand") @@ -9921,6 +10098,27 @@ (define_insn "sse_movlhps" (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex") (set_attr "mode" "V4SF,V4SF,V2SF,V2SF,V2SF")]) +(define_insn "sse_movlhps_" + [(set (match_operand:V8_128 0 "nonimmediate_operand" "=x,v,x,o") + (vec_select:V8_128 + (vec_concat: + (match_operand:V8_128 1 "nonimmediate_operand" " 0,v,0,0") + (match_operand:V8_128 2 "nonimmediate_operand" " x,v,m,v")) + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 8) (const_int 9) + (const_int 10) (const_int 11)])))] + "TARGET_SSE && ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "@ + movlhps\t{%2, %0|%0, %2} + vpunpcklqdq\t{%2, %1, %0|%0, %1, %2} + movhps\t{%2, %0|%0, %q2} + %vmovlps\t{%2, %H0|%H0, %2}" + [(set_attr "isa" "noavx,avx,noavx,*") + (set_attr "type" "ssemov") + (set_attr "prefix" "orig,maybe_evex,orig,maybe_vex") + (set_attr "mode" "V4SF,TI,V2SF,V2SF")]) + (define_insn "avx512f_unpckhps512" [(set (match_operand:V16SF 0 "register_operand" "=v") (vec_select:V16SF @@ -26239,9 +26437,9 @@ (define_insn "*avx_vperm2f128_nozero" (set_attr "mode" "")]) (define_insn "*ssse3_palignr_perm" - [(set (match_operand:V_128 0 "register_operand" "=x,Yw") - (vec_select:V_128 - (match_operand:V_128 1 "register_operand" "0,Yw") + [(set (match_operand:V_128H 0 "register_operand" "=x,Yw") + (vec_select:V_128H + (match_operand:V_128H 1 "register_operand" "0,Yw") (match_parallel 2 "palignr_operand" [(match_operand 3 "const_int_operand")])))] "TARGET_SSSE3" diff --git a/gcc/testsuite/gcc.target/i386/vec_pack_fp16-1.c b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-1.c new file mode 100644 index 00000000000..9eca9c71645 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-1.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -mavx512vl -Ofast -mprefer-vector-width=512" } */ +/* { dg-final { scan-assembler-times "vcvttph2dq" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtdq2ph" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtph2ps" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtps2ph" "2" } } */ + +void +foo (int* __restrict a, _Float16* b) +{ + for (int i = 0; i != 1000000; i++) + a[i] = b[i]; +} + +void +foo1 (int* __restrict a, _Float16* b) +{ + for (int i = 0; i != 100000; i++) + b[i] = a[i]; +} + +void +foo2 (float* __restrict a, _Float16* b) +{ + for (int i = 0; i != 1000000; i++) + a[i] = b[i]; +} + +void +foo3 (float* __restrict a, _Float16* b) +{ + for (int i = 0; i != 100000; i++) + b[i] = a[i]; +} diff --git a/gcc/testsuite/gcc.target/i386/vec_pack_fp16-2.c b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-2.c new file mode 100644 index 00000000000..0fd0325c193 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -mavx512vl -Ofast -mprefer-vector-width=256" } */ +/* { dg-final { scan-assembler-times "vcvttph2dq" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtdq2ph" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtph2ps" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtps2ph" "2" } } */ + + +#include "vec_pack_fp16-1.c" diff --git a/gcc/testsuite/gcc.target/i386/vec_pack_fp16-3.c b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-3.c new file mode 100644 index 00000000000..f6d3fa0bf65 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vec_pack_fp16-3.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -mavx512vl -Ofast -mprefer-vector-width=128" } */ +/* { dg-final { scan-assembler-times "vcvttph2dq" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtdq2ph" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtph2ps" "2" } } */ +/* { dg-final { scan-assembler-times "vcvtps2ph" "2" } } */ + +#include "vec_pack_fp16-1.c"