From patchwork Fri Sep 22 06:23:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 143206 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5357775vqi; Thu, 21 Sep 2023 23:23:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/cbt/xzMiqhEwAG0i0BipHjkGMWd49pyO3AABNTJHsBWWBh5YdrsgoN1D8JfHCG3vTRK2 X-Received: by 2002:a2e:9189:0:b0:2c0:20e3:990f with SMTP id f9-20020a2e9189000000b002c020e3990fmr6547425ljg.10.1695363831158; Thu, 21 Sep 2023 23:23:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695363831; cv=none; d=google.com; s=arc-20160816; b=Nx65KQyrUk+vg+XKEc7xcZVVjoDbhfnwjQe0uNerPB5ZvsSyuWvtcyeKc8HiDMcDaj MJdjmyECy9P2Hy7YnBeE9neCyj/PRa+xK5E07cp96z/sf/+fRqNvZ3bkr8rcK2FVuhNL KeVF2pa7q1BqNt8lzkW2VZja0UGwcAd8wf8nP9sUtfiaDpPTm36dysEim13+lhNJzckc A8uoZJoLRnx3Yfgt2iHwUidbJ95EbxZgNrNlwQJed/OFZFXNwTtCd73EFLXiSCsXD461 sIDMzkxUsLNudWOalJ118c577hzt/cN4fuk05ICEFXXFrwND2aPeJqBNPRBthnz7m+sv jjBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=pQnjGQ+feg8bFMU6KDokHAhMH/brFXepYZSAbHm+wBA=; fh=yqBQmCEeFYB2Wjmf8l8QkV/dOy5iKwSEx/iU/FYQjxU=; b=K8Ou9rqm8oIPPgFkrekqtnFYRY/VQxCObidLUrb2aHUlphWN39me1Fo9fvplhwfTmb Q7/62pilsI2wQdtkFNu1cPm4iGSFESGZQIqQfimh0bkPN5o+YHJceQ+n05nDDZ/ifhcV TcEdIpXoAmYTCg4paaqASrEi4iyyfyKwz2pS1jw0r/qEGWM2yjtXOw1sFXfn74ArGIiX R4D7JY4M5Tbdk8Bm+7yFZMBB1LwkavY36cL4TN6xErebZshQeI3VvwROt4rQBETLUgn5 gsWzqiH5Qh0mIBBQfLPwklPAxUvqhzyifh1yr78Y8hR9+EWuY4saHympMSswIIIG72yD cs1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VCNkGxJi; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b5-20020a17090636c500b0099bb83fdfc6si2757924ejc.408.2023.09.21.23.23.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 23:23:51 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VCNkGxJi; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0DF133857803 for ; Fri, 22 Sep 2023 06:23:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id 6A9253858C53 for ; Fri, 22 Sep 2023 06:23:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6A9253858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695363795; x=1726899795; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=f9AIiOImkx+vJze4VpdGepozNamIn7j+GI4inBiblaw=; b=VCNkGxJiMcIxs9OmrOLY5SChbBjES9iz/4HEEGEvn7t8GExUyJsT7Qtr igxWkAkiB1A7Sqa51q0LZS6eYlwWwfq8d95KUKzOU+im4R7hIeqyhhO7O FJZHoDx8fz8s5ytrw0HL9tkKlImBm5e+3vVM4mv203JVssUREylOd77nf yxlOOuQSUxIKmE9uIfpqGh3yTTsNg6g1x/2FrIE1in9j5raRe0zSuI6+4 djR3u4TVo7zkYxjHnvqvjScNt9E4ysovMywinHBxMiw1cRVlX/0V4o1Kn V08AhBrEQwuf8n8ZwIBzBomx4wYW1KHdO4LifU470hn8fzOKt5Y+qk+4O A==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="365811708" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="365811708" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Sep 2023 23:23:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="782521938" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="782521938" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 21 Sep 2023 23:23:08 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 11B481005689; Fri, 22 Sep 2023 14:23:08 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com Subject: [PATCH v1] RISCV-V: Suport FP floor auto-vectorization Date: Fri, 22 Sep 2023 14:23:06 +0800 Message-Id: <20230922062306.1220795-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777717824543931131 X-GMAIL-MSGID: 1777717824543931131 From: Pan Li This patch would like to support auto-vectorization for the floor API in math.h. It depends on the -ffast-math option. When we would like to call floor/floorf like v2 = floor (v1), we will convert it into below insns (reference the implementation of llvm). * vfcvt.x.f v3, v1, RDN * vfcvt.f.x v2, v3 However, the floating point value may not need the cvt as above if its mantissa is zero. For example single precision floating point below. +-----------+---------------+-------------+ | raw float | binary layout | after floor | +-----------+---------------+-------------+ | 8388607.5 | 0x4affffff | 8388607.0 | | 8388608.0 | 0x4b000000 | 8388608.0 | | 8388609.0 | 0x4b000001 | 8388609.0 | +-----------+---------------+-------------+ All single floating point glte 8388608.0 will have all zero mantisaa. We leverage vmflt and mask to filter them out in vector and only do the cvt on mask. Befor this patch: math-floor-1.c:21:1: missed: couldn't vectorize loop ... .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,4 call ceilf fsw fa0,-4(s1) bne s0,s2,.L3 After this patch: ... fsrmi 2 // Rounding Down .L4: vfabs.v v0,v1 vmv1r.v v2,v1 vmflt.vv v0,v0,v4 sub a3,a3,a4 vfcvt.x.f.v v3,v1,v0.t vfcvt.f.x.v v2,v3,v0.t vfsgnj.vv v2,v2,v1 bne .L4 .L14: fsrm a6 ret Please note VLS mode is also involved in this patch and covered by the test cases. gcc/ChangeLog: * config/riscv/autovec.md (floor2): New pattern. * config/riscv/riscv-protos.h (enum insn_flags): New enum type. (enum insn_type): Ditto. (expand_vec_floor): New function decl. * config/riscv/riscv-v.cc (gen_floor_const_fp): New function impl. (expand_vec_floor): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/math-floor-0.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-1.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-2.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-3.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-run-0.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-run-1.c: New test. * gcc.target/riscv/rvv/autovec/math-floor-run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/math-floor-1.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 11 ++++ gcc/config/riscv/riscv-protos.h | 5 ++ gcc/config/riscv/riscv-v.cc | 36 +++++++++++- .../riscv/rvv/autovec/math-floor-0.c | 26 +++++++++ .../riscv/rvv/autovec/math-floor-1.c | 26 +++++++++ .../riscv/rvv/autovec/math-floor-2.c | 26 +++++++++ .../riscv/rvv/autovec/math-floor-3.c | 28 ++++++++++ .../riscv/rvv/autovec/math-floor-run-0.c | 39 +++++++++++++ .../riscv/rvv/autovec/math-floor-run-1.c | 39 +++++++++++++ .../riscv/rvv/autovec/math-floor-run-2.c | 39 +++++++++++++ .../riscv/rvv/autovec/vls/math-floor-1.c | 56 +++++++++++++++++++ 11 files changed, 329 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-floor-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index b92cb7a5d0f..9ba20e27cf1 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2245,6 +2245,7 @@ (define_expand "avg3_ceil" ;; ------------------------------------------------------------------------- ;; Includes: ;; - ceil/ceilf +;; - floor/floorf ;; ------------------------------------------------------------------------- (define_expand "ceil2" [(match_operand:V_VLSF 0 "register_operand") @@ -2255,3 +2256,13 @@ (define_expand "ceil2" DONE; } ) + +(define_expand "floor2" + [(match_operand:V_VLSF 0 "register_operand") + (match_operand:V_VLSF 1 "register_operand")] + "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math" + { + riscv_vector::expand_vec_floor (operands[0], operands[1], mode, mode); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 07b4ffe3edf..04e26c957d7 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -253,6 +253,9 @@ enum insn_flags : unsigned int /* Means INSN has FRM operand and the value is FRM_RUP. */ FRM_RUP_P = 1 << 16, + + /* Means INSN has FRM operand and the value is FRM_RDN. */ + FRM_RDN_P = 1 << 17, }; enum insn_type : unsigned int @@ -294,6 +297,7 @@ enum insn_type : unsigned int UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P, + UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P, /* Binary operator. */ BINARY_OP = __NORMAL_OP | BINARY_OP_P, @@ -437,6 +441,7 @@ void expand_cond_len_unop (unsigned, rtx *); void expand_cond_len_binop (unsigned, rtx *); void expand_reduction (unsigned, unsigned, rtx *, rtx); void expand_vec_ceil (rtx, rtx, machine_mode, machine_mode); +void expand_vec_floor (rtx, rtx, machine_mode, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index f63dec573ef..8eb05b32ef2 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,8 +323,10 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); - if (m_insn_flags & FRM_RUP_P) + else if (m_insn_flags & FRM_RUP_P) add_rounding_mode_operand (FRM_RUP); + else if (m_insn_flags & FRM_RDN_P) + add_rounding_mode_operand (FRM_RDN); gcc_assert (insn_data[(int) icode].n_operands == m_opno); expand (icode, any_mem_p); @@ -3508,6 +3510,13 @@ gen_ceil_const_fp (machine_mode inner_mode) return const_double_from_real_value (real, inner_mode); } +static rtx +gen_floor_const_fp (machine_mode inner_mode) +{ + /* The floor needs the same floating point const as ceil. */ + return gen_ceil_const_fp (inner_mode); +} + static rtx expand_vec_float_cmp_mask (rtx fp_vector, rtx_code code, rtx fp_scalar, machine_mode vec_fp_mode) @@ -3568,7 +3577,30 @@ expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, icode = code_for_pred (FLOAT, vec_fp_mode); emit_vlmax_insn (icode, UNARY_OP_TAMU_FRM_RUP, cvt_fp_ops); - /* Step-4: Retrieve the sign bit. */ + /* Step-4: Retrieve the sign bit for -0.0. */ + expand_vec_copysign (op_0, op_0, op_1, vec_fp_mode); +} + +void +expand_vec_floor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, + machine_mode vec_int_mode) +{ + /* Step-1: Generate the mask on const fp. */ + rtx const_fp = gen_floor_const_fp (GET_MODE_INNER (vec_fp_mode)); + rtx mask = expand_vec_float_cmp_mask (op_1, LT, const_fp, vec_fp_mode); + + /* Step-2: Convert to integer on mask, with rounding down (aka floor). */ + rtx tmp = gen_reg_rtx (vec_int_mode); + rtx cvt_x_ops[] = {tmp, mask, tmp, op_1}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_fp_mode); + emit_vlmax_insn (icode, UNARY_OP_TAMU_FRM_RDN, cvt_x_ops); + + /* Step-3: Convert to floating-point on mask for the floor result. */ + rtx cvt_fp_ops[] = {op_0, mask, op_1, tmp}; + icode = code_for_pred (FLOAT, vec_fp_mode); + emit_vlmax_insn (icode, UNARY_OP_TAMU_FRM_RDN, cvt_fp_ops); + + /* Step-4: Retrieve the sign bit for -0.0. */ expand_vec_copysign (op_0, op_0, op_1, vec_fp_mode); } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-0.c new file mode 100644 index 00000000000..a9095e0222f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-0.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test__Float16___builtin_floorf16: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e16,\s*m1,\s*ta,\s*mu +** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_UNARY_CALL (_Float16, __builtin_floorf16) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-1.c new file mode 100644 index 00000000000..3cab1597f02 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float___builtin_floorf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*mu +** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_UNARY_CALL (float, __builtin_floorf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-2.c new file mode 100644 index 00000000000..9b0a30fd217 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double___builtin_floor: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*mu +** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_UNARY_CALL (double, __builtin_floor) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-3.c new file mode 100644 index 00000000000..b1bd8df0bbc --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float___builtin_floorf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+2 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*mu +** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t +** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_UNARY_CALL (float, __builtin_floorf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-0.c new file mode 100644 index 00000000000..6f017887603 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-0.c @@ -0,0 +1,39 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-march=rv64gcv_zvfh -std=c2x -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +_Float16 in[ARRAY_SIZE]; +_Float16 out[ARRAY_SIZE]; +_Float16 ref[ARRAY_SIZE]; + +TEST_UNARY_CALL (_Float16, __builtin_floorf16) +TEST_ASSERT (_Float16) + +TEST_INIT (_Float16, 1.2, 2.0, 1) +TEST_INIT (_Float16, -1.2, -1.0, 2) +TEST_INIT (_Float16, 3.0, 3.0, 3) +TEST_INIT (_Float16, 1023.5, 1024.0, 4) +TEST_INIT (_Float16, 1025.0, 1025.0, 5) +TEST_INIT (_Float16, 0.0, 0.0, 6) +TEST_INIT (_Float16, -0.0, -0.0, 7) +TEST_INIT (_Float16, -1023.5, -1023.0, 8) +TEST_INIT (_Float16, -1024.0, -1024.0, 9) + +int +main () +{ + RUN_TEST (_Float16, 1, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 2, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 3, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 4, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 5, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 6, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 7, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 8, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + RUN_TEST (_Float16, 9, __builtin_floorf16, in, out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-1.c new file mode 100644 index 00000000000..25df3f89fa7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-1.c @@ -0,0 +1,39 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-march=rv64gcv -std=c99 -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +TEST_UNARY_CALL (float, __builtin_floorf) +TEST_ASSERT (float) + +TEST_INIT (float, 1.2, 1.0, 1) +TEST_INIT (float, -1.2, -2.0, 2) +TEST_INIT (float, 3.0, 3.0, 3) +TEST_INIT (float, 8388607.5, 8388607.0, 4) +TEST_INIT (float, 8388609.0, 8388609.0, 5) +TEST_INIT (float, 0.0, 0.0, 6) +TEST_INIT (float, -0.0, -0.0, 7) +TEST_INIT (float, -8388607.5, -8388608.0, 8) +TEST_INIT (float, -8388608.0, -8388608.0, 9) + +int +main () +{ + RUN_TEST (float, 1, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 2, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 3, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 4, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 5, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 6, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 7, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 8, __builtin_floorf, in, out, ref, ARRAY_SIZE); + RUN_TEST (float, 9, __builtin_floorf, in, out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-2.c new file mode 100644 index 00000000000..7090b95cd2c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-floor-run-2.c @@ -0,0 +1,39 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-march=rv64gcv -std=c99 -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +TEST_UNARY_CALL (double, __builtin_floor) +TEST_ASSERT (double) + +TEST_INIT (double, 1.2, 1.0, 1) +TEST_INIT (double, -1.2, -2.0, 2) +TEST_INIT (double, 3.0, 3.0, 3) +TEST_INIT (double, 4503599627370495.5, 4503599627370495.0, 4) +TEST_INIT (double, 4503599627370497.0, 4503599627370497.0, 5) +TEST_INIT (double, 0.0, 0.0, 6) +TEST_INIT (double, -0.0, -0.0, 7) +TEST_INIT (double, -4503599627370495.5, -4503599627370496.0, 8) +TEST_INIT (double, -4503599627370496.0, -4503599627370496.0, 9) + +int +main () +{ + RUN_TEST (double, 1, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 2, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 3, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 4, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 5, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 6, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 7, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 8, __builtin_floor, in, out, ref, ARRAY_SIZE); + RUN_TEST (double, 9, __builtin_floor, in, out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-floor-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-floor-1.c new file mode 100644 index 00000000000..076580e6a58 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-floor-1.c @@ -0,0 +1,56 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 --param=riscv-autovec-lmul=m8 -ffast-math -fdump-tree-optimized" } */ + +#include "def.h" + +DEF_OP_V (floorf16, 1, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 2, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 4, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 8, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 16, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 32, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 64, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 128, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 256, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 512, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 1024, _Float16, __builtin_floorf16) +DEF_OP_V (floorf16, 2048, _Float16, __builtin_floorf16) + +DEF_OP_V (floorf, 1, float, __builtin_floorf) +DEF_OP_V (floorf, 2, float, __builtin_floorf) +DEF_OP_V (floorf, 4, float, __builtin_floorf) +DEF_OP_V (floorf, 8, float, __builtin_floorf) +DEF_OP_V (floorf, 16, float, __builtin_floorf) +DEF_OP_V (floorf, 32, float, __builtin_floorf) +DEF_OP_V (floorf, 64, float, __builtin_floorf) +DEF_OP_V (floorf, 128, float, __builtin_floorf) +DEF_OP_V (floorf, 256, float, __builtin_floorf) +DEF_OP_V (floorf, 512, float, __builtin_floorf) +DEF_OP_V (floorf, 1024, float, __builtin_floorf) + +DEF_OP_V (floor, 1, double, __builtin_floor) +DEF_OP_V (floor, 2, double, __builtin_floor) +DEF_OP_V (floor, 4, double, __builtin_floor) +DEF_OP_V (floor, 8, double, __builtin_floor) +DEF_OP_V (floor, 16, double, __builtin_floor) +DEF_OP_V (floor, 32, double, __builtin_floor) +DEF_OP_V (floor, 64, double, __builtin_floor) +DEF_OP_V (floor, 128, double, __builtin_floor) +DEF_OP_V (floor, 256, double, __builtin_floor) +DEF_OP_V (floor, 512, double, __builtin_floor) + +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "4,4" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "16,16" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "32,32" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "64,64" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "128,128" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "256,256" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "512,512" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "1024,1024" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "2048,2048" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "4096,4096" "optimized" } } */ +/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t} 30 } } */ +/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t} 30 } } */