From patchwork Thu Sep 8 20:38:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kwok Cheung Yeung X-Patchwork-Id: 1103 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp427356wrt; Thu, 8 Sep 2022 13:40:01 -0700 (PDT) X-Google-Smtp-Source: AA6agR6DYAVaaCEEBFahf2RXE+B9xG8SBSzc9WhnoGPnkHF56ACcMwU72+r7Q/fyywNGPRTM4H9x X-Received: by 2002:a17:907:1b1f:b0:72f:56db:cce9 with SMTP id mp31-20020a1709071b1f00b0072f56dbcce9mr6994147ejc.605.1662669601703; Thu, 08 Sep 2022 13:40:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662669601; cv=none; d=google.com; s=arc-20160816; b=ZPBgTXA7l0whuS/Up/PwIEl18GoK7DsyUFUhVDQ7ptkmnfeNkcCUlS/9iG6rr7crau 3TxaLDSRqAzdg9HLnP40xT6i627ONuttsVGONkgizDPKfigNp106z6eKVAmI4bCYvGAz gOibz1DaP/KxiOBr4TO+Lny3lHsHjLXw9H8075xoFWp9VZjVi1Ps9oU36/Ri5Ratfq2G Hc8jZ7Pd5F0Pqb1Smjj7oJymRCRiG/sMDUU++Hzki0U1clxjXTqT3D0L0+YVq2JOTswB aSVbHBXRJegfSpInfXM73BcbaqpqCsuVeJQAntNYLmKR53g0BKUttG65M62+xuZWVInT 7UhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:to:from:user-agent :mime-version:date:message-id:ironport-sdr:dmarc-filter:delivered-to; bh=a33gL6WvImpyc/zzppWsHiE0dZ2wggjoQffS1cAZgQA=; b=ftzKBjBJPDx5RdxaRm6i6I5Ahwh81+JtqYhJdzjjHYNHu+aqus7TG62TKQMs4+TfYP +3dSDCuMYLc7W1Q/ibks8A6KXfhSgGGXVw5lyESOUXtHHF61aom4CNoFAiORX0jXrqKj LQ6DtkMABjQDOkf/4Nj49tUczqgJO4nwRicd/t8IvOI7JF/we5dO2rqaHZMmEiqXN2Uj kPXf8SvBMK+g0koliRhivv3y1KE9h8rK5/xErpdgbifM6ES+aDXQ8ZwuF5y8Q4LeOAyd 2HFd1B1qktIy59OkEY4rFlIpeg05xAQw9QPho6+LyRswb950z0fQAPWPMMes3xx7YR0Z lRzw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id c11-20020a05640227cb00b0044cf08e4276si17556348ede.216.2022.09.08.13.40.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Sep 2022 13:40:01 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E265E38582B6 for ; Thu, 8 Sep 2022 20:39:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id DBE133858C74 for ; Thu, 8 Sep 2022 20:39:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DBE133858C74 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.93,300,1654588800"; d="scan'208,223";a="82675765" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 08 Sep 2022 12:39:12 -0800 IronPort-SDR: ZRd3YKkrS4Pv0+CeOjiYEf1WOyFZ7mHF0lfge6Uk3MNdJ0ZN1sfrXTGIoEYsJynGAX+dtCl8uw ADYjwPBp5NyDh48AXzQyvm5GkZkEBxU8JUCpOjZNSO+JgHp5hCQMYWc9FvBz7nXLrPGg0RORfk HSBex8VYzMia1mWaYRuVlZGELno+Ev/8s//F1SY8qlho881AvNjBgVdElWPKUkCNR4Fj3Fl3gy lL8jZS0Or5j3xONYY/gjhgqjF/NqhOrSh9vUYdyKfRXAUeDfnkWoVS7WZr4kU2brqkLbEmh2m7 91s= Message-ID: Date: Thu, 8 Sep 2022 21:38:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 From: Kwok Cheung Yeung To: gcc-patches , Andrew Stubbs Subject: [PATCH] amdgcn: Add support for additional natively supported floating-point operations X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1743435439893278938?= X-GMAIL-MSGID: =?utf-8?q?1743435439893278938?= Hello This patch adds support for some additional floating-point operations, in scalar and vector modes, which are natively supported by the AMD GCN instruction set, but haven't been implemented in GCC yet. With the exception of frexp, these implement standard RTL names, and should be utilised automatically by GCC. The instructions for the transcendental functions are documented to have limited numerical precision, so they are only used if unsafe_math_optimizations are enabled for now. The sin and cos instructions for some reason are scaled by 2*PI radians (i.e. 1.0 == 2*PI radians/360 degrees), so their inputs need to be scaled by 1/(2*PI) first. I've implemented this as an expander to two instructions - one to do the pre-scaling, one to do the sin/cos. 1/(2*PI) is a builtin constant for GCN, but the syntax to use it in the LLVM assembler was wrong - now fixed. I have also added some extra GCN-specific builtins to access the vector versions of some of these operations (to implement vectorized versions of library math routines) and to access the frexp operations. Okay for trunk? Thanks Kwok From 5592c4512212ba74a7a690821650ddcba05df848 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Thu, 8 Sep 2022 17:37:26 +0000 Subject: [PATCH] amdgcn: Add support for additional natively supported floating-point operations This adds support for the following natively supported floating-point operations, in scalar and vectorized modes: floor, ceil, exp2*, log2*, sin*, cos*, ldexp, frexp * These operations are single-precision float only and are only active if unsafe_math_optimizations are enabled (due to potential numerical precision issues). 2022-09-08 Kwok Cheung Yeung gcc/ * config/gcn/gcn-builtins.def (FABSVF, LDEXPVF, LDEXPV, FREXPVF_EXP, FREXPVF_MANT, FREXPV_EXP, FREXPV_MANT): Add new builtins. * config/gcn/gcn-protos.h (gcn_dconst1over2pi): New prototype. * config/gcn/gcn-valu.md (MATH_UNOP_1OR2REG, MATH_UNOP_1REG, MATH_UNOP_TRIG): New iterators. (math_unop): New attributes. (2, 2, 2, 2, 2_insn, 2_insn, ldexp3, ldexp3, frexp_exp2, frexp_mant2, frexp_exp2, frexp_mant2): New instructions. (2, 2): New expanders. * config/gcn/gcn.cc (init_ext_gcn_constants): Update definition of dconst1over2pi. (gcn_dconst1over2pi): New. (gcn_builtin_type_index): Add entry for v64df type. (v64df_type_node): New. (gcn_init_builtin_types): Initialize v64df_type_node. (gcn_expand_builtin_1): Expand new builtins to instructions. (print_operand): Fix assembler output for 1/(2*PI) constant. * config/gcn/gcn.md (unspec): Add new entries. --- gcc/config/gcn/gcn-builtins.def | 35 ++++++ gcc/config/gcn/gcn-protos.h | 1 + gcc/config/gcn/gcn-valu.md | 181 ++++++++++++++++++++++++++++++++ gcc/config/gcn/gcn.cc | 114 +++++++++++++++++++- gcc/config/gcn/gcn.md | 4 +- 5 files changed, 332 insertions(+), 3 deletions(-) diff --git a/gcc/config/gcn/gcn-builtins.def b/gcc/config/gcn/gcn-builtins.def index 54e4ea4e953..27691909925 100644 --- a/gcc/config/gcn/gcn-builtins.def +++ b/gcc/config/gcn/gcn-builtins.def @@ -59,6 +59,41 @@ DEF_BUILTIN (SQRTF, 3 /*CODE_FOR_sqrtf */, _A2 (GCN_BTI_SF, GCN_BTI_SF), gcn_expand_builtin_1) +DEF_BUILTIN (FABSVF, 3 /*CODE_FOR_fabsvf */, + "fabsvf", B_INSN, + _A2 (GCN_BTI_V64SF, GCN_BTI_V64SF), + gcn_expand_builtin_1) + +DEF_BUILTIN (LDEXPVF, 3 /*CODE_FOR_ldexpvf */, + "ldexpvf", B_INSN, + _A3 (GCN_BTI_V64SF, GCN_BTI_V64SF, GCN_BTI_V64SI), + gcn_expand_builtin_1) + +DEF_BUILTIN (LDEXPV, 3 /*CODE_FOR_ldexpv */, + "ldexpv", B_INSN, + _A3 (GCN_BTI_V64DF, GCN_BTI_V64DF, GCN_BTI_V64SI), + gcn_expand_builtin_1) + +DEF_BUILTIN (FREXPVF_EXP, 3 /*CODE_FOR_frexpvf_exp */, + "frexpvf_exp", B_INSN, + _A2 (GCN_BTI_V64SI, GCN_BTI_V64SF), + gcn_expand_builtin_1) + +DEF_BUILTIN (FREXPVF_MANT, 3 /*CODE_FOR_frexpvf_mant */, + "frexpvf_mant", B_INSN, + _A2 (GCN_BTI_V64SF, GCN_BTI_V64SF), + gcn_expand_builtin_1) + +DEF_BUILTIN (FREXPV_EXP, 3 /*CODE_FOR_frexpv_exp */, + "frexpv_exp", B_INSN, + _A2 (GCN_BTI_V64SI, GCN_BTI_V64DF), + gcn_expand_builtin_1) + +DEF_BUILTIN (FREXPV_MANT, 3 /*CODE_FOR_frexpv_mant */, + "frexpv_mant", B_INSN, + _A2 (GCN_BTI_V64DF, GCN_BTI_V64DF), + gcn_expand_builtin_1) + DEF_BUILTIN (CMP_SWAP, -1, "cmp_swap", B_INSN, _A4 (GCN_BTI_UINT, GCN_BTI_VOIDPTR, GCN_BTI_UINT, GCN_BTI_UINT), diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h index 38197b929fd..ca804609c09 100644 --- a/gcc/config/gcn/gcn-protos.h +++ b/gcc/config/gcn/gcn-protos.h @@ -54,6 +54,7 @@ extern int gcn_hard_regno_nregs (int regno, machine_mode mode); extern void gcn_hsa_declare_function_name (FILE *file, const char *name, tree decl); extern HOST_WIDE_INT gcn_initial_elimination_offset (int, int); +extern REAL_VALUE_TYPE gcn_dconst1over2pi (void); extern bool gcn_inline_constant64_p (rtx, bool); extern bool gcn_inline_constant_p (rtx); extern int gcn_inline_fp_constant_p (rtx, bool); diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 8c33ae0c717..3bfdf8213fc 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -2290,6 +2290,187 @@ [(set_attr "type" "vop1") (set_attr "length" "8")]) +; These FP unops have f64, f32 and f16 versions. +(define_int_iterator MATH_UNOP_1OR2REG + [UNSPEC_FLOOR UNSPEC_CEIL]) + +; These FP unops only have f16/f32 versions. +(define_int_iterator MATH_UNOP_1REG + [UNSPEC_EXP2 UNSPEC_LOG2]) + +(define_int_iterator MATH_UNOP_TRIG + [UNSPEC_SIN UNSPEC_COS]) + +(define_int_attr math_unop + [(UNSPEC_FLOOR "floor") + (UNSPEC_CEIL "ceil") + (UNSPEC_EXP2 "exp2") + (UNSPEC_LOG2 "log2") + (UNSPEC_SIN "sin") + (UNSPEC_COS "cos")]) + +(define_insn "2" + [(set (match_operand:FP 0 "register_operand" "= v") + (unspec:FP + [(match_operand:FP 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_1OR2REG))] + "" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "2" + [(set (match_operand:V_FP 0 "register_operand" "= v") + (unspec:V_FP + [(match_operand:V_FP 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_1OR2REG))] + "" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "2" + [(set (match_operand:FP_1REG 0 "register_operand" "= v") + (unspec:FP_1REG + [(match_operand:FP_1REG 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_1REG))] + "flag_unsafe_math_optimizations" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "2" + [(set (match_operand:V_FP_1REG 0 "register_operand" "= v") + (unspec:V_FP_1REG + [(match_operand:V_FP_1REG 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_1REG))] + "flag_unsafe_math_optimizations" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "*2_insn" + [(set (match_operand:FP_1REG 0 "register_operand" "= v") + (unspec:FP_1REG + [(match_operand:FP_1REG 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_TRIG))] + "flag_unsafe_math_optimizations" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "*2_insn" + [(set (match_operand:V_FP_1REG 0 "register_operand" "= v") + (unspec:V_FP_1REG + [(match_operand:V_FP_1REG 1 "gcn_alu_operand" "vSvB")] + MATH_UNOP_TRIG))] + "flag_unsafe_math_optimizations" + "v_%i0\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +; Trigonometric functions need their input scaled by 1/(2*PI) first. + +(define_expand "2" + [(set (match_dup 2) + (mult:FP_1REG + (match_dup 3) + (match_operand:FP_1REG 1 "gcn_alu_operand"))) + (set (match_operand:FP_1REG 0 "register_operand") + (unspec:FP_1REG + [(match_dup 2)] + MATH_UNOP_TRIG))] + "flag_unsafe_math_optimizations" + { + operands[2] = gen_reg_rtx (mode); + operands[3] = const_double_from_real_value (gcn_dconst1over2pi (), + mode); + }) + +(define_expand "2" + [(set (match_dup 2) + (mult:V_FP_1REG + (match_dup 3) + (match_operand:V_FP_1REG 1 "gcn_alu_operand"))) + (set (match_operand:V_FP_1REG 0 "register_operand") + (unspec:V_FP_1REG + [(match_dup 2)] + MATH_UNOP_TRIG))] + "flag_unsafe_math_optimizations" + { + operands[2] = gen_reg_rtx (mode); + operands[3] = + gcn_vec_constant (mode, + const_double_from_real_value (gcn_dconst1over2pi (), + mode)); + }) + +; Implement ldexp pattern + +(define_insn "ldexp3" + [(set (match_operand:FP 0 "register_operand" "=v") + (unspec:FP + [(match_operand:FP 1 "gcn_alu_operand" "vB") + (match_operand:SI 2 "gcn_alu_operand" "vSvA")] + UNSPEC_LDEXP))] + "" + "v_ldexp%i0\t%0, %1, %2" + [(set_attr "type" "vop3a") + (set_attr "length" "8")]) + +(define_insn "ldexp3" + [(set (match_operand:V_FP 0 "register_operand" "=v") + (unspec:V_FP + [(match_operand:V_FP 1 "gcn_alu_operand" "vB") + (match_operand:V64SI 2 "gcn_alu_operand" "vSvA")] + UNSPEC_LDEXP))] + "" + "v_ldexp%i0\t%0, %1, %2" + [(set_attr "type" "vop3a") + (set_attr "length" "8")]) + +; Implement frexp patterns + +(define_insn "frexp_exp2" + [(set (match_operand:SI 0 "register_operand" "=v") + (unspec:SI + [(match_operand:FP 1 "gcn_alu_operand" "vB")] + UNSPEC_FREXP_EXP))] + "" + "v_frexp_exp_i32%i1\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "frexp_mant2" + [(set (match_operand:FP 0 "register_operand" "=v") + (unspec:FP + [(match_operand:FP 1 "gcn_alu_operand" "vB")] + UNSPEC_FREXP_MANT))] + "" + "v_frexp_mant%i1\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "frexp_exp2" + [(set (match_operand:V64SI 0 "register_operand" "=v") + (unspec:V64SI + [(match_operand:V_FP 1 "gcn_alu_operand" "vB")] + UNSPEC_FREXP_EXP))] + "" + "v_frexp_exp_i32%i1\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + +(define_insn "frexp_mant2" + [(set (match_operand:V_FP 0 "register_operand" "=v") + (unspec:V_FP + [(match_operand:V_FP 1 "gcn_alu_operand" "vB")] + UNSPEC_FREXP_MANT))] + "" + "v_frexp_mant%i1\t%0, %1" + [(set_attr "type" "vop1") + (set_attr "length" "8")]) + ;; }}} ;; {{{ FP fused multiply and add diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 82667556512..eb822e20dd1 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -779,12 +779,20 @@ init_ext_gcn_constants (void) /* FIXME: this constant probably does not match what hardware really loads. Reality check it eventually. */ real_from_string (&dconst1over2pi, - "0.1591549430918953357663423455968866839"); + "0.15915494309189532"); real_convert (&dconst1over2pi, SFmode, &dconst1over2pi); ext_gcn_constants_init = 1; } +REAL_VALUE_TYPE +gcn_dconst1over2pi (void) +{ + if (!ext_gcn_constants_init) + init_ext_gcn_constants (); + return dconst1over2pi; +} + /* Return non-zero if X is a constant that can appear as an inline operand. This is 0, 0.5, -0.5, 1, -1, 2, -2, 4,-4, 1/(2*pi) Or a vector of those. @@ -3605,6 +3613,7 @@ enum gcn_builtin_type_index GCN_BTI_SF, GCN_BTI_V64SI, GCN_BTI_V64SF, + GCN_BTI_V64DF, GCN_BTI_V64PTR, GCN_BTI_SIPTR, GCN_BTI_SFPTR, @@ -3621,6 +3630,7 @@ static GTY(()) tree gcn_builtin_types[GCN_BTI_MAX]; #define sf_type_node (gcn_builtin_types[GCN_BTI_SF]) #define v64si_type_node (gcn_builtin_types[GCN_BTI_V64SI]) #define v64sf_type_node (gcn_builtin_types[GCN_BTI_V64SF]) +#define v64df_type_node (gcn_builtin_types[GCN_BTI_V64DF]) #define v64ptr_type_node (gcn_builtin_types[GCN_BTI_V64PTR]) #define siptr_type_node (gcn_builtin_types[GCN_BTI_SIPTR]) #define sfptr_type_node (gcn_builtin_types[GCN_BTI_SFPTR]) @@ -3710,6 +3720,7 @@ gcn_init_builtin_types (void) sf_type_node = float32_type_node; v64si_type_node = build_vector_type (intSI_type_node, 64); v64sf_type_node = build_vector_type (float_type_node, 64); + v64df_type_node = build_vector_type (double_type_node, 64); v64ptr_type_node = build_vector_type (unsigned_intDI_type_node /*build_pointer_type (integer_type_node) */ @@ -3977,6 +3988,105 @@ gcn_expand_builtin_1 (tree exp, rtx target, rtx /*subtarget */ , emit_insn (gen_sqrtsf2 (target, arg)); return target; } + case GCN_BUILTIN_FABSVF: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg = force_reg (V64SFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + emit_insn (gen_absv64sf2_exec + (target, arg, gcn_gen_undef (V64SFmode), exec)); + return target; + } + case GCN_BUILTIN_LDEXPVF: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg1 = force_reg (V64SFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + rtx arg2 = force_reg (V64SImode, + expand_expr (CALL_EXPR_ARG (exp, 1), NULL_RTX, + V64SImode, + EXPAND_NORMAL)); + emit_insn (gen_ldexpv64sf3_exec + (target, arg1, arg2, gcn_gen_undef (V64SFmode), exec)); + return target; + } + case GCN_BUILTIN_LDEXPV: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg1 = force_reg (V64DFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + rtx arg2 = force_reg (V64SImode, + expand_expr (CALL_EXPR_ARG (exp, 1), NULL_RTX, + V64SImode, + EXPAND_NORMAL)); + emit_insn (gen_ldexpv64df3_exec + (target, arg1, arg2, gcn_gen_undef (V64DFmode), exec)); + return target; + } + case GCN_BUILTIN_FREXPVF_EXP: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg = force_reg (V64SFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + emit_insn (gen_frexpv64sf_exp2_exec + (target, arg, gcn_gen_undef (V64SImode), exec)); + return target; + } + case GCN_BUILTIN_FREXPVF_MANT: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg = force_reg (V64SFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + emit_insn (gen_frexpv64sf_mant2_exec + (target, arg, gcn_gen_undef (V64SFmode), exec)); + return target; + } + case GCN_BUILTIN_FREXPV_EXP: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg = force_reg (V64DFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64DFmode, + EXPAND_NORMAL)); + emit_insn (gen_frexpv64df_exp2_exec + (target, arg, gcn_gen_undef (V64SImode), exec)); + return target; + } + case GCN_BUILTIN_FREXPV_MANT: + { + if (ignore) + return target; + rtx exec = gcn_full_exec_reg (); + rtx arg = force_reg (V64DFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64DFmode, + EXPAND_NORMAL)); + emit_insn (gen_frexpv64df_mant2_exec + (target, arg, gcn_gen_undef (V64DFmode), exec)); + return target; + } case GCN_BUILTIN_OMP_DIM_SIZE: { if (ignore) @@ -6476,7 +6586,7 @@ print_operand (FILE *file, rtx x, int code) str = "-4.0"; break; case 248: - str = "1/pi"; + str = "0.15915494"; break; default: rtx ix = simplify_gen_subreg (GET_MODE (x) == DFmode diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index 7805e867901..a3c9523cd6d 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -82,7 +82,9 @@ UNSPEC_GATHER UNSPEC_SCATTER UNSPEC_RCP - UNSPEC_FLBIT_INT]) + UNSPEC_FLBIT_INT + UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_SIN UNSPEC_COS UNSPEC_EXP2 UNSPEC_LOG2 + UNSPEC_LDEXP UNSPEC_FREXP_EXP UNSPEC_FREXP_MANT]) ;; }}} ;; {{{ Attributes