From patchwork Thu Aug 31 02:46:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: chenxiaolong X-Patchwork-Id: 137221 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4897446vqm; Wed, 30 Aug 2023 19:48:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH1CLJJGV8jsQaJfE1qT9kIR2GXgoxmWkaKoEjzNyOXGWf4JqEOILXg/0TIImFqNZ1O1MIr X-Received: by 2002:a05:6402:34d0:b0:51d:b184:efd with SMTP id w16-20020a05640234d000b0051db1840efdmr1418954edc.20.1693450089009; Wed, 30 Aug 2023 19:48:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693450088; cv=none; d=google.com; s=arc-20160816; b=uzc+5AcxXTV8huO/0e+yX24Pq4PZgU0IJhfqYpJ/3/Xm2h4o0DCv3fqXsxwk4oHouJ WRK5br133/xz3l/DYmC5chJSSXTUzP330EzGQu4MqB/q9K+xnQ20iUOY4SVEF9ciiLRG EbQoX5tMKequn6DaCYX9Myl4xFsSbn1gXv3VRqxWsbRyJ4wsNXCVNat7oMogAIyNbk8p h29JTTF5xDOKY6E7MFiCPgr6YgdFyo0nd+rHrfcgJuQMbxB2SxKBlpjgpAGCD5QrY+H4 gtuyAxFw5kZwjP5/ljJ/7Ha027TknEKhJpt8dW1WeyfXRJmFiEpBbQ/uaq7aUHDFsuPw TxTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dmarc-filter :delivered-to; bh=fpp3vfhSwmd4D3KObh2Yr/fcufOmY3Oa0/KYblPWJ7g=; fh=b4bHiRhwmyngBnK1A0teGZR99sACoeEAslk5F3Bw8oE=; b=Oe2GGaS3+thh3d3PRfFxIJd+8jh6sppsqcgXpnrjHs0P/hwF6/jaXX8mUIyTogoZJ+ bvG1ulLE3I2e5IstU+1mCPSjF5NHotH535T7MIpswAwKDI9u90F7K1DQ/cg+HM1TvTYX 0oa33Yy0LGdARFGHOx94TVuQ/bKtXicJmGfn5sgRn13ld/nE9A97rJoyyapwZeNUpkZT AEtNRnOnnMpiqTccWgn1d/bv/akWStYu+5Tksd1FN9g1bkBhI/y0VTvyBtP9v960yoEE cv352sRaDbZ+pWxK2BiLb2OiqByEsup75CJOWpsujPg5vyAXHAu4Af/W3kBe/f8pmpao WNtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r8-20020aa7cfc8000000b0052a48f40657si400313edy.35.2023.08.30.19.48.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 19:48:08 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ABAA63858028 for ; Thu, 31 Aug 2023 02:48:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id B31C33858D20 for ; Thu, 31 Aug 2023 02:47:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B31C33858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbXiO-00065h-DF for gcc-patches@gcc.gnu.org; Wed, 30 Aug 2023 22:47:28 -0400 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8Dxl+gw_+9kR08dAA--.23828S3; Thu, 31 Aug 2023 10:47:13 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Dx4eQu_+9kpcpnAA--.54612S4; Thu, 31 Aug 2023 10:47:11 +0800 (CST) From: chenxiaolong To: gcc-patches@gcc.gnu.org Subject: [PATCH v4] LoongArch:Implement 128-bit floating point functions in gcc. Date: Thu, 31 Aug 2023 10:46:57 +0800 Message-Id: <20230831024657.57063-1-chenxiaolong@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8Dx4eQu_+9kpcpnAA--.54612S4 X-CM-SenderInfo: hfkh05xldrz0tqj6z05rqj20fqof0/1tbiAQASBWTtcDQHagAEsy X-Coremail-Antispam: 1Uk129KBj93XoW3ZF4kWF4rur1fJr4ruF4Utrc_yoWkArykpF W7Cr1YyrZ7JFs3Zw1fJa4rArnxAr47Gw4xXF9xKFyqka1UXr92q3WrtrWaqF15J34rWr4I q390qay293W8A3gCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkYb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4UJVWxJr1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27w Aqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE 14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwCF04k20xvY0x 0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E 7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcV C0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF 04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7 CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1wL05UUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=chenxiaolong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-14.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: chenxiaolong , xuchenghua@loongson.cn, chenglulu@loongson.cn, i@xen0n.name Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775711120811364582 X-GMAIL-MSGID: 1775711120811364582 Brief version history of patch set: v1 -> v2: According to the GNU code specification, adjust the format of the function implementation with "q" as the suffix function. v2 - >v3: 1.On the LoongArch architecture, refer to the functionality of 64-bit functions and modify the underlying implementation of __builtin_{nanq,nansq} functions in libgcc. 2.Modify the function's instruction template to use some instructions such as "bstrins.d" to implement the 128-bit __builtin_{fabsq,copysignq} function instead of calling libgcc library support, so as to better play the machine's performance. v3 -> v4: 1.The above v1,v2, and v3 all implement 128-bit floating-point functions with "q" as the suffix, but it is an older implementation. The v4 version completely abandoned the old implementation by associating the 128-bit floating-point function with the "q" suffix with the "f128" function that already existed in GCC. 2.Modify the code so that both "__float128" and "_Float128" function types can be supported in compiler gcc. 3.Associating a function with the suffix "q" to the "f128" function allows two different forms of the function to produce the same effect, For example, __builtin_{huge_{valq,valf128},{infq/inff128},{nanq/nanf128},{nansq/nansf128}}. 4.For the _builtin_{fabsq,copysignq} function, do not call the new "f128" implementation, but use the "bstrins" and other instructions in the machine description file to implement the function function, the result is that the number of assembly instructions can be reduced and the function optimization to achieve the optimal effect. During implementation, float128_type_node is bound with the type "__float128" so that the compiler can correctly identify the type of the function. The "q" suffix is associated with the "f128" function, which makes GCC more flexible to support different user input cases, implementing functions such as __builtin_{huge_valq infq, fabsq, copysignq, nanq,nansq}.At the same time, the __builtin_{copysign{q/f128},fabs{q/f128}} functions are optimized by using "bstrins" and other instructions on LoongArch architecture to better play the optimization performance of the compiler. gcc/ChangeLog: * config/loongarch/loongarch-builtins.cc (loongarch_init_builtins): Associate the __float128 type to float128_type_node so that it can be recognized by the compiler. * config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Add the flag "FLOAT128_TYPE" to gcc and associate a function with the suffix "q" to "f128". * config/loongarch/loongarch.md (abstf2):Modify the instruction template to implement the __builtin_{copysignf128/fabsf128} function. (abstf_local):Ditto. (copysigntf3):Implement the built-in function __builtin_copysignf128(). * doc/extend.texi:Added support for 128-bit floating-point functions on the LoongArch architecture. gcc/testsuite/ChangeLog: * gcc.target/loongarch/math-float-128.c: New test. --- gcc/config/loongarch/loongarch-builtins.cc | 5 + gcc/config/loongarch/loongarch-c.cc | 11 ++ gcc/config/loongarch/loongarch.md | 54 ++++++++ gcc/doc/extend.texi | 20 ++- .../gcc.target/loongarch/math-float-128.c | 115 ++++++++++++++++++ 5 files changed, 202 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/math-float-128.c diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc index b929f224dfa..58b612bf445 100644 --- a/gcc/config/loongarch/loongarch-builtins.cc +++ b/gcc/config/loongarch/loongarch-builtins.cc @@ -256,6 +256,11 @@ loongarch_init_builtins (void) unsigned int i; tree type; + /* Register the type float128_type_node as a built-in type and + give it an alias "__float128". */ + (*lang_hooks.types.register_builtin_type) (float128_type_node, + "__float128"); + /* Iterate through all of the bdesc arrays, initializing all of the builtin functions. */ for (i = 0; i < ARRAY_SIZE (loongarch_builtins); i++) diff --git a/gcc/config/loongarch/loongarch-c.cc b/gcc/config/loongarch/loongarch-c.cc index 67911b78f28..6ffbf748316 100644 --- a/gcc/config/loongarch/loongarch-c.cc +++ b/gcc/config/loongarch/loongarch-c.cc @@ -99,6 +99,17 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile) else builtin_define ("__loongarch_frlen=0"); + /* Add support for FLOAT128_TYPE on the LoongArch architecture. */ + builtin_define ("__FLOAT128_TYPE__"); + + /* Map the old _Float128 'q' builtins into the new 'f128' builtins. */ + builtin_define ("__builtin_fabsq=__builtin_fabsf128"); + builtin_define ("__builtin_copysignq=__builtin_copysignf128"); + builtin_define ("__builtin_nanq=__builtin_nanf128"); + builtin_define ("__builtin_nansq=__builtin_nansf128"); + builtin_define ("__builtin_infq=__builtin_inff128"); + builtin_define ("__builtin_huge_valq=__builtin_huge_valf128"); + /* Native Data Sizes. */ builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE); builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE); diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index b37e070660f..230d7aac972 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -38,6 +38,7 @@ (define_c_enum "unspec" [ UNSPEC_FMAX UNSPEC_FMIN UNSPEC_FCOPYSIGN + UNSPEC_COPYSIGNF128 UNSPEC_FTINT UNSPEC_FTINTRM UNSPEC_FTINTRP @@ -2008,6 +2009,59 @@ (define_insn "movfcc" "" "movgr2cf\t%0,$r0") +;; Implement __builtin_fabs128 function. + +(define_expand "abstf2" + [(match_operand:TF 0 "register_operand") + (match_operand:TF 1 "register_operand")] + "TARGET_64BIT" +{ + loongarch_emit_move (operands[0], operands[1]); + emit_insn (gen_abstf_local (operands[0])); + DONE; +}) + +(define_insn "abstf_local" + [(set (match_operand:TF 0 "register_operand" "+r") + (abs:TF (match_dup 0)))] + "TARGET_64BIT" +{ + operands[0] = gen_rtx_REG (DImode, REGNO (operands[0]) + 1); + return "bstrins.d\t%0,$r0,0x3f,0x3f"; +}) + +;; Implement __builtin_copysignf128 function. + +(define_insn_and_split "copysigntf3" + [(set (match_operand:TF 0 "register_operand" "=&r") + (unspec:TF [(match_operand:TF 1 "register_operand" "r") + (match_operand:TF 2 "register_operand" "r")] + UNSPEC_COPYSIGNF128))] + "TARGET_64BIT" + "#" + "reload_completed" + [(const_int 0)] +{ + rtx op0_lo = gen_rtx_REG (DImode,REGNO (operands[0]) + 0); + rtx op0_hi = gen_rtx_REG (DImode,REGNO (operands[0]) + 1); + rtx op1_lo = gen_rtx_REG (DImode,REGNO (operands[1]) + 0); + rtx op1_hi = gen_rtx_REG (DImode,REGNO (operands[1]) + 1); + rtx op2_hi = gen_rtx_REG (DImode,REGNO (operands[2]) + 1); + + if (REGNO (operands[1]) == REGNO (operands[2])) + { + loongarch_emit_move (operands[0], operands[1]); + DONE; + } + else + { + loongarch_emit_move (op0_hi, op2_hi); + loongarch_emit_move (op0_lo, op1_lo); + emit_insn (gen_insvdi (op0_hi, GEN_INT (63), GEN_INT (0), op1_hi)); + DONE; + } +}) + ;; Conditional move instructions. (define_insn "*sel_using_" diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 400284b85f5..074f1ee33a0 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1093,10 +1093,10 @@ types. As an extension, GNU C and GNU C++ support additional floating types, which are not supported by all targets. @itemize @bullet -@item @code{__float128} is available on i386, x86_64, IA-64, and -hppa HP-UX, as well as on PowerPC GNU/Linux targets that enable +@item @code{__float128} is available on i386, x86_64, IA-64, LoongArch +and hppa HP-UX, as well as on PowerPC GNU/Linux targets that enable the vector scalar (VSX) instruction set. @code{__float128} supports -the 128-bit floating type. On i386, x86_64, PowerPC, and IA-64 +the 128-bit floating type. On i386, x86_64, PowerPC, LoongArch and IA-64, other than HP-UX, @code{__float128} is an alias for @code{_Float128}. On hppa and IA-64 HP-UX, @code{__float128} is an alias for @code{long double}. @@ -16657,6 +16657,20 @@ function you need to include @code{larchintrin.h}. void __break (imm0_32767) @end smallexample +Additional built-in functions are available for LoongArch family +processors to efficiently use 128-bit floating-point (__float128) +values. + +The following are the basic built-in functions supported. +@smallexample +__float128 __builtin_fabsq (__float128); +__float128 __builtin_copysignq (__float128, __float128); +__float128 __builtin_infq (void); +__float128 __builtin_huge_valq (void); +__float128 __builtin_nanq (void); +__float128 __builtin_nansq (void); +@end smallexample + @node MIPS DSP Built-in Functions @subsection MIPS DSP Built-in Functions diff --git a/gcc/testsuite/gcc.target/loongarch/math-float-128.c b/gcc/testsuite/gcc.target/loongarch/math-float-128.c new file mode 100644 index 00000000000..5eab3019278 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/math-float-128.c @@ -0,0 +1,115 @@ +/* { dg-do compile } */ +/* { dg-options " -march=loongarch64 " } */ +/* { dg-final { scan-assembler-times {bstrins.d} 6} } */ +/* { dg-final { scan-assembler-not "my_fabsq2:.*\\bl\t%plt\\(__fabsq\\).*my_fabsq2" } } */ +/* { dg-final { scan-assembler-not "my_copysignq2:.*\\bl\t%plt\\(__copysignq\\).*my_copysignq2" } } */ +/* { dg-final { scan-assembler-not "my_nanq2:.*\\bl\t%plt\\(__builtin_nanq\\).*my_nanq2" } } */ +/* { dg-final { scan-assembler-not "my_nansq2:.*\\bl\t%plt\\(__builtin_nansq\\).*my_nansq2" } } */ + +__float128 +my_fabsq1 (__float128 a) +{ + return __builtin_fabsq (a); +} + +_Float128 +my_fabsq2 (_Float128 a) +{ + return __builtin_fabsq (a); +} + +_Float128 +my_fabsf128 (_Float128 a) +{ + return __builtin_fabsf128 (a); +} + +__float128 +my_copysignq1 (__float128 a, __float128 b) +{ + return __builtin_copysignq (a, b); +} + +_Float128 +my_copysignq2 (_Float128 a, _Float128 b) +{ + return __builtin_copysignq (a, b); +} + +_Float128 +my_copysignf128 (_Float128 a, _Float128 b) +{ + return __builtin_copysignf128 (a, b); +} + +__float128 +my_infq1 (void) +{ + return __builtin_infq (); +} + +_Float128 +my_infq2 (void) +{ + return __builtin_infq (); +} + +_Float128 +my_inff128 (void) +{ + return __builtin_inff128 (); +} + +__float128 +my_huge_valq1 (void) +{ + return __builtin_huge_valq (); +} + +_Float128 +my_huge_valq2 (void) +{ + return __builtin_huge_valq (); +} + +_Float128 +my_huge_valf128 (void) +{ + return __builtin_huge_valf128 (); +} + +__float128 +my_nanq1 (void) +{ + return __builtin_nanq (""); +} + +_Float128 +my_nanq2 (void) +{ + return __builtin_nanq (""); +} + +_Float128 +my_nanf128 (void) +{ + return __builtin_nanf128 (""); +} + +__float128 +my_nansq1 (void) +{ + return __builtin_nansq (""); +} + +_Float128 +my_nansq2 (void) +{ + return __builtin_nansq (""); +} + +_Float128 +my_nansf128 (void) +{ + return __builtin_nansf128 (""); +}