From patchwork Wed Nov 2 02:42:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 13999 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp3352156wru; Tue, 1 Nov 2022 19:43:25 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7rgEQJi5M204/tZ0lEFjaai+ORjlSCXflm7UHZRelGvN+2B7e6hqixMp54VDxJqGuY9fb/ X-Received: by 2002:a17:907:7fa5:b0:791:9a5f:101a with SMTP id qk37-20020a1709077fa500b007919a5f101amr21345554ejc.453.1667357005509; Tue, 01 Nov 2022 19:43:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667357005; cv=none; d=google.com; s=arc-20160816; b=Vdod1ttkxdX0mdzzV9QrXL1c+brWFRp8niGy724cNDqQHREP6ZGnY/2vKMEf4eaVVj xK1C+Q5l5/chUhtE3sA6LES6bOP6MpR06NeFReV07I+n+A+IJSMmuFB+/iZ7OatPMm1k LxZpWalCC/BMtowj9YndRFxfmx01eAa5w6Qb1ubW1HWAnfYq64HjFywIMJe68QSrh8lh hPBvtJxNkEhV55AA4dszFy2iDl3U1xSw7ZcjiCr101y/qj7dtbLo8WDELR4E3VnM1oPG 7R71ie6Noyoz8F731J9sTPZ9LJB104oyxHAHAg6y0moUfBixKr8UyHTqMgB4gdsI8LKu 49BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:to:date:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=xEG7sC5Lr+nzoWD1gRfamaCJSF3npElmpLZPFi5+Uj8=; b=zYYkS9ekxQ2NOWI6NMYL0RJaanjuzvAEOEC95DoyPAvwon1fOxXRXHGKqn698eTL8i yaIFx7R1jb8OVMo6mog/9x4IfegO624m5UulMEHXVb+a8DWdKPYiunoQIY0/f1aD1iOI LLF3w8CB6SsxojUcVtHVQyHY1rdWwlILXtg74HwYpKJmJnX/DZN6OiDm0XgePwqZpfm8 6eEPYmtrWidyjCTwck7Yc+d9yFwhXk2qF8nZw6YzkQ4co5zXR8efc3EX5oW/5ezFAceG uu8B1VAukBpL0To7NUHWRamYhkqjcvkuj49KvYryFgIjwKSkcCdAtjECyRrK0ydWx8H7 FmfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dkYgYsVt; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id m19-20020aa7d353000000b004619acbc70fsi2103541edr.505.2022.11.01.19.43.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Nov 2022 19:43:25 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=dkYgYsVt; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 335763858403 for ; Wed, 2 Nov 2022 02:43:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 335763858403 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667357004; bh=xEG7sC5Lr+nzoWD1gRfamaCJSF3npElmpLZPFi5+Uj8=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=dkYgYsVthA9zN796Pki5do4nI07hXesfWfGY5/Tp5UDGqUT3QaK0BgFgCvjMUDUQo /vUqlzKVH7tne585UHjV/N9ML4b6QulO7FQsU6C3tC7eJCvN793oF+zztEllgMU8Oi XZ2RlX0LjK/4lKePfVeJP2bvRUDsvedrpgtX9cwo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id EB1343858D20; Wed, 2 Nov 2022 02:42:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EB1343858D20 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A211qvO016290; Wed, 2 Nov 2022 02:42:36 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3kjruh5872-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 02 Nov 2022 02:42:36 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2A22asRi014413; Wed, 2 Nov 2022 02:42:35 GMT Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3kjruh586h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 02 Nov 2022 02:42:35 +0000 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A22aFQ7020012; Wed, 2 Nov 2022 02:42:34 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma01wdc.us.ibm.com with ESMTP id 3kgut9qyhb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 02 Nov 2022 02:42:34 +0000 Received: from smtpav02.wdc07v.mail.ibm.com ([9.208.128.114]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A22gXLO14811958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 2 Nov 2022 02:42:33 GMT Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 26C145805D; Wed, 2 Nov 2022 02:42:33 +0000 (GMT) Received: from smtpav02.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 15F095805B; Wed, 2 Nov 2022 02:42:32 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.5.6]) by smtpav02.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Wed, 2 Nov 2022 02:42:31 +0000 (GMT) Date: Tue, 1 Nov 2022 22:42:30 -0400 To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt , William Seurer Subject: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299 Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt , William Seurer References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: BNViw8YMBPikB23I0CE_-yLKpLtaBvSX X-Proofpoint-ORIG-GUID: qV-MkKZtrm9WOd-MjxSUhfKl_zSABPAY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-01_12,2022-11-01_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 priorityscore=1501 mlxscore=0 suspectscore=0 phishscore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211020013 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Michael Meissner via Gcc-patches From: Michael Meissner Reply-To: Michael Meissner Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748350539616310672?= X-GMAIL-MSGID: =?utf-8?q?1748350539616310672?= This patch fixes the issue that GCC cannot build when the default long double is IEEE 128-bit. It fails in building libgcc, specifically when it is trying to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc during the evrp pass. Ultimately it is failing because the code declared the type to use TFmode but it used F128 functions (i.e. KFmode). typedef float TFtype __attribute__((mode (TF))); typedef __complex float TCtype __attribute__((mode (TC))); TCtype __mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d) { TFtype ac, bd, ad, bc, x, y; TCtype res; ac = a * c; bd = b * d; ad = a * d; bc = b * c; x = ac - bd; y = ad + bc; if (__builtin_isnan (x) && __builtin_isnan (y)) { _Bool recalc = 0; if (__builtin_isinf (a) || __builtin_isinf (b)) { a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a); b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b); if (__builtin_isnan (c)) c = __builtin_copysignf128 (0, c); if (__builtin_isnan (d)) d = __builtin_copysignf128 (0, d); recalc = 1; } if (__builtin_isinf (c) || __builtin_isinf (d)) { c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c); d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d); if (__builtin_isnan (a)) a = __builtin_copysignf128 (0, a); if (__builtin_isnan (b)) b = __builtin_copysignf128 (0, b); recalc = 1; } if (!recalc && (__builtin_isinf (ac) || __builtin_isinf (bd) || __builtin_isinf (ad) || __builtin_isinf (bc))) { if (__builtin_isnan (a)) a = __builtin_copysignf128 (0, a); if (__builtin_isnan (b)) b = __builtin_copysignf128 (0, b); if (__builtin_isnan (c)) c = __builtin_copysignf128 (0, c); if (__builtin_isnan (d)) d = __builtin_copysignf128 (0, d); recalc = 1; } if (recalc) { x = __builtin_inff128 () * (a * c - b * d); y = __builtin_inff128 () * (a * d + b * c); } } __real__ res = x; __imag__ res = y; return res; } Currently GCC uses the long double type node for __float128 if long double is IEEE 128-bit. It did not use the node for _Float128. Originally this was noticed if you call the nansq function to make a signaling NaN (nansq is mapped to nansf128). Because the type node for _Float128 is different from __float128, the machine independent code converts signaling NaNs to quiet NaNs if the types are not compatible. The following tests used to fail when run on a system where long double is IEEE 128-bit: gcc.dg/torture/float128-nan.c gcc.target/powerpc/nan128-1.c This patch makes both __float128 and _Float128 use the same type node. One side effect of not using the long double type node for __float128 is that we must only use KFmode for _Float128/__float128. The libstdc++ library won't build if we use TFmode for _Float128 and __float128 when long double is IEEE 128-bit. Another minor side effect is that the f128 round to odd fused multiply-add function will not merge negatition with the FMA operation when the type is long double. If the type is __float128 or _Float128, then it will continue to do the optimization. The round to odd functions are defined in terms of __float128 arguments. For example: long double do_fms (long double a, long double b, long double c) { return __builtin_fmaf128_round_to_odd (a, b, -c); } will generate (assuming -mabi=ieeelongdouble): xsnegqp 4,4 xsmaddqpo 4,2,3 xxlor 34,36,36 while: __float128 do_fms (__float128 a, __float128 b, __float128 c) { return __builtin_fmaf128_round_to_odd (a, b, -c); } will generate: xsmsubqpo 4,2,3 xxlor 34,36,36 I tested all 3 patchs for PR target/107299 on: 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm Once all 3 patches have been applied, we can once again build GCC when long double is IEEE 128-bit. There were no other regressions with these patches. Can I check these patches into the trunk? 2022-11-01 Michael Meissner gcc/ PR target/107299 * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Always use the _Float128 type for __float128. (rs6000_expand_builtin): Only change a KFmode built-in to TFmode, if the built-in passes or returns TFmode. If the predicate failed because the modes were different, use convert_move to load up the value instead of copy_to_mode_reg. * config/rs6000/rs6000.cc (rs6000_translate_mode_attribute): Don't translate IEEE 128-bit floating point modes to explicit IEEE 128-bit modes (KFmode or KCmode), even if long double is IEEE 128-bit. (rs6000_libgcc_floating_mode_supported_p): Support KFmode all of the time if we support IEEE 128-bit floating point. (rs6000_floatn_mode): _Float128 and _Float128x always uses KFmode. gcc/testsuite/ PR target/107299 * gcc.target/powerpc/float128-hw12.c: New test. * gcc.target/powerpc/float128-hw13.c: Likewise. * gcc.target/powerpc/float128-hw4.c: Update insns. --- gcc/config/rs6000/rs6000-builtin.cc | 237 ++++++++++-------- gcc/config/rs6000/rs6000.cc | 31 ++- .../gcc.target/powerpc/float128-hw12.c | 137 ++++++++++ .../gcc.target/powerpc/float128-hw13.c | 137 ++++++++++ .../gcc.target/powerpc/float128-hw4.c | 10 +- 5 files changed, 431 insertions(+), 121 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-hw12.c create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-hw13.c diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc index 90ab39dc258..e5298f45363 100644 --- a/gcc/config/rs6000/rs6000-builtin.cc +++ b/gcc/config/rs6000/rs6000-builtin.cc @@ -730,25 +730,28 @@ rs6000_init_builtins (void) if (TARGET_FLOAT128_TYPE) { - if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128) - ieee128_float_type_node = long_double_type_node; - else + /* In the past we used long_double_type_node when long double was IEEE + 128-bit. However, this means that the _Float128 type + (i.e. float128_type_node) is a different type from __float128 + (i.e. ieee128_float_type_nonde). This leads to some corner cases, + such as processing signaling NaNs with the nansf128 built-in function + (which returns a _Float128 value) and assign it to a long double or + __float128 value. The two explicit IEEE 128-bit types should always + use the same internal type. + + For C we only need to register the __ieee128 name for it. For C++, we + create a distinct type which will mangle differently (u9__ieee128) + vs. _Float128 (DF128_) and behave backwards compatibly. */ + if (float128t_type_node == NULL_TREE) { - /* For C we only need to register the __ieee128 name for - it. For C++, we create a distinct type which will mangle - differently (u9__ieee128) vs. _Float128 (DF128_) and behave - backwards compatibly. */ - if (float128t_type_node == NULL_TREE) - { - float128t_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (float128t_type_node) - = TYPE_PRECISION (float128_type_node); - layout_type (float128t_type_node); - SET_TYPE_MODE (float128t_type_node, - TYPE_MODE (float128_type_node)); - } - ieee128_float_type_node = float128t_type_node; + float128t_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (float128t_type_node) + = TYPE_PRECISION (float128_type_node); + layout_type (float128t_type_node); + SET_TYPE_MODE (float128t_type_node, + TYPE_MODE (float128_type_node)); } + ieee128_float_type_node = float128t_type_node; t = build_qualified_type (ieee128_float_type_node, TYPE_QUAL_CONST); lang_hooks.types.register_builtin_type (ieee128_float_type_node, "__ieee128"); @@ -3265,13 +3268,13 @@ htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode, /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient - (and in mode MODE if that's convenient). + (and in mode RETURN_MODE if that's convenient). SUBTARGET may be used as the target for computing one of EXP's operands. IGNORE is nonzero if the value is to be ignored. Use the new builtin infrastructure. */ rtx rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, - machine_mode /* mode */, int ignore) + machine_mode return_mode, int ignore) { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); enum rs6000_gen_builtins fcode @@ -3287,78 +3290,99 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, size_t uns_fcode = (size_t)fcode; enum insn_code icode = rs6000_builtin_info[uns_fcode].icode; - /* TODO: The following commentary and code is inherited from the original - builtin processing code. The commentary is a bit confusing, with the - intent being that KFmode is always IEEE-128, IFmode is always IBM - double-double, and TFmode is the current long double. The code is - confusing in that it converts from KFmode to TFmode pattern names, - when the other direction is more intuitive. Try to address this. */ - - /* We have two different modes (KFmode, TFmode) that are the IEEE - 128-bit floating point type, depending on whether long double is the - IBM extended double (KFmode) or long double is IEEE 128-bit (TFmode). - It is simpler if we only define one variant of the built-in function, - and switch the code when defining it, rather than defining two built- - ins and using the overload table in rs6000-c.cc to switch between the - two. If we don't have the proper assembler, don't do this switch - because CODE_FOR_*kf* and CODE_FOR_*tf* will be CODE_FOR_nothing. */ - if (FLOAT128_IEEE_P (TFmode)) - switch (icode) - { - case CODE_FOR_sqrtkf2_odd: - icode = CODE_FOR_sqrttf2_odd; - break; - case CODE_FOR_trunckfdf2_odd: - icode = CODE_FOR_trunctfdf2_odd; - break; - case CODE_FOR_addkf3_odd: - icode = CODE_FOR_addtf3_odd; - break; - case CODE_FOR_subkf3_odd: - icode = CODE_FOR_subtf3_odd; - break; - case CODE_FOR_mulkf3_odd: - icode = CODE_FOR_multf3_odd; - break; - case CODE_FOR_divkf3_odd: - icode = CODE_FOR_divtf3_odd; - break; - case CODE_FOR_fmakf4_odd: - icode = CODE_FOR_fmatf4_odd; - break; - case CODE_FOR_xsxexpqp_kf: - icode = CODE_FOR_xsxexpqp_tf; - break; - case CODE_FOR_xsxsigqp_kf: - icode = CODE_FOR_xsxsigqp_tf; - break; - case CODE_FOR_xststdcnegqp_kf: - icode = CODE_FOR_xststdcnegqp_tf; - break; - case CODE_FOR_xsiexpqp_kf: - icode = CODE_FOR_xsiexpqp_tf; - break; - case CODE_FOR_xsiexpqpf_kf: - icode = CODE_FOR_xsiexpqpf_tf; - break; - case CODE_FOR_xststdcqp_kf: - icode = CODE_FOR_xststdcqp_tf; - break; - case CODE_FOR_xscmpexpqp_eq_kf: - icode = CODE_FOR_xscmpexpqp_eq_tf; - break; - case CODE_FOR_xscmpexpqp_lt_kf: - icode = CODE_FOR_xscmpexpqp_lt_tf; - break; - case CODE_FOR_xscmpexpqp_gt_kf: - icode = CODE_FOR_xscmpexpqp_gt_tf; - break; - case CODE_FOR_xscmpexpqp_unordered_kf: - icode = CODE_FOR_xscmpexpqp_unordered_tf; - break; - default: - break; - } + /* For 128-bit long double, we may need both the KFmode built-in functions + and IFmode built-in functions to the equivalent TFmode built-in function, + if either a TFmode result is expected or any of the arguments use + TFmode. */ + if (TARGET_LONG_DOUBLE_128) + { + bool uses_tf_mode = return_mode == TFmode; + if (!uses_tf_mode) + { + call_expr_arg_iterator iter; + tree arg; + FOR_EACH_CALL_EXPR_ARG (arg, iter, exp) + { + if (arg != error_mark_node + && TYPE_MODE (TREE_TYPE (arg)) == TFmode) + { + uses_tf_mode = true; + break; + } + } + } + + /* Convert KFmode built-in functions to TFmode when long double is IEEE + 128-bit. */ + if (uses_tf_mode && FLOAT128_IEEE_P (TFmode)) + switch (icode) + { + case CODE_FOR_sqrtkf2_odd: + icode = CODE_FOR_sqrttf2_odd; + break; + case CODE_FOR_trunckfdf2_odd: + icode = CODE_FOR_trunctfdf2_odd; + break; + case CODE_FOR_addkf3_odd: + icode = CODE_FOR_addtf3_odd; + break; + case CODE_FOR_subkf3_odd: + icode = CODE_FOR_subtf3_odd; + break; + case CODE_FOR_mulkf3_odd: + icode = CODE_FOR_multf3_odd; + break; + case CODE_FOR_divkf3_odd: + icode = CODE_FOR_divtf3_odd; + break; + case CODE_FOR_fmakf4_odd: + icode = CODE_FOR_fmatf4_odd; + break; + case CODE_FOR_xsxexpqp_kf: + icode = CODE_FOR_xsxexpqp_tf; + break; + case CODE_FOR_xsxsigqp_kf: + icode = CODE_FOR_xsxsigqp_tf; + break; + case CODE_FOR_xststdcnegqp_kf: + icode = CODE_FOR_xststdcnegqp_tf; + break; + case CODE_FOR_xsiexpqp_kf: + icode = CODE_FOR_xsiexpqp_tf; + break; + case CODE_FOR_xsiexpqpf_kf: + icode = CODE_FOR_xsiexpqpf_tf; + break; + case CODE_FOR_xststdcqp_kf: + icode = CODE_FOR_xststdcqp_tf; + break; + case CODE_FOR_xscmpexpqp_eq_kf: + icode = CODE_FOR_xscmpexpqp_eq_tf; + break; + case CODE_FOR_xscmpexpqp_lt_kf: + icode = CODE_FOR_xscmpexpqp_lt_tf; + break; + case CODE_FOR_xscmpexpqp_gt_kf: + icode = CODE_FOR_xscmpexpqp_gt_tf; + break; + case CODE_FOR_xscmpexpqp_unordered_kf: + icode = CODE_FOR_xscmpexpqp_unordered_tf; + break; + default: + break; + } + + /* Convert IFmode built-in functions to TFmode when long double is IBM + 128-bit. */ + else if (uses_tf_mode && FLOAT128_IBM_P (TFmode)) + { + if (icode == CODE_FOR_packif) + icode = CODE_FOR_packtf; + + else if (icode == CODE_FOR_unpackif) + icode = CODE_FOR_unpacktf; + } + } /* In case of "#pragma target" changes, we initialize all builtins but check for actual availability now, during expand time. For @@ -3481,18 +3505,6 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, if (bif_is_ibm128 (*bifaddr) && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD) { - if (fcode == RS6000_BIF_PACK_IF) - { - icode = CODE_FOR_packtf; - fcode = RS6000_BIF_PACK_TF; - uns_fcode = (size_t) fcode; - } - else if (fcode == RS6000_BIF_UNPACK_IF) - { - icode = CODE_FOR_unpacktf; - fcode = RS6000_BIF_UNPACK_TF; - uns_fcode = (size_t) fcode; - } } /* TRUE iff the built-in function returns void. */ @@ -3647,7 +3659,24 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */, for (int i = 0; i < nargs; i++) if (!insn_data[icode].operand[i+k].predicate (op[i], mode[i+k])) - op[i] = copy_to_mode_reg (mode[i+k], op[i]); + { + /* If the predicate failed because the modes are different, do a + convert instead of copy_to_mode_reg, since copy_to_mode_reg will + abort in this case. The modes might be different if we have two + different 128-bit floating point modes (i.e. KFmode/TFmode if long + double is IEEE 128-bit and IFmode/TFmode if long double is IBM + 128-bit). */ + machine_mode mode_insn = mode[i+k]; + machine_mode mode_op = GET_MODE (op[i]); + if (mode_insn != mode_op && mode_op != VOIDmode) + { + rtx tmp = gen_reg_rtx (mode_insn); + convert_move (tmp, op[i], 0); + op[i] = tmp; + } + else + op[i] = copy_to_mode_reg (mode_insn, op[i]); + } rtx pat; diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index cfb6227e27b..8a8357512c0 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -23851,15 +23851,23 @@ rs6000_eh_return_filter_mode (void) return TARGET_32BIT ? SImode : word_mode; } -/* Target hook for translate_mode_attribute. */ +/* Target hook for translate_mode_attribute. + + When -mabi=ieeelongdouble is used, we want to translate either KFmode or + TFmode to KFmode. This is because user code that wants to specify IEEE + 128-bit types will use either TFmode or KFmode, and we really want to use + the _Float128 and __float128 types instead of long double. + + Similarly when -mabi=ibmlongdouble is used, we want to map IFmode into + TFmode. */ static machine_mode rs6000_translate_mode_attribute (machine_mode mode) { - if ((FLOAT128_IEEE_P (mode) - && ieee128_float_type_node == long_double_type_node) - || (FLOAT128_IBM_P (mode) - && ibm128_float_type_node == long_double_type_node)) + if (FLOAT128_IBM_P (mode) + && ibm128_float_type_node == long_double_type_node) return COMPLEX_MODE_P (mode) ? E_TCmode : E_TFmode; + else if (FLOAT128_IEEE_P (mode)) + return COMPLEX_MODE_P (mode) ? E_KCmode : E_KFmode; return mode; } @@ -23895,13 +23903,10 @@ rs6000_libgcc_floating_mode_supported_p (scalar_float_mode mode) case E_TFmode: return true; - /* We only return true for KFmode if IEEE 128-bit types are supported, and - if long double does not use the IEEE 128-bit format. If long double - uses the IEEE 128-bit format, it will use TFmode and not KFmode. - Because the code will not use KFmode in that case, there will be aborts - because it can't find KFmode in the Floatn types. */ + /* We only return true for KFmode if IEEE 128-bit types are + supported. */ case E_KFmode: - return TARGET_FLOAT128_TYPE && !TARGET_IEEEQUAD; + return TARGET_FLOAT128_TYPE; default: return false; @@ -23935,7 +23940,7 @@ rs6000_floatn_mode (int n, bool extended) case 64: if (TARGET_FLOAT128_TYPE) - return (FLOAT128_IEEE_P (TFmode)) ? TFmode : KFmode; + return KFmode; else return opt_scalar_float_mode (); @@ -23959,7 +23964,7 @@ rs6000_floatn_mode (int n, bool extended) case 128: if (TARGET_FLOAT128_TYPE) - return (FLOAT128_IEEE_P (TFmode)) ? TFmode : KFmode; + return KFmode; else return opt_scalar_float_mode (); diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw12.c b/gcc/testsuite/gcc.target/powerpc/float128-hw12.c new file mode 100644 index 00000000000..d08b4cbc883 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/float128-hw12.c @@ -0,0 +1,137 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-require-effective-target float128 } */ +/* { dg-options "-mpower9-vector -O2 -mabi=ieeelongdouble -Wno-psabi" } */ + +/* Insure that the ISA 3.0 IEEE 128-bit floating point built-in functions work + with _Float128. This is the same as float128-hw4.c, except the type + _Float128 is used, and the IEEE 128-bit long double ABI is used. */ + +#ifndef TYPE +#define TYPE _Float128 +#endif + +unsigned int +get_double_exponent (double a) +{ + return __builtin_vec_scalar_extract_exp (a); +} + +unsigned int +get_float128_exponent (TYPE a) +{ + return __builtin_vec_scalar_extract_exp (a); +} + +unsigned long +get_double_mantissa (double a) +{ + return __builtin_vec_scalar_extract_sig (a); +} + +__uint128_t +get_float128_mantissa (TYPE a) +{ + return __builtin_vec_scalar_extract_sig (a); +} + +double +set_double_exponent_ulong (unsigned long a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +set_float128_exponent_uint128 (__uint128_t a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +double +set_double_exponent_double (double a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +set_float128_exponent_float128 (TYPE a, __uint128_t e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +sqrt_odd (TYPE a) +{ + return __builtin_sqrtf128_round_to_odd (a); +} + +double +trunc_odd (TYPE a) +{ + return __builtin_truncf128_round_to_odd (a); +} + +TYPE +add_odd (TYPE a, TYPE b) +{ + return __builtin_addf128_round_to_odd (a, b); +} + +TYPE +sub_odd (TYPE a, TYPE b) +{ + return __builtin_subf128_round_to_odd (a, b); +} + +TYPE +mul_odd (TYPE a, TYPE b) +{ + return __builtin_mulf128_round_to_odd (a, b); +} + +TYPE +div_odd (TYPE a, TYPE b) +{ + return __builtin_divf128_round_to_odd (a, b); +} + +TYPE +fma_odd (TYPE a, TYPE b, TYPE c) +{ + return __builtin_fmaf128_round_to_odd (a, b, c); +} + +TYPE +fms_odd (TYPE a, TYPE b, TYPE c) +{ + return __builtin_fmaf128_round_to_odd (a, b, -c); +} + +TYPE +nfma_odd (TYPE a, TYPE b, TYPE c) +{ + return -__builtin_fmaf128_round_to_odd (a, b, c); +} + +TYPE +nfms_odd (TYPE a, TYPE b, TYPE c) +{ + return -__builtin_fmaf128_round_to_odd (a, b, -c); +} + +/* { dg-final { scan-assembler {\mxsiexpdp\M} } } */ +/* { dg-final { scan-assembler {\mxsiexpqp\M} } } */ +/* { dg-final { scan-assembler {\mxsxexpdp\M} } } */ +/* { dg-final { scan-assembler {\mxsxexpqp\M} } } */ +/* { dg-final { scan-assembler {\mxsxsigdp\M} } } */ +/* { dg-final { scan-assembler {\mxsxsigqp\M} } } */ +/* { dg-final { scan-assembler {\mxsaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsdivqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmulqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssqrtqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssubqpo\M} } } */ +/* { dg-final { scan-assembler-not {\mbl\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw13.c b/gcc/testsuite/gcc.target/powerpc/float128-hw13.c new file mode 100644 index 00000000000..51a3cd4802b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/float128-hw13.c @@ -0,0 +1,137 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-require-effective-target float128 } */ +/* { dg-options "-mpower9-vector -O2 -mabi=ibmlongdouble -Wno-psabi" } */ + +/* Insure that the ISA 3.0 IEEE 128-bit floating point built-in functions work + with __float128. This is the same as float128-hw4.c, except the type + __float128 is used, and the IBM 128-bit long double ABI is used. */ + +#ifndef TYPE +#define TYPE __float128 +#endif + +unsigned int +get_double_exponent (double a) +{ + return __builtin_vec_scalar_extract_exp (a); +} + +unsigned int +get_float128_exponent (TYPE a) +{ + return __builtin_vec_scalar_extract_exp (a); +} + +unsigned long +get_double_mantissa (double a) +{ + return __builtin_vec_scalar_extract_sig (a); +} + +__uint128_t +get_float128_mantissa (TYPE a) +{ + return __builtin_vec_scalar_extract_sig (a); +} + +double +set_double_exponent_ulong (unsigned long a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +set_float128_exponent_uint128 (__uint128_t a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +double +set_double_exponent_double (double a, unsigned long e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +set_float128_exponent_float128 (TYPE a, __uint128_t e) +{ + return __builtin_vec_scalar_insert_exp (a, e); +} + +TYPE +sqrt_odd (TYPE a) +{ + return __builtin_sqrtf128_round_to_odd (a); +} + +double +trunc_odd (TYPE a) +{ + return __builtin_truncf128_round_to_odd (a); +} + +TYPE +add_odd (TYPE a, TYPE b) +{ + return __builtin_addf128_round_to_odd (a, b); +} + +TYPE +sub_odd (TYPE a, TYPE b) +{ + return __builtin_subf128_round_to_odd (a, b); +} + +TYPE +mul_odd (TYPE a, TYPE b) +{ + return __builtin_mulf128_round_to_odd (a, b); +} + +TYPE +div_odd (TYPE a, TYPE b) +{ + return __builtin_divf128_round_to_odd (a, b); +} + +TYPE +fma_odd (TYPE a, TYPE b, TYPE c) +{ + return __builtin_fmaf128_round_to_odd (a, b, c); +} + +TYPE +fms_odd (TYPE a, TYPE b, TYPE c) +{ + return __builtin_fmaf128_round_to_odd (a, b, -c); +} + +TYPE +nfma_odd (TYPE a, TYPE b, TYPE c) +{ + return -__builtin_fmaf128_round_to_odd (a, b, c); +} + +TYPE +nfms_odd (TYPE a, TYPE b, TYPE c) +{ + return -__builtin_fmaf128_round_to_odd (a, b, -c); +} + +/* { dg-final { scan-assembler {\mxsiexpdp\M} } } */ +/* { dg-final { scan-assembler {\mxsiexpqp\M} } } */ +/* { dg-final { scan-assembler {\mxsxexpdp\M} } } */ +/* { dg-final { scan-assembler {\mxsxexpqp\M} } } */ +/* { dg-final { scan-assembler {\mxsxsigdp\M} } } */ +/* { dg-final { scan-assembler {\mxsxsigqp\M} } } */ +/* { dg-final { scan-assembler {\mxsaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsdivqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmulqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssqrtqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssubqpo\M} } } */ +/* { dg-final { scan-assembler-not {\mbl\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw4.c b/gcc/testsuite/gcc.target/powerpc/float128-hw4.c index fc149169bc6..3f6717825b7 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-hw4.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-hw4.c @@ -118,6 +118,11 @@ nfms_odd (TYPE a, TYPE b, TYPE c) return -__builtin_fmaf128_round_to_odd (a, b, -c); } +/* In using long double instead of _Float128, we might not be able to optimize + __builtin_fmaf128_round_to_odd (a, b, -c) into using xsmsubqpo instead of + xsnegqp and xsmaddqpo due to conversions between TFmode and KFmode. So just + recognize that the did the FMA optimization. */ + /* { dg-final { scan-assembler {\mxsiexpdp\M} } } */ /* { dg-final { scan-assembler {\mxsiexpqp\M} } } */ /* { dg-final { scan-assembler {\mxsxexpdp\M} } } */ @@ -126,11 +131,8 @@ nfms_odd (TYPE a, TYPE b, TYPE c) /* { dg-final { scan-assembler {\mxsxsigqp\M} } } */ /* { dg-final { scan-assembler {\mxsaddqpo\M} } } */ /* { dg-final { scan-assembler {\mxsdivqpo\M} } } */ -/* { dg-final { scan-assembler {\mxsmaddqpo\M} } } */ -/* { dg-final { scan-assembler {\mxsmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsn?m(add|sub)qpo\M} } } */ /* { dg-final { scan-assembler {\mxsmulqpo\M} } } */ -/* { dg-final { scan-assembler {\mxsnmaddqpo\M} } } */ -/* { dg-final { scan-assembler {\mxsnmsubqpo\M} } } */ /* { dg-final { scan-assembler {\mxssqrtqpo\M} } } */ /* { dg-final { scan-assembler {\mxssubqpo\M} } } */ /* { dg-final { scan-assembler-not {\mbl\M} } } */