Message ID | Yv6zreac0PTQgjmA@toto.the-meissners.org |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:38f:b0:2d5:3c95:9e21 with SMTP id 15csp543514pxh; Thu, 18 Aug 2022 14:49:19 -0700 (PDT) X-Google-Smtp-Source: AA6agR71s6GFKp1dVxV1jXhsSYziNalv7aU2FZn0h7dOomwNWYg96fLKa1EIWAAuyvSCBoRf1vz3 X-Received: by 2002:a17:907:968e:b0:730:bc30:da2f with SMTP id hd14-20020a170907968e00b00730bc30da2fmr3018022ejc.484.1660859359369; Thu, 18 Aug 2022 14:49:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660859359; cv=none; d=google.com; s=arc-20160816; b=lMFWmclsHZBFHu4jc8wK8B6fFzUk4ukez+hQGRGjy5Wnauzl+KNs1LhMAyeke8lhG7 weRXE+Dw66tu+Rf6UwsO24NwMyx4q8Yx+0cHjK1MD+rV72wfaQUmOBbU62gHcfxL81EV BSPeqo0am6dwcPvlE4Y2rRUkCxhJsSnNrnMTUtpZkWHpX+tylzPb176xacB+cfAzG5F8 8839ZOxCyDHU45oW91jqtAyd9FpZwllPomA/8D/S1HV00F2Szc3yNOMZeI6uRjsN8ENH MUoDLQ8upnD7PUtFKS/HJgK3o7DMxWg0r3q5SjiK/QMKW9sf+J3T8jBozbFOW3ZMvtnb Fwzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:mail-followup-to:message-id :subject:to:date:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=v616hGm9K2+2dsKHRqkQoWZBwdnmwbPYeqeY78gNev4=; b=L7h2EFO2LOsdwRs1q9NAsT2PaIDcmknESkwAN4aASLsqL+peItuBiM84hJGiAi1iCI DVweipdc6/qDuO7srSP2XX2b3tR9idtkYDlqH/1EWcLPP2MxPSTg0wuEgTS1DWy7+sGH 3vqf7Ki46HFTo8Fd/nOV9az5qshJlU7tlk7WphZktvMY5AiiYRWPK2OrJaNCF/zf0hdf OOToL68wtRVvOWqpVU4NmbgBfyzviO9A51Azb2HVsfacuQQB6GYUg28zKUe/6VLUKDq8 7K7F9caJdqKkCOuLJROLfR0MQkNw8vLz8w2pMgGyCScgY5QrsJp0VVqWfIcn7kUK6uut rXZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=IUXyb1Km; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ho42-20020a1709070eaa00b0073100e36dd2si2102575ejc.939.2022.08.18.14.49.19 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Aug 2022 14:49:19 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=IUXyb1Km; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 417C93858C56 for <ouuuleilei@gmail.com>; Thu, 18 Aug 2022 21:49:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 417C93858C56 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660859358; bh=v616hGm9K2+2dsKHRqkQoWZBwdnmwbPYeqeY78gNev4=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=IUXyb1KmQ9YiYWvsSfxgHhvUDiPqvgQ8+i4+pVVX014ri9OzgCSBcjFydXVYkvNfy 8ZK27KYfpgG5ClXqVuoBrgPY6r7MOxRjjwTnva+J0NUpmIwdpMCdFGQGAgLCx8u3SH wc58uZGAoY2SuZ03ewPUJGsj5jzl9CqYQMHakAH4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9D3233858D28 for <gcc-patches@gcc.gnu.org>; Thu, 18 Aug 2022 21:48:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9D3233858D28 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 27ILSoWK013815; Thu, 18 Aug 2022 21:48:34 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3j1wck8d3g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Aug 2022 21:48:33 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 27ILVkdm023310; Thu, 18 Aug 2022 21:48:33 GMT Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3j1wck8d34-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Aug 2022 21:48:33 +0000 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27ILKlRZ029726; Thu, 18 Aug 2022 21:48:32 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma04wdc.us.ibm.com with ESMTP id 3hx3ka19a1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Aug 2022 21:48:32 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 27ILmV7b13304342 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Aug 2022 21:48:31 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CC61913604F; Thu, 18 Aug 2022 21:48:31 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DD5D136051; Thu, 18 Aug 2022 21:48:31 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.65.225.181]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTPS; Thu, 18 Aug 2022 21:48:31 +0000 (GMT) Date: Thu, 18 Aug 2022 17:48:29 -0400 To: gcc-patches@gcc.gnu.org, Michael Meissner <meissner@linux.ibm.com>, Segher Boessenkool <segher@kernel.crashing.org>, "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>, Will Schmidt <will_schmidt@vnet.ibm.com> Subject: [PATCH] Improve converting between 128-bit modes that use the same format Message-ID: <Yv6zreac0PTQgjmA@toto.the-meissners.org> Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>, gcc-patches@gcc.gnu.org, Segher Boessenkool <segher@kernel.crashing.org>, "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>, Will Schmidt <will_schmidt@vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: fNKg1CN2cMie85fn8McZ7EUJMxA2-FtB X-Proofpoint-GUID: TS4NwAIV6Dc-DiaT_A-JO00LXyuljr-C X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-18_16,2022-08-18_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 suspectscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 adultscore=0 impostorscore=0 priorityscore=1501 spamscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2208180077 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Michael Meissner via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Michael Meissner <meissner@linux.ibm.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1741537263851851074?= X-GMAIL-MSGID: =?utf-8?q?1741537263851851074?= |
Series |
Improve converting between 128-bit modes that use the same format
|
|
Commit Message
Michael Meissner
Aug. 18, 2022, 9:48 p.m. UTC
mprove converting between 128-bit modes that use the same format. This patch improves the insns used for converting between two modes using the 128-bit floating point format (i.e. converting between KFmode and TFmode if -mabi=ieeelongdouble is used, and converting between IFmode and TFmode if -mabi=ibmlongdouble is used). The new insns have the correct insn type and instruction length for the move involved. Previously, the two different moves were lumped together (i.e. converting between IEEE 128-bit and IEEE 128-bit was matched by the same insns as converting between IBM 128-bit and IBM 128-bit moves). I have tested this patch on the following systems: 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee 2) LE Power10 using --with-cpu=power9 --with-long-double-format=ibm 3) LE Power10 using --with-cpu=power8 --with-long-double-format=ibm 4) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm 5) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm 6) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm 7) BE Power8 using --with-cpu=power5 --with-long-double-format=ibm There were no regressions in the build or in the tests. On the power10 with long double using the IEEE 128-bit format, pr105334.c now runs where it previously failed. Can I check this patch into the trunk? Did we want to backport this to earlier GCC releases? 2022-08-17 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/rs6000.md (IFKF): Delete. (IFKF_reg): Delete. (extendkfif2): New define_expand. (trunckfif2): New define_expand. (extend<mode>tf2_internal): Split into extendiftf2_internal and extendkftf2_internal. Update the insns to use the correct insn type and length attributes based on whether the move uses IEEE 128-bit floating point or IBM 128-bit floating point type. (extendiftf2_internal): Likewise. (extendkftf2_internal): Likewise. (extendtf<mode>2_internal): Split into extendtfif2_internal and extendtfkf2_internal. Update the insns to use the correct insn type and length attributes based on whether the move uses IEEE 128-bit floating e point or IBM 128-bit floating point type. (extendtfif2_internal): Likewise. (extendtfkf2_internal): Likewise. --- gcc/config/rs6000/rs6000.md | 94 +++++++++++++++++++++++++++++-------- 1 file changed, 74 insertions(+), 20 deletions(-)
Comments
Please do not send new patches as replies to other patches. On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote: > mprove converting between 128-bit modes that use the same format. You are missing some characters? But this is an edited version of the subject anyway. Just don't do that (neither the copying or the editing), it just confuses things. Please factor this patch into more pieces, pieces that can be reviewed more easily, pieces that change one thing only. As is you are just rewriting the lot, and it is not an improvement at all this way. No doubt there are many good pieces in it, but mixed with a non-trivial amount of bad pieces I cannot approve it. It also isn't clear at all what you want to do; piece by piece it is easier to explain. > -; Iterators for converting to/from TFmode > -(define_mode_iterator IFKF [IF KF]) Yes, IFmode and KFmode have almost nothing in common. Good to see this go. It would be even better if we would not use rs6000_expand_float128_convert when not needed, either, and all this would be just gone after expand. > +(define_expand "extendkfif2" > + [(set (match_operand:IF 0 "gpc_reg_operand") > + (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))] > + "TARGET_FLOAT128_TYPE" > +{ > + rs6000_expand_float128_convert (operands[0], operands[1], false); > + DONE; > +}) This does not belong here. It really shouldn't *exist* at all: this is *not* a float_extend! It is not converting to a wider mode (as required!), but not even to a mode of higher precision: both IFmode and KFmode can represent (finite, normal) numbers the other can not. But it certainly does not belong here in the middle of no-op moves. > +(define_expand "trunckfif2" > + [(set (match_operand:IF 0 "gpc_reg_operand") > + (float_truncate:IF (match_operand:KF 1 "gpc_reg_operand")))] > + "TARGET_FLOAT128_TYPE" > +{ > + rs6000_expand_float128_convert (operands[0], operands[1], false); > + DONE; > +}) I also would expect IBM128 instead of just IF. This would simplify a lot. Why do you not use that, is there a reason? > +;; Convert between KFmode and TFmode when -mabi=ieeelongdouble > +(define_insn_and_split "*extendkftf2_internal" Same for IEEE128. And this isn't a conversion at all (it's a no-op move), please don't confuse things by saying it is. > + [(set_attr "type" "two") > + (set_attr "num_insns" "2")]) Btw, that really should never be needed. insn_type "two" already means exactly that. > +(define_insn_and_split "*extendtfif2_internal" > + [(set (match_operand:IF 0 "gpc_reg_operand" "=d,&d") > + (float_extend:IF > + (match_operand:TF 1 "input_operand" "0,d")))] > + "FLOAT128_IBM_P (TFmode)" > + "#" > + "&& reload_completed" Why would this ever need reload_completed? It is a no-op move! Segher
On Tue, Aug 23, 2022 at 04:13:45PM -0500, Segher Boessenkool wrote: > Please do not send new patches as replies to other patches. This was sent as a new patch. > On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote: > > mprove converting between 128-bit modes that use the same format. > > You are missing some characters? But this is an edited version of the > subject anyway. Just don't do that (neither the copying or the > editing), it just confuses things. That is the first line from the git commit, which git format-patch puts as the subject. I accidently deleted a few extra characters when trimming it down (I remove the From:, etc. lines from the format-patch output). But I can just delete this line if desired. > Please factor this patch into more pieces, pieces that can be reviewed > more easily, pieces that change one thing only. > > As is you are just rewriting the lot, and it is not an improvement at > all this way. No doubt there are many good pieces in it, but mixed with > a non-trivial amount of bad pieces I cannot approve it. It also isn't > clear at all what you want to do; piece by piece it is easier to > explain. > > > -; Iterators for converting to/from TFmode > > -(define_mode_iterator IFKF [IF KF]) > > Yes, IFmode and KFmode have almost nothing in common. Good to see this > go. It would be even better if we would not use > rs6000_expand_float128_convert when not needed, either, and all this > would be just gone after expand. I took a look at it, and I have a new version that only does the moves that are NOPs, and it makes sure all of the functions called have the proper names. I will post it on Tuesday, as some of the machines that I use for testing are now down for the US Labor Day weekend (they need to work on power infrastructure to the lab the machines are in). > > > +(define_expand "extendkfif2" > > + [(set (match_operand:IF 0 "gpc_reg_operand") > > + (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))] > > + "TARGET_FLOAT128_TYPE" > > +{ > > + rs6000_expand_float128_convert (operands[0], operands[1], false); > > + DONE; > > +}) > > This does not belong here. > > It really shouldn't *exist* at all: this is *not* a float_extend! It is > not converting to a wider mode (as required!), but not even to a mode of > higher precision: both IFmode and KFmode can represent (finite, normal) > numbers the other can not. We know that TFmode (if -mabi=ieeelongdouble) and KFmode are the same, just like TFmode (if -mabi=ibmlongdouble) and IFmode are the same. But RTL does not know that these modes use the same representation. So to convert between them, it needs to use either FLOAT_EXTEND or FLOAT_TRUNCATE, depending on which precision each of the three modes have (i.e. rs6000-modes.h). So you need these conversions in RTL. Unfortunately, you can't just use SUBREG before register allocation is done. So I do define_insn_and_split to cover this. > > But it certainly does not belong here in the middle of no-op moves. > > > +(define_expand "trunckfif2" > > + [(set (match_operand:IF 0 "gpc_reg_operand") > > + (float_truncate:IF (match_operand:KF 1 "gpc_reg_operand")))] > > + "TARGET_FLOAT128_TYPE" > > +{ > > + rs6000_expand_float128_convert (operands[0], operands[1], false); > > + DONE; > > +}) > > I also would expect IBM128 instead of just IF. This would simplify a > lot. Why do you not use that, is there a reason? If you use IBM128, you then need to create a mode_attr that for a given mode gives the other mode. Sure it can be done, but for the insns involved it was just simpler to duplicate the insns. So for example, for IBM floating point my current patches are: (define_insn_and_split "extendtfif2" [(set (match_operand:IF 0 "gpc_reg_operand" "=wa,wa,r,r") (float_extend:IF (match_operand:TF 1 "gpc_reg_operand" "0,wa,0,r")))] "TARGET_HARD_FLOAT && TARGET_IBM128 && FLOAT128_IBM_P (TFmode)" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 2))] { operands[2] = gen_lowpart (IFmode, operands[1]); } [(set_attr "num_insns" "2") (set_attr "length" "8")]) (define_insn_and_split "extendiftf2" [(set (match_operand:TF 0 "gpc_reg_operand" "=wa,wa,r,r") (float_extend:TF (match_operand:IF 1 "gpc_reg_operand" "0,wa,0,r")))] "TARGET_HARD_FLOAT && TARGET_IBM128 && FLOAT128_IBM_P (TFmode)" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 2))] { operands[2] = gen_lowpart (TFmode, operands[1]); } [(set_attr "num_insns" "2") (set_attr "length" "8")]) You could rewrite that as: (define_mode_attr IBM128_other [(IF "TF") ("TF" "IF")]) (define_insn_and_split "extend<mode><IBM128_other>2" [(set (match_operand:IBM128 0 "gpc_reg_operand" "=wa,wa,r,r") (float_extend:IBM128 (match_operand:IBM128_othr 1 "gpc_reg_operand" "0,wa,0,r")))] "TARGET_HARD_FLOAT && TARGET_IBM128 && FLOAT128_IBM_P (TFmode)" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 2))] { operands[2] = gen_lowpart (TFmode, operands[1]); } [(set_attr "num_insns" "2") (set_attr "length" "8")]) But for the IEEE side, combining the two insns won't work, since going from TFmode to KFmode will generate a FLOAT_TRUNCATE instead of a FLOAT_EXTEND. You could then use a code iterator for both FLOAT_TRUNCATE and FLOAT_EXTEND. Recognizing both forms might protect us in case some day somebody reorders the precision within rs6000-modes.h. But that is a lot of work to do for minimal gain. But I'm not convinced that it improves things. > > +;; Convert between KFmode and TFmode when -mabi=ieeelongdouble > > +(define_insn_and_split "*extendkftf2_internal" > > Same for IEEE128. And this isn't a conversion at all (it's a no-op > move), please don't confuse things by saying it is. That is how RTL goes between modes. Without a whole lot of changes to the machine independent portion of the compiler, I don't see anyway of avoiding the converts. You might say well always just use one mode for IEEE and one mode for IBM. I've tried, and I couldn't get it to work. Besides if we do this, we break people using __attribute__((mode(...))) which in turn is used by glibc and libstdc++ to get _Complex forms of the __float128 type. > > > + [(set_attr "type" "two") > > + (set_attr "num_insns" "2")]) > > Btw, that really should never be needed. insn_type "two" already means > exactly that. > > > +(define_insn_and_split "*extendtfif2_internal" > > + [(set (match_operand:IF 0 "gpc_reg_operand" "=d,&d") > > + (float_extend:IF > > + (match_operand:TF 1 "input_operand" "0,d")))] > > + "FLOAT128_IBM_P (TFmode)" > > + "#" > > + "&& reload_completed" > > Why would this ever need reload_completed? It is a no-op move! Various predicates do not allow SUBREG's of these types before register allocation.
On Fri, Sep 02, 2022 at 01:51:33PM -0400, Michael Meissner wrote: > On Tue, Aug 23, 2022 at 04:13:45PM -0500, Segher Boessenkool wrote: > > Please do not send new patches as replies to other patches. > > This was sent as a new patch. It probably was the partially copied to the mail body subject line that confused me. Sorry. > > On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote: > > > mprove converting between 128-bit modes that use the same format. > > > > You are missing some characters? But this is an edited version of the > > subject anyway. Just don't do that (neither the copying or the > > editing), it just confuses things. > > That is the first line from the git commit, which git format-patch puts as the > subject. I accidently deleted a few extra characters when trimming it down (I > remove the From:, etc. lines from the format-patch output). But I can just > delete this line if desired. Don't copy that line in the first place? Don't copy *anything* in fact, the way you copied stuff in that other patch (which is very hard to review, so it will have to wait a bit more) makes it impossible to apply the patch (which whould be just git-am). > > Please factor this patch into more pieces, pieces that can be reviewed > > more easily, pieces that change one thing only. Please do this. It is the biggest problem I have with most of your patches: you seem to save up development of a week, and then send it out as big omnibus patch an hour or two before my weekend. This is not ideal. > > > +(define_expand "extendkfif2" > > > + [(set (match_operand:IF 0 "gpc_reg_operand") > > > + (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))] > > > + "TARGET_FLOAT128_TYPE" > > > +{ > > > + rs6000_expand_float128_convert (operands[0], operands[1], false); > > > + DONE; > > > +}) > > > > This does not belong here. > > > > It really shouldn't *exist* at all: this is *not* a float_extend! It is > > not converting to a wider mode (as required!), but not even to a mode of > > higher precision: both IFmode and KFmode can represent (finite, normal) > > numbers the other can not. > > We know that TFmode (if -mabi=ieeelongdouble) and KFmode are the same, just > like TFmode (if -mabi=ibmlongdouble) and IFmode are the same. But RTL does not > know that these modes use the same representation. So to convert between them, > it needs to use either FLOAT_EXTEND or FLOAT_TRUNCATE, depending on which > precision each of the three modes have (i.e. rs6000-modes.h). So you need > these conversions in RTL. You have "extends" also when it is to a lower precision! Also, let me say this again: this needs to be solved. We cannot make real progress as long as we keep pretending every pait of floating point modes can be ordered. > Unfortunately, you can't just use SUBREG before register allocation is done. Why not? It isn't valid (in all cases) *after* RA, but before is fine. But you do not want a subreg, you need a conversion. > > But it certainly does not belong here in the middle of no-op moves. This is still very true. > So for example, for IBM floating point my current patches are: > But for the IEEE side, combining the two insns won't work, since going from > TFmode to KFmode will generate a FLOAT_TRUNCATE instead of a FLOAT_EXTEND. Yes. Please just fix the code depending on fundamentally wrong and unworkable assumptions, instead of adding perpetually worse workarounds that make things more rickety all the time. When you added this IF/KF/TF ordering stuff first, I was amazed that it worked as well as it did. And this was perhaps the best we could do for GCC 10, sure. But the problem should have been fixed since, there is no sane way forward without doing that. > > > +;; Convert between KFmode and TFmode when -mabi=ieeelongdouble > > > +(define_insn_and_split "*extendkftf2_internal" > > > > Same for IEEE128. And this isn't a conversion at all (it's a no-op > > move), please don't confuse things by saying it is. > > That is how RTL goes between modes. Without a whole lot of changes to the > machine independent portion of the compiler, I don't see anyway of avoiding the > converts. It is not a conversion, not in RTL either. It changes mode, it doesn't change values. That is not the same thing. It's a subreg (one of the five overloaded meanings of subregs, but :-) ) > > > +(define_insn_and_split "*extendtfif2_internal" > > > + [(set (match_operand:IF 0 "gpc_reg_operand" "=d,&d") > > > + (float_extend:IF > > > + (match_operand:TF 1 "input_operand" "0,d")))] > > > + "FLOAT128_IBM_P (TFmode)" > > > + "#" > > > + "&& reload_completed" > > > > Why would this ever need reload_completed? It is a no-op move! > > Various predicates do not allow SUBREG's of these types before register > allocation. Modes, not types, and I do not understand what you mean. Please show an example? Either way that is not a reason to do fully the wrong thing. It might make doing the right thing slightly more work, sure. Segher
On Tue, Sep 06, 2022 at 05:22:11PM -0500, Segher Boessenkool wrote: > Please do this. It is the biggest problem I have with most of your > patches: you seem to save up development of a week, and then send it out > as big omnibus patch an hour or two before my weekend. This is not > ideal. This is always going to be the case. As I'm developing the larger patches, there are usually at least 3 smaller problems wanting to get out. I don't know what these things are until I run into them. With the long patch review cycle, I have to be working n patches ahead of what is submitted just to keep busy. I try my best to segreate the patches into smaller chunks, but at the end of the day there are going to various changes to get to the larger goal. I really don't see any way around that. Things don't always come in 3 line changes. A lot of times I make the 3 line change, and then I have to make several other changes to compensate for it elsewhere. If I submit smaller patches, things will break unless all of the patches in the chain are committed. And invariably you will then ask why this particular change is needed, when it is needed by the next patch or the patch after that. > > > > +(define_expand "extendkfif2" > > > > + [(set (match_operand:IF 0 "gpc_reg_operand") > > > > + (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))] > > > > + "TARGET_FLOAT128_TYPE" > > > > +{ > > > > + rs6000_expand_float128_convert (operands[0], operands[1], false); > > > > + DONE; > > > > +}) > > > > > > This does not belong here. > > > > > > It really shouldn't *exist* at all: this is *not* a float_extend! It is > > > not converting to a wider mode (as required!), but not even to a mode of > > > higher precision: both IFmode and KFmode can represent (finite, normal) > > > numbers the other can not. > > > > We know that TFmode (if -mabi=ieeelongdouble) and KFmode are the same, just > > like TFmode (if -mabi=ibmlongdouble) and IFmode are the same. But RTL does not > > know that these modes use the same representation. So to convert between them, > > it needs to use either FLOAT_EXTEND or FLOAT_TRUNCATE, depending on which > > precision each of the three modes have (i.e. rs6000-modes.h). So you need > > these conversions in RTL. > > You have "extends" also when it is to a lower precision! > > Also, let me say this again: this needs to be solved. We cannot make > real progress as long as we keep pretending every pait of floating point > modes can be ordered. I have no plans to solve this. I view it as an unsolvable problem that will invariably lead to lots and lots of changes that will need buy in from the other GCC developers. If you can make the changes, fine. But I view it as work with what you have rather than trying to change the whole floating point infrastructure within GCC. > > Unfortunately, you can't just use SUBREG before register allocation is done. > > Why not? It isn't valid (in all cases) *after* RA, but before is fine. I haven't tracked it down. But I tend to think it is a whack-a-mole problem where there will be 50+ small changes that are needed to get this to work. And then we get back to the issue that you need all of these changes. > But you do not want a subreg, you need a conversion. > > > > But it certainly does not belong here in the middle of no-op moves. > > This is still very true. > > > So for example, for IBM floating point my current patches are: > > > But for the IEEE side, combining the two insns won't work, since going from > > TFmode to KFmode will generate a FLOAT_TRUNCATE instead of a FLOAT_EXTEND. > > Yes. Please just fix the code depending on fundamentally wrong and > unworkable assumptions, instead of adding perpetually worse workarounds > that make things more rickety all the time. > > When you added this IF/KF/TF ordering stuff first, I was amazed that it > worked as well as it did. And this was perhaps the best we could do for > GCC 10, sure. But the problem should have been fixed since, there is no > sane way forward without doing that. When I first looked at it, the number of hidden assumptions that I discovered just in the changes I made was high. I believe fundamentally that there will be a lot more changes to do what you want. And as I said, this will need a lot of buy-in from other developers.
On Wed, Sep 07, 2022 at 04:25:49PM -0400, Michael Meissner wrote: > On Tue, Sep 06, 2022 at 05:22:11PM -0500, Segher Boessenkool wrote: > > Please do this. It is the biggest problem I have with most of your > > patches: you seem to save up development of a week, and then send it out > > as big omnibus patch an hour or two before my weekend. This is not > > ideal. > > This is always going to be the case. As I'm developing the larger patches, > there are usually at least 3 smaller problems wanting to get out. I don't know > what these things are until I run into them. This is the only large problem. Humongous, if you want. It has been _the_ problem since the beginning of the IEEE QP work. None of this is news to anyone. It makes all other problems tens of times as hard to solve. The other thing that would have made development way way way less work in total, and much simpler, is supporting this on more systems. Everything with VMX for example (with the exact same ABI as we have now!), or just a la "nof" ("-msoft-float"), which is easy to so as well. You end up testing more cases, "having" to test more cases, *but* there are way fewer exceptions. Anything downstream from us (essentially everything, we are the C and C++ compilers after all) has to do a lot of unnecessary work as well, because of this. But the fundamental problem that IEEE QP and double-double can not be ordered causes so many problems it is just not funny. The *workarounds* for it each cause a cascade of *other problems*, and there is no end in sight. Segher
I submitted a new patch that rewrites what this patch was trying to do. I didn't see the original version I submitted on September 8th, so I just reposted it. https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601504.html
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index f942597c3b4..e17252bb8de 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -543,12 +543,6 @@ (define_mode_iterator FMOVE128_GPR [TI ; Iterator for 128-bit VSX types for pack/unpack (define_mode_iterator FMOVE128_VSX [V1TI KF]) -; Iterators for converting to/from TFmode -(define_mode_iterator IFKF [IF KF]) - -; Constraints for moving IF/KFmode. -(define_mode_attr IFKF_reg [(IF "d") (KF "wa")]) - ; Whether a floating point move is ok, don't allow SD without hardware FP (define_mode_attr fmove_ok [(SF "") (DF "") @@ -9075,6 +9069,15 @@ (define_expand "extendifkf2" DONE; }) +(define_expand "extendkfif2" + [(set (match_operand:IF 0 "gpc_reg_operand") + (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))] + "TARGET_FLOAT128_TYPE" +{ + rs6000_expand_float128_convert (operands[0], operands[1], false); + DONE; +}) + (define_expand "extendtfkf2" [(set (match_operand:KF 0 "gpc_reg_operand") (float_extend:KF (match_operand:TF 1 "gpc_reg_operand")))] @@ -9111,6 +9114,15 @@ (define_expand "truncifkf2" DONE; }) +(define_expand "trunckfif2" + [(set (match_operand:IF 0 "gpc_reg_operand") + (float_truncate:IF (match_operand:KF 1 "gpc_reg_operand")))] + "TARGET_FLOAT128_TYPE" +{ + rs6000_expand_float128_convert (operands[0], operands[1], false); + DONE; +}) + (define_expand "trunckftf2" [(set (match_operand:TF 0 "gpc_reg_operand") (float_truncate:TF (match_operand:KF 1 "gpc_reg_operand")))] @@ -9129,31 +9141,73 @@ (define_expand "trunctfif2" DONE; }) -(define_insn_and_split "*extend<mode>tf2_internal" - [(set (match_operand:TF 0 "gpc_reg_operand" "=<IFKF_reg>") +;; Convert between KFmode and TFmode when -mabi=ieeelongdouble +(define_insn_and_split "*extendkftf2_internal" + [(set (match_operand:TF 0 "gpc_reg_operand" "=wa,wa") (float_extend:TF - (match_operand:IFKF 1 "gpc_reg_operand" "<IFKF_reg>")))] - "TARGET_FLOAT128_TYPE - && FLOAT128_IBM_P (TFmode) == FLOAT128_IBM_P (<MODE>mode)" + (match_operand:KF 1 "gpc_reg_operand" "0,wa")))] + "FLOAT128_IEEE_P (TFmode)" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 2))] { operands[2] = gen_rtx_REG (TFmode, REGNO (operands[1])); -}) +} + [(set_attr "type" "vecsimple")]) -(define_insn_and_split "*extendtf<mode>2_internal" - [(set (match_operand:IFKF 0 "gpc_reg_operand" "=<IFKF_reg>") - (float_extend:IFKF - (match_operand:TF 1 "gpc_reg_operand" "<IFKF_reg>")))] - "TARGET_FLOAT128_TYPE - && FLOAT128_IBM_P (TFmode) == FLOAT128_IBM_P (<MODE>mode)" +(define_insn_and_split "*extendtfkf2_internal" + [(set (match_operand:KF 0 "gpc_reg_operand" "=wa,wa") + (float_extend:KF + (match_operand:TF 1 "gpc_reg_operand" "0,wa")))] + "FLOAT128_IEEE_P (TFmode)" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 2))] { - operands[2] = gen_rtx_REG (<MODE>mode, REGNO (operands[1])); -}) + operands[2] = gen_rtx_REG (KFmode, REGNO (operands[1])); +} + [(set_attr "type" "vecsimple")]) + +;; Convert between IFmode and TFmode when -mabi=ibmlongdouble +(define_insn_and_split "*extendiftf2_internal" + [(set (match_operand:TF 0 "gpc_reg_operand" "=d,&d") + (float_extend:TF + (match_operand:IF 1 "input_operand" "0,d")))] + "FLOAT128_IBM_P (TFmode)" + "#" + "&& reload_completed" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 4) (match_dup 5))] +{ + unsigned int op0_regno = reg_or_subregno (operands[0]); + unsigned int op1_regno = reg_or_subregno (operands[1]); + operands[2] = gen_rtx_REG (DFmode, op0_regno); + operands[3] = gen_rtx_REG (DFmode, op1_regno); + operands[4] = gen_rtx_REG (DFmode, op0_regno + 1); + operands[5] = gen_rtx_REG (DFmode, op1_regno + 1); +} + [(set_attr "type" "two") + (set_attr "num_insns" "2")]) + +(define_insn_and_split "*extendtfif2_internal" + [(set (match_operand:IF 0 "gpc_reg_operand" "=d,&d") + (float_extend:IF + (match_operand:TF 1 "input_operand" "0,d")))] + "FLOAT128_IBM_P (TFmode)" + "#" + "&& reload_completed" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 4) (match_dup 5))] +{ + unsigned int op0_regno = reg_or_subregno (operands[0]); + unsigned int op1_regno = reg_or_subregno (operands[1]); + operands[2] = gen_rtx_REG (DFmode, op0_regno); + operands[3] = gen_rtx_REG (DFmode, op1_regno); + operands[4] = gen_rtx_REG (DFmode, op0_regno + 1); + operands[5] = gen_rtx_REG (DFmode, op1_regno + 1); +} + [(set_attr "type" "two") + (set_attr "num_insns" "2")]) ;; Reload helper functions used by rs6000_secondary_reload. The patterns all