Message ID | 20220826154606.1155977-1-aldyh@redhat.com |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:ecc5:0:0:0:0:0 with SMTP id s5csp292995wro; Fri, 26 Aug 2022 08:47:00 -0700 (PDT) X-Google-Smtp-Source: AA6agR620AJmr+XknMXKff2lxBipQd+rEQM61me/luNZuq/WXPPMLtLjHOUP0AU5eM1GOZ3OF+ZT X-Received: by 2002:a17:907:3ea1:b0:73e:6881:d189 with SMTP id hs33-20020a1709073ea100b0073e6881d189mr2657374ejc.726.1661528820056; Fri, 26 Aug 2022 08:47:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661528820; cv=none; d=google.com; s=arc-20160816; b=dS6+uOy/y/kmv0mF7twVSVtblxEyFQ6poOLyiWwAloCo/niJtf9ETiXp20iXTElNOs 8Tle6sZlA+KogjqlVXok7unIhFPpAMzCylNqTBiKLJ4tUX6cZ1lO8bfg4km84KHab2AR HWYiC+2j91JzinSgIpSSm8DrgtL7W/niXtnhjI+2Q0SG34//9ZIl1GISigjbmZ3M6lU0 49XTWXqiD9FQW9RC9ZthZc6hDASC4HOhBeKhMMVmbNmR/BiJu1AEY4KXYUbkEPDDsdb/ xMD0Y1MCrq8yIe+Cr6lme3X2d+IYnFQuTjmnZ+SFtRR8OyGE6MqWLS361hlsmflypGIG aXOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=pIU3bIkwakiAnsGmQIOzm5eGtldqLV+zmgxbQDLTF2o=; b=agKjxplhHBxBTv2SQmcX2KJT8jlL0h4B1eWdc2bhJX3agFsIBSz6SZLTeVwBGjqrHZ 5zx7vYmcZPnGhnAXSLohNFiMqaLqAlCsmX/GIVL5Um6NmvETpT+PJ46uLcs/1Jl4cBMd 8rO+TkqsTTMYFURVNbMHW5BxSUNabxxM1LQftzgduS5wRhvijZdM6PmHxDiEjMI9jdqZ i9ZgkKIEcEzKOPVXZFQwBdU7WWqo7wJwLeEUtF/7j2KaZ+W8+7zptDoARmLzyDgb3nPl h1neypQJKkuUv6Fk7dkYoTcQ/27B8dtPuxwBYp0vbGxx5dcK3Q9FYv6zES/URUP9CVlY XcqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=CX2BhgEY; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p34-20020a056402502200b00445d2524d42si1707153eda.493.2022.08.26.08.46.59 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 08:47:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=CX2BhgEY; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 996A03851155 for <ouuuleilei@gmail.com>; Fri, 26 Aug 2022 15:46:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 996A03851155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1661528817; bh=pIU3bIkwakiAnsGmQIOzm5eGtldqLV+zmgxbQDLTF2o=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=CX2BhgEYjDzkCzWJbPAj5OLOIcvUdxQLcGCJJYkU3b7+bvp24Vva0pq+S/PdPGcf3 WWP8p+qxt2mER7YjAmKoQ1k/c0FowGHAPK6FZspvxmFiSFunrcgjM0BDQ1y4cFPHwx SvOky89RaEUnZqybmzKq0Ehp5Eiou/DoC/YBTqHc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 9300B38515D9 for <gcc-patches@gcc.gnu.org>; Fri, 26 Aug 2022 15:46:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9300B38515D9 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-277-PoxYMMGGOMaAx15frkSNHA-1; Fri, 26 Aug 2022 11:46:13 -0400 X-MC-Unique: PoxYMMGGOMaAx15frkSNHA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5782185A794 for <gcc-patches@gcc.gnu.org>; Fri, 26 Aug 2022 15:46:12 +0000 (UTC) Received: from abulafia.home (unknown [10.39.193.54]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 67383C15BB3; Fri, 26 Aug 2022 15:46:12 +0000 (UTC) Received: from abulafia.home (localhost [127.0.0.1]) by abulafia.home (8.17.1/8.17.1) with ESMTPS id 27QFk9PX1155995 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 26 Aug 2022 17:46:09 +0200 Received: (from aldyh@localhost) by abulafia.home (8.17.1/8.17.1/Submit) id 27QFk9DZ1155994; Fri, 26 Aug 2022 17:46:09 +0200 To: GCC patches <gcc-patches@gcc.gnu.org> Subject: [PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0 Date: Fri, 26 Aug 2022 17:46:06 +0200 Message-Id: <20220826154606.1155977-1-aldyh@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Aldy Hernandez via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Aldy Hernandez <aldyh@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1742239244148615005?= X-GMAIL-MSGID: =?utf-8?q?1742239244148615005?= |
Series |
[ranger] x == -0.0 does not mean we can replace x with -0.0
|
|
Commit Message
Aldy Hernandez
Aug. 26, 2022, 3:46 p.m. UTC
On the true side of x == -0.0, we can't just blindly value propagate the -0.0 into every use of x because x could be +0.0 (or vice versa). With this change, we only allow the transformation if !HONOR_SIGNED_ZEROS or if the range is known not to contain 0. Will commit after tests complete. gcc/ChangeLog: * range-op-float.cc (foperator_equal::op1_range): Do not blindly copy op2 range when honoring signed zeros. --- gcc/range-op-float.cc | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
Comments
On Fri, Aug 26, 2022 at 05:46:06PM +0200, Aldy Hernandez wrote: > On the true side of x == -0.0, we can't just blindly value propagate > the -0.0 into every use of x because x could be +0.0 (or vice versa). > > With this change, we only allow the transformation if > !HONOR_SIGNED_ZEROS or if the range is known not to contain 0. > > Will commit after tests complete. > > gcc/ChangeLog: > > * range-op-float.cc (foperator_equal::op1_range): Do not blindly > copy op2 range when honoring signed zeros. > --- > gcc/range-op-float.cc | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc > index ad2fae578d2..ff9fe312acf 100644 > --- a/gcc/range-op-float.cc > +++ b/gcc/range-op-float.cc > @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type, > switch (get_bool_state (r, lhs, type)) > { > case BRS_TRUE: > - // If it's true, the result is the same as OP2. > - r = op2; > + if (HONOR_SIGNED_ZEROS (type) > + && op2.contains_p (build_zero_cst (type))) What exactly does op2.contains_p for zero? Does it use real_compare/real_equal under the hood, so it is equivalent to op2 == 0.0 or op2 == -0.0, where both will be true whether op2 is -0.0 or 0.0? Or is it more strict and checks whether it is actually a positive zero? In any case, for HONOR_SIGNED_ZEROS, VARYING is unnecessary, all you can do is extend the r range to contain both -0.0 and +0.0 if it contains at least one of them. > + { > + // With signed zeros, x == -0.0 does not mean we can replace > + // x with -0.0, because x may be either +0.0 or -0.0. > + r.set_varying (type); > + } > + else > + { > + // If it's true, the result is the same as OP2. > + // > + // If the range does not actually contain zeros, this should > + // always be OK. > + r = op2; > + } !HONOR_SIGNED_ZEROS doesn't imply that negative zeros won't appear, but says that user doesn't care if he gets a positive or negative zero (unless !MODE_HAS_SIGNED_ZEROS - in that case -0.0 doesn't exist and one doesn't need to bother with it). Now, if all the code setting franges makes sure that for MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS if +0.0 or -0.0 are inside of a range, then both -0.0 and +0.0 are in the range, then yes, you can use r = op2; > // The TRUE side of op1 == op2 implies op1 is !NAN. > r.set_nan (fp_prop::NO); > break; Jakub
On Fri, Aug 26, 2022 at 6:40 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Aug 26, 2022 at 05:46:06PM +0200, Aldy Hernandez wrote: > > On the true side of x == -0.0, we can't just blindly value propagate > > the -0.0 into every use of x because x could be +0.0 (or vice versa). > > > > With this change, we only allow the transformation if > > !HONOR_SIGNED_ZEROS or if the range is known not to contain 0. > > > > Will commit after tests complete. > > > > gcc/ChangeLog: > > > > * range-op-float.cc (foperator_equal::op1_range): Do not blindly > > copy op2 range when honoring signed zeros. > > --- > > gcc/range-op-float.cc | 17 +++++++++++++++-- > > 1 file changed, 15 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc > > index ad2fae578d2..ff9fe312acf 100644 > > --- a/gcc/range-op-float.cc > > +++ b/gcc/range-op-float.cc > > @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type, > > switch (get_bool_state (r, lhs, type)) > > { > > case BRS_TRUE: > > - // If it's true, the result is the same as OP2. > > - r = op2; > > + if (HONOR_SIGNED_ZEROS (type) > > + && op2.contains_p (build_zero_cst (type))) > > What exactly does op2.contains_p for zero? > Does it use real_compare/real_equal under the hood, so it is > equivalent to op2 == 0.0 or op2 == -0.0, where both will be > true whether op2 is -0.0 or 0.0? Or is it more strict and > checks whether it is actually a positive zero? frange::contains_p() uses real_compare(), so both -0.0 and 0.0 will come out as true: return (real_compare (GE_EXPR, TREE_REAL_CST_PTR (cst), &m_min) && real_compare (LE_EXPR, TREE_REAL_CST_PTR (cst), &m_max)); I thought about this some more, and you're right, dropping to VARYING is a big hammer. It seems to me we can do this optimization regardless, but then treat positive and negative zero the same throughout the frange class. Particularly, in frange::singleton_p(). We should never return TRUE for any version of 0.0. This will keep VRP from propagating an incorrect 0.0, since all VRP does is propagate when a range is provably a singleton. Also, frange::zero_p() shall return true for any version of 0.0. I fleshed out all the relational operators (with endpoints) with this approach, and everything worked out...including go, ada, and fortran, which had given me headaches. As a bonus, we can get rid of the INF/NINF property bits I was keeping around, since now the range will have actual endpoints. I will repost the full frange endpoints patch (and CC you) in the appropriate thread. Aldy > In any case, for HONOR_SIGNED_ZEROS, VARYING is unnecessary, all you > can do is extend the r range to contain both -0.0 and +0.0 if it contains > at least one of them. > > > + { > > + // With signed zeros, x == -0.0 does not mean we can replace > > + // x with -0.0, because x may be either +0.0 or -0.0. > > + r.set_varying (type); > > + } > > + else > > + { > > + // If it's true, the result is the same as OP2. > > + // > > + // If the range does not actually contain zeros, this should > > + // always be OK. > > + r = op2; > > + } > > !HONOR_SIGNED_ZEROS doesn't imply that negative zeros won't appear, > but says that user doesn't care if he gets a positive or negative zero > (unless !MODE_HAS_SIGNED_ZEROS - in that case -0.0 doesn't exist > and one doesn't need to bother with it). > > Now, if all the code setting franges makes sure that for > MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS if +0.0 or -0.0 are inside > of a range, then both -0.0 and +0.0 are in the range, then yes, > you can use r = op2; > > > // The TRUE side of op1 == op2 implies op1 is !NAN. > > r.set_nan (fp_prop::NO); > > break; > > Jakub >
On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: > It seems to me we can do this optimization regardless, but then treat > positive and negative zero the same throughout the frange class. > Particularly, in frange::singleton_p(). We should never return TRUE > for any version of 0.0. This will keep VRP from propagating an > incorrect 0.0, since all VRP does is propagate when a range is > provably a singleton. Also, frange::zero_p() shall return true for > any version of 0.0. Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to differentiate between 0.0 and -0.0. One reason is e.g. to be able to optimize copysign/signbit - if we can prove that the sign bit on some value will be always cleared or always set, we can fold those. On the other side, with -fno-signed-zeros it is invalid to use copysign/signbit on values that could be zero (well, nothing guarantees whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being one thing (just not singleton_p) and not bother with details like whether a range ends or starts with -0.0 or 0.0, either of them would work the same. And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. Jakub
On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: > > It seems to me we can do this optimization regardless, but then treat > > positive and negative zero the same throughout the frange class. > > Particularly, in frange::singleton_p(). We should never return TRUE > > for any version of 0.0. This will keep VRP from propagating an > > incorrect 0.0, since all VRP does is propagate when a range is > > provably a singleton. Also, frange::zero_p() shall return true for > > any version of 0.0. > > Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to > differentiate between 0.0 and -0.0. > One reason is e.g. to be able to optimize copysign/signbit - if we can > prove that the sign bit on some value will be always cleared or always set, > we can fold those. > On the other side, with -fno-signed-zeros it is invalid to use > copysign/signbit on values that could be zero (well, nothing guarantees > whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && > !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being > one thing (just not singleton_p) and not bother with details like whether > a range ends or starts with -0.0 or 0.0, either of them would work the same. > And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. *head explodes* Ok, I think I can add a zero property we can track (like we do for NAN), and set it appropriately at constant creation and upon results from copysign/signbit. However, I am running out of time before Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do that as a follow-up. Aldy
On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote: > On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: >> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: >>> It seems to me we can do this optimization regardless, but then treat >>> positive and negative zero the same throughout the frange class. >>> Particularly, in frange::singleton_p(). We should never return TRUE >>> for any version of 0.0. This will keep VRP from propagating an >>> incorrect 0.0, since all VRP does is propagate when a range is >>> provably a singleton. Also, frange::zero_p() shall return true for >>> any version of 0.0. >> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to >> differentiate between 0.0 and -0.0. >> One reason is e.g. to be able to optimize copysign/signbit - if we can >> prove that the sign bit on some value will be always cleared or always set, >> we can fold those. >> On the other side, with -fno-signed-zeros it is invalid to use >> copysign/signbit on values that could be zero (well, nothing guarantees >> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && >> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being >> one thing (just not singleton_p) and not bother with details like whether >> a range ends or starts with -0.0 or 0.0, either of them would work the same. >> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. > *head explodes* > > Ok, I think I can add a zero property we can track (like we do for > NAN), and set it appropriately at constant creation and upon results > from copysign/signbit. However, I am running out of time before > Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do > that as a follow-up. We definitely want to be able to track +-0.0 and distinguish between them. IIRC there's cases where you can start eliminating comparisons and arithmetic once you start tracking -0.0 state. Jeff
On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > > On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote: > > On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: > >> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: > >>> It seems to me we can do this optimization regardless, but then treat > >>> positive and negative zero the same throughout the frange class. > >>> Particularly, in frange::singleton_p(). We should never return TRUE > >>> for any version of 0.0. This will keep VRP from propagating an > >>> incorrect 0.0, since all VRP does is propagate when a range is > >>> provably a singleton. Also, frange::zero_p() shall return true for > >>> any version of 0.0. > >> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to > >> differentiate between 0.0 and -0.0. > >> One reason is e.g. to be able to optimize copysign/signbit - if we can > >> prove that the sign bit on some value will be always cleared or always set, > >> we can fold those. > >> On the other side, with -fno-signed-zeros it is invalid to use > >> copysign/signbit on values that could be zero (well, nothing guarantees > >> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && > >> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being > >> one thing (just not singleton_p) and not bother with details like whether > >> a range ends or starts with -0.0 or 0.0, either of them would work the same. > >> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. > > *head explodes* > > > > Ok, I think I can add a zero property we can track (like we do for > > NAN), and set it appropriately at constant creation and upon results > > from copysign/signbit. However, I am running out of time before > > Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do > > that as a follow-up. > We definitely want to be able to track +-0.0 and distinguish between > them. IIRC there's cases where you can start eliminating comparisons > and arithmetic once you start tracking -0.0 state. Absolutely. That was always the plan. However, my goal was just to add relop stubs so others could flesh out the rest. Alas, I'm way over that now :). We're tracking NANs, endpoints, infinities, etc. However, I'm hoping to forget as many floating point details, as fast as possible, as soon as I can ;-). Aldy
On 8/29/2022 8:26 AM, Aldy Hernandez wrote: > On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> >> On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote: >>> On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: >>>> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: >>>>> It seems to me we can do this optimization regardless, but then treat >>>>> positive and negative zero the same throughout the frange class. >>>>> Particularly, in frange::singleton_p(). We should never return TRUE >>>>> for any version of 0.0. This will keep VRP from propagating an >>>>> incorrect 0.0, since all VRP does is propagate when a range is >>>>> provably a singleton. Also, frange::zero_p() shall return true for >>>>> any version of 0.0. >>>> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to >>>> differentiate between 0.0 and -0.0. >>>> One reason is e.g. to be able to optimize copysign/signbit - if we can >>>> prove that the sign bit on some value will be always cleared or always set, >>>> we can fold those. >>>> On the other side, with -fno-signed-zeros it is invalid to use >>>> copysign/signbit on values that could be zero (well, nothing guarantees >>>> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && >>>> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being >>>> one thing (just not singleton_p) and not bother with details like whether >>>> a range ends or starts with -0.0 or 0.0, either of them would work the same. >>>> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. >>> *head explodes* >>> >>> Ok, I think I can add a zero property we can track (like we do for >>> NAN), and set it appropriately at constant creation and upon results >>> from copysign/signbit. However, I am running out of time before >>> Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do >>> that as a follow-up. >> We definitely want to be able to track +-0.0 and distinguish between >> them. IIRC there's cases where you can start eliminating comparisons >> and arithmetic once you start tracking -0.0 state. > Absolutely. That was always the plan. However, my goal was just to > add relop stubs so others could flesh out the rest. Alas, I'm way > over that now :). We're tracking NANs, endpoints, infinities, etc. Well, we'll probably cycle back and forth a bit between the solver and figuring out what we need to track. Related: I had a short email thread with Siddhesh and Carlos WRT the idea of putting in some __builtin_unreachables into the math routines. They're generally OK with it, though we do need to verify those conditionals aren't generating extra code. So there's a potential path forward for that side of the problem as well. > > However, I'm hoping to forget as many floating point details, as fast > as possible, as soon as I can ;-). Actually FP isn't that bad -- I'd largely avoided it for decades, but didn't have a choice earlier this year. And there's a lot more headroom for improvements in the FP space than the integer space IMHO. Jeff
On 8/29/22 17:08, Jeff Law via Gcc-patches wrote: >> However, I'm hoping to forget as many floating point details, as fast >> as possible, as soon as I can ;-). > Actually FP isn't that bad -- I'd largely avoided it for decades, but > didn't have a choice earlier this year. And there's a lot more headroom > for improvements in the FP space than the integer space IMHO. One of the more interesting ones is to try to limit the range of the input to the trigonometric functions - that way you could use ones without any argument reduction phase ...
On Mon, Aug 29, 2022 at 5:08 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 8/29/2022 8:26 AM, Aldy Hernandez wrote: > > On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> > >> On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote: > >>> On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: > >>>> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: > >>>>> It seems to me we can do this optimization regardless, but then treat > >>>>> positive and negative zero the same throughout the frange class. > >>>>> Particularly, in frange::singleton_p(). We should never return TRUE > >>>>> for any version of 0.0. This will keep VRP from propagating an > >>>>> incorrect 0.0, since all VRP does is propagate when a range is > >>>>> provably a singleton. Also, frange::zero_p() shall return true for > >>>>> any version of 0.0. > >>>> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to > >>>> differentiate between 0.0 and -0.0. > >>>> One reason is e.g. to be able to optimize copysign/signbit - if we can > >>>> prove that the sign bit on some value will be always cleared or always set, > >>>> we can fold those. > >>>> On the other side, with -fno-signed-zeros it is invalid to use > >>>> copysign/signbit on values that could be zero (well, nothing guarantees > >>>> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && > >>>> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being > >>>> one thing (just not singleton_p) and not bother with details like whether > >>>> a range ends or starts with -0.0 or 0.0, either of them would work the same. > >>>> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. > >>> *head explodes* > >>> > >>> Ok, I think I can add a zero property we can track (like we do for > >>> NAN), and set it appropriately at constant creation and upon results > >>> from copysign/signbit. However, I am running out of time before > >>> Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do > >>> that as a follow-up. > >> We definitely want to be able to track +-0.0 and distinguish between > >> them. IIRC there's cases where you can start eliminating comparisons > >> and arithmetic once you start tracking -0.0 state. > > Absolutely. That was always the plan. However, my goal was just to > > add relop stubs so others could flesh out the rest. Alas, I'm way > > over that now :). We're tracking NANs, endpoints, infinities, etc. > Well, we'll probably cycle back and forth a bit between the solver and > figuring out what we need to track. > > Related: I had a short email thread with Siddhesh and Carlos WRT the > idea of putting in some __builtin_unreachables into the math routines. > They're generally OK with it, though we do need to verify those > conditionals aren't generating extra code. So there's a potential path > forward for that side of the problem as well. > > > > > > However, I'm hoping to forget as many floating point details, as fast > > as possible, as soon as I can ;-). > Actually FP isn't that bad -- I'd largely avoided it for decades, but > didn't have a choice earlier this year. And there's a lot more headroom > for improvements in the FP space than the integer space IMHO. Absolutely. Just working with basic relationals I've noticed that we generate absolute garbage for some simple testcases. I'm amazed what makes it all the way to the assembly because we fail to fold simple things. Aldy
On 8/29/2022 9:13 AM, Toon Moene wrote: > On 8/29/22 17:08, Jeff Law via Gcc-patches wrote: > >>> However, I'm hoping to forget as many floating point details, as fast >>> as possible, as soon as I can ;-). > >> Actually FP isn't that bad -- I'd largely avoided it for decades, but >> didn't have a choice earlier this year. And there's a lot more >> headroom for improvements in the FP space than the integer space IMHO. > > One of the more interesting ones is to try to limit the range of the > input to the trigonometric functions - that way you could use ones > without any argument reduction phase ... The difficult part is that most of the trig stuff is in libraries, so we don't have visibility into the full range. What we do sometimes have is knowledge that the special values are already handled which allows us to do things like automatically transform a division into estimation + NR correction steps (atan2). I guess we could do specialization based on the input range. So rather than calling "sin" we could call a special one that didn't have the reduction step when we know the input value is in a sensible range. Jeff
> On Aug 29, 2022, at 1:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > ... > I guess we could do specialization based on the input range. So rather than calling "sin" we could call a special one that didn't have the reduction step when we know the input value is in a sensible range. There's some precedent for that, though for a somewhat different reason: functions like "log1p". And in fact, it would make sense for the optimizer to transform log calls into log1p calls when the range is known to be right for doing so. paul
On 8/29/22 19:07, Jeff Law via Gcc-patches wrote: >> One of the more interesting ones is to try to limit the range of the >> input to the trigonometric functions - that way you could use ones >> without any argument reduction phase ... > The difficult part is that most of the trig stuff is in libraries, so we > don't have visibility into the full range. > > What we do sometimes have is knowledge that the special values are > already handled which allows us to do things like automatically > transform a division into estimation + NR correction steps (atan2). > > I guess we could do specialization based on the input range. So rather > than calling "sin" we could call a special one that didn't have the > reduction step when we know the input value is in a sensible range. Exactly. It's probably not that hard to have sin/cos/tan with a special entry point that foregoes the whole argument reduction step. In every weather forecast, you have to compute the local solar height (to get the effects of solar radiation correct) every time step, in every grid point. You *know* that angle is between 0 and 90 degrees, as are all the angles that go into that computation (latitude, longitude (and time [hour of the day, day of the year]).
I think we can model the signed zero problem by keeping track of a sign property, similar to how we keep track of a NAN property. The property can be yes, no, or unknown, and would apply to the entire range. [0.0, 0.0] SIGN => -0.0 singleton [0.0, 0.0] !SIGN => +0.0 singleton [0.0, 0.0] VARYING => [-0.0, +0.0] sign unknown frange::singleton_p() would return the appropriate zero if the sign bit is definitely known, otherwise it would return NULL, which would keep VRP from propagating it. This is a sample of how I envision it working with __builtin_signbit: =========== BB 2 ============ Imports: x_3(D) Exports: _1 x_3(D) _1 : x_3(D)(I) x_3(D) [frange] float VARYING <bb 2> : _1 = __builtin_signbit (x_3(D)); if (_1 == 0) goto <bb 3>; [INV] else goto <bb 4>; [INV] 2->3 (T) _1 : [irange] int [0, 0] NONZERO 0x0 2->3 (T) x_3(D) : [frange] float [0.0, Inf] !SIGN 2->4 (F) _1 : [irange] int [-INF, -1][1, +INF] 2->4 (F) x_3(D) : [frange] float [ -Inf, 0.0] SIGN That is, on the TRUE side x_3 can be assumed to be positive, including the zero. On the FALSE side x_3 is negative, also including the zero. We can keep the endpoints in sync with the sign bit for free, since we have endpoints. So, setting the sign bit on a range to either yes or no, would automatically intersect it to [-INF, 0] or [0, +INF] respectively. With this in play, VRP could propagate a 0.0 if it knows the sign. For example: if (x == 0.0 && __builtin_signbit (x) == 0) bark(x); ...would generate: =========== BB 2 ============ Imports: x_3(D) Exports: x_3(D) x_3(D) [frange] float VARYING <bb 2> : if (x_3(D) == 0.0) goto <bb 3>; [INV] else goto <bb 5>; [INV] 2->3 (T) x_3(D) : [frange] float [0.0, 0.0] !NAN =========== BB 3 ============ Imports: x_3(D) Exports: _1 x_3(D) _1 : x_3(D)(I) x_3(D) [frange] float [0.0, 0.0] !NAN <bb 3> : _1 = __builtin_signbit (x_3(D)); if (_1 == 0) goto <bb 4>; [INV] else goto <bb 5>; [INV] 3->4 (T) _1 : [irange] int [0, 0] NONZERO 0x0 3->4 (T) x_3(D) : [frange] float [0.0, 0.0] !NAN !SIGN 3->5 (F) _1 : [irange] int [-INF, -1][1, +INF] 3->5 (F) x_3(D) : [frange] float [0.0, 0.0] !NAN SIGN =========== BB 4 ============ x_3(D) [frange] float [0.0, 0.0] !NAN !SIGN <bb 4> : bark (0.0); That is, on the 2->3 edge we know x_3 is 0.0 and !NAN, but have no information on the sign bit. Then out of BB3, we know both that x_3 is 0.0 as well as the appropriate sign. Ultimately this leads us to propagate +0.0 in BB4. I have most^Wall of it coded without regressions, with the exception of how to coerce the range-op machinery to play nice with builtins (details below). But I wanted to make sure we're all on the same page. A couple questions: Can I safely assume that +0.0 in the source (say, x = 0.0) has the sign bit cleared, and vice versa for -0.0? What's the deal with __builtin_signbit? Can I fold it to 0/1, or must I return the actual signbit, because I see differing behavior whether we fold a known value or not: abulafia:~$ cat a.c float nzero = -0.0; main(){ printf("0x%x\n", __builtin_signbit(-0.0)); printf("0x%x\n", __builtin_signbit(nzero)); } abulafia:~$ gcc a.c -w && ./a.out 0x1 0x80000000 When Andrew comes back from PTO, we'll need to talk about propagating builtins. Currently range-ops' op1_range is used to unwind back from conditionals. For example: _1 = x_9 + 5 if (_1 == 0) On the TRUE side we use op1_range to solve: 0 = x_9 + 5; We currently only handle assignments and conditionals. We would need to ability to wind back through builtins since __builtin_signbit is not part of the IL: _1 = __builtin_signbit (x_3(D)); if (_1 == 0) We have no way to augment the range for x_3 when examining the builtin. We do have a way of folding the builtin on a forward analysis, but that's a separate thing. Thoughts? Aldy On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote: > > It seems to me we can do this optimization regardless, but then treat > > positive and negative zero the same throughout the frange class. > > Particularly, in frange::singleton_p(). We should never return TRUE > > for any version of 0.0. This will keep VRP from propagating an > > incorrect 0.0, since all VRP does is propagate when a range is > > provably a singleton. Also, frange::zero_p() shall return true for > > any version of 0.0. > > Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to > differentiate between 0.0 and -0.0. > One reason is e.g. to be able to optimize copysign/signbit - if we can > prove that the sign bit on some value will be always cleared or always set, > we can fold those. > On the other side, with -fno-signed-zeros it is invalid to use > copysign/signbit on values that could be zero (well, nothing guarantees > whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS && > !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being > one thing (just not singleton_p) and not bother with details like whether > a range ends or starts with -0.0 or 0.0, either of them would work the same. > And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p. > > Jakub >
diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index ad2fae578d2..ff9fe312acf 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type, switch (get_bool_state (r, lhs, type)) { case BRS_TRUE: - // If it's true, the result is the same as OP2. - r = op2; + if (HONOR_SIGNED_ZEROS (type) + && op2.contains_p (build_zero_cst (type))) + { + // With signed zeros, x == -0.0 does not mean we can replace + // x with -0.0, because x may be either +0.0 or -0.0. + r.set_varying (type); + } + else + { + // If it's true, the result is the same as OP2. + // + // If the range does not actually contain zeros, this should + // always be OK. + r = op2; + } // The TRUE side of op1 == op2 implies op1 is !NAN. r.set_nan (fp_prop::NO); break;