From patchwork Thu Jan 19 08:41:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 45646 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp213250wrn; Thu, 19 Jan 2023 00:42:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXsLbosqveK8HZSC+srlzrfuZFrdq2AlX+gAo4GK78uAhiw+/EAbrlfmduiG5MT295dT2Zt6 X-Received: by 2002:a17:906:3912:b0:86e:c115:1ea8 with SMTP id f18-20020a170906391200b0086ec1151ea8mr9922510eje.41.1674117760566; Thu, 19 Jan 2023 00:42:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674117760; cv=none; d=google.com; s=arc-20160816; b=pqvCdk9c+12sm70vGCjpAsRjBP6bjnnbRHM0VhZbWehKZjEmlRSod2zqi7vhnaYGds qW3KYZwTvkh/c4LApU4cWbwjFFcHV6AdkFGLn2OnoE4hv4GY6C0mznvaH0OARHbS1Ux1 3cnOZo4wu3Qvb3hYMD3I7U/ckbgmiGJvC5XfGLWnqC6oV7qohfPYqhuvGJshZDNj9liA zIQIvr5rkvgSoF4Wq4Y+oieatcwXr+ur6siLdrAsZaFdNrIf4HtExGVNy1PpQjU7xwqf 5j34xr94sdCizw0lY8k28kXCWajYTWFr+Gv4lnjCVGJda3TPJyuboIt2GjzDfH+ZYKxn b/og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:cc:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=+eYlDh3iUb2g1m5d1Eaolb4s7ugnClfmvA8cm2FGjhI=; b=CbK9B/KXm9zPDpw+crnzrm0vPYbu7JTDSkI0mOCe4QYyRvt86CJ/63+YiJsgW91UWC 2NHzm2exNLccj8PIWDSiL++k0pOxviqPBkFIQvbSmnqBWjnDs7lvbRiXGa/33lM5IN+1 +w3Zya3iOVIJ/4e2M9BuMvhi9n+meQyQLs5oE0ks2bGIcmv0bF7+yCwQxkugEaqsOZGa Hvj4/9tWqjJTGWJXbvnXE4+X0WnpI+tBWiY22vfNpbvUM1XAtA+2Isx6m3KOV990zNuc tL9Zo75FbYrT1zNbjWXcZyeVClapfKP/+9ex57HSxjN+ui4e4R8lxGKtdjb739BHCVJy v8nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=N+SYGszW; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id wt13-20020a170906ee8d00b007cb0cf0b18asi41043673ejb.743.2023.01.19.00.42.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Jan 2023 00:42:40 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=N+SYGszW; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 68CA038582BC for ; Thu, 19 Jan 2023 08:42:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 68CA038582BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1674117759; bh=+eYlDh3iUb2g1m5d1Eaolb4s7ugnClfmvA8cm2FGjhI=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=N+SYGszWFlMJPyRBKZm1swp3sCHSMrqccpSs9ZGkogPoClllhIaRDJqSIlZzTBLBc KXdYO21ADvx3C4zmh27iuIqnTy0df3GLPeSU3P5geAFWJoO4/ChR1v45snUbepsguF eHEOz2YQGoLNEAQA70ReXnz2UcfbSUQ+x8Kp2Q1A= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id DC09D3858D28 for ; Thu, 19 Jan 2023 08:41:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DC09D3858D28 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-339-FZemBU-5OemV8l6oPGL0qw-1; Thu, 19 Jan 2023 03:41:22 -0500 X-MC-Unique: FZemBU-5OemV8l6oPGL0qw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 92164100F905; Thu, 19 Jan 2023 08:41:22 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.223]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1DD9051EF; Thu, 19 Jan 2023 08:41:21 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 30J8fJ011232084 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 19 Jan 2023 09:41:19 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 30J8fIB41232083; Thu, 19 Jan 2023 09:41:18 +0100 Date: Thu, 19 Jan 2023 09:41:17 +0100 To: Richard Biener , Aldy Hernandez Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] forwprop: Further fixes for simplify_rotate [PR108440] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755439705362816117?= X-GMAIL-MSGID: =?utf-8?q?1755439705362816117?= Hi! As mentioned in the simplify_rotate comment, for e.g. ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1)))) we already emit X r<< (Y & (B - 1)) as replacement. This PR is about the ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y))) ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y))) forms if T2 is wider than T. Unlike e.g. (X << Y) OP (X >> (B - Y)) which is valid just for Y in [1, B - 1], the above 2 forms are actually valid and do the rotates for Y in [0, B] - for Y 0 the X value is preserved by the left shift and right logical shift by B adds just zeros (but because the shift is in wider precision B is still valid shift count), while for Y equal to B X is preserved through the latter shift and the former adds just zeros. Now, it is unclear if we in the middle-end treat rotates with rotate count equal or larger than precision as UB or not, unlike shifts there are less reasons to do so, but e.g. expansion of X r<< Y if there is no rotate optab for the mode is emitted as (X << Y) | (((unsigned) X) >> ((-Y) & (B - 1))) and so with UB on Y == B. The following patch does multiple things: 1) for the above 2, asks the ranger if Y could be equal to B and if so, instead of using X r<< Y uses X r<< (Y & (B - 1)) 2) for the ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1)))) ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1)))) forms that were fixed 2 days ago it only punts if Y might be in the [B,B2-1] range but isn't known to be in the [0,B][2*B,2*B][3*B,3*B]... range. Because for Y which is a multiple of B but smaller than B2 it acts as a rotate too, left shift provides 0 and (-Y) & (B - 1) is 0 and so preserves X. Though, for the cases where Y is not known to be in [0,B-1] the patch also uses X r<< (Y & (B - 1)) rather than X r<< Y 3) as discussed with Aldy, instead of using global ranger it uses a pass specific copy but lazily created on first simplify_rotate that needs it; this e.g. handles rotate inside of if body where the guarding condition limits the shift count to some range which will not work with the global ranger (unless there is some SSA_NAME to attach the range to). Note, e.g. on x86 X r<< (Y & (B - 1)) and X r<< Y actually emit the same assembly because rotates work the same even for larger rotate counts, but that is handled only during combine. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-01-19 Jakub Jelinek PR tree-optimization/108440 * tree-ssa-forwprop.cc: Include gimple-range.h. (simplify_rotate): For the forms with T2 wider than T and shift counts of Y and B - Y add & (B - 1) masking for the rotate count if Y could be equal to B. For the forms with T2 wider than T and shift counts of Y and (-Y) & (B - 1), don't punt if range could be [B, B2], but only if range doesn't guarantee Y < B or Y = N * B. If range doesn't guarantee Y < B, also add & (B - 1) masking for the rotate count. Use lazily created pass specific ranger instead of get_global_range_query. (pass_forwprop::execute): Disable that ranger at the end of pass if it has been created. * c-c++-common/rotate-10.c: New test. * c-c++-common/rotate-11.c: New test. Jakub --- gcc/tree-ssa-forwprop.cc.jj 2023-01-17 12:14:15.845088330 +0100 +++ gcc/tree-ssa-forwprop.cc 2023-01-18 13:30:59.337914945 +0100 @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. #include "internal-fn.h" #include "cgraph.h" #include "tree-ssa.h" +#include "gimple-range.h" /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized @@ -1837,8 +1838,12 @@ defcodefor_name (tree name, enum tree_co ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1)))) ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1)))) - transform these into (last 2 only if ranger can prove Y < B): + transform these into (last 2 only if ranger can prove Y < B + or Y = N * B): X r<< Y + or + X r<< (& & (B - 1)) + The latter for the forms with T2 wider than T if ranger can't prove Y < B. Or for: (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1))) @@ -1868,6 +1873,7 @@ simplify_rotate (gimple_stmt_iterator *g gimple *g; gimple *def_arg_stmt[2] = { NULL, NULL }; int wider_prec = 0; + bool add_masking = false; arg[0] = gimple_assign_rhs1 (stmt); arg[1] = gimple_assign_rhs2 (stmt); @@ -1995,7 +2001,7 @@ simplify_rotate (gimple_stmt_iterator *g tree cdef_arg1[2], cdef_arg2[2], def_arg2_alt[2]; enum tree_code cdef_code[2]; gimple *def_arg_alt_stmt[2] = { NULL, NULL }; - bool check_range = false; + int check_range = 0; gimple *check_range_stmt = NULL; /* Look through conversion of the shift count argument. The C/C++ FE cast any shift count argument to integer_type_node. @@ -2036,6 +2042,11 @@ simplify_rotate (gimple_stmt_iterator *g || cdef_arg2[i] == def_arg2_alt[1 - i]) { rotcnt = cdef_arg2[i]; + check_range = -1; + if (cdef_arg2[i] == def_arg2[1 - i]) + check_range_stmt = def_arg_stmt[1 - i]; + else + check_range_stmt = def_arg_alt_stmt[1 - i]; break; } defcodefor_name (cdef_arg2[i], &code, &tem, NULL); @@ -2048,6 +2059,11 @@ simplify_rotate (gimple_stmt_iterator *g || tem == def_arg2_alt[1 - i])) { rotcnt = tem; + check_range = -1; + if (tem == def_arg2[1 - i]) + check_range_stmt = def_arg_stmt[1 - i]; + else + check_range_stmt = def_arg_alt_stmt[1 - i]; break; } } @@ -2080,7 +2096,7 @@ simplify_rotate (gimple_stmt_iterator *g if (tem == def_arg2[1 - i] || tem == def_arg2_alt[1 - i]) { rotcnt = tem; - check_range = true; + check_range = 1; if (tem == def_arg2[1 - i]) check_range_stmt = def_arg_stmt[1 - i]; else @@ -2099,7 +2115,7 @@ simplify_rotate (gimple_stmt_iterator *g || tem2 == def_arg2_alt[1 - i]) { rotcnt = tem2; - check_range = true; + check_range = 1; if (tem2 == def_arg2[1 - i]) check_range_stmt = def_arg_stmt[1 - i]; else @@ -2144,17 +2160,44 @@ simplify_rotate (gimple_stmt_iterator *g if (TREE_CODE (rotcnt) != SSA_NAME) return false; int_range_max r; - if (!get_global_range_query ()->range_of_expr (r, rotcnt, - check_range_stmt)) - return false; + range_query *q = get_range_query (cfun); + if (q == get_global_range_query ()) + q = enable_ranger (cfun); + if (!q->range_of_expr (r, rotcnt, check_range_stmt)) + { + if (check_range > 0) + return false; + r.set_varying (TREE_TYPE (rotcnt)); + } int prec = TYPE_PRECISION (TREE_TYPE (rotcnt)); signop sign = TYPE_SIGN (TREE_TYPE (rotcnt)); wide_int min = wide_int::from (TYPE_PRECISION (rtype), prec, sign); wide_int max = wide_int::from (wider_prec - 1, prec, sign); - int_range<2> r2 (TREE_TYPE (rotcnt), min, max); + if (check_range < 0) + max = min; + int_range<1> r2 (TREE_TYPE (rotcnt), min, max); r.intersect (r2); if (!r.undefined_p ()) - return false; + { + if (check_range > 0) + { + int_range_max r3; + for (int i = TYPE_PRECISION (rtype) + 1; i < wider_prec; + i += TYPE_PRECISION (rtype)) + { + int j = i + TYPE_PRECISION (rtype) - 2; + min = wide_int::from (i, prec, sign); + max = wide_int::from (MIN (j, wider_prec - 1), + prec, sign); + int_range<1> r4 (TREE_TYPE (rotcnt), min, max); + r3.union_ (r4); + } + r.intersect (r3); + if (!r.undefined_p ()) + return false; + } + add_masking = true; + } } if (rotcnt == NULL_TREE) return false; @@ -2169,6 +2212,15 @@ simplify_rotate (gimple_stmt_iterator *g gsi_insert_before (gsi, g, GSI_SAME_STMT); rotcnt = gimple_assign_lhs (g); } + if (add_masking) + { + g = gimple_build_assign (make_ssa_name (TREE_TYPE (rotcnt)), + BIT_AND_EXPR, rotcnt, + build_int_cst (TREE_TYPE (rotcnt), + TYPE_PRECISION (rtype) - 1)); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + rotcnt = gimple_assign_lhs (g); + } lhs = gimple_assign_lhs (stmt); if (!useless_type_conversion_p (rtype, TREE_TYPE (def_arg1[0]))) lhs = make_ssa_name (TREE_TYPE (def_arg1[0])); @@ -3958,6 +4010,9 @@ pass_forwprop::execute (function *fun) BITMAP_FREE (to_purge); BITMAP_FREE (need_ab_cleanup); + if (get_range_query (cfun) != get_global_range_query ()) + disable_ranger (cfun); + if (cfg_changed) todoflags |= TODO_cleanup_cfg; --- gcc/testsuite/c-c++-common/rotate-10.c.jj 2023-01-18 13:12:05.917425710 +0100 +++ gcc/testsuite/c-c++-common/rotate-10.c 2023-01-18 13:11:37.869835522 +0100 @@ -0,0 +1,53 @@ +/* PR tree-optimization/108440 */ +/* { dg-do compile { target { { ilp32 || lp64 } || llp64 } } } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump-times " r<< " 5 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \\\& 7;" 4 "optimized" } } */ + +unsigned char +foo (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__) + __builtin_unreachable (); + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +bar (unsigned char x, unsigned int y) +{ + if (y >= __CHAR_BIT__) + __builtin_unreachable (); + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +baz (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__ && y != 2 * __CHAR_BIT__) + __builtin_unreachable (); + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +qux (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__ && y != 2 * __CHAR_BIT__ && y != __CHAR_BIT__ + 2) + __builtin_unreachable (); + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +quux (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__) + __builtin_unreachable (); + return (x << y) | (x >> (__CHAR_BIT__ - y)); +} + +unsigned char +corge (unsigned char x, unsigned int y) +{ + if (y >= __CHAR_BIT__) + __builtin_unreachable (); + return (x << y) | (x >> (__CHAR_BIT__ - y)); +} --- gcc/testsuite/c-c++-common/rotate-11.c.jj 2023-01-18 13:14:07.146654356 +0100 +++ gcc/testsuite/c-c++-common/rotate-11.c 2023-01-18 13:14:19.916467769 +0100 @@ -0,0 +1,53 @@ +/* PR tree-optimization/108440 */ +/* { dg-do compile { target { { ilp32 || lp64 } || llp64 } } } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump-times " r<< " 5 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \\\& 7;" 4 "optimized" } } */ + +unsigned char +foo (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__) + return 42; + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +bar (unsigned char x, unsigned int y) +{ + if (y >= __CHAR_BIT__) + return 42; + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +baz (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__ && y != 2 * __CHAR_BIT__) + return 42; + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +qux (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__ && y != 2 * __CHAR_BIT__ && y != __CHAR_BIT__ + 2) + return 42; + return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1))); +} + +unsigned char +quux (unsigned char x, unsigned int y) +{ + if (y > __CHAR_BIT__) + return 42; + return (x << y) | (x >> (__CHAR_BIT__ - y)); +} + +unsigned char +corge (unsigned char x, unsigned int y) +{ + if (y >= __CHAR_BIT__) + return 42; + return (x << y) | (x >> (__CHAR_BIT__ - y)); +}