Message ID | 20220829034216.94029-1-guojiufu@linux.ibm.com |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:ecc5:0:0:0:0:0 with SMTP id s5csp1247304wro; Sun, 28 Aug 2022 20:43:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR5SmSDFU9i4niD+5kHkeLKwl/FRuUw7tweQhflnNzL4cN3zeBFLhdhio491XN+yoc+Wh+x8 X-Received: by 2002:a17:907:8a1e:b0:731:39b9:e00c with SMTP id sc30-20020a1709078a1e00b0073139b9e00cmr12416814ejc.65.1661744598785; Sun, 28 Aug 2022 20:43:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661744598; cv=none; d=google.com; s=arc-20160816; b=ewjCmREav9nBiMuDq1MNaAtbK4Fysz/a61UklPd3UNh9WSFlZqWkT49WVqi6HWO5w9 EbP/yOGilZrtndQZ2kkQYMpzccZSUXI/vbHI3tfPrg9kM82AJZJ42d88cOL+GX+L2/at wNlcilLWwsIoiQzfMbMBeZF/9C2AXKZ/IvD4naZULqcAFdnUFGqivLm+8fRITLgkaRyz 696PNLWJTO0MjdXia7tDtfjuJag33ZvFswDymxTLR1spjDW47r4nJG209LEl6/RgNflq yjDlpJFpfiCXDoy4JlI28UryvQ4la4a1OfaZd4Ls8DwMBqapUiKH82visBLmgfE902YY kTzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:message-id:date:subject:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=kR38QxEEDsY3JiuPCldBZZRT2NuWUku1SN/iUxuP70w=; b=YeEQcj5Uhy+zsa+qWyTdSxyAJ1U/cAEvYiio6SBuwFn1r3+IPciR2jdCL5QlJWWdUE NYrXg1JFRK9XbH/9Q9bXFd1TalefrZP3QrdO+enMG+rsm2x5w9zsN/hNXxtAOqULY7Ot EVisdie2x3oFlkuJE95xEUIrxY83hULHvcnWSVQoFXK0kmR0BexVAnCM+XXTwJpNoq34 3LvaCcVjUS/Yjno9eW57qBy+dn1xs2lsw9AIh0LnKuyNtgmYlKRW9DQI96na7Tt6Btl2 C0Gv1sibVSxQ3PBeqNcsi3mbd9UVGCFwJ9v6VmWsZhLs0rQfw+MLGQwv2QIZbdOeTr34 3Wtg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=cJvli3T2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l8-20020a056402254800b0043a9bb390d3si7015024edb.278.2022.08.28.20.43.18 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Aug 2022 20:43:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=cJvli3T2; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A5DB83858418 for <ouuuleilei@gmail.com>; Mon, 29 Aug 2022 03:43:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A5DB83858418 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1661744595; bh=kR38QxEEDsY3JiuPCldBZZRT2NuWUku1SN/iUxuP70w=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=cJvli3T2ijgx9Gijx1Ap7+VvnYQyUL6zzxTWE4/xjoi/VtOuRVKC6bhWbqAjpOPFA 4gog9vsSTgFSIhWeocEUINJGSxvrwMKX6QVVZM0tJSI79ljORlYYHZ72ly2ULb3ztX qJSoZNCB4mtuppjmYeI20UbmrLW8hkjvmWooiU7I= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 196BC3858D1E; Mon, 29 Aug 2022 03:42:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 196BC3858D1E Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 27T3C8Tt020027; Mon, 29 Aug 2022 03:42:23 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3j8nbb8fs2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 29 Aug 2022 03:42:23 +0000 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 27T3eLCs015617; Mon, 29 Aug 2022 03:42:22 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3j8nbb8frk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 29 Aug 2022 03:42:22 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27T3aDXp008331; Mon, 29 Aug 2022 03:42:20 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06fra.de.ibm.com with ESMTP id 3j7ahhs993-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 29 Aug 2022 03:42:20 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 27T3d77k38863242 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Aug 2022 03:39:07 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B117D11C04A; Mon, 29 Aug 2022 03:42:17 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4E0111C052; Mon, 29 Aug 2022 03:42:16 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 29 Aug 2022 03:42:16 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant Date: Mon, 29 Aug 2022 11:42:16 +0800 Message-Id: <20220829034216.94029-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: KSmKwZCuhFmRuJQPb9yiLjobHKDMBDbR X-Proofpoint-GUID: PFrv5jLo-lFIFoIbEgr0Bl3H_xtsRUfH X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-29_01,2022-08-25_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 mlxlogscore=999 malwarescore=0 suspectscore=0 impostorscore=0 phishscore=0 clxscore=1015 mlxscore=0 spamscore=0 priorityscore=1501 lowpriorityscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2208290017 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Jiufu Guo <guojiufu@linux.ibm.com> Cc: dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1742465504458536350?= X-GMAIL-MSGID: =?utf-8?q?1742465504458536350?= |
Series |
[V6] rs6000: Optimize cmp on rotated 16bits constant
|
|
Commit Message
Jiufu Guo
Aug. 29, 2022, 3:42 a.m. UTC
Hi, When checking eq/ne with a constant which has only 16bits, it can be optimized to check the rotated data. By this, the constant building is optimized. As the example in PR103743: For "in == 0x8000000000000000LL", this patch generates: rotldi %r3,%r3,16 cmpldi %cr0,%r3,32768 instead: li %r9,-1 rldicr %r9,%r9,0,0 cmpd %cr0,%r3,%r9 Compare with previous patchs: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html This patch releases the condition on can_create_pseudo_p and adds clobbers to allow the splitter can be run both before and after RA. This is updated patch based on previous patch and comments: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html This patch pass bootstrap and regtest on ppc64 and ppc64le. Is it ok for trunk? Thanks for comments! BR, Jeff(Jiufu) PR target/103743 gcc/ChangeLog: * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. (compare_rotate_immediate_p): New. * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New definition. (compare_rotate_immediate_p): New definition. * config/rs6000/rs6000.md (EQNE): New code_attr. (*rotate_on_cmpdi): New define_insn_and_split. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr103743.c: New test. * gcc.target/powerpc/pr103743_1.c: New test. --- gcc/config/rs6000/rs6000-protos.h | 2 + gcc/config/rs6000/rs6000.cc | 41 ++++++++ gcc/config/rs6000/rs6000.md | 62 +++++++++++- gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ 5 files changed, 251 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c
Comments
Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html BR, Jeff(Jiufu) Jiufu Guo <guojiufu@linux.ibm.com> writes: > Hi, > > When checking eq/ne with a constant which has only 16bits, it can be > optimized to check the rotated data. By this, the constant building > is optimized. > > As the example in PR103743: > For "in == 0x8000000000000000LL", this patch generates: > rotldi %r3,%r3,16 > cmpldi %cr0,%r3,32768 > instead: > li %r9,-1 > rldicr %r9,%r9,0,0 > cmpd %cr0,%r3,%r9 > > Compare with previous patchs: > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html > > This patch releases the condition on can_create_pseudo_p and adds > clobbers to allow the splitter can be run both before and after RA. > > This is updated patch based on previous patch and comments: > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html > > This patch pass bootstrap and regtest on ppc64 and ppc64le. > Is it ok for trunk? Thanks for comments! > > BR, > Jeff(Jiufu) > > > PR target/103743 > > gcc/ChangeLog: > > * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. > (compare_rotate_immediate_p): New. > * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New > definition. > (compare_rotate_immediate_p): New definition. > * config/rs6000/rs6000.md (EQNE): New code_attr. > (*rotate_on_cmpdi): New define_insn_and_split. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr103743.c: New test. > * gcc.target/powerpc/pr103743_1.c: New test. > > --- > gcc/config/rs6000/rs6000-protos.h | 2 + > gcc/config/rs6000/rs6000.cc | 41 ++++++++ > gcc/config/rs6000/rs6000.md | 62 +++++++++++- > gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ > gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ > 5 files changed, 251 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c > > diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h > index b3c16e7448d..78847e6b3db 100644 > --- a/gcc/config/rs6000/rs6000-protos.h > +++ b/gcc/config/rs6000/rs6000-protos.h > @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); > extern int vspltis_shifted (rtx); > extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); > extern bool macho_lo_sum_memory_operand (rtx, machine_mode); > +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); > +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); > extern int num_insns_constant (rtx, machine_mode); > extern int small_data_operand (rtx, machine_mode); > extern bool mem_operand_gpr (rtx, machine_mode); > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index df491bee2ea..a548db42660 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) > return reverse_condition (code); > } > > +/* Check if C can be rotated from an immediate which starts (as 64bit integer) > + with at least CLZ bits zero. > + > + Return the number by which C can be rotated from the immediate. > + Return -1 if C can not be rotated as from. */ > + > +int > +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) > +{ > + /* case a. 0..0xxx: already at least clz zeros. */ > + int lz = clz_hwi (c); > + if (lz >= clz) > + return 0; > + > + /* case b. 0..0xxx0..0: at least clz zeros. */ > + int tz = ctz_hwi (c); > + if (lz + tz >= clz) > + return tz; > + > + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. > + ^bit -> Vbit > + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ > + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; > + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); > + tz = ctz_hwi (rc); > + if (clz_hwi (rc) + tz >= clz) > + return tz + rot_bits; > + > + return -1; > +} > + > +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ > + > +bool > +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) > +{ > + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ > + return rotate_from_leading_zeros_const (~c, 49) > 0 > + || rotate_from_leading_zeros_const (c, 48) > 0; > +} > + > /* Generate a compare for CODE. Return a brand-new rtx that > represents the result of the compare. */ > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index e9e5cd1e54d..cad3cfc98cd 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" > "xscvdpsp %x0,%x1" > [(set_attr "type" "fp")]) > > + > +(define_code_iterator eqne [eq ne]) > +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) > + > +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" > +(define_insn_and_split "*rotate_on_cmpdi" > + [(set (pc) > + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") > + (match_operand:DI 2 "const_int_operand" "n")) > + (label_ref (match_operand 0 "")) > + (pc))) > + (clobber (match_scratch:DI 3 "=r")) > + (clobber (match_scratch:CCUNS 4 "=y"))] > + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 > + && compare_rotate_immediate_p (UINTVAL (operands[2]))" > + "#" > + "&& 1" > + [(pc)] > +{ > + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); > + bool sgn = false; > + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); > + int rot = rotate_from_leading_zeros_const (C, 48); > + if (rot < 0) > + { > + sgn = true; > + rot = rotate_from_leading_zeros_const (~C, 49); > + } > + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); > + > + /* i' = rotl (i, n) */ > + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; > + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); > + > + /* C' = rotl (C, n) */ > + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); > + > + /* i' == C' */ > + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; > + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; > + PUT_MODE (cc, comp_mode); > + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); > + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); > + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); > + emit_jump_insn (gen_rtx_SET (pc_rtx, > + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, > + loc_ref, pc_rtx))); > + > + /* keep the probability info for the prediction of the branch insn. */ > + if (note) > + { > + profile_probability prob > + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); > + > + add_reg_br_prob_note (get_last_insn (), prob); > + } > + > + DONE; > +} > +) > + > ;; Split a load of a large constant into the appropriate two-insn > ;; sequence. > > @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" > ;; rs6000_legitimate_combined_insn prevents combine creating any of > ;; the ctr<mode> insns. > > -(define_code_iterator eqne [eq ne]) > (define_code_attr bd [(eq "bdz") (ne "bdnz")]) > (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c > new file mode 100644 > index 00000000000..abb876ed79e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c > @@ -0,0 +1,52 @@ > +/* { dg-options "-O2" } */ > +/* { dg-do compile { target has_arch_ppc64 } } */ > + > +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ > +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ > +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ > + > +int foo (int a); > + > +int __attribute__ ((noinline)) udi_fun (unsigned long long in) > +{ > + if (in == (0x8642000000000000ULL)) > + return foo (1); > + if (in == (0x7642000000000000ULL)) > + return foo (12); > + if (in == (0x8000000000000000ULL)) > + return foo (32); > + if (in == (0x8000000000000001ULL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFULL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFULL)) > + return foo (51); > + if (in == (0x7567000000ULL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFULL)) > + return foo (19); > + > + return 0; > +} > + > +int __attribute__ ((noinline)) di_fun (long long in) > +{ > + if (in == (0x8642000000000000LL)) > + return foo (1); > + if (in == (0x7642000000000000LL)) > + return foo (12); > + if (in == (0x8000000000000000LL)) > + return foo (32); > + if (in == (0x8000000000000001LL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFLL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFLL)) > + return foo (51); > + if (in == (0x7567000000LL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFLL)) > + return foo (19); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c > new file mode 100644 > index 00000000000..2c08c56714a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c > @@ -0,0 +1,95 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -std=c99" } */ > + > +int > +foo (int a) > +{ > + return a + 6; > +} > + > +int __attribute__ ((noinline)) udi_fun (unsigned long long in) > +{ > + if (in == (0x8642000000000000ULL)) > + return foo (1); > + if (in == (0x7642000000000000ULL)) > + return foo (12); > + if (in == (0x8000000000000000ULL)) > + return foo (32); > + if (in == (0x8000000000000001ULL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFULL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFULL)) > + return foo (51); > + if (in == (0x7567000000ULL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFULL)) > + return foo (19); > + > + return 0; > +} > + > +int __attribute__ ((noinline)) di_fun (long long in) > +{ > + if (in == (0x8642000000000000LL)) > + return foo (1); > + if (in == (0x7642000000000000LL)) > + return foo (12); > + if (in == (0x8000000000000000LL)) > + return foo (32); > + if (in == (0x8000000000000001LL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFLL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFLL)) > + return foo (51); > + return 0; > +} > + > +int > +main () > +{ > + int e = 0; > + if (udi_fun (6) != 0) > + e++; > + if (udi_fun (0x8642000000000000ULL) != foo (1)) > + e++; > + if (udi_fun (0x7642000000000000ULL) != foo (12)) > + e++; > + if (udi_fun (0x8000000000000000ULL) != foo (32)) > + e++; > + if (udi_fun (0x8000000000000001ULL) != foo (33)) > + e++; > + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) > + e++; > + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) > + e++; > + if (udi_fun (0x7567000000ULL) != foo (9)) > + e++; > + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) > + e++; > + > + if (di_fun (6) != 0) > + e++; > + if (di_fun (0x8642000000000000LL) != foo (1)) > + e++; > + if (di_fun (0x7642000000000000LL) != foo (12)) > + e++; > + if (di_fun (0x8000000000000000LL) != foo (32)) > + e++; > + if (di_fun (0x8000000000000001LL) != foo (33)) > + e++; > + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) > + e++; > + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) > + e++; > + if (udi_fun (0x7567000000LL) != foo (9)) > + e++; > + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) > + e++; > + > + if (e) > + __builtin_abort (); > + return 0; > +} > +
Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html BR, Jeff (Jiufu) Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html > > BR, > Jeff(Jiufu) > > > Jiufu Guo <guojiufu@linux.ibm.com> writes: > >> Hi, >> >> When checking eq/ne with a constant which has only 16bits, it can be >> optimized to check the rotated data. By this, the constant building >> is optimized. >> >> As the example in PR103743: >> For "in == 0x8000000000000000LL", this patch generates: >> rotldi %r3,%r3,16 >> cmpldi %cr0,%r3,32768 >> instead: >> li %r9,-1 >> rldicr %r9,%r9,0,0 >> cmpd %cr0,%r3,%r9 >> >> Compare with previous patchs: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html >> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html >> >> This patch releases the condition on can_create_pseudo_p and adds >> clobbers to allow the splitter can be run both before and after RA. >> >> This is updated patch based on previous patch and comments: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html >> >> This patch pass bootstrap and regtest on ppc64 and ppc64le. >> Is it ok for trunk? Thanks for comments! >> >> BR, >> Jeff(Jiufu) >> >> >> PR target/103743 >> >> gcc/ChangeLog: >> >> * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. >> (compare_rotate_immediate_p): New. >> * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New >> definition. >> (compare_rotate_immediate_p): New definition. >> * config/rs6000/rs6000.md (EQNE): New code_attr. >> (*rotate_on_cmpdi): New define_insn_and_split. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/powerpc/pr103743.c: New test. >> * gcc.target/powerpc/pr103743_1.c: New test. >> >> --- >> gcc/config/rs6000/rs6000-protos.h | 2 + >> gcc/config/rs6000/rs6000.cc | 41 ++++++++ >> gcc/config/rs6000/rs6000.md | 62 +++++++++++- >> gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ >> gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ >> 5 files changed, 251 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c >> >> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h >> index b3c16e7448d..78847e6b3db 100644 >> --- a/gcc/config/rs6000/rs6000-protos.h >> +++ b/gcc/config/rs6000/rs6000-protos.h >> @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); >> extern int vspltis_shifted (rtx); >> extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); >> extern bool macho_lo_sum_memory_operand (rtx, machine_mode); >> +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); >> +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); >> extern int num_insns_constant (rtx, machine_mode); >> extern int small_data_operand (rtx, machine_mode); >> extern bool mem_operand_gpr (rtx, machine_mode); >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >> index df491bee2ea..a548db42660 100644 >> --- a/gcc/config/rs6000/rs6000.cc >> +++ b/gcc/config/rs6000/rs6000.cc >> @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) >> return reverse_condition (code); >> } >> >> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >> + with at least CLZ bits zero. >> + >> + Return the number by which C can be rotated from the immediate. >> + Return -1 if C can not be rotated as from. */ >> + >> +int >> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) >> +{ >> + /* case a. 0..0xxx: already at least clz zeros. */ >> + int lz = clz_hwi (c); >> + if (lz >= clz) >> + return 0; >> + >> + /* case b. 0..0xxx0..0: at least clz zeros. */ >> + int tz = ctz_hwi (c); >> + if (lz + tz >= clz) >> + return tz; >> + >> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. >> + ^bit -> Vbit >> + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ >> + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; >> + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); >> + tz = ctz_hwi (rc); >> + if (clz_hwi (rc) + tz >= clz) >> + return tz + rot_bits; >> + >> + return -1; >> +} >> + >> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >> + >> +bool >> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >> +{ >> + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ >> + return rotate_from_leading_zeros_const (~c, 49) > 0 >> + || rotate_from_leading_zeros_const (c, 48) > 0; >> +} >> + >> /* Generate a compare for CODE. Return a brand-new rtx that >> represents the result of the compare. */ >> >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index e9e5cd1e54d..cad3cfc98cd 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" >> "xscvdpsp %x0,%x1" >> [(set_attr "type" "fp")]) >> >> + >> +(define_code_iterator eqne [eq ne]) >> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >> + >> +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" >> +(define_insn_and_split "*rotate_on_cmpdi" >> + [(set (pc) >> + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") >> + (match_operand:DI 2 "const_int_operand" "n")) >> + (label_ref (match_operand 0 "")) >> + (pc))) >> + (clobber (match_scratch:DI 3 "=r")) >> + (clobber (match_scratch:CCUNS 4 "=y"))] >> + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 >> + && compare_rotate_immediate_p (UINTVAL (operands[2]))" >> + "#" >> + "&& 1" >> + [(pc)] >> +{ >> + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); >> + bool sgn = false; >> + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); >> + int rot = rotate_from_leading_zeros_const (C, 48); >> + if (rot < 0) >> + { >> + sgn = true; >> + rot = rotate_from_leading_zeros_const (~C, 49); >> + } >> + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); >> + >> + /* i' = rotl (i, n) */ >> + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; >> + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); >> + >> + /* C' = rotl (C, n) */ >> + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); >> + >> + /* i' == C' */ >> + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; >> + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; >> + PUT_MODE (cc, comp_mode); >> + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); >> + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); >> + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); >> + emit_jump_insn (gen_rtx_SET (pc_rtx, >> + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, >> + loc_ref, pc_rtx))); >> + >> + /* keep the probability info for the prediction of the branch insn. */ >> + if (note) >> + { >> + profile_probability prob >> + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); >> + >> + add_reg_br_prob_note (get_last_insn (), prob); >> + } >> + >> + DONE; >> +} >> +) >> + >> ;; Split a load of a large constant into the appropriate two-insn >> ;; sequence. >> >> @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" >> ;; rs6000_legitimate_combined_insn prevents combine creating any of >> ;; the ctr<mode> insns. >> >> -(define_code_iterator eqne [eq ne]) >> (define_code_attr bd [(eq "bdz") (ne "bdnz")]) >> (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c >> new file mode 100644 >> index 00000000000..abb876ed79e >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c >> @@ -0,0 +1,52 @@ >> +/* { dg-options "-O2" } */ >> +/* { dg-do compile { target has_arch_ppc64 } } */ >> + >> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ >> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ >> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ >> + >> +int foo (int a); >> + >> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >> +{ >> + if (in == (0x8642000000000000ULL)) >> + return foo (1); >> + if (in == (0x7642000000000000ULL)) >> + return foo (12); >> + if (in == (0x8000000000000000ULL)) >> + return foo (32); >> + if (in == (0x8000000000000001ULL)) >> + return foo (33); >> + if (in == (0x8642FFFFFFFFFFFFULL)) >> + return foo (46); >> + if (in == (0x7642FFFFFFFFFFFFULL)) >> + return foo (51); >> + if (in == (0x7567000000ULL)) >> + return foo (9); >> + if (in == (0xFFF8567FFFFFFFFFULL)) >> + return foo (19); >> + >> + return 0; >> +} >> + >> +int __attribute__ ((noinline)) di_fun (long long in) >> +{ >> + if (in == (0x8642000000000000LL)) >> + return foo (1); >> + if (in == (0x7642000000000000LL)) >> + return foo (12); >> + if (in == (0x8000000000000000LL)) >> + return foo (32); >> + if (in == (0x8000000000000001LL)) >> + return foo (33); >> + if (in == (0x8642FFFFFFFFFFFFLL)) >> + return foo (46); >> + if (in == (0x7642FFFFFFFFFFFFLL)) >> + return foo (51); >> + if (in == (0x7567000000LL)) >> + return foo (9); >> + if (in == (0xFFF8567FFFFFFFFFLL)) >> + return foo (19); >> + >> + return 0; >> +} >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >> new file mode 100644 >> index 00000000000..2c08c56714a >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >> @@ -0,0 +1,95 @@ >> +/* { dg-do run } */ >> +/* { dg-options "-O2 -std=c99" } */ >> + >> +int >> +foo (int a) >> +{ >> + return a + 6; >> +} >> + >> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >> +{ >> + if (in == (0x8642000000000000ULL)) >> + return foo (1); >> + if (in == (0x7642000000000000ULL)) >> + return foo (12); >> + if (in == (0x8000000000000000ULL)) >> + return foo (32); >> + if (in == (0x8000000000000001ULL)) >> + return foo (33); >> + if (in == (0x8642FFFFFFFFFFFFULL)) >> + return foo (46); >> + if (in == (0x7642FFFFFFFFFFFFULL)) >> + return foo (51); >> + if (in == (0x7567000000ULL)) >> + return foo (9); >> + if (in == (0xFFF8567FFFFFFFFFULL)) >> + return foo (19); >> + >> + return 0; >> +} >> + >> +int __attribute__ ((noinline)) di_fun (long long in) >> +{ >> + if (in == (0x8642000000000000LL)) >> + return foo (1); >> + if (in == (0x7642000000000000LL)) >> + return foo (12); >> + if (in == (0x8000000000000000LL)) >> + return foo (32); >> + if (in == (0x8000000000000001LL)) >> + return foo (33); >> + if (in == (0x8642FFFFFFFFFFFFLL)) >> + return foo (46); >> + if (in == (0x7642FFFFFFFFFFFFLL)) >> + return foo (51); >> + return 0; >> +} >> + >> +int >> +main () >> +{ >> + int e = 0; >> + if (udi_fun (6) != 0) >> + e++; >> + if (udi_fun (0x8642000000000000ULL) != foo (1)) >> + e++; >> + if (udi_fun (0x7642000000000000ULL) != foo (12)) >> + e++; >> + if (udi_fun (0x8000000000000000ULL) != foo (32)) >> + e++; >> + if (udi_fun (0x8000000000000001ULL) != foo (33)) >> + e++; >> + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) >> + e++; >> + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) >> + e++; >> + if (udi_fun (0x7567000000ULL) != foo (9)) >> + e++; >> + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) >> + e++; >> + >> + if (di_fun (6) != 0) >> + e++; >> + if (di_fun (0x8642000000000000LL) != foo (1)) >> + e++; >> + if (di_fun (0x7642000000000000LL) != foo (12)) >> + e++; >> + if (di_fun (0x8000000000000000LL) != foo (32)) >> + e++; >> + if (di_fun (0x8000000000000001LL) != foo (33)) >> + e++; >> + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) >> + e++; >> + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) >> + e++; >> + if (udi_fun (0x7567000000LL) != foo (9)) >> + e++; >> + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) >> + e++; >> + >> + if (e) >> + __builtin_abort (); >> + return 0; >> +} >> +
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html BR, Jeff (Jiufu) Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Gentle ping: > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html > > BR, > Jeff (Jiufu) > > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > >> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >> >> BR, >> Jeff(Jiufu) >> >> >> Jiufu Guo <guojiufu@linux.ibm.com> writes: >> >>> Hi, >>> >>> When checking eq/ne with a constant which has only 16bits, it can be >>> optimized to check the rotated data. By this, the constant building >>> is optimized. >>> >>> As the example in PR103743: >>> For "in == 0x8000000000000000LL", this patch generates: >>> rotldi %r3,%r3,16 >>> cmpldi %cr0,%r3,32768 >>> instead: >>> li %r9,-1 >>> rldicr %r9,%r9,0,0 >>> cmpd %cr0,%r3,%r9 >>> >>> Compare with previous patchs: >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html >>> >>> This patch releases the condition on can_create_pseudo_p and adds >>> clobbers to allow the splitter can be run both before and after RA. >>> >>> This is updated patch based on previous patch and comments: >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html >>> >>> This patch pass bootstrap and regtest on ppc64 and ppc64le. >>> Is it ok for trunk? Thanks for comments! >>> >>> BR, >>> Jeff(Jiufu) >>> >>> >>> PR target/103743 >>> >>> gcc/ChangeLog: >>> >>> * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. >>> (compare_rotate_immediate_p): New. >>> * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New >>> definition. >>> (compare_rotate_immediate_p): New definition. >>> * config/rs6000/rs6000.md (EQNE): New code_attr. >>> (*rotate_on_cmpdi): New define_insn_and_split. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/powerpc/pr103743.c: New test. >>> * gcc.target/powerpc/pr103743_1.c: New test. >>> >>> --- >>> gcc/config/rs6000/rs6000-protos.h | 2 + >>> gcc/config/rs6000/rs6000.cc | 41 ++++++++ >>> gcc/config/rs6000/rs6000.md | 62 +++++++++++- >>> gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ >>> gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ >>> 5 files changed, 251 insertions(+), 1 deletion(-) >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>> >>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h >>> index b3c16e7448d..78847e6b3db 100644 >>> --- a/gcc/config/rs6000/rs6000-protos.h >>> +++ b/gcc/config/rs6000/rs6000-protos.h >>> @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); >>> extern int vspltis_shifted (rtx); >>> extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); >>> extern bool macho_lo_sum_memory_operand (rtx, machine_mode); >>> +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); >>> +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); >>> extern int num_insns_constant (rtx, machine_mode); >>> extern int small_data_operand (rtx, machine_mode); >>> extern bool mem_operand_gpr (rtx, machine_mode); >>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >>> index df491bee2ea..a548db42660 100644 >>> --- a/gcc/config/rs6000/rs6000.cc >>> +++ b/gcc/config/rs6000/rs6000.cc >>> @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) >>> return reverse_condition (code); >>> } >>> >>> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >>> + with at least CLZ bits zero. >>> + >>> + Return the number by which C can be rotated from the immediate. >>> + Return -1 if C can not be rotated as from. */ >>> + >>> +int >>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) >>> +{ >>> + /* case a. 0..0xxx: already at least clz zeros. */ >>> + int lz = clz_hwi (c); >>> + if (lz >= clz) >>> + return 0; >>> + >>> + /* case b. 0..0xxx0..0: at least clz zeros. */ >>> + int tz = ctz_hwi (c); >>> + if (lz + tz >= clz) >>> + return tz; >>> + >>> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. >>> + ^bit -> Vbit >>> + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ >>> + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; >>> + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); >>> + tz = ctz_hwi (rc); >>> + if (clz_hwi (rc) + tz >= clz) >>> + return tz + rot_bits; >>> + >>> + return -1; >>> +} >>> + >>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >>> + >>> +bool >>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >>> +{ >>> + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ >>> + return rotate_from_leading_zeros_const (~c, 49) > 0 >>> + || rotate_from_leading_zeros_const (c, 48) > 0; >>> +} >>> + >>> /* Generate a compare for CODE. Return a brand-new rtx that >>> represents the result of the compare. */ >>> >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index e9e5cd1e54d..cad3cfc98cd 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" >>> "xscvdpsp %x0,%x1" >>> [(set_attr "type" "fp")]) >>> >>> + >>> +(define_code_iterator eqne [eq ne]) >>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >>> + >>> +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" >>> +(define_insn_and_split "*rotate_on_cmpdi" >>> + [(set (pc) >>> + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") >>> + (match_operand:DI 2 "const_int_operand" "n")) >>> + (label_ref (match_operand 0 "")) >>> + (pc))) >>> + (clobber (match_scratch:DI 3 "=r")) >>> + (clobber (match_scratch:CCUNS 4 "=y"))] >>> + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 >>> + && compare_rotate_immediate_p (UINTVAL (operands[2]))" >>> + "#" >>> + "&& 1" >>> + [(pc)] >>> +{ >>> + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); >>> + bool sgn = false; >>> + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); >>> + int rot = rotate_from_leading_zeros_const (C, 48); >>> + if (rot < 0) >>> + { >>> + sgn = true; >>> + rot = rotate_from_leading_zeros_const (~C, 49); >>> + } >>> + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); >>> + >>> + /* i' = rotl (i, n) */ >>> + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; >>> + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); >>> + >>> + /* C' = rotl (C, n) */ >>> + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); >>> + >>> + /* i' == C' */ >>> + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; >>> + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; >>> + PUT_MODE (cc, comp_mode); >>> + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); >>> + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); >>> + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); >>> + emit_jump_insn (gen_rtx_SET (pc_rtx, >>> + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, >>> + loc_ref, pc_rtx))); >>> + >>> + /* keep the probability info for the prediction of the branch insn. */ >>> + if (note) >>> + { >>> + profile_probability prob >>> + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); >>> + >>> + add_reg_br_prob_note (get_last_insn (), prob); >>> + } >>> + >>> + DONE; >>> +} >>> +) >>> + >>> ;; Split a load of a large constant into the appropriate two-insn >>> ;; sequence. >>> >>> @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" >>> ;; rs6000_legitimate_combined_insn prevents combine creating any of >>> ;; the ctr<mode> insns. >>> >>> -(define_code_iterator eqne [eq ne]) >>> (define_code_attr bd [(eq "bdz") (ne "bdnz")]) >>> (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) >>> >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>> new file mode 100644 >>> index 00000000000..abb876ed79e >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>> @@ -0,0 +1,52 @@ >>> +/* { dg-options "-O2" } */ >>> +/* { dg-do compile { target has_arch_ppc64 } } */ >>> + >>> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ >>> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ >>> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ >>> + >>> +int foo (int a); >>> + >>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>> +{ >>> + if (in == (0x8642000000000000ULL)) >>> + return foo (1); >>> + if (in == (0x7642000000000000ULL)) >>> + return foo (12); >>> + if (in == (0x8000000000000000ULL)) >>> + return foo (32); >>> + if (in == (0x8000000000000001ULL)) >>> + return foo (33); >>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>> + return foo (46); >>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>> + return foo (51); >>> + if (in == (0x7567000000ULL)) >>> + return foo (9); >>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>> + return foo (19); >>> + >>> + return 0; >>> +} >>> + >>> +int __attribute__ ((noinline)) di_fun (long long in) >>> +{ >>> + if (in == (0x8642000000000000LL)) >>> + return foo (1); >>> + if (in == (0x7642000000000000LL)) >>> + return foo (12); >>> + if (in == (0x8000000000000000LL)) >>> + return foo (32); >>> + if (in == (0x8000000000000001LL)) >>> + return foo (33); >>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>> + return foo (46); >>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>> + return foo (51); >>> + if (in == (0x7567000000LL)) >>> + return foo (9); >>> + if (in == (0xFFF8567FFFFFFFFFLL)) >>> + return foo (19); >>> + >>> + return 0; >>> +} >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>> new file mode 100644 >>> index 00000000000..2c08c56714a >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>> @@ -0,0 +1,95 @@ >>> +/* { dg-do run } */ >>> +/* { dg-options "-O2 -std=c99" } */ >>> + >>> +int >>> +foo (int a) >>> +{ >>> + return a + 6; >>> +} >>> + >>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>> +{ >>> + if (in == (0x8642000000000000ULL)) >>> + return foo (1); >>> + if (in == (0x7642000000000000ULL)) >>> + return foo (12); >>> + if (in == (0x8000000000000000ULL)) >>> + return foo (32); >>> + if (in == (0x8000000000000001ULL)) >>> + return foo (33); >>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>> + return foo (46); >>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>> + return foo (51); >>> + if (in == (0x7567000000ULL)) >>> + return foo (9); >>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>> + return foo (19); >>> + >>> + return 0; >>> +} >>> + >>> +int __attribute__ ((noinline)) di_fun (long long in) >>> +{ >>> + if (in == (0x8642000000000000LL)) >>> + return foo (1); >>> + if (in == (0x7642000000000000LL)) >>> + return foo (12); >>> + if (in == (0x8000000000000000LL)) >>> + return foo (32); >>> + if (in == (0x8000000000000001LL)) >>> + return foo (33); >>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>> + return foo (46); >>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>> + return foo (51); >>> + return 0; >>> +} >>> + >>> +int >>> +main () >>> +{ >>> + int e = 0; >>> + if (udi_fun (6) != 0) >>> + e++; >>> + if (udi_fun (0x8642000000000000ULL) != foo (1)) >>> + e++; >>> + if (udi_fun (0x7642000000000000ULL) != foo (12)) >>> + e++; >>> + if (udi_fun (0x8000000000000000ULL) != foo (32)) >>> + e++; >>> + if (udi_fun (0x8000000000000001ULL) != foo (33)) >>> + e++; >>> + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) >>> + e++; >>> + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) >>> + e++; >>> + if (udi_fun (0x7567000000ULL) != foo (9)) >>> + e++; >>> + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) >>> + e++; >>> + >>> + if (di_fun (6) != 0) >>> + e++; >>> + if (di_fun (0x8642000000000000LL) != foo (1)) >>> + e++; >>> + if (di_fun (0x7642000000000000LL) != foo (12)) >>> + e++; >>> + if (di_fun (0x8000000000000000LL) != foo (32)) >>> + e++; >>> + if (di_fun (0x8000000000000001LL) != foo (33)) >>> + e++; >>> + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) >>> + e++; >>> + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) >>> + e++; >>> + if (udi_fun (0x7567000000LL) != foo (9)) >>> + e++; >>> + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) >>> + e++; >>> + >>> + if (e) >>> + __builtin_abort (); >>> + return 0; >>> +} >>> +
Hi, Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html BR, Jeff(Jiufu) Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html > > BR, > Jeff (Jiufu) > > > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > >> Gentle ping: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >> >> BR, >> Jeff (Jiufu) >> >> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> >>> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >>> >>> BR, >>> Jeff(Jiufu) >>> >>> >>> Jiufu Guo <guojiufu@linux.ibm.com> writes: >>> >>>> Hi, >>>> >>>> When checking eq/ne with a constant which has only 16bits, it can be >>>> optimized to check the rotated data. By this, the constant building >>>> is optimized. >>>> >>>> As the example in PR103743: >>>> For "in == 0x8000000000000000LL", this patch generates: >>>> rotldi %r3,%r3,16 >>>> cmpldi %cr0,%r3,32768 >>>> instead: >>>> li %r9,-1 >>>> rldicr %r9,%r9,0,0 >>>> cmpd %cr0,%r3,%r9 >>>> >>>> Compare with previous patchs: >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html >>>> >>>> This patch releases the condition on can_create_pseudo_p and adds >>>> clobbers to allow the splitter can be run both before and after RA. >>>> >>>> This is updated patch based on previous patch and comments: >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html >>>> >>>> This patch pass bootstrap and regtest on ppc64 and ppc64le. >>>> Is it ok for trunk? Thanks for comments! >>>> >>>> BR, >>>> Jeff(Jiufu) >>>> >>>> >>>> PR target/103743 >>>> >>>> gcc/ChangeLog: >>>> >>>> * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. >>>> (compare_rotate_immediate_p): New. >>>> * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New >>>> definition. >>>> (compare_rotate_immediate_p): New definition. >>>> * config/rs6000/rs6000.md (EQNE): New code_attr. >>>> (*rotate_on_cmpdi): New define_insn_and_split. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> * gcc.target/powerpc/pr103743.c: New test. >>>> * gcc.target/powerpc/pr103743_1.c: New test. >>>> >>>> --- >>>> gcc/config/rs6000/rs6000-protos.h | 2 + >>>> gcc/config/rs6000/rs6000.cc | 41 ++++++++ >>>> gcc/config/rs6000/rs6000.md | 62 +++++++++++- >>>> gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ >>>> gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ >>>> 5 files changed, 251 insertions(+), 1 deletion(-) >>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c >>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>> >>>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h >>>> index b3c16e7448d..78847e6b3db 100644 >>>> --- a/gcc/config/rs6000/rs6000-protos.h >>>> +++ b/gcc/config/rs6000/rs6000-protos.h >>>> @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); >>>> extern int vspltis_shifted (rtx); >>>> extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); >>>> extern bool macho_lo_sum_memory_operand (rtx, machine_mode); >>>> +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); >>>> +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); >>>> extern int num_insns_constant (rtx, machine_mode); >>>> extern int small_data_operand (rtx, machine_mode); >>>> extern bool mem_operand_gpr (rtx, machine_mode); >>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >>>> index df491bee2ea..a548db42660 100644 >>>> --- a/gcc/config/rs6000/rs6000.cc >>>> +++ b/gcc/config/rs6000/rs6000.cc >>>> @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) >>>> return reverse_condition (code); >>>> } >>>> >>>> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >>>> + with at least CLZ bits zero. >>>> + >>>> + Return the number by which C can be rotated from the immediate. >>>> + Return -1 if C can not be rotated as from. */ >>>> + >>>> +int >>>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) >>>> +{ >>>> + /* case a. 0..0xxx: already at least clz zeros. */ >>>> + int lz = clz_hwi (c); >>>> + if (lz >= clz) >>>> + return 0; >>>> + >>>> + /* case b. 0..0xxx0..0: at least clz zeros. */ >>>> + int tz = ctz_hwi (c); >>>> + if (lz + tz >= clz) >>>> + return tz; >>>> + >>>> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. >>>> + ^bit -> Vbit >>>> + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ >>>> + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; >>>> + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); >>>> + tz = ctz_hwi (rc); >>>> + if (clz_hwi (rc) + tz >= clz) >>>> + return tz + rot_bits; >>>> + >>>> + return -1; >>>> +} >>>> + >>>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >>>> + >>>> +bool >>>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >>>> +{ >>>> + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ >>>> + return rotate_from_leading_zeros_const (~c, 49) > 0 >>>> + || rotate_from_leading_zeros_const (c, 48) > 0; >>>> +} >>>> + >>>> /* Generate a compare for CODE. Return a brand-new rtx that >>>> represents the result of the compare. */ >>>> >>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>>> index e9e5cd1e54d..cad3cfc98cd 100644 >>>> --- a/gcc/config/rs6000/rs6000.md >>>> +++ b/gcc/config/rs6000/rs6000.md >>>> @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" >>>> "xscvdpsp %x0,%x1" >>>> [(set_attr "type" "fp")]) >>>> >>>> + >>>> +(define_code_iterator eqne [eq ne]) >>>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >>>> + >>>> +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" >>>> +(define_insn_and_split "*rotate_on_cmpdi" >>>> + [(set (pc) >>>> + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") >>>> + (match_operand:DI 2 "const_int_operand" "n")) >>>> + (label_ref (match_operand 0 "")) >>>> + (pc))) >>>> + (clobber (match_scratch:DI 3 "=r")) >>>> + (clobber (match_scratch:CCUNS 4 "=y"))] >>>> + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 >>>> + && compare_rotate_immediate_p (UINTVAL (operands[2]))" >>>> + "#" >>>> + "&& 1" >>>> + [(pc)] >>>> +{ >>>> + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); >>>> + bool sgn = false; >>>> + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); >>>> + int rot = rotate_from_leading_zeros_const (C, 48); >>>> + if (rot < 0) >>>> + { >>>> + sgn = true; >>>> + rot = rotate_from_leading_zeros_const (~C, 49); >>>> + } >>>> + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); >>>> + >>>> + /* i' = rotl (i, n) */ >>>> + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; >>>> + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); >>>> + >>>> + /* C' = rotl (C, n) */ >>>> + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); >>>> + >>>> + /* i' == C' */ >>>> + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; >>>> + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; >>>> + PUT_MODE (cc, comp_mode); >>>> + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); >>>> + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); >>>> + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); >>>> + emit_jump_insn (gen_rtx_SET (pc_rtx, >>>> + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, >>>> + loc_ref, pc_rtx))); >>>> + >>>> + /* keep the probability info for the prediction of the branch insn. */ >>>> + if (note) >>>> + { >>>> + profile_probability prob >>>> + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); >>>> + >>>> + add_reg_br_prob_note (get_last_insn (), prob); >>>> + } >>>> + >>>> + DONE; >>>> +} >>>> +) >>>> + >>>> ;; Split a load of a large constant into the appropriate two-insn >>>> ;; sequence. >>>> >>>> @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" >>>> ;; rs6000_legitimate_combined_insn prevents combine creating any of >>>> ;; the ctr<mode> insns. >>>> >>>> -(define_code_iterator eqne [eq ne]) >>>> (define_code_attr bd [(eq "bdz") (ne "bdnz")]) >>>> (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) >>>> >>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>>> new file mode 100644 >>>> index 00000000000..abb876ed79e >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>>> @@ -0,0 +1,52 @@ >>>> +/* { dg-options "-O2" } */ >>>> +/* { dg-do compile { target has_arch_ppc64 } } */ >>>> + >>>> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ >>>> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ >>>> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ >>>> + >>>> +int foo (int a); >>>> + >>>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>>> +{ >>>> + if (in == (0x8642000000000000ULL)) >>>> + return foo (1); >>>> + if (in == (0x7642000000000000ULL)) >>>> + return foo (12); >>>> + if (in == (0x8000000000000000ULL)) >>>> + return foo (32); >>>> + if (in == (0x8000000000000001ULL)) >>>> + return foo (33); >>>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>>> + return foo (46); >>>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>>> + return foo (51); >>>> + if (in == (0x7567000000ULL)) >>>> + return foo (9); >>>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>>> + return foo (19); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +int __attribute__ ((noinline)) di_fun (long long in) >>>> +{ >>>> + if (in == (0x8642000000000000LL)) >>>> + return foo (1); >>>> + if (in == (0x7642000000000000LL)) >>>> + return foo (12); >>>> + if (in == (0x8000000000000000LL)) >>>> + return foo (32); >>>> + if (in == (0x8000000000000001LL)) >>>> + return foo (33); >>>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>>> + return foo (46); >>>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>>> + return foo (51); >>>> + if (in == (0x7567000000LL)) >>>> + return foo (9); >>>> + if (in == (0xFFF8567FFFFFFFFFLL)) >>>> + return foo (19); >>>> + >>>> + return 0; >>>> +} >>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>> new file mode 100644 >>>> index 00000000000..2c08c56714a >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>> @@ -0,0 +1,95 @@ >>>> +/* { dg-do run } */ >>>> +/* { dg-options "-O2 -std=c99" } */ >>>> + >>>> +int >>>> +foo (int a) >>>> +{ >>>> + return a + 6; >>>> +} >>>> + >>>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>>> +{ >>>> + if (in == (0x8642000000000000ULL)) >>>> + return foo (1); >>>> + if (in == (0x7642000000000000ULL)) >>>> + return foo (12); >>>> + if (in == (0x8000000000000000ULL)) >>>> + return foo (32); >>>> + if (in == (0x8000000000000001ULL)) >>>> + return foo (33); >>>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>>> + return foo (46); >>>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>>> + return foo (51); >>>> + if (in == (0x7567000000ULL)) >>>> + return foo (9); >>>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>>> + return foo (19); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +int __attribute__ ((noinline)) di_fun (long long in) >>>> +{ >>>> + if (in == (0x8642000000000000LL)) >>>> + return foo (1); >>>> + if (in == (0x7642000000000000LL)) >>>> + return foo (12); >>>> + if (in == (0x8000000000000000LL)) >>>> + return foo (32); >>>> + if (in == (0x8000000000000001LL)) >>>> + return foo (33); >>>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>>> + return foo (46); >>>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>>> + return foo (51); >>>> + return 0; >>>> +} >>>> + >>>> +int >>>> +main () >>>> +{ >>>> + int e = 0; >>>> + if (udi_fun (6) != 0) >>>> + e++; >>>> + if (udi_fun (0x8642000000000000ULL) != foo (1)) >>>> + e++; >>>> + if (udi_fun (0x7642000000000000ULL) != foo (12)) >>>> + e++; >>>> + if (udi_fun (0x8000000000000000ULL) != foo (32)) >>>> + e++; >>>> + if (udi_fun (0x8000000000000001ULL) != foo (33)) >>>> + e++; >>>> + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) >>>> + e++; >>>> + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) >>>> + e++; >>>> + if (udi_fun (0x7567000000ULL) != foo (9)) >>>> + e++; >>>> + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) >>>> + e++; >>>> + >>>> + if (di_fun (6) != 0) >>>> + e++; >>>> + if (di_fun (0x8642000000000000LL) != foo (1)) >>>> + e++; >>>> + if (di_fun (0x7642000000000000LL) != foo (12)) >>>> + e++; >>>> + if (di_fun (0x8000000000000000LL) != foo (32)) >>>> + e++; >>>> + if (di_fun (0x8000000000000001LL) != foo (33)) >>>> + e++; >>>> + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) >>>> + e++; >>>> + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) >>>> + e++; >>>> + if (udi_fun (0x7567000000LL) != foo (9)) >>>> + e++; >>>> + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) >>>> + e++; >>>> + >>>> + if (e) >>>> + __builtin_abort (); >>>> + return 0; >>>> +} >>>> +
Hi, I would like to ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html BR, Jeff (Jiufu) Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > Gentle ping: > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html > > BR, > Jeff(Jiufu) > > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > >> Hi, >> >> Gentle ping this: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >> >> BR, >> Jeff (Jiufu) >> >> >> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> >>> Gentle ping: >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >>> >>> BR, >>> Jeff (Jiufu) >>> >>> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >>> >>>> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html >>>> >>>> BR, >>>> Jeff(Jiufu) >>>> >>>> >>>> Jiufu Guo <guojiufu@linux.ibm.com> writes: >>>> >>>>> Hi, >>>>> >>>>> When checking eq/ne with a constant which has only 16bits, it can be >>>>> optimized to check the rotated data. By this, the constant building >>>>> is optimized. >>>>> >>>>> As the example in PR103743: >>>>> For "in == 0x8000000000000000LL", this patch generates: >>>>> rotldi %r3,%r3,16 >>>>> cmpldi %cr0,%r3,32768 >>>>> instead: >>>>> li %r9,-1 >>>>> rldicr %r9,%r9,0,0 >>>>> cmpd %cr0,%r3,%r9 >>>>> >>>>> Compare with previous patchs: >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html >>>>> >>>>> This patch releases the condition on can_create_pseudo_p and adds >>>>> clobbers to allow the splitter can be run both before and after RA. >>>>> >>>>> This is updated patch based on previous patch and comments: >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html >>>>> >>>>> This patch pass bootstrap and regtest on ppc64 and ppc64le. >>>>> Is it ok for trunk? Thanks for comments! >>>>> >>>>> BR, >>>>> Jeff(Jiufu) >>>>> >>>>> >>>>> PR target/103743 >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> * config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New. >>>>> (compare_rotate_immediate_p): New. >>>>> * config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New >>>>> definition. >>>>> (compare_rotate_immediate_p): New definition. >>>>> * config/rs6000/rs6000.md (EQNE): New code_attr. >>>>> (*rotate_on_cmpdi): New define_insn_and_split. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> * gcc.target/powerpc/pr103743.c: New test. >>>>> * gcc.target/powerpc/pr103743_1.c: New test. >>>>> >>>>> --- >>>>> gcc/config/rs6000/rs6000-protos.h | 2 + >>>>> gcc/config/rs6000/rs6000.cc | 41 ++++++++ >>>>> gcc/config/rs6000/rs6000.md | 62 +++++++++++- >>>>> gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ >>>>> gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ >>>>> 5 files changed, 251 insertions(+), 1 deletion(-) >>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c >>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>>> >>>>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h >>>>> index b3c16e7448d..78847e6b3db 100644 >>>>> --- a/gcc/config/rs6000/rs6000-protos.h >>>>> +++ b/gcc/config/rs6000/rs6000-protos.h >>>>> @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); >>>>> extern int vspltis_shifted (rtx); >>>>> extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); >>>>> extern bool macho_lo_sum_memory_operand (rtx, machine_mode); >>>>> +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); >>>>> +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); >>>>> extern int num_insns_constant (rtx, machine_mode); >>>>> extern int small_data_operand (rtx, machine_mode); >>>>> extern bool mem_operand_gpr (rtx, machine_mode); >>>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >>>>> index df491bee2ea..a548db42660 100644 >>>>> --- a/gcc/config/rs6000/rs6000.cc >>>>> +++ b/gcc/config/rs6000/rs6000.cc >>>>> @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) >>>>> return reverse_condition (code); >>>>> } >>>>> >>>>> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >>>>> + with at least CLZ bits zero. >>>>> + >>>>> + Return the number by which C can be rotated from the immediate. >>>>> + Return -1 if C can not be rotated as from. */ >>>>> + >>>>> +int >>>>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) >>>>> +{ >>>>> + /* case a. 0..0xxx: already at least clz zeros. */ >>>>> + int lz = clz_hwi (c); >>>>> + if (lz >= clz) >>>>> + return 0; >>>>> + >>>>> + /* case b. 0..0xxx0..0: at least clz zeros. */ >>>>> + int tz = ctz_hwi (c); >>>>> + if (lz + tz >= clz) >>>>> + return tz; >>>>> + >>>>> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. >>>>> + ^bit -> Vbit >>>>> + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ >>>>> + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; >>>>> + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); >>>>> + tz = ctz_hwi (rc); >>>>> + if (clz_hwi (rc) + tz >= clz) >>>>> + return tz + rot_bits; >>>>> + >>>>> + return -1; >>>>> +} >>>>> + >>>>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >>>>> + >>>>> +bool >>>>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >>>>> +{ >>>>> + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ >>>>> + return rotate_from_leading_zeros_const (~c, 49) > 0 >>>>> + || rotate_from_leading_zeros_const (c, 48) > 0; >>>>> +} >>>>> + >>>>> /* Generate a compare for CODE. Return a brand-new rtx that >>>>> represents the result of the compare. */ >>>>> >>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>>>> index e9e5cd1e54d..cad3cfc98cd 100644 >>>>> --- a/gcc/config/rs6000/rs6000.md >>>>> +++ b/gcc/config/rs6000/rs6000.md >>>>> @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" >>>>> "xscvdpsp %x0,%x1" >>>>> [(set_attr "type" "fp")]) >>>>> >>>>> + >>>>> +(define_code_iterator eqne [eq ne]) >>>>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >>>>> + >>>>> +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" >>>>> +(define_insn_and_split "*rotate_on_cmpdi" >>>>> + [(set (pc) >>>>> + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") >>>>> + (match_operand:DI 2 "const_int_operand" "n")) >>>>> + (label_ref (match_operand 0 "")) >>>>> + (pc))) >>>>> + (clobber (match_scratch:DI 3 "=r")) >>>>> + (clobber (match_scratch:CCUNS 4 "=y"))] >>>>> + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 >>>>> + && compare_rotate_immediate_p (UINTVAL (operands[2]))" >>>>> + "#" >>>>> + "&& 1" >>>>> + [(pc)] >>>>> +{ >>>>> + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); >>>>> + bool sgn = false; >>>>> + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); >>>>> + int rot = rotate_from_leading_zeros_const (C, 48); >>>>> + if (rot < 0) >>>>> + { >>>>> + sgn = true; >>>>> + rot = rotate_from_leading_zeros_const (~C, 49); >>>>> + } >>>>> + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); >>>>> + >>>>> + /* i' = rotl (i, n) */ >>>>> + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; >>>>> + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); >>>>> + >>>>> + /* C' = rotl (C, n) */ >>>>> + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); >>>>> + >>>>> + /* i' == C' */ >>>>> + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; >>>>> + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; >>>>> + PUT_MODE (cc, comp_mode); >>>>> + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); >>>>> + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); >>>>> + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); >>>>> + emit_jump_insn (gen_rtx_SET (pc_rtx, >>>>> + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, >>>>> + loc_ref, pc_rtx))); >>>>> + >>>>> + /* keep the probability info for the prediction of the branch insn. */ >>>>> + if (note) >>>>> + { >>>>> + profile_probability prob >>>>> + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); >>>>> + >>>>> + add_reg_br_prob_note (get_last_insn (), prob); >>>>> + } >>>>> + >>>>> + DONE; >>>>> +} >>>>> +) >>>>> + >>>>> ;; Split a load of a large constant into the appropriate two-insn >>>>> ;; sequence. >>>>> >>>>> @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" >>>>> ;; rs6000_legitimate_combined_insn prevents combine creating any of >>>>> ;; the ctr<mode> insns. >>>>> >>>>> -(define_code_iterator eqne [eq ne]) >>>>> (define_code_attr bd [(eq "bdz") (ne "bdnz")]) >>>>> (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) >>>>> >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>>>> new file mode 100644 >>>>> index 00000000000..abb876ed79e >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c >>>>> @@ -0,0 +1,52 @@ >>>>> +/* { dg-options "-O2" } */ >>>>> +/* { dg-do compile { target has_arch_ppc64 } } */ >>>>> + >>>>> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ >>>>> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ >>>>> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ >>>>> + >>>>> +int foo (int a); >>>>> + >>>>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>>>> +{ >>>>> + if (in == (0x8642000000000000ULL)) >>>>> + return foo (1); >>>>> + if (in == (0x7642000000000000ULL)) >>>>> + return foo (12); >>>>> + if (in == (0x8000000000000000ULL)) >>>>> + return foo (32); >>>>> + if (in == (0x8000000000000001ULL)) >>>>> + return foo (33); >>>>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>>>> + return foo (46); >>>>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>>>> + return foo (51); >>>>> + if (in == (0x7567000000ULL)) >>>>> + return foo (9); >>>>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>>>> + return foo (19); >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> +int __attribute__ ((noinline)) di_fun (long long in) >>>>> +{ >>>>> + if (in == (0x8642000000000000LL)) >>>>> + return foo (1); >>>>> + if (in == (0x7642000000000000LL)) >>>>> + return foo (12); >>>>> + if (in == (0x8000000000000000LL)) >>>>> + return foo (32); >>>>> + if (in == (0x8000000000000001LL)) >>>>> + return foo (33); >>>>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>>>> + return foo (46); >>>>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>>>> + return foo (51); >>>>> + if (in == (0x7567000000LL)) >>>>> + return foo (9); >>>>> + if (in == (0xFFF8567FFFFFFFFFLL)) >>>>> + return foo (19); >>>>> + >>>>> + return 0; >>>>> +} >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>>> new file mode 100644 >>>>> index 00000000000..2c08c56714a >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c >>>>> @@ -0,0 +1,95 @@ >>>>> +/* { dg-do run } */ >>>>> +/* { dg-options "-O2 -std=c99" } */ >>>>> + >>>>> +int >>>>> +foo (int a) >>>>> +{ >>>>> + return a + 6; >>>>> +} >>>>> + >>>>> +int __attribute__ ((noinline)) udi_fun (unsigned long long in) >>>>> +{ >>>>> + if (in == (0x8642000000000000ULL)) >>>>> + return foo (1); >>>>> + if (in == (0x7642000000000000ULL)) >>>>> + return foo (12); >>>>> + if (in == (0x8000000000000000ULL)) >>>>> + return foo (32); >>>>> + if (in == (0x8000000000000001ULL)) >>>>> + return foo (33); >>>>> + if (in == (0x8642FFFFFFFFFFFFULL)) >>>>> + return foo (46); >>>>> + if (in == (0x7642FFFFFFFFFFFFULL)) >>>>> + return foo (51); >>>>> + if (in == (0x7567000000ULL)) >>>>> + return foo (9); >>>>> + if (in == (0xFFF8567FFFFFFFFFULL)) >>>>> + return foo (19); >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> +int __attribute__ ((noinline)) di_fun (long long in) >>>>> +{ >>>>> + if (in == (0x8642000000000000LL)) >>>>> + return foo (1); >>>>> + if (in == (0x7642000000000000LL)) >>>>> + return foo (12); >>>>> + if (in == (0x8000000000000000LL)) >>>>> + return foo (32); >>>>> + if (in == (0x8000000000000001LL)) >>>>> + return foo (33); >>>>> + if (in == (0x8642FFFFFFFFFFFFLL)) >>>>> + return foo (46); >>>>> + if (in == (0x7642FFFFFFFFFFFFLL)) >>>>> + return foo (51); >>>>> + return 0; >>>>> +} >>>>> + >>>>> +int >>>>> +main () >>>>> +{ >>>>> + int e = 0; >>>>> + if (udi_fun (6) != 0) >>>>> + e++; >>>>> + if (udi_fun (0x8642000000000000ULL) != foo (1)) >>>>> + e++; >>>>> + if (udi_fun (0x7642000000000000ULL) != foo (12)) >>>>> + e++; >>>>> + if (udi_fun (0x8000000000000000ULL) != foo (32)) >>>>> + e++; >>>>> + if (udi_fun (0x8000000000000001ULL) != foo (33)) >>>>> + e++; >>>>> + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) >>>>> + e++; >>>>> + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) >>>>> + e++; >>>>> + if (udi_fun (0x7567000000ULL) != foo (9)) >>>>> + e++; >>>>> + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) >>>>> + e++; >>>>> + >>>>> + if (di_fun (6) != 0) >>>>> + e++; >>>>> + if (di_fun (0x8642000000000000LL) != foo (1)) >>>>> + e++; >>>>> + if (di_fun (0x7642000000000000LL) != foo (12)) >>>>> + e++; >>>>> + if (di_fun (0x8000000000000000LL) != foo (32)) >>>>> + e++; >>>>> + if (di_fun (0x8000000000000001LL) != foo (33)) >>>>> + e++; >>>>> + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) >>>>> + e++; >>>>> + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) >>>>> + e++; >>>>> + if (udi_fun (0x7567000000LL) != foo (9)) >>>>> + e++; >>>>> + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) >>>>> + e++; >>>>> + >>>>> + if (e) >>>>> + __builtin_abort (); >>>>> + return 0; >>>>> +} >>>>> +
Hi! Sorry for the tardiness. On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote: > When checking eq/ne with a constant which has only 16bits, it can be > optimized to check the rotated data. By this, the constant building > is optimized. > > As the example in PR103743: > For "in == 0x8000000000000000LL", this patch generates: > rotldi %r3,%r3,16 > cmpldi %cr0,%r3,32768 > instead: > li %r9,-1 > rldicr %r9,%r9,0,0 > cmpd %cr0,%r3,%r9 FWIW, I find the winnt assembler syntax very hard to read, and I doubt I am the only one. So you're doing rotldi 3,3,16 ; cmpldi 3,0x8000 instead of li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9 > +/* Check if C can be rotated from an immediate which starts (as 64bit integer) > + with at least CLZ bits zero. > + > + Return the number by which C can be rotated from the immediate. > + Return -1 if C can not be rotated as from. */ > + > +int > +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) The name does not say what the function does. Can you think of a better name? Maybe it is better to not return magic values anyway? So perhaps bool can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot) (with *rot written if the return value is true). > + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. s/firstly/first/ > +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ > + > +bool > +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) No _p please, this function is not a predicate (at least, the name does not say what it tests). So a better name please. This matters even more for extern functions (like this one) because the function implementation is always farther away so you do not easily have all interface details in mind. Good names help :-) > +(define_code_iterator eqne [eq ne]) > +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) Just <CODE> or <CODE:eqne> should work? Please fix these things. Almost there :-) Segher
Hi, Segher Boessenkool <segher@kernel.crashing.org> writes: > Hi! > > Sorry for the tardiness. > > On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote: >> When checking eq/ne with a constant which has only 16bits, it can be >> optimized to check the rotated data. By this, the constant building >> is optimized. >> >> As the example in PR103743: >> For "in == 0x8000000000000000LL", this patch generates: >> rotldi %r3,%r3,16 >> cmpldi %cr0,%r3,32768 >> instead: >> li %r9,-1 >> rldicr %r9,%r9,0,0 >> cmpd %cr0,%r3,%r9 > > FWIW, I find the winnt assembler syntax very hard to read, and I doubt > I am the only one. Oh, sorry about that. I will avoid to add '-mregnames' to dump asm. :) BTW, what options are you used to dump asm code? > > So you're doing > rotldi 3,3,16 ; cmpldi 3,0x8000 > instead of > li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9 > >> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >> + with at least CLZ bits zero. >> + >> + Return the number by which C can be rotated from the immediate. >> + Return -1 if C can not be rotated as from. */ >> + >> +int >> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) > > The name does not say what the function does. Can you think of a better > name? > > Maybe it is better to not return magic values anyway? So perhaps > > bool > can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot) > > (with *rot written if the return value is true). Thanks for your suggestion! It is checking if a constant can be rotated from/to a value which has only few tailing nonzero bits (all leading bits are zeros). So, I'm thinking to name the function as something like: can_be_rotated_to_lowbits. > >> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. > > s/firstly/first/ Thanks! > >> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >> + >> +bool >> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) > > No _p please, this function is not a predicate (at least, the name does > not say what it tests). So a better name please. This matters even > more for extern functions (like this one) because the function > implementation is always farther away so you do not easily have all > interface details in mind. Good names help :-) Thanks! Name is always a matter. :) Maybe we can name this funciton as "can_be_rotated_as_compare_operand", or "is_constant_rotateable_for_compare", because this function checks "if a constant can be rotated to/from an immediate operand of cmpdi/cmpldi". > >> +(define_code_iterator eqne [eq ne]) >> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) > > Just <CODE> or <CODE:eqne> should work? Great! Thanks for point out this! <eqne:CODE> works. > > Please fix these things. Almost there :-) I updated the patch as below. Bootstraping and regtesting is ongoing. Thanks again for your careful and insight review! BR, Jeff (Jiufu) ---------- When checking eq/ne with a constant which has only 16bits, it can be optimized to check the rotated data. By this, the constant building is optimized. As the example in PR103743: For "in == 0x8000000000000000LL", this patch generates: rotldi 3,3,1 ; cmpldi 0,3,1 instead of: li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9 Compare with previous version: This patch refactor the code according to review comments. e.g. updating function names/comments/code. PR target/103743 gcc/ChangeLog: * config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New. (can_be_rotated_as_compare_operand): New. * config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition. (can_be_rotated_as_compare_operand): New definition. * config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr103743.c: New test. * gcc.target/powerpc/pr103743_1.c: New test. --- gcc/config/rs6000/rs6000-protos.h | 2 + gcc/config/rs6000/rs6000.cc | 56 +++++++++++ gcc/config/rs6000/rs6000.md | 63 +++++++++++- gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ 5 files changed, 267 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index d0d89320ef6..9626917e359 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); +extern bool can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT, int, int *); +extern bool can_be_rotated_as_compare_operand (unsigned HOST_WIDE_INT); extern int num_insns_constant (rtx, machine_mode); extern int small_data_operand (rtx, machine_mode); extern bool mem_operand_gpr (rtx, machine_mode); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index b3a609f3aa3..2b0df8479f0 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -14925,6 +14925,62 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) return reverse_condition (code); } +/* Check if C (as 64bit integer) can be rotated to a constant which constains + nonzero bits at LOWBITS only. + + Return true if C can be rotated to such constant. And *ROT is written to + the number by which C is rotated. + Return false otherwise. */ + +bool +can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT c, int lowbits, int *rot) +{ + int clz = HOST_BITS_PER_WIDE_INT - lowbits; + + /* case a. 0..0xxx: already at least clz zeros. */ + int lz = clz_hwi (c); + if (lz >= clz) + { + *rot = 0; + return true; + } + + /* case b. 0..0xxx0..0: at least clz zeros. */ + int tz = ctz_hwi (c); + if (lz + tz >= clz) + { + *rot = HOST_BITS_PER_WIDE_INT - tz; + return true; + } + + /* case c. xx10.....0xx: rotate 'clz - 1' bits first, then check case b. + ^bit -> Vbit, , then zeros are at head or tail. + 00...00xxx100, 'clz - 1' >= 'bits of xxxx'. */ + const int rot_bits = lowbits + 1; + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); + tz = ctz_hwi (rc); + if (clz_hwi (rc) + tz >= clz) + { + *rot = HOST_BITS_PER_WIDE_INT - (tz + rot_bits); + return true; + } + + return false; +} + +/* Check if C can be rotated to an immediate operand of cmpdi or cmpldi. */ + +bool +can_be_rotated_as_compare_operand (unsigned HOST_WIDE_INT c) +{ + int rot = 0; + /* leading 48 zeros + 16 lowbits (cmpldi), + or leading 49 ones + 15 lowbits (cmpdi). */ + bool res = can_be_rotated_to_lowbits (~c, 15, &rot) + || can_be_rotated_to_lowbits (c, 16, &rot); + return res && rot > 0; +} + /* Generate a compare for CODE. Return a brand-new rtx that represents the result of the compare. */ diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 4bd1dfd3da9..dabfe5dfdfd 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7765,6 +7765,68 @@ (define_insn "*movsi_from_df" "xscvdpsp %x0,%x1" [(set_attr "type" "fp")]) + +(define_code_iterator eqne [eq ne]) + +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" +(define_insn_and_split "*rotate_on_cmpdi" + [(set (pc) + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")) + (label_ref (match_operand 0 "")) + (pc))) + (clobber (match_scratch:DI 3 "=r")) + (clobber (match_scratch:CCUNS 4 "=y"))] + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 + && can_be_rotated_as_compare_operand (UINTVAL (operands[2]))" + "#" + "&& 1" + [(pc)] +{ + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); + bool sgn = false; + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); + int rot; + if (!can_be_rotated_to_lowbits (C, 16, &rot)) + { + /* cmpdi */ + sgn = true; + bool res = can_be_rotated_to_lowbits (~C, 15, &rot); + gcc_assert (res); + } + rtx n = GEN_INT (rot); + + /* i' = rotl (i, n) */ + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); + + /* C' = rotl (C, n) */ + rtx op1 = GEN_INT ((C << rot) | (C >> (HOST_BITS_PER_WIDE_INT - rot))); + + /* i' == C' */ + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; + PUT_MODE (cc, comp_mode); + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); + rtx cmp = gen_rtx_<eqne:CODE> (CCmode, cc, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, + loc_ref, pc_rtx))); + + /* keep the probability info for the prediction of the branch insn. */ + if (note) + { + profile_probability prob + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); + + add_reg_br_prob_note (get_last_insn (), prob); + } + + DONE; +} +) + ;; Split a load of a large constant into the appropriate two-insn ;; sequence. @@ -13453,7 +13515,6 @@ (define_expand "@ctr<mode>" ;; rs6000_legitimate_combined_insn prevents combine creating any of ;; the ctr<mode> insns. -(define_code_iterator eqne [eq ne]) (define_code_attr bd [(eq "bdz") (ne "bdnz")]) (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c new file mode 100644 index 00000000000..41c686bb4cb --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c @@ -0,0 +1,52 @@ +/* { dg-options "-O2" } */ +/* { dg-do compile { target has_arch_ppc64 } } */ + +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ + +int foo (int a); + +int __attribute__ ((noinline)) udi_fun (unsigned long long in) +{ + if (in == (0x8642000000000000ULL)) + return foo (1); + if (in == (0x7642000000000000ULL)) + return foo (12); + if (in == (0x8000000000000000ULL)) + return foo (32); + if (in == (0x8700000000000091ULL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFULL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFULL)) + return foo (51); + if (in == (0x7567000000ULL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFULL)) + return foo (19); + + return 0; +} + +int __attribute__ ((noinline)) di_fun (long long in) +{ + if (in == (0x8642000000000000LL)) + return foo (1); + if (in == (0x7642000000000000LL)) + return foo (12); + if (in == (0x8000000000000000LL)) + return foo (32); + if (in == (0x8700000000000091LL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFLL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFLL)) + return foo (51); + if (in == (0x7567000000LL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFLL)) + return foo (19); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c new file mode 100644 index 00000000000..e128aae7574 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c @@ -0,0 +1,95 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -std=c99" } */ + +int +foo (int a) +{ + return a + 6; +} + +int __attribute__ ((noipa)) udi_fun (unsigned long long in) +{ + if (in == (0x8642000000000000ULL)) + return foo (1); + if (in == (0x7642000000000000ULL)) + return foo (12); + if (in == (0x8000000000000000ULL)) + return foo (32); + if (in == (0x8700000000000091ULL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFULL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFULL)) + return foo (51); + if (in == (0x7567000000ULL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFULL)) + return foo (19); + + return 0; +} + +int __attribute__ ((noipa)) di_fun (long long in) +{ + if (in == (0x8642000000000000LL)) + return foo (1); + if (in == (0x7642000000000000LL)) + return foo (12); + if (in == (0x8000000000000000LL)) + return foo (32); + if (in == (0x8700000000000091LL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFLL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFLL)) + return foo (51); + return 0; +} + +int +main () +{ + int e = 0; + if (udi_fun (6) != 0) + e++; + if (udi_fun (0x8642000000000000ULL) != foo (1)) + e++; + if (udi_fun (0x7642000000000000ULL) != foo (12)) + e++; + if (udi_fun (0x8000000000000000ULL) != foo (32)) + e++; + if (udi_fun (0x8700000000000091ULL) != foo (33)) + e++; + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) + e++; + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) + e++; + if (udi_fun (0x7567000000ULL) != foo (9)) + e++; + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) + e++; + + if (di_fun (6) != 0) + e++; + if (di_fun (0x8642000000000000LL) != foo (1)) + e++; + if (di_fun (0x7642000000000000LL) != foo (12)) + e++; + if (di_fun (0x8000000000000000LL) != foo (32)) + e++; + if (di_fun (0x8700000000000091LL) != foo (33)) + e++; + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) + e++; + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) + e++; + if (udi_fun (0x7567000000LL) != foo (9)) + e++; + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) + e++; + + if (e) + __builtin_abort (); + return 0; +} +
Hi, Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > Segher Boessenkool <segher@kernel.crashing.org> writes: > >> Hi! >> >> Sorry for the tardiness. >> >> On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote: >>> When checking eq/ne with a constant which has only 16bits, it can be >>> optimized to check the rotated data. By this, the constant building >>> is optimized. >>> >>> As the example in PR103743: >>> For "in == 0x8000000000000000LL", this patch generates: >>> rotldi %r3,%r3,16 >>> cmpldi %cr0,%r3,32768 >>> instead: >>> li %r9,-1 >>> rldicr %r9,%r9,0,0 >>> cmpd %cr0,%r3,%r9 >> >> FWIW, I find the winnt assembler syntax very hard to read, and I doubt >> I am the only one. > Oh, sorry about that. I will avoid to add '-mregnames' to dump asm. :) > BTW, what options are you used to dump asm code? >> >> So you're doing >> rotldi 3,3,16 ; cmpldi 3,0x8000 >> instead of >> li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9 >> >>> +/* Check if C can be rotated from an immediate which starts (as 64bit integer) >>> + with at least CLZ bits zero. >>> + >>> + Return the number by which C can be rotated from the immediate. >>> + Return -1 if C can not be rotated as from. */ >>> + >>> +int >>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) >> >> The name does not say what the function does. Can you think of a better >> name? >> >> Maybe it is better to not return magic values anyway? So perhaps >> >> bool >> can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot) >> >> (with *rot written if the return value is true). > Thanks for your suggestion! > It is checking if a constant can be rotated from/to a value which has > only few tailing nonzero bits (all leading bits are zeros). > > So, I'm thinking to name the function as something like: > can_be_rotated_to_lowbits. > >> >>> + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. >> >> s/firstly/first/ > Thanks! >> >>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ >>> + >>> +bool >>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >> >> No _p please, this function is not a predicate (at least, the name does >> not say what it tests). So a better name please. This matters even >> more for extern functions (like this one) because the function >> implementation is always farther away so you do not easily have all >> interface details in mind. Good names help :-) > Thanks! Name is always a matter. :) > > Maybe we can name this funciton as "can_be_rotated_as_compare_operand", > or "is_constant_rotateable_for_compare", because this function checks > "if a constant can be rotated to/from an immediate operand of > cmpdi/cmpldi". > >> >>> +(define_code_iterator eqne [eq ne]) >>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >> >> Just <CODE> or <CODE:eqne> should work? > Great! Thanks for point out this! <eqne:CODE> works. >> >> Please fix these things. Almost there :-) > > I updated the patch as below. Bootstraping and regtesting is ongoing. > Thanks again for your careful and insight review! Bootstrap and regtests pass on ppc64{,le}. BR, Jeff (Jiufu) > > > BR, > Jeff (Jiufu) > > ---------- > When checking eq/ne with a constant which has only 16bits, it can be > optimized to check the rotated data. By this, the constant building > is optimized. > > As the example in PR103743: > For "in == 0x8000000000000000LL", this patch generates: > rotldi 3,3,1 ; cmpldi 0,3,1 > instead of: > li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9 > > Compare with previous version: > This patch refactor the code according to review comments. > e.g. updating function names/comments/code. > > > PR target/103743 > > gcc/ChangeLog: > > * config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New. > (can_be_rotated_as_compare_operand): New. > * config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition. > (can_be_rotated_as_compare_operand): New definition. > * config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr103743.c: New test. > * gcc.target/powerpc/pr103743_1.c: New test. > > --- > gcc/config/rs6000/rs6000-protos.h | 2 + > gcc/config/rs6000/rs6000.cc | 56 +++++++++++ > gcc/config/rs6000/rs6000.md | 63 +++++++++++- > gcc/testsuite/gcc.target/powerpc/pr103743.c | 52 ++++++++++ > gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++++++++++++++++++ > 5 files changed, 267 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c > > diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h > index d0d89320ef6..9626917e359 100644 > --- a/gcc/config/rs6000/rs6000-protos.h > +++ b/gcc/config/rs6000/rs6000-protos.h > @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); > extern int vspltis_shifted (rtx); > extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); > extern bool macho_lo_sum_memory_operand (rtx, machine_mode); > +extern bool can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT, int, int *); > +extern bool can_be_rotated_as_compare_operand (unsigned HOST_WIDE_INT); > extern int num_insns_constant (rtx, machine_mode); > extern int small_data_operand (rtx, machine_mode); > extern bool mem_operand_gpr (rtx, machine_mode); > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index b3a609f3aa3..2b0df8479f0 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -14925,6 +14925,62 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) > return reverse_condition (code); > } > > +/* Check if C (as 64bit integer) can be rotated to a constant which constains > + nonzero bits at LOWBITS only. > + > + Return true if C can be rotated to such constant. And *ROT is written to > + the number by which C is rotated. > + Return false otherwise. */ > + > +bool > +can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT c, int lowbits, int *rot) > +{ > + int clz = HOST_BITS_PER_WIDE_INT - lowbits; > + > + /* case a. 0..0xxx: already at least clz zeros. */ > + int lz = clz_hwi (c); > + if (lz >= clz) > + { > + *rot = 0; > + return true; > + } > + > + /* case b. 0..0xxx0..0: at least clz zeros. */ > + int tz = ctz_hwi (c); > + if (lz + tz >= clz) > + { > + *rot = HOST_BITS_PER_WIDE_INT - tz; > + return true; > + } > + > + /* case c. xx10.....0xx: rotate 'clz - 1' bits first, then check case b. > + ^bit -> Vbit, , then zeros are at head or tail. > + 00...00xxx100, 'clz - 1' >= 'bits of xxxx'. */ > + const int rot_bits = lowbits + 1; > + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); > + tz = ctz_hwi (rc); > + if (clz_hwi (rc) + tz >= clz) > + { > + *rot = HOST_BITS_PER_WIDE_INT - (tz + rot_bits); > + return true; > + } > + > + return false; > +} > + > +/* Check if C can be rotated to an immediate operand of cmpdi or cmpldi. */ > + > +bool > +can_be_rotated_as_compare_operand (unsigned HOST_WIDE_INT c) > +{ > + int rot = 0; > + /* leading 48 zeros + 16 lowbits (cmpldi), > + or leading 49 ones + 15 lowbits (cmpdi). */ > + bool res = can_be_rotated_to_lowbits (~c, 15, &rot) > + || can_be_rotated_to_lowbits (c, 16, &rot); > + return res && rot > 0; > +} > + > /* Generate a compare for CODE. Return a brand-new rtx that > represents the result of the compare. */ > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 4bd1dfd3da9..dabfe5dfdfd 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -7765,6 +7765,68 @@ (define_insn "*movsi_from_df" > "xscvdpsp %x0,%x1" > [(set_attr "type" "fp")]) > > + > +(define_code_iterator eqne [eq ne]) > + > +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" > +(define_insn_and_split "*rotate_on_cmpdi" > + [(set (pc) > + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") > + (match_operand:DI 2 "const_int_operand" "n")) > + (label_ref (match_operand 0 "")) > + (pc))) > + (clobber (match_scratch:DI 3 "=r")) > + (clobber (match_scratch:CCUNS 4 "=y"))] > + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 > + && can_be_rotated_as_compare_operand (UINTVAL (operands[2]))" > + "#" > + "&& 1" > + [(pc)] > +{ > + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); > + bool sgn = false; > + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); > + int rot; > + if (!can_be_rotated_to_lowbits (C, 16, &rot)) > + { > + /* cmpdi */ > + sgn = true; > + bool res = can_be_rotated_to_lowbits (~C, 15, &rot); > + gcc_assert (res); > + } > + rtx n = GEN_INT (rot); > + > + /* i' = rotl (i, n) */ > + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; > + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); > + > + /* C' = rotl (C, n) */ > + rtx op1 = GEN_INT ((C << rot) | (C >> (HOST_BITS_PER_WIDE_INT - rot))); > + > + /* i' == C' */ > + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; > + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; > + PUT_MODE (cc, comp_mode); > + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); > + rtx cmp = gen_rtx_<eqne:CODE> (CCmode, cc, const0_rtx); > + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); > + emit_jump_insn (gen_rtx_SET (pc_rtx, > + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, > + loc_ref, pc_rtx))); > + > + /* keep the probability info for the prediction of the branch insn. */ > + if (note) > + { > + profile_probability prob > + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); > + > + add_reg_br_prob_note (get_last_insn (), prob); > + } > + > + DONE; > +} > +) > + > ;; Split a load of a large constant into the appropriate two-insn > ;; sequence. > > @@ -13453,7 +13515,6 @@ (define_expand "@ctr<mode>" > ;; rs6000_legitimate_combined_insn prevents combine creating any of > ;; the ctr<mode> insns. > > -(define_code_iterator eqne [eq ne]) > (define_code_attr bd [(eq "bdz") (ne "bdnz")]) > (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c > new file mode 100644 > index 00000000000..41c686bb4cb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c > @@ -0,0 +1,52 @@ > +/* { dg-options "-O2" } */ > +/* { dg-do compile { target has_arch_ppc64 } } */ > + > +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ > +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ > +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ > + > +int foo (int a); > + > +int __attribute__ ((noinline)) udi_fun (unsigned long long in) > +{ > + if (in == (0x8642000000000000ULL)) > + return foo (1); > + if (in == (0x7642000000000000ULL)) > + return foo (12); > + if (in == (0x8000000000000000ULL)) > + return foo (32); > + if (in == (0x8700000000000091ULL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFULL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFULL)) > + return foo (51); > + if (in == (0x7567000000ULL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFULL)) > + return foo (19); > + > + return 0; > +} > + > +int __attribute__ ((noinline)) di_fun (long long in) > +{ > + if (in == (0x8642000000000000LL)) > + return foo (1); > + if (in == (0x7642000000000000LL)) > + return foo (12); > + if (in == (0x8000000000000000LL)) > + return foo (32); > + if (in == (0x8700000000000091LL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFLL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFLL)) > + return foo (51); > + if (in == (0x7567000000LL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFLL)) > + return foo (19); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c > new file mode 100644 > index 00000000000..e128aae7574 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c > @@ -0,0 +1,95 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -std=c99" } */ > + > +int > +foo (int a) > +{ > + return a + 6; > +} > + > +int __attribute__ ((noipa)) udi_fun (unsigned long long in) > +{ > + if (in == (0x8642000000000000ULL)) > + return foo (1); > + if (in == (0x7642000000000000ULL)) > + return foo (12); > + if (in == (0x8000000000000000ULL)) > + return foo (32); > + if (in == (0x8700000000000091ULL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFULL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFULL)) > + return foo (51); > + if (in == (0x7567000000ULL)) > + return foo (9); > + if (in == (0xFFF8567FFFFFFFFFULL)) > + return foo (19); > + > + return 0; > +} > + > +int __attribute__ ((noipa)) di_fun (long long in) > +{ > + if (in == (0x8642000000000000LL)) > + return foo (1); > + if (in == (0x7642000000000000LL)) > + return foo (12); > + if (in == (0x8000000000000000LL)) > + return foo (32); > + if (in == (0x8700000000000091LL)) > + return foo (33); > + if (in == (0x8642FFFFFFFFFFFFLL)) > + return foo (46); > + if (in == (0x7642FFFFFFFFFFFFLL)) > + return foo (51); > + return 0; > +} > + > +int > +main () > +{ > + int e = 0; > + if (udi_fun (6) != 0) > + e++; > + if (udi_fun (0x8642000000000000ULL) != foo (1)) > + e++; > + if (udi_fun (0x7642000000000000ULL) != foo (12)) > + e++; > + if (udi_fun (0x8000000000000000ULL) != foo (32)) > + e++; > + if (udi_fun (0x8700000000000091ULL) != foo (33)) > + e++; > + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) > + e++; > + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) > + e++; > + if (udi_fun (0x7567000000ULL) != foo (9)) > + e++; > + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) > + e++; > + > + if (di_fun (6) != 0) > + e++; > + if (di_fun (0x8642000000000000LL) != foo (1)) > + e++; > + if (di_fun (0x7642000000000000LL) != foo (12)) > + e++; > + if (di_fun (0x8000000000000000LL) != foo (32)) > + e++; > + if (di_fun (0x8700000000000091LL) != foo (33)) > + e++; > + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) > + e++; > + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) > + e++; > + if (udi_fun (0x7567000000LL) != foo (9)) > + e++; > + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) > + e++; > + > + if (e) > + __builtin_abort (); > + return 0; > +} > +
On Wed, Dec 14, 2022 at 04:26:54PM +0800, Jiufu Guo wrote: > Segher Boessenkool <segher@kernel.crashing.org> writes: > > On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote: > >> li %r9,-1 > >> rldicr %r9,%r9,0,0 > >> cmpd %cr0,%r3,%r9 > > > > FWIW, I find the winnt assembler syntax very hard to read, and I doubt > > I am the only one. > Oh, sorry about that. I will avoid to add '-mregnames' to dump asm. :) > BTW, what options are you used to dump asm code? The same as GCC outputs, and as I write assembler code as: bare numbers. It is much easier to type, and very much easier to read. -mregnames is fine for output (and it is the default as well), but problematic for input. Take for example li r10,r10 which translates to li 10,10 while what was probably wanted is to load the address of the global symbol r10, which can be written as li r10,(r10) I do understand that liking the bare numbers syntax is an acquired taste of course. But less clutter is very useful. This goes hand in hand with writing multiple asm statements per line, which allows you to group things together nicely: li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9 > > Maybe it is better to not return magic values anyway? So perhaps > > > > bool > > can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot) > > > > (with *rot written if the return value is true). > Thanks for your suggestion! > It is checking if a constant can be rotated from/to a value which has > only few tailing nonzero bits (all leading bits are zeros). > > So, I'm thinking to name the function as something like: > can_be_rotated_to_lowbits. That is a good name yeah. > >> +bool > >> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) > > > > No _p please, this function is not a predicate (at least, the name does > > not say what it tests). So a better name please. This matters even > > more for extern functions (like this one) because the function > > implementation is always farther away so you do not easily have all > > interface details in mind. Good names help :-) > Thanks! Name is always a matter. :) > > Maybe we can name this funciton as "can_be_rotated_as_compare_operand", > or "is_constant_rotateable_for_compare", because this function checks > "if a constant can be rotated to/from an immediate operand of > cmpdi/cmpldi". Maybe just "constant_can_be_rotated_to_lowbits"? (If that is what the function does). It doesn't clearly say that it allows negative numbers as well, but that is a problem of the function itself already; maybe it would be better to do signed and unsigned separately. > >> +(define_code_iterator eqne [eq ne]) > >> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) > > > > Just <CODE> or <CODE:eqne> should work? > Great! Thanks for point out this! <eqne:CODE> works. > > > > Please fix these things. Almost there :-) > > I updated the patch as below. Bootstraping and regtesting is ongoing. > Thanks again for your careful and insight review! Please send as new message (not as reply even), that is much easier to handle. Thanks! Segher
Hi, Segher Boessenkool <segher@kernel.crashing.org> writes: > On Wed, Dec 14, 2022 at 04:26:54PM +0800, Jiufu Guo wrote: >> Segher Boessenkool <segher@kernel.crashing.org> writes: >> > On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote: >> >> li %r9,-1 >> >> rldicr %r9,%r9,0,0 >> >> cmpd %cr0,%r3,%r9 >> > >> > FWIW, I find the winnt assembler syntax very hard to read, and I doubt >> > I am the only one. >> Oh, sorry about that. I will avoid to add '-mregnames' to dump asm. :) >> BTW, what options are you used to dump asm code? > > The same as GCC outputs, and as I write assembler code as: bare numbers. > It is much easier to type, and very much easier to read. > > -mregnames is fine for output (and it is the default as well), but > problematic for input. Take for example > li r10,r10 > which translates to > li 10,10 > while what was probably wanted is to load the address of the global > symbol r10, which can be written as > li r10,(r10) > > I do understand that liking the bare numbers syntax is an acquired taste > of course. But less clutter is very useful. This goes hand in hand > with writing multiple asm statements per line, which allows you to group > things together nicely: > li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9 > Great! Thanks for your helpful comments! >> > Maybe it is better to not return magic values anyway? So perhaps >> > >> > bool >> > can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot) >> > >> > (with *rot written if the return value is true). >> Thanks for your suggestion! >> It is checking if a constant can be rotated from/to a value which has >> only few tailing nonzero bits (all leading bits are zeros). >> >> So, I'm thinking to name the function as something like: >> can_be_rotated_to_lowbits. > > That is a good name yeah. > >> >> +bool >> >> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) >> > >> > No _p please, this function is not a predicate (at least, the name does >> > not say what it tests). So a better name please. This matters even >> > more for extern functions (like this one) because the function >> > implementation is always farther away so you do not easily have all >> > interface details in mind. Good names help :-) >> Thanks! Name is always a matter. :) >> >> Maybe we can name this funciton as "can_be_rotated_as_compare_operand", >> or "is_constant_rotateable_for_compare", because this function checks >> "if a constant can be rotated to/from an immediate operand of >> cmpdi/cmpldi". > > Maybe just "constant_can_be_rotated_to_lowbits"? (If that is what the > function does). It doesn't clearly say that it allows negative numbers > as well, but that is a problem of the function itself already; maybe it > would be better to do signed and unsigned separately. It makes sense. I updated a new version patch. > >> >> +(define_code_iterator eqne [eq ne]) >> >> +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) >> > >> > Just <CODE> or <CODE:eqne> should work? >> Great! Thanks for point out this! <eqne:CODE> works. >> > >> > Please fix these things. Almost there :-) >> >> I updated the patch as below. Bootstraping and regtesting is ongoing. >> Thanks again for your careful and insight review! > > Please send as new message (not as reply even), that is much easier to > handle. Thanks! Sure, I just submit a new patch version. https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608765.html Thanks a lot for your review. BR, Jeff (Jiufu) > > > Segher
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index b3c16e7448d..78847e6b3db 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int); +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT); extern int num_insns_constant (rtx, machine_mode); extern int small_data_operand (rtx, machine_mode); extern bool mem_operand_gpr (rtx, machine_mode); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index df491bee2ea..a548db42660 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum rtx_code code) return reverse_condition (code); } +/* Check if C can be rotated from an immediate which starts (as 64bit integer) + with at least CLZ bits zero. + + Return the number by which C can be rotated from the immediate. + Return -1 if C can not be rotated as from. */ + +int +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz) +{ + /* case a. 0..0xxx: already at least clz zeros. */ + int lz = clz_hwi (c); + if (lz >= clz) + return 0; + + /* case b. 0..0xxx0..0: at least clz zeros. */ + int tz = ctz_hwi (c); + if (lz + tz >= clz) + return tz; + + /* case c. xx10.....0xx: rotate 'clz + 1' bits firstly, then check case b. + ^bit -> Vbit + 00...00xxx100, 'clz + 1' >= bits of xxxx. */ + const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1; + unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1)); + tz = ctz_hwi (rc); + if (clz_hwi (rc) + tz >= clz) + return tz + rot_bits; + + return -1; +} + +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. */ + +bool +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c) +{ + /* leading 48 zeros (cmpldi), or leading 49 ones (cmpdi). */ + return rotate_from_leading_zeros_const (~c, 49) > 0 + || rotate_from_leading_zeros_const (c, 48) > 0; +} + /* Generate a compare for CODE. Return a brand-new rtx that represents the result of the compare. */ diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index e9e5cd1e54d..cad3cfc98cd 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7766,6 +7766,67 @@ (define_insn "*movsi_from_df" "xscvdpsp %x0,%x1" [(set_attr "type" "fp")]) + +(define_code_iterator eqne [eq ne]) +(define_code_attr EQNE [(eq "EQ") (ne "NE")]) + +;; "i == C" ==> "rotl(i,N) == rotl(C,N)" +(define_insn_and_split "*rotate_on_cmpdi" + [(set (pc) + (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:DI 2 "const_int_operand" "n")) + (label_ref (match_operand 0 "")) + (pc))) + (clobber (match_scratch:DI 3 "=r")) + (clobber (match_scratch:CCUNS 4 "=y"))] + "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1 + && compare_rotate_immediate_p (UINTVAL (operands[2]))" + "#" + "&& 1" + [(pc)] +{ + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); + bool sgn = false; + unsigned HOST_WIDE_INT C = INTVAL (operands[2]); + int rot = rotate_from_leading_zeros_const (C, 48); + if (rot < 0) + { + sgn = true; + rot = rotate_from_leading_zeros_const (~C, 49); + } + rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot); + + /* i' = rotl (i, n) */ + rtx op0 = can_create_pseudo_p () ? gen_reg_rtx (DImode) : operands[3]; + emit_insn (gen_rtx_SET (op0, gen_rtx_ROTATE (DImode, operands[1], n))); + + /* C' = rotl (C, n) */ + rtx op1 = GEN_INT ((C << INTVAL (n)) | (C >> rot)); + + /* i' == C' */ + machine_mode comp_mode = sgn ? CCmode : CCUNSmode; + rtx cc = can_create_pseudo_p () ? gen_reg_rtx (comp_mode) : operands[4]; + PUT_MODE (cc, comp_mode); + emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (comp_mode, op0, op1))); + rtx cmp = gen_rtx_<EQNE> (CCmode, cc, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp, + loc_ref, pc_rtx))); + + /* keep the probability info for the prediction of the branch insn. */ + if (note) + { + profile_probability prob + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); + + add_reg_br_prob_note (get_last_insn (), prob); + } + + DONE; +} +) + ;; Split a load of a large constant into the appropriate two-insn ;; sequence. @@ -13472,7 +13533,6 @@ (define_expand "@ctr<mode>" ;; rs6000_legitimate_combined_insn prevents combine creating any of ;; the ctr<mode> insns. -(define_code_iterator eqne [eq ne]) (define_code_attr bd [(eq "bdz") (ne "bdnz")]) (define_code_attr bd_neg [(eq "bdnz") (ne "bdz")]) diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743.c b/gcc/testsuite/gcc.target/powerpc/pr103743.c new file mode 100644 index 00000000000..abb876ed79e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c @@ -0,0 +1,52 @@ +/* { dg-options "-O2" } */ +/* { dg-do compile { target has_arch_ppc64 } } */ + +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10 } } */ +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mrotldi\M} 14 } } */ + +int foo (int a); + +int __attribute__ ((noinline)) udi_fun (unsigned long long in) +{ + if (in == (0x8642000000000000ULL)) + return foo (1); + if (in == (0x7642000000000000ULL)) + return foo (12); + if (in == (0x8000000000000000ULL)) + return foo (32); + if (in == (0x8000000000000001ULL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFULL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFULL)) + return foo (51); + if (in == (0x7567000000ULL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFULL)) + return foo (19); + + return 0; +} + +int __attribute__ ((noinline)) di_fun (long long in) +{ + if (in == (0x8642000000000000LL)) + return foo (1); + if (in == (0x7642000000000000LL)) + return foo (12); + if (in == (0x8000000000000000LL)) + return foo (32); + if (in == (0x8000000000000001LL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFLL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFLL)) + return foo (51); + if (in == (0x7567000000LL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFLL)) + return foo (19); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr103743_1.c b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c new file mode 100644 index 00000000000..2c08c56714a --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103743_1.c @@ -0,0 +1,95 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -std=c99" } */ + +int +foo (int a) +{ + return a + 6; +} + +int __attribute__ ((noinline)) udi_fun (unsigned long long in) +{ + if (in == (0x8642000000000000ULL)) + return foo (1); + if (in == (0x7642000000000000ULL)) + return foo (12); + if (in == (0x8000000000000000ULL)) + return foo (32); + if (in == (0x8000000000000001ULL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFULL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFULL)) + return foo (51); + if (in == (0x7567000000ULL)) + return foo (9); + if (in == (0xFFF8567FFFFFFFFFULL)) + return foo (19); + + return 0; +} + +int __attribute__ ((noinline)) di_fun (long long in) +{ + if (in == (0x8642000000000000LL)) + return foo (1); + if (in == (0x7642000000000000LL)) + return foo (12); + if (in == (0x8000000000000000LL)) + return foo (32); + if (in == (0x8000000000000001LL)) + return foo (33); + if (in == (0x8642FFFFFFFFFFFFLL)) + return foo (46); + if (in == (0x7642FFFFFFFFFFFFLL)) + return foo (51); + return 0; +} + +int +main () +{ + int e = 0; + if (udi_fun (6) != 0) + e++; + if (udi_fun (0x8642000000000000ULL) != foo (1)) + e++; + if (udi_fun (0x7642000000000000ULL) != foo (12)) + e++; + if (udi_fun (0x8000000000000000ULL) != foo (32)) + e++; + if (udi_fun (0x8000000000000001ULL) != foo (33)) + e++; + if (udi_fun (0x8642FFFFFFFFFFFFULL) != foo (46)) + e++; + if (udi_fun (0x7642FFFFFFFFFFFFULL) != foo (51)) + e++; + if (udi_fun (0x7567000000ULL) != foo (9)) + e++; + if (udi_fun (0xFFF8567FFFFFFFFFULL) != foo (19)) + e++; + + if (di_fun (6) != 0) + e++; + if (di_fun (0x8642000000000000LL) != foo (1)) + e++; + if (di_fun (0x7642000000000000LL) != foo (12)) + e++; + if (di_fun (0x8000000000000000LL) != foo (32)) + e++; + if (di_fun (0x8000000000000001LL) != foo (33)) + e++; + if (di_fun (0x8642FFFFFFFFFFFFLL) != foo (46)) + e++; + if (di_fun (0x7642FFFFFFFFFFFFLL) != foo (51)) + e++; + if (udi_fun (0x7567000000LL) != foo (9)) + e++; + if (udi_fun (0xFFF8567FFFFFFFFFLL) != foo (19)) + e++; + + if (e) + __builtin_abort (); + return 0; +} +