From patchwork Wed Dec 6 05:24:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 174297 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3893991vqy; Tue, 5 Dec 2023 21:25:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEnQZfIyEFPuUtsBu2UQoFEMDa68+iQJ+Qn5prJLIS8fnU/nFJH0UEoj5T8VlimR84lmwcq X-Received: by 2002:a05:620a:8008:b0:77d:78c9:4da1 with SMTP id ee8-20020a05620a800800b0077d78c94da1mr467735qkb.28.1701840331561; Tue, 05 Dec 2023 21:25:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701840331; cv=pass; d=google.com; s=arc-20160816; b=NNI8f0Pl4cP7Q8YwLCRuLfSHYxo57p7xclkDmu6Yz3KaCQZeND6hroP/+rbjo5zwpk X3pqu2RNjoxZcyTVsf3/KJSDtF6FZXuCtUOdr3MRXwo1/XwjKlMW7szxg8goioebsrRV ubQi4LHGYUZhlHPDqy9lprhKdBBBfVcSlLAZs4W9+SZOFJqwygmPjsaKm2Mei0sCoJNh ou+QLGcKTkv7XLDAUMSj2ZtQqZsqtTf9JR2wgpplzR1Ww6vrj//YOxLLHNWTox+k6AQv C78Z9wQj2ql2inbf6fqN/3PMBEQhvKPQa3e3mt2Ke7YKkMkY3HGsNLF8lQt6H7jD7BBP dG0A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-filter:dmarc-filter :delivered-to; bh=DdbiOpjG1N2263o9mgLyb76zqY9dHT8iBGzgwQfi6qY=; fh=AJ+XyiCK9C45D31PcueuCqgpHIDRP2j25uBOQsJKm1w=; b=P5PcsgmSZvehtmD/3b8b3yhGJTaXd0MU9M8+y4YFPfaADgvDa7KfOBo7LCuMspMbhr HX//Tcfj0VhqvqrC8s9AZZ9cmvoSvmSFKdQ8Y4rTG2NZoWIwnkjOfD4rzsCi8D6j3oWw OmVMrQz6A8qjjzio+zwx/61e3r3UyhqwLf6DupIZyA632Gjkv4vEM3b6ELNZKEuHeIJE 9fdNXU9v7Pd7mfKH1/wARMssVkcyb7oceMoy69PGODge6Mxcc4l+LFsX7Z2JzaA60rU6 5E9TFRHNHe6iEPSDl7TiPFn2x44y6FU7S4AtqKzCobzJbNZnurKon/rTq1JiRdo3u12q kZuw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="UlUmF/TT"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id s14-20020a05620a254e00b00773d54d6a11si14125238qko.387.2023.12.05.21.25.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 21:25:31 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b="UlUmF/TT"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5362838708B4 for ; Wed, 6 Dec 2023 05:25:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A46E73857B95; Wed, 6 Dec 2023 05:24:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A46E73857B95 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A46E73857B95 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; cv=none; b=ss3850U+R8trhD1KAMkinorvKy1/8GWqity4YuG7/peGDpyL1Ftq7w7pU6icu6QwJkKn1eqLD/LIUFXiCiPm6K9d4KR9edowjuM1Rge9AUIStYdki30P1//8RADODBD689MhBoljQWbwcsjIWOEekDS27sIDgMqB8LS/vtbkkWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; c=relaxed/simple; bh=zg0la/N2ygqStR1u9tUqKr7Xq9b/a610AqXFMnhIb4g=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Aqh/4O9PZhRltXhmiVkH/EMrT9k5H74dv2Hxui+rf5PsBPbqK6A1f5428MaWB/cIVKzb/2KUtvMt2/qyUrjG3pNarOCqh9YQ6USiN26Zm1ZUWvIh18E3azedKtFY7KwwoWZOIfkw4qvQWTttCnr5aB9MHYU5vaUuvSvtFO/nrd8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B64CUns000708; Wed, 6 Dec 2023 05:24:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=DdbiOpjG1N2263o9mgLyb76zqY9dHT8iBGzgwQfi6qY=; b=UlUmF/TT2alvnvpO3AiGbIPPPnbXi8VxF9A95vesU3l9WEHr3Xk1jr18ehMfSfPiUMMk AWmzEhaN94QQ8gNdaAbGFIJXD4Go13693qwxbB7uTLNi9Ai5/Sc3ZsgcwN1LCtesQ78H MQRqf8+S2RBihb3KA5KiPQbhJfDvuU0ivdVEuVrvW+K9GeyqCt9tlLtPEwHEpc52uEVH mZVhzgc3+9479t+BVTzQyKkQikTB0i+l3nVch1cZP88+NZA/7mSgenxtIJCp6gdW6/4e AgzQ8smG//kmgRN4RrJhijtPDKPyWbNy06FVR5fiiCJ7aYTS1zHKtAdIH0/qQkhtyQGD Bg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uthqk1k9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:36 +0000 Received: from m0353727.ppops.net (m0353727.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B65BxNT004096; Wed, 6 Dec 2023 05:24:35 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uthqk1k93-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:35 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B65K4Cn011068; Wed, 6 Dec 2023 05:24:34 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3utav2tayv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:34 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B65OWOa20316808 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Dec 2023 05:24:32 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EAF0520065; Wed, 6 Dec 2023 05:24:31 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DBE8E2004B; Wed, 6 Dec 2023 05:24:30 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 6 Dec 2023 05:24:30 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V3 3/3] split complicate constant to memory Date: Wed, 6 Dec 2023 13:24:27 +0800 Message-Id: <20231206052427.143889-3-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231206052427.143889-1-guojiufu@linux.ibm.com> References: <20231206052427.143889-1-guojiufu@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 6rdrT_96sOEOngEKX94T8-4C_dgpoW57 X-Proofpoint-ORIG-GUID: VhpRA4FCXzaxHBqcY4ZX1OwLORAdprMV X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_04,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 malwarescore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060043 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784508927360965096 X-GMAIL-MSGID: 1784508927360965096 Hi, Sometimes, a complicated constant is built via 3(or more) instructions to build. Generally speaking, it would not be as fast as loading it from the constant pool (as a few discussions in PR63281): * "ld" is one instruction. If consider "address/toc" adjust, we may count it as 2 instructions (the high part of address computation could be optimized as nop by linker further). And "pld" may need fewer cycles. * As testing(SPEC2017), it could get better/stable runtime if set the threshold as "> 2" (compare with "> 3"). As tested on spec2017, for visible performance changes, we can find the runtime improvement on 500.perlbench_r about ~1.8% (-O2, P10) with the patch. And for performance downgrades on other benchmarks, as investigated, the recessions are not caused by this patch. Compare with the previous version: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636566.html This version is refreshed based on the latest code. Boostrap & regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) PR target/63281 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split complicate constant to memory. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test point. * gcc.target/powerpc/pr106550.c: Likewise.. * gcc.target/powerpc/pr106550_1.c: Likewise. * gcc.target/powerpc/pr87870.c: Update according to latest behavior. * gcc.target/powerpc/pr93012.c: Likewise. --- gcc/config/rs6000/rs6000.cc | 16 ++++++++++++++++ .../gcc.target/powerpc/const_anchors.c | 5 ++--- .../gcc.target/powerpc/parall_5insn_const.c | 14 ++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550_1.c | 15 +++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr87870.c | 5 ++++- gcc/testsuite/gcc.target/powerpc/pr93012.c | 5 ++++- 7 files changed, 66 insertions(+), 11 deletions(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 2e074a21a05..e44a6da91ae 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10271,6 +10271,22 @@ rs6000_emit_set_const (rtx dest, rtx source) c = sext_hwi (c, 32); emit_move_insn (lo, GEN_INT (c)); } + + /* If it can be stored to the constant pool and profitable. */ + else if (base_reg_operand (dest, mode) + && num_insns_constant (source, mode) > 2) + { + rtx sym = force_const_mem (mode, source); + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0)) + && use_toc_relative_ref (XEXP (sym, 0), mode)) + { + rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest)); + sym = gen_const_mem (mode, toc); + set_mem_alias_set (sym, get_TOC_alias_set ()); + } + + emit_insn (gen_rtx_SET (dest, sym)); + } else rs6000_emit_set_long_const (dest, c); break; diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c index 542e2674b12..188744165f2 100644 --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c @@ -1,5 +1,5 @@ /* { dg-do compile { target has_arch_ppc64 } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fdump-rtl-final" } */ #define C1 0x2351847027482577ULL #define C2 0x2351847027482578ULL @@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long long b) if (b) *a++ = C2; } - -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */ +/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c index e3a9a7264cf..df0690b90be 100644 --- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c @@ -9,8 +9,18 @@ void __attribute__ ((noinline)) foo (unsigned long long *a) { /* 2 lis + 2 ori + 1 rldimi for each constant. */ - *a++ = 0x800aabcdc167fa16ULL; - *a++ = 0x7543a876867f616ULL; + { + register long long d asm("r0") = 0x800aabcdc167fa16ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + register long long d asm("r0") = 0x7543a876867f616ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL}; diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c b/gcc/testsuite/gcc.target/powerpc/pr106550.c index 74e395331ab..5eca2b2f701 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c @@ -1,12 +1,25 @@ /* PR target/106550 */ /* { dg-options "-O2 -mdejagnu-cpu=power10" } */ /* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target has_arch_ppc64 } */ void foo (unsigned long long *a) { - *a++ = 0x020805006106003; /* pli+pli+rldimi */ - *a++ = 0x2351847027482577;/* pli+pli+rldimi */ + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x020805006106003ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x2351847027482577ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 4 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c index 5ab40d71a56..80e6b817dff 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c @@ -13,8 +13,19 @@ foo (unsigned long long *a) asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); *a++ = n; - *a++ = 0x235a8470a7480000ULL; /* pli+sldi+oris */ - *a++ = 0x23a184700000b677ULL; /* pli+sldi+ori */ + { + register long long d asm("r0") = 0x235a8470a7480000ULL; /* pli+sldi+oris */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + + { + register long long d asm("r0") = 0x23a184700000b677ULL; /* pli+sldi+ori */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 3 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr87870.c b/gcc/testsuite/gcc.target/powerpc/pr87870.c index d2108ac3386..5fee06744ae 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr87870.c +++ b/gcc/testsuite/gcc.target/powerpc/pr87870.c @@ -25,4 +25,7 @@ test3 (void) return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; } -/* { dg-final { scan-assembler-not {\mld\M} } } */ +/* test3 using "ld" to load the value for r3 and r4. + test0, test1 and test2 are using "li". */ +/* { dg-final { scan-assembler-times {\mp?ld\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mli\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index 4f764d0576f..b9e869e4285 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -10,4 +10,7 @@ unsigned long long mskh1() { return 0xffff9234ffff9234ULL; } unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mpli\M} 4 { target has_arch_pwr10 }} } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 7 { target has_arch_pwr10 } } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 3 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\mld\M} 4 { target { ! has_arch_pwr10 } } } } */