From patchwork Fri Mar 1 02:41:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 208619 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2097:b0:108:e6aa:91d0 with SMTP id gs23csp819831dyb; Thu, 29 Feb 2024 18:42:18 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWR0jpiz9w36urW8A6Gaby2Xo9gpMR2yQGNlXRu0tXA47IPIRmv6LEgPw0h9i+8rIVq2Pz12ZU2QUGHfP7dNessh/mqUQ== X-Google-Smtp-Source: AGHT+IFXv5/BtuXaJBb8zAaGzR8vD+TbVgTui0gGPwSaxnnFtVzS33+5sigWNaAvfklaaV1UVpfa X-Received: by 2002:a05:620a:319f:b0:787:aa51:56a0 with SMTP id bi31-20020a05620a319f00b00787aa5156a0mr625947qkb.43.1709260938249; Thu, 29 Feb 2024 18:42:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709260938; cv=pass; d=google.com; s=arc-20160816; b=XO0XyyRKvcpCSwrXrLRqqULzJk+8k17/qFlmtgpF3jto200pQgucEgsnuISGjzPssF VGbuG9JNrTmW3FVl2M+I+TJrHEUyi6arC3yho1ItJBk2HC+4tQONrmaCFkUDoiVYMBNf UxIodnJScbj40uDwNYL9R4BHjGCzhUQP1w6BV9mE8v8rjJuMV+b196ODEnSmDL2aM7w1 5QOINP6R9GaFRZTWorvi5bZiBhB/B/gjqUx8N9BEQ9qpEmIJAUuDGXIv/MXDgG2T4sG6 eiujQ/ilmE0BYVIFqrsKgX+PDiYwtlCWTvkeY828iH9BXXLbGUI3tXvjDDVmqs1Hk+aP T92g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:subject:from:cc:to:content-language :user-agent:date:message-id:dkim-signature:arc-filter:dmarc-filter :delivered-to; bh=uZn7F5MFiC0xni77J9ZTUgiuybq+Jvc5KevTC6tI5A0=; fh=3eUSxJU+9IWNwGHlMjnmqDQDnJfeMKAjlglEUO7a4vw=; b=M16B6jRLDUaZLzjYF6JS+e4E3QcrN1RxCCvPIYMNb8eZPbPvCiqHaP9rWA7+VpC2vU F4rCDVHguR7XaYawMMHjUn3D76UTTQzstaFKxHd80ioR4cVrxz9p07GIu6XZtPBtjNFg RcP2ueCybv1grjeNTZmrc9usc8U5hz8y5f2AJ9WrlbNzJK1vZYqL2HMabNcg1JWuuH9i B8lCKuHYeQbpZ/O5NbdvcfzDw7Ns7yGHEQgVIr893uDXGNSU6lps8TBdxq0XUbVhzK2j OORzQsPzpG++EhBHDYEh1KFIPedBEdjpxx32TLpjRGDMyW/NoFHt4b44nXyVWiY65xD6 xVKA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=irRBh8pJ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id g21-20020a05620a109500b007873204bc3esi2684852qkk.645.2024.02.29.18.42.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Feb 2024 18:42:18 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=irRBh8pJ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D97883858C39 for ; Fri, 1 Mar 2024 02:42:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 5A5393858412 for ; Fri, 1 Mar 2024 02:41:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5A5393858412 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5A5393858412 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709260889; cv=none; b=VHmz5scDyau5TwD3GK3Do1s4F++PwbKFQSa3bWbycnJPSqTFFPzwGzboXSJ7WX87yg6tT2oJVjaBq+4ZvPiIeOb0WEmDaWBOy8grZBc/MFmdio8iI77g+2JQqR7lXOpbwzYFZJ7QiOreGeFFh5Ppp2ieBujlq0HGUL2sv0m7Tj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709260889; c=relaxed/simple; bh=6TiKOaxb0wuConEpYTMfEIPwDLYBZu5belWFpLLzKs4=; h=DKIM-Signature:Message-ID:Date:To:From:Subject:MIME-Version; b=TiYk8qy3DZuq0d1Rv7cMnlEMQfKoeEpPQtXSUCfnEbogyC5364yzKdmFPHmrO9OGaZxANEcP4LuzRUBUeBntiIAYtpRca8KomjzHKUdcyVyu9c0/kUOJH4nBaeBc1pfkaD0zQy2VLcuALlo80XstjAq0NerwAEdWff+aZra9coE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 4212bONW026961; Fri, 1 Mar 2024 02:41:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : to : cc : from : subject : content-type : content-transfer-encoding : mime-version; s=pp1; bh=uZn7F5MFiC0xni77J9ZTUgiuybq+Jvc5KevTC6tI5A0=; b=irRBh8pJZP6PTa+qI59nyrA9zrlEO49nrNaR/yTCgVwlMwUZ8Ncf0yVhP3u8LgYwEaXN HncbXo+lpZbPe235HXDP4PYxZVn1UkyCfnX/0Eln79rTW6oc2MI8oXwQqQPK+vMYYMdl TyXmweIQsKGIRrmI3dOBqsXnxUtvX1F6VQVPK59wjpAUnHV2AZyp+0uCfdiBEZGrgvrW OAQwpPeHDx0xLk5wVxHjK5jXWqEZgPGaW0WkApfNBnFYLdeqS0bn9PRDFnzAvsT5o8PJ PCPIaDcPWiz/1NLa8f3Q46vt8zsXJ7K0fE4i1xVZp7RTNmuwGEDbw4vYvBskXbEwy8PE gg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wk6cxr275-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 02:41:25 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4212dRUj003037; Fri, 1 Mar 2024 02:41:25 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wk6cxr26q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 02:41:25 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 41TNUO3U023910; Fri, 1 Mar 2024 02:41:24 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3wfw0ksbyn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 02:41:24 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4212fI7a22086296 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 1 Mar 2024 02:41:20 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C7B122004D; Fri, 1 Mar 2024 02:41:18 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 018A420040; Fri, 1 Mar 2024 02:41:17 +0000 (GMT) Received: from [9.197.246.19] (unknown [9.197.246.19]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 1 Mar 2024 02:41:16 +0000 (GMT) Message-ID: Date: Fri, 1 Mar 2024 10:41:16 +0800 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner From: HAO CHEN GUI Subject: [PATCH, rs6000] Add subreg patterns for SImode rotate and mask insert X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 9Hz8nlMTuhyiUJIAv0Z3YptHHk_naoSH X-Proofpoint-ORIG-GUID: lFPz4Gy7xlZGbVpT2ameRM3jzltTnFGR X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-29_08,2024-02-29_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 malwarescore=0 impostorscore=0 suspectscore=0 bulkscore=0 spamscore=0 priorityscore=1501 mlxlogscore=999 phishscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2403010021 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792289997068136296 X-GMAIL-MSGID: 1792289997068136296 Hi, This patch fixes regression cases in gcc.target/powerpc/rlwimi-2.c. In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an out AND. It matches a DImode rotate and mask insert on rs6000. Trying 2 -> 7: 2: r122:DI=r129:DI REG_DEAD r129:DI 7: r125:SI=r122:DI#0 0>>0x1f REG_DEAD r122:DI Failed to match this instruction: (set (subreg:DI (reg:SI 125 [ x ]) 0) (zero_extract:DI (reg:DI 129) (const_int 32 [0x20]) (const_int 1 [0x1]))) Successfully matched this instruction: (set (subreg:DI (reg:SI 125 [ x ]) 0) (and:DI (lshiftrt:DI (reg:DI 129) (const_int 31 [0x1f])) (const_int 4294967295 [0xffffffff]))) This conversion blocks the further combination which combines to a SImode rotate and mask insert insn. Trying 9, 7 -> 10: 9: r127:SI=r130:DI#0&0xfffffffffffffffe REG_DEAD r130:DI 7: r125:SI#0=r129:DI 0>>0x1f&0xffffffff REG_DEAD r129:DI 10: r124:SI=r127:SI|r125:SI REG_DEAD r125:SI REG_DEAD r127:SI Failed to match this instruction: (set (reg:SI 124) (ior:SI (and:SI (subreg:SI (reg:DI 130) 0) (const_int -2 [0xfffffffffffffffe])) (subreg:SI (zero_extract:DI (reg:DI 129) (const_int 32 [0x20]) (const_int 1 [0x1])) 0))) Failed to match this instruction: (set (reg:SI 124) (ior:SI (and:SI (subreg:SI (reg:DI 130) 0) (const_int -2 [0xfffffffffffffffe])) (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129) (const_int 31 [0x1f])) (const_int 4294967295 [0xffffffff])) 0))) The root cause of the issue is if it's necessary to do the widen mode for lshiftrt when the target already has the narrow mode lshiftrt and its cost is not high. My former patch tried to fix the problem but not accepted yet. https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624852.html As it's stage 4 now, I drafted this patch to fix the regression by adding subreg patterns of SImode rotate and mask insert. It actually does reversed things and narrow the mode for lshiftrt so that it can matches the SImode rotate and mask insert. The case "rlwimi-2.c" is fixed and restore the corresponding number of insns to original ones. The case "rlwinm-0.c" is also changed and 9 "rlwinm" is replaced with 9 "rldicl" as the sequence of combine is changed. It's not a regression as the total number of insns isn't changed. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is it OK for the trunk? Thanks Gui Haochen ChangeLog rs6000: Add subreg patterns for SImode rotate and mask insert In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an AND. The new pattern matches rotate and mask insert on rs6000. Thus it blocks the pattern to be further combined to a SImode rotate and mask insert pattern. This patch fixes the problem by adding two subreg pattern for SImode rotate and mask insert patterns. gcc/ PR target/93738 * config/rs6000/rs6000.md (*rotlsi3_insert_9): New. (*rotlsi3_insert_8): New. gcc/testsuite/ PR target/93738 * gcc.target/powerpc/rlwimi-2.c: Adjust the number of 64bit and 32bit rotate instructions. * gcc.target/powerpc/rlwinm-0.c: Likewise. patch.diff diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bc8bc6ab060..b0b40f91e3e 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4253,6 +4253,36 @@ (define_insn "*rotl3_insert" ; difference between rlwimi and rldimi. We also might want dot forms, ; but not for rlwimi on POWER4 and similar processors. +; Subreg pattern of insn "*rotlsi3_insert" +(define_insn_and_split "*rotlsi3_insert_9" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (ior:SI (and:SI + (match_operator:SI 8 "lowpart_subreg_operator" + [(and:DI (match_operator:DI 4 "rotate_mask_operator" + [(match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n")]) + (match_operand:DI 3 "const_int_operand" "n"))]) + (match_operand:SI 5 "const_int_operand" "n")) + (and:SI (match_operand:SI 6 "gpc_reg_operand" "0") + (match_operand:SI 7 "const_int_operand" "n"))))] + "rs6000_is_valid_insert_mask (operands[5], operands[4], SImode) + && GET_CODE (operands[4]) == LSHIFTRT + && INTVAL (operands[3]) == 0xffffffff + && UINTVAL (operands[5]) + UINTVAL (operands[7]) + 1 == 0" + "#" + "&& 1" + [(set (match_dup 0) + (ior:SI (and:SI (lshiftrt:SI (match_dup 9) + (match_dup 2)) + (match_dup 5)) + (and:SI (match_dup 6) + (match_dup 7))))] +{ + int offset = BYTES_BIG_ENDIAN ? 4 : 0; + operands[9] = gen_rtx_SUBREG (SImode, operands[1], offset); +} + [(set_attr "type" "insert")]) + (define_insn "*rotl3_insert_2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (ior:GPR (and:GPR (match_operand:GPR 5 "gpc_reg_operand" "0") @@ -4331,6 +4361,31 @@ (define_insn "*rotlsi3_insert_4" "rlwimi %0,%1,32-%h2,%h2,31" [(set_attr "type" "insert")]) +; Subreg pattern of insn "*rotlsi3_insert_4" +(define_insn_and_split "*rotlsi3_insert_8" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (ior:SI (and:SI (match_operand:SI 3 "gpc_reg_operand" "0") + (match_operand:SI 4 "const_int_operand" "n")) + (match_operator:SI 6 "lowpart_subreg_operator" + [(and:DI + (lshiftrt:DI (match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n")) + (match_operand:DI 5 "const_int_operand" "n"))])))] + "INTVAL (operands[2]) + exact_log2 (-UINTVAL (operands[4])) == 32 + && INTVAL (operands[5]) == 0xffffffff" + "#" + "&& 1" + [(set (match_dup 0) + (ior:SI (and:SI (match_dup 3) + (match_dup 4)) + (lshiftrt:SI (match_dup 7) + (match_dup 2))))] +{ + int offset = BYTES_BIG_ENDIAN ? 4 : 0; + operands[7] = gen_rtx_SUBREG (SImode, operands[1], offset); +} + [(set_attr "type" "insert")]) + (define_insn "*rotlsi3_insert_5" [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") (ior:SI (and:SI (match_operand:SI 1 "gpc_reg_operand" "0,r") diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c index bafa371db73..62344a95aa0 100644 --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c @@ -6,10 +6,9 @@ /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 6728 { target lp64 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c index 4f4fca2d8ef..a10d9174306 100644 --- a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c +++ b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c @@ -4,10 +4,10 @@ /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 6739 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9716 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+blr} 3375 } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3081 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3090 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3197 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3093 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3084 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+rotlwi} 154 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+srwi} 13 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+srdi} 13 { target lp64 } } } */