From patchwork Mon Jun 19 01:14:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 109702 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp2715479vqr; Sun, 18 Jun 2023 18:16:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5ViEUddSIREReolnwU7U5lUVxUc18BAszWdzbK8zZwPea6rvLg+JpMkB77rQoHg6BZnXxr X-Received: by 2002:a17:907:7d86:b0:96a:928c:d391 with SMTP id oz6-20020a1709077d8600b0096a928cd391mr8233451ejc.4.1687137414951; Sun, 18 Jun 2023 18:16:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687137414; cv=none; d=google.com; s=arc-20160816; b=n9P7EXrbkB/fmuK/csYcWQgxdeqZ1feo5CRpoR1/tuPj6KQ9IkSzymuY8muXudds9D wJTmW98lE2VMctpORZynPvDeFbutT0kxQVv7FLKziSTQ21T53VUe/KbVPcm8eqKxcSCU 0gie5aYibrrx/PscuiSAP+EfEr46DKPS2i9+8uqgTLm8/if2m2estBtm9c0ATLZd2sBf MgK7RR8WH84oKhTbYKy5zrxyzMj6xnrxcW15DBmwbxYe+f/PuPkscYhecQK+tZflul6i Y0+5bFkKZCZXte4R+ssLvOKTEi02oWG+Rd3jQDOQuVgqAr8EF01WunPyojv3Q0Wgt7Uk NCMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:subject:cc:to:content-language:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=J4bfCVkIAFb6Pa5qibyUxBbrtz4Y+G8v7Wk1cXw5/KU=; b=EPC27eXbiDwo9Bi8VTi4d2bPS677/AiJYseRfCHPtR1XHo3TTxcYR/9m3VfF0e/wpK zFbbVFf+sV1MEdUgIcdQAWeI1wLozX3yJOATQfsj3XY1BL4SKvPYNottFfA8Tv91y7h6 Vp8ONzE7LMykuooeieai1dcFG9n+VCgN9hsrWiycZ2/ru07yqalORQsLQ459y6hujarN lECV2rCIKzWfAUj9nuRBOMcRmPI68HtxfoET3CS7owFYsofqPtS7ijP5i0i6jkt6JqiC f3FBT9jR3dty1Sc6F7R7Hlx/G5D9DlJ6DP2nssK+DsiKdUG+E3l7OApKrMkWRGsozAsA HBug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=jWRF8ISd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id g16-20020a170906349000b00988c60e9effsi204159ejb.271.2023.06.18.18.16.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Jun 2023 18:16:54 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=jWRF8ISd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BE18E385841D for ; Mon, 19 Jun 2023 01:16:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE18E385841D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687137413; bh=J4bfCVkIAFb6Pa5qibyUxBbrtz4Y+G8v7Wk1cXw5/KU=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=jWRF8ISdVTAg6Wt81f+zb8LujDUtqGJ4IbJHWdXh90J5cn9J7dszPUQcOk8QcWWJv 0/1oAbZfp5YEXn5sgKZUgUU/zgWQYJxsPjhKYOu1gwCSITLpI9DxBVPLCzWN8dIX0i +kL+aPP18omnhDiF9hnNVc93hMUfqppZJeF0938c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 5FD303858D28 for ; Mon, 19 Jun 2023 01:16:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5FD303858D28 Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35J0lXC8029773; Mon, 19 Jun 2023 01:16:06 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3racsd8dh5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jun 2023 01:16:06 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35J1BKr7028836; Mon, 19 Jun 2023 01:16:06 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3racsd8dgb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jun 2023 01:16:05 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35J0VTY9004532; Mon, 19 Jun 2023 01:15:03 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3r943e130q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Jun 2023 01:15:03 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35J1F0Xo47448484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 19 Jun 2023 01:15:00 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1933220040; Mon, 19 Jun 2023 01:15:00 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1BB3220043; Mon, 19 Jun 2023 01:14:57 +0000 (GMT) Received: from [9.197.250.78] (unknown [9.197.250.78]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 19 Jun 2023 01:14:56 +0000 (GMT) Message-ID: Date: Mon, 19 Jun 2023 09:14:55 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner Subject: [PATCH, rs6000] Generate mfvsrwz for all platforms and remove redundant zero extend [PR106769] X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 9RG3vFksgztRmUoQUQsM3iZ8cgD9EFj6 X-Proofpoint-ORIG-GUID: Xw0B8ipF-CdDeJ8LKoByAf1i5OqQ3L3b X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-18_16,2023-06-16_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 clxscore=1015 impostorscore=0 phishscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 malwarescore=0 lowpriorityscore=0 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306190008 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769091802224059607?= X-GMAIL-MSGID: =?utf-8?q?1769091802224059607?= Hi, This patch modifies vsx extract expander and generates mfvsrwz/stxsiwx for all platforms when the mode is V4SI and the index of extracted element is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz which can help eliminate redundant zero extend. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog rs6000: Generate mfvsrwz for all platforms and remove redundant zero extend mfvsrwz has lower latency than xxextractuw. So it should be generated even with p9 vector enabled if possible. Also the instruction is already zero extended. A combine pattern is needed to eliminate redundant zero extend instructions. gcc/ PR target/106769 * config/rs6000/vsx.md (expand vsx_extract_): Skip calling gen_vsx_extract__p9 when it can be implemented by mfvsrwz/stxsiwx. (*vsx_extract__di_p9): Not generate the insn when it can be generated by mfvsrwz. (mfvsrwz): New insn pattern. (*vsx_extract_si): Rename to... (vsx_extract_si): ..., remove redundant insn condition and generate the insn on p9 when it can be implemented by mfvsrwz/stxsiwx. Add a dup alternative for simple vector moving. Remove reload_completed from split condition as it's unnecessary. Remove unnecessary checking from preparation statements. Set type and length attributes for each alternative. gcc/testsuite/ PR target/106769 * gcc.target/powerpc/pr106769.h: New. * gcc.target/powerpc/pr106769-p8.c: New. * gcc.target/powerpc/pr106769-p9.c: New. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 0a34ceebeb5..09b0f83db86 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -3728,7 +3728,9 @@ (define_expand "vsx_extract_" "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" { /* If we have ISA 3.0, we can do a xxextractuw/vextractu{b,h}. */ - if (TARGET_P9_VECTOR) + if (TARGET_P9_VECTOR + && (mode != V4SImode + || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))) { emit_insn (gen_vsx_extract__p9 (operands[0], operands[1], operands[2])); @@ -3798,7 +3800,9 @@ (define_insn_and_split "*vsx_extract__di_p9" (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v,") (parallel [(match_operand:QI 2 "const_int_operand" "n,n")])))) (clobber (match_scratch:SI 3 "=r,X"))] - "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB" + "TARGET_VEXTRACTUB + && (mode != V4SImode + || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))" "#" "&& reload_completed" [(parallel [(set (match_dup 4) @@ -3830,58 +3834,67 @@ (define_insn_and_split "*vsx_extract__store_p9" (set (match_dup 0) (match_dup 3))]) -(define_insn_and_split "*vsx_extract_si" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,wa,Z") +(define_insn "mfvsrwz" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (vec_select:SI + (match_operand:V4SI 1 "vsx_register_operand" "wa") + (parallel [(match_operand:QI 2 "const_int_operand" "n")])))) + (clobber (match_scratch:V4SI 3 "=v"))] + "TARGET_DIRECT_MOVE_64BIT + && INTVAL (operands[2]) == (BYTES_BIG_ENDIAN ? 1 : 2)" + "mfvsrwz %0,%x1" + [(set_attr "type" "mfvsr") + (set_attr "isa" "p8v")]) + +(define_insn_and_split "vsx_extract_si" + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,wa,Z,wa") (vec_select:SI - (match_operand:V4SI 1 "gpc_reg_operand" "v,v,v") - (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n")]))) - (clobber (match_scratch:V4SI 3 "=v,v,v"))] - "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT && !TARGET_P9_VECTOR" - "#" - "&& reload_completed" + (match_operand:V4SI 1 "gpc_reg_operand" "v,v,v,0") + (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")]))) + (clobber (match_scratch:V4SI 3 "=v,v,v,v"))] + "TARGET_DIRECT_MOVE_64BIT + && (!TARGET_P9_VECTOR || INTVAL (operands[2]) == (BYTES_BIG_ENDIAN ? 1 : 2))" +{ + if (which_alternative == 0) + return "mfvsrwz %0,%x1"; + + if (which_alternative == 1) + return "xxlor %x0,%x1,%x1"; + + if (which_alternative == 2) + return "stxsiwx %x1,%y0"; + + return ASM_COMMENT_START " vec_extract to same register"; +} + "&& INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2)" [(const_int 0)] { rtx dest = operands[0]; rtx src = operands[1]; rtx element = operands[2]; - rtx vec_tmp = operands[3]; - int value; + rtx vec_tmp; + + if (GET_CODE (operands[3]) == SCRATCH) + vec_tmp = gen_reg_rtx (V4SImode); + else + vec_tmp = operands[3]; /* Adjust index for LE element ordering, the below minuend 3 is computed by GET_MODE_NUNITS (V4SImode) - 1. */ if (!BYTES_BIG_ENDIAN) element = GEN_INT (3 - INTVAL (element)); - /* If the value is in the correct position, we can avoid doing the VSPLT - instruction. */ - value = INTVAL (element); - if (value != 1) - emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element)); - else - vec_tmp = src; - - if (MEM_P (operands[0])) - { - if (can_create_pseudo_p ()) - dest = rs6000_force_indexed_or_indirect_mem (dest); + emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element)); - if (TARGET_P8_VECTOR) - emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp))); - else - emit_insn (gen_stfiwx (dest, gen_rtx_REG (DImode, REGNO (vec_tmp)))); - } - - else if (TARGET_P8_VECTOR) - emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp))); - else - emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)), - gen_rtx_REG (DImode, REGNO (vec_tmp))); + int value = BYTES_BIG_ENDIAN ? 1 : 2; + emit_insn (gen_vsx_extract_si (dest, vec_tmp, GEN_INT (value))); DONE; } - [(set_attr "type" "mfvsr,vecperm,fpstore") - (set_attr "length" "8") - (set_attr "isa" "*,p8v,*")]) + [(set_attr "type" "mfvsr,veclogical,fpstore,*") + (set_attr "length" "4,4,4,0") + (set_attr "isa" "p8v,*,p8v,*")]) (define_insn_and_split "*vsx_extract__p8" [(set (match_operand: 0 "nonimmediate_operand" "=r") diff --git a/gcc/testsuite/gcc.target/powerpc/pr106769-p8.c b/gcc/testsuite/gcc.target/powerpc/pr106769-p8.c new file mode 100644 index 00000000000..e7cdbc76298 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106769-p8.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +#include "pr106769.h" + +/* { dg-final { scan-assembler {\mmfvsrwz\M} } } */ +/* { dg-final { scan-assembler {\mstxsiwx\M} } } */ +/* { dg-final { scan-assembler-not {\mrldicl\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr106769-p9.c b/gcc/testsuite/gcc.target/powerpc/pr106769-p9.c new file mode 100644 index 00000000000..d21c7f13382 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106769-p9.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +#include "pr106769.h" + +/* { dg-final { scan-assembler {\mmfvsrwz\M} } } */ +/* { dg-final { scan-assembler {\mstxsiwx\M} } } */ +/* { dg-final { scan-assembler-not {\mrldicl\M} } } */ +/* { dg-final { scan-assembler-not {\mxxextractuw\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr106769.h b/gcc/testsuite/gcc.target/powerpc/pr106769.h new file mode 100644 index 00000000000..1c8c8a024f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106769.h @@ -0,0 +1,17 @@ +#include + +#ifdef __BIG_ENDIAN__ +#define LANE 1 +#else +#define LANE 2 +#endif + +unsigned int foo1 (vector unsigned int v) +{ + return vec_extract(v, LANE); +} + +void foo2 (vector unsigned int v, unsigned int* p) +{ + *p = vec_extract(v, LANE); +}