From patchwork Fri Nov 10 23:11:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 164025 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b129:0:b0:403:3b70:6f57 with SMTP id q9csp1440574vqs; Fri, 10 Nov 2023 15:16:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IE8K5k4zzPfIp3X1CE6eJyD+5rAzD7pwgPy8ewcMM89sHrWZVwcjXMUdKq3c1ZfvGoiaB8H X-Received: by 2002:a05:6358:7209:b0:168:ff1b:8f59 with SMTP id h9-20020a056358720900b00168ff1b8f59mr644868rwa.4.1699658186983; Fri, 10 Nov 2023 15:16:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699658186; cv=pass; d=google.com; s=arc-20160816; b=ik+on+Spvm8jIVDWskPwcDFzmlSaF7Ls1DeG/0J3gKyzjp22BtyroOOGyGDwh3t3Qi drhG7/AbzTytN+ymedRcv64Tks2hxQQynHNW6qoOE7WIrp7inRpYEBf/OJo8+2LmIS8n 7xeyF/D1XtknKCv+SxjBDrBN7QgWr5rW1SRjqjTKDbN5IWqQYepJKezOSFmqu7EpWRSw wsfgDc1NIZj+iM1Jx0B2OJJGNYodMBSJPV2QCEZvm18CyG2lTOnDzQiNU0v7DIaWULKK LlOHd2kkKMRrOIiDDG70en4dSUozP1Jyb2mZAErapP+9HOTq6UAQjiLtrqNgwVpmzlBR c9KA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:in-reply-to:content-disposition :mime-version:references:mail-followup-to:message-id:subject:to:from :date:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=/A++58z6q3ji2B0Q3+eLmJskKgZ8sZfqzRHc1NhN400=; fh=jH+DijE7mz3ySVsRmzRqEe/ioBeGu3vnvA+jm2JjCm8=; b=mxkm4Jkwu9htNBRQ7Z+NaIHMnIbqPYIQWs6+ekTvjVgiZbFVAZ7Y6xCuRRwHMx45JR WJJTl334JHpGOsnOgR5Uifgbzujs5Fvvi8kh015nKbvZtaWgP6QPqbvD19KhOnEv2FJF onwLtm+PGCAfnmrkybaKczAO8P2FK9ISxj/Rq2gKbB1jO/d/HH5O06MOtp0bbkXz/MLG 6uG72w/aKevnW4A9yADuP1C7/1OoLXMwh7MQJyKvS+vMUARIM1HZNaOj5NdBI4nLMzRh 5VJjJ6C4MezsM01TnhTA6P4bRUZPW6dd/mlSEywMct3PiT6ut9/bCGh9Qn9ez8cqShr9 in8g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=qVJV9ec+; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id v4-20020ac85784000000b00418154b60e5si461210qta.540.2023.11.10.15.16.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Nov 2023 15:16:26 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=qVJV9ec+; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B5EB13858407 for ; Fri, 10 Nov 2023 23:16:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B07E73858D37 for ; Fri, 10 Nov 2023 23:15:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B07E73858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B07E73858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699658162; cv=none; b=IE8Qc8GnSkW05L8zyfXOfjQEBGOmUI9NqXJ3cztyOsw1xqRd1toT/vnskECg4tk0AJwIrepSt0OdT+dCC60EYlkwXdsF3aQFDHDlipA9OzYmNs3R5w7MhYmBhe5998Cp74DGT+sGVi0/lHtYAb2K14mznkrPxvzdjLVVRBtte7M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699658162; c=relaxed/simple; bh=16bC0KXttu0U0NWnaZxjWK4dI5bV1Rx6wdbXjLAE6jw=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=wiuB0sH/wCp2eDRld3YqnJbygyoRk5zBaIIvh/zyrIHPWNIxz2u8Wrf5R3nnYTz6/UnZrI577rDkGoB3gBRfG3N3Qn1z22XnlR2ScyQQOOoDyq2qlVWgT/mn49rZuEq6X6U+Tq4Yyb5O6MYaVnVNeIs9UD/sVbQ4OSU8GQOf7+Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AAMlPs2005079; Fri, 10 Nov 2023 23:15:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=/A++58z6q3ji2B0Q3+eLmJskKgZ8sZfqzRHc1NhN400=; b=qVJV9ec+inQwd9GtBj6F3NOpGq5KHVPMzSGhgkF/TpsqV3Q4OV5YIHRw0J4zQJzRZqqL mvtpaXqjZ5rt0p82zKTzTJySwtqyNP71G4Tsy26jJiQOE59Y0W34ZYjGYYCzosbDfnN6 Or17rNlJi3QwGKSrJAzwmGbPmikzBmA8vjZLuaBYm1Ay5Zj4th3S6JWGYyhkM+gu1xKH o5Q07U+N3YH6SlnctffOAH7TXwWz9qUFgQ7cUA/6W5PR9mX+/FwVmmLjjJzxvGPx+jLD G5hjl2rMtjnczKeqAE8LFDrGcSJZYErD0QK3zGLrLEnTVbUZjnMqrmp0KK3wjsKjL4j2 XA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9wm78naa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:15:55 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AANFrWf019649; Fri, 10 Nov 2023 23:15:53 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9wm78mp0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:15:52 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AALpikU000662; Fri, 10 Nov 2023 23:11:23 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3u7w23eauu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:11:23 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AANBMIx5898770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Nov 2023 23:11:22 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B43405805A; Fri, 10 Nov 2023 23:11:22 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 333F758051; Fri, 10 Nov 2023 23:11:22 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.104.206]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTPS; Fri, 10 Nov 2023 23:11:22 +0000 (GMT) Date: Fri, 10 Nov 2023 18:11:20 -0500 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH 2/4] Add support for integer vector pair built-ins Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: hBgw1IdO05P4x7GSH96QBbtUK21JySyX X-Proofpoint-ORIG-GUID: CJnJIf1aQEDmcraOBbN44GYSeF1BM1UB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-10_21,2023-11-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 impostorscore=0 mlxscore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 clxscore=1015 phishscore=0 spamscore=0 mlxlogscore=999 priorityscore=1501 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311100192 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782220783293340424 X-GMAIL-MSGID: 1782220783293340424 This patch adds a series of built-in functions to allow users to write code to do a number of simple operations where the loop is done using the __vector_pair type. The __vector_pair type is an opaque type. These built-in functions keep the two 128-bit vectors within the __vector_pair together, and split the operation after register allocation. This patch provides vector pair operations for 8, 16, 32, and 64-bit integers. I have built and tested these patches on: * A little endian power10 server using --with-cpu=power10 * A little endian power9 server using --with-cpu=power9 * A big endian power9 server using --with-cpu=power9. Can I check this patch into the master branch after the preceeding patch is checked in? 2023-11-09 Michael Meissner gcc/ * config/rs6000/rs6000-builtins.def (__builtin_vpair_i8*): Add built-in functions for integer vector pairs. (__builtin_vpair_i16*): Likeise. (__builtin_vpair_i32*): Likeise. (__builtin_vpair_i64*): Likeise. * config/rs6000/vector-pair.md (UNSPEC_VPAIR_V32QI): New unspec. (UNSPEC_VPAIR_V16HI): Likewise. (UNSPEC_VPAIR_V8SI): Likewise. (UNSPEC_VPAIR_V4DI): Likewise. (VP_INT_BINARY): New iterator for integer vector pair. (vp_insn): Add supoort for integer vector pairs. (vp_ireg): New code attribute for integer vector pairs. (vp_ipredicate): Likewise. (VP_INT): New int interator for integer vector pairs. (VP_VEC_MODE): Likewise. (vp_pmode): Likewise. (vp_vmode): Likewise. (vp_neg_reg): New int interator for integer vector pairs. (vpair_neg_): Add integer vector pair support insns. (vpair_not_2): Likewise. (vpair__3): Likewise. (vpair_andc_): Likewise. (vpair_nand__1): Likewise. (vpair_nand__2): Likewise. (vpair_nor__1): Likewise. (vpair_nor__2): Likewise. * doc/extend.texi (PowerPC Vector Pair Built-in Functions): Document the integer vector pair built-in functions. gcc/testsuite/ * gcc.target/powerpc/vector-pair-5.c: New test. * gcc.target/powerpc/vector-pair-6.c: New test. * gcc.target/powerpc/vector-pair-7.c: New test. * gcc.target/powerpc/vector-pair-8.c: New test. --- gcc/config/rs6000/rs6000-builtins.def | 144 +++++++++ gcc/config/rs6000/vector-pair.md | 280 +++++++++++++++++- gcc/doc/extend.texi | 72 +++++ .../gcc.target/powerpc/vector-pair-5.c | 193 ++++++++++++ .../gcc.target/powerpc/vector-pair-6.c | 193 ++++++++++++ .../gcc.target/powerpc/vector-pair-7.c | 193 ++++++++++++ .../gcc.target/powerpc/vector-pair-8.c | 194 ++++++++++++ 7 files changed, 1266 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-5.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-6.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-7.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-8.c diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index 89b248b50ef..3b2db39c1ab 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -4183,3 +4183,147 @@ v256 __builtin_vpair_f64_sub (v256, v256); VPAIR_F64_SUB vpair_sub_v4df3 {mma,pair} + +;; vector pair built-in functions for 32 8-bit unsigned char or +;; signed char values + + v256 __builtin_vpair_i8_add (v256, v256); + VPAIR_I8_ADD vpair_add_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_and (v256, v256); + VPAIR_I8_AND vpair_and_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_ior (v256, v256); + VPAIR_I8_IOR vpair_ior_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_max (v256, v256); + VPAIR_I8_MAX vpair_smax_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_min (v256, v256); + VPAIR_I8_MIN vpair_smin_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_neg (v256); + VPAIR_I8_NEG vpair_neg_v32qi2 {mma,pair} + + v256 __builtin_vpair_i8_not (v256); + VPAIR_I8_NOT vpair_not_v32qi2 {mma,pair} + + v256 __builtin_vpair_i8_sub (v256, v256); + VPAIR_I8_SUB vpair_sub_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8_xor (v256, v256); + VPAIR_I8_XOR vpair_xor_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8u_max (v256, v256); + VPAIR_I8U_MAX vpair_umax_v32qi3 {mma,pair} + + v256 __builtin_vpair_i8u_min (v256, v256); + VPAIR_I8U_MIN vpair_umin_v32qi3 {mma,pair} + +;; vector pair built-in functions for 16 16-bit unsigned short or +;; signed short values + + v256 __builtin_vpair_i16_add (v256, v256); + VPAIR_I16_ADD vpair_add_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_and (v256, v256); + VPAIR_I16_AND vpair_and_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_ior (v256, v256); + VPAIR_I16_IOR vpair_ior_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_max (v256, v256); + VPAIR_I16_MAX vpair_smax_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_min (v256, v256); + VPAIR_I16_MIN vpair_smin_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_neg (v256); + VPAIR_I16_NEG vpair_neg_v16hi2 {mma,pair} + + v256 __builtin_vpair_i16_not (v256); + VPAIR_I16_NOT vpair_not_v16hi2 {mma,pair} + + v256 __builtin_vpair_i16_sub (v256, v256); + VPAIR_I16_SUB vpair_sub_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16_xor (v256, v256); + VPAIR_I16_XOR vpair_xor_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16u_max (v256, v256); + VPAIR_I16U_MAX vpair_umax_v16hi3 {mma,pair} + + v256 __builtin_vpair_i16u_min (v256, v256); + VPAIR_I16U_MIN vpair_umin_v16hi3 {mma,pair} + +;; vector pair built-in functions for 8 32-bit unsigned int or +;; signed int values + + v256 __builtin_vpair_i32_add (v256, v256); + VPAIR_I32_ADD vpair_add_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_and (v256, v256); + VPAIR_I32_AND vpair_and_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_ior (v256, v256); + VPAIR_I32_IOR vpair_ior_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_max (v256, v256); + VPAIR_I32_MAX vpair_smax_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_min (v256, v256); + VPAIR_I32_MIN vpair_smin_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_neg (v256); + VPAIR_I32_NEG vpair_neg_v8si2 {mma,pair} + + v256 __builtin_vpair_i32_not (v256); + VPAIR_I32_NOT vpair_not_v8si2 {mma,pair} + + v256 __builtin_vpair_i32_sub (v256, v256); + VPAIR_I32_SUB vpair_sub_v8si3 {mma,pair} + + v256 __builtin_vpair_i32_xor (v256, v256); + VPAIR_I32_XOR vpair_xor_v8si3 {mma,pair} + + v256 __builtin_vpair_i32u_max (v256, v256); + VPAIR_I32U_MAX vpair_umax_v8si3 {mma,pair} + + v256 __builtin_vpair_i32u_min (v256, v256); + VPAIR_I32U_MIN vpair_umin_v8si3 {mma,pair} + +;; vector pair built-in functions for 4 64-bit unsigned long long or +;; signed long long values + + v256 __builtin_vpair_i64_add (v256, v256); + VPAIR_I64_ADD vpair_add_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_and (v256, v256); + VPAIR_I64_AND vpair_and_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_ior (v256, v256); + VPAIR_I64_IOR vpair_ior_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_max (v256, v256); + VPAIR_I64_MAX vpair_smax_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_min (v256, v256); + VPAIR_I64_MIN vpair_smin_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_neg (v256); + VPAIR_I64_NEG vpair_neg_v4di2 {mma,pair} + + v256 __builtin_vpair_i64_not (v256); + VPAIR_I64_NOT vpair_not_v4di2 {mma,pair} + + v256 __builtin_vpair_i64_sub (v256, v256); + VPAIR_I64_SUB vpair_sub_v4di3 {mma,pair} + + v256 __builtin_vpair_i64_xor (v256, v256); + VPAIR_I64_XOR vpair_xor_v4di3 {mma,pair} + + v256 __builtin_vpair_i64u_max (v256, v256); + VPAIR_I64U_MAX vpair_umax_v4di3 {mma,pair} + + v256 __builtin_vpair_i64u_min (v256, v256); + VPAIR_I64U_MIN vpair_umin_v4di3 {mma,pair} diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md index 2dcac6a31e2..cd14430f47a 100644 --- a/gcc/config/rs6000/vector-pair.md +++ b/gcc/config/rs6000/vector-pair.md @@ -29,38 +29,102 @@ (define_c_enum "unspec" [UNSPEC_VPAIR_V4DF UNSPEC_VPAIR_V8SF + UNSPEC_VPAIR_V32QI + UNSPEC_VPAIR_V16HI + UNSPEC_VPAIR_V8SI + UNSPEC_VPAIR_V4DI ]) ;; Iterator doing unary/binary arithmetic on vector pairs (define_code_iterator VP_FP_UNARY [abs neg]) (define_code_iterator VP_FP_BINARY [minus mult plus smin smax]) +(define_code_iterator VP_INT_BINARY [and ior minus plus smax smin umax umin xor]) + ;; Return the insn name from the VP_* code iterator (define_code_attr vp_insn [(abs "abs") + (and "and") + (ior "ior") (minus "sub") (mult "mul") + (not "one_cmpl") (neg "neg") (plus "add") (smin "smin") (smax "smax") + (umin "umin") + (umax "umax") (xor "xor")]) +;; Return the register constraint ("v" or "wa") for the integer code iterator +;; used. For arithmetic operations, we need to use "v" in order to use the +;; Altivec instruction. For logical operations, we can use wa. +(define_code_attr vp_ireg [(and "wa") + (ior "wa") + (minus "v") + (not "wa") + (neg "v") + (plus "v") + (smax "v") + (smin "v") + (umax "v") + (umin "v") + (xor "wa")]) + +;; Return the register previdcate for the integer code iterator used +(define_code_attr vp_ipredicate [(and "vsx_register_operand") + (ior "vsx_register_operand") + (minus "altivec_register_operand") + (not "vsx_register_operand") + (neg "altivec_register_operand") + (plus "altivec_register_operand") + (smax "altivec_register_operand") + (smin "altivec_register_operand") + (umax "altivec_register_operand") + (umin "altivec_register_operand") + (xor "vsx_register_operand")]) + ;; Iterator for creating the unspecs for vector pair built-ins (define_int_iterator VP_FP [UNSPEC_VPAIR_V4DF UNSPEC_VPAIR_V8SF]) +(define_int_iterator VP_INT [UNSPEC_VPAIR_V4DI + UNSPEC_VPAIR_V8SI + UNSPEC_VPAIR_V16HI + UNSPEC_VPAIR_V32QI]) + ;; Map VP_* to vector mode of the arguments after they are split (define_int_attr VP_VEC_MODE [(UNSPEC_VPAIR_V4DF "V2DF") - (UNSPEC_VPAIR_V8SF "V4SF")]) + (UNSPEC_VPAIR_V8SF "V4SF") + (UNSPEC_VPAIR_V32QI "V16QI") + (UNSPEC_VPAIR_V16HI "V8HI") + (UNSPEC_VPAIR_V8SI "V4SI") + (UNSPEC_VPAIR_V4DI "V2DI")]) ;; Map VP_* to a lower case name to identify the vector pair. (define_int_attr vp_pmode [(UNSPEC_VPAIR_V4DF "v4df") - (UNSPEC_VPAIR_V8SF "v8sf")]) + (UNSPEC_VPAIR_V8SF "v8sf") + (UNSPEC_VPAIR_V32QI "v32qi") + (UNSPEC_VPAIR_V16HI "v16hi") + (UNSPEC_VPAIR_V8SI "v8si") + (UNSPEC_VPAIR_V4DI "v4di")]) ;; Map VP_* to a lower case name to identify the vector after the vector pair ;; has been split. (define_int_attr vp_vmode [(UNSPEC_VPAIR_V4DF "v2df") - (UNSPEC_VPAIR_V8SF "v4sf")]) + (UNSPEC_VPAIR_V8SF "v4sf") + (UNSPEC_VPAIR_V32QI "v16qi") + (UNSPEC_VPAIR_V16HI "v8hi") + (UNSPEC_VPAIR_V8SI "v4si") + (UNSPEC_VPAIR_V4DI "v2di")]) + +;; Map VP_INT to constraints used for the negate scratch register. For vectors +;; of QI and HI, we need to change -a into 0 - a since we don't have a negate +;; operation. We do have a vnegw/vnegd operation for SI and DI modes. +(define_int_attr vp_neg_reg [(UNSPEC_VPAIR_V32QI "&v") + (UNSPEC_VPAIR_V16HI "&v") + (UNSPEC_VPAIR_V8SI "X") + (UNSPEC_VPAIR_V4DI "X")]) ;; Vector pair floating point unary operations @@ -327,3 +391,213 @@ (define_insn_and_split "*vpair_nfms_fpcontract_4" { } [(set_attr "length" "8")]) + + +;; Vector pair integer negate support. +(define_insn_and_split "vpair_neg_2" + [(set (match_operand:OO 0 "altivec_register_operand" "=v") + (unspec:OO [(neg:OO + (match_operand:OO 1 "altivec_register_operand" "v"))] + VP_INT)) + (clobber (match_scratch: 2 "="))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 4) (minus: (match_dup 2) + (match_dup 5))) + (set (match_dup 6) (minus: (match_dup 2) + (match_dup 7)))] +{ + unsigned reg0 = reg_or_subregno (operands[0]); + unsigned reg1 = reg_or_subregno (operands[1]); + machine_mode vmode = mode; + + operands[3] = CONST0_RTX (vmode); + + operands[4] = gen_rtx_REG (vmode, reg0); + operands[5] = gen_rtx_REG (vmode, reg1); + + operands[6] = gen_rtx_REG (vmode, reg0 + 1); + operands[7] = gen_rtx_REG (vmode, reg1 + 1); + + /* If the vector integer size is 32 or 64 bits, we can use the vneg{w,d} + instructions. */ + if (vmode == V4SImode) + { + emit_insn (gen_negv4si2 (operands[4], operands[5])); + emit_insn (gen_negv4si2 (operands[6], operands[7])); + DONE; + } + else if (vmode == V2DImode) + { + emit_insn (gen_negv2di2 (operands[4], operands[5])); + emit_insn (gen_negv2di2 (operands[6], operands[7])); + DONE; + } +} + [(set_attr "length" "8")]) + +;; Vector pair integer not support. +(define_insn_and_split "vpair_not_2" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO [(not:OO (match_operand:OO 1 "vsx_register_operand" "wa"))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_unary_vector_pair (mode, operands, + gen_one_cmpl2); + DONE; +} + [(set_attr "length" "8")]) + +;; Vector pair integer binary operations. +(define_insn_and_split "vpair__3" + [(set (match_operand:OO 0 "" "=") + (unspec:OO [(VP_INT_BINARY:OO + (match_operand:OO 1 "" "") + (match_operand:OO 2 "" ""))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_3); + DONE; +} + [(set_attr "length" "8")]) + +;; Optimize vector pair a & ~b +(define_insn_and_split "*vpair_andc_" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO [(and:OO + (unspec:OO + [(not:OO + (match_operand:OO 1 "vsx_register_operand" "wa"))] + VP_INT) + (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_andc3); + DONE; +} + [(set_attr "length" "8")]) + +;; Optimize vector pair a | ~b +(define_insn_and_split "*vpair_iorc_" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO [(ior:OO + (unspec:OO + [(not:OO + (match_operand:OO 1 "vsx_register_operand" "wa"))] + VP_INT) + (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_orc3); + DONE; +} + [(set_attr "length" "8")]) + +;; Optiomize vector pair ~(a & b) or ((~a) | (~b)) +(define_insn_and_split "*vpair_nand__1" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO + [(not:OO + (unspec:OO [(and:OO + (match_operand:OO 1 "vsx_register_operand" "wa") + (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_nand3); + DONE; +} + [(set_attr "length" "8")]) + +(define_insn_and_split "*vpair_nand__2" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO + [(ior:OO + (unspec:OO + [(not:OO + (match_operand:OO 1 "vsx_register_operand" "wa"))] + VP_INT) + (unspec:OO + [(not:OO + (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_nand3); + DONE; +} + [(set_attr "length" "8")]) + +;; Optiomize vector pair ~(a | b) or ((~a) & (~b)) +(define_insn_and_split "*vpair_nor__1" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO + [(not:OO + (unspec:OO [(ior:OO + (match_operand:OO 1 "vsx_register_operand" "wa") + (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_nor3); + DONE; +} + [(set_attr "length" "8")]) + +(define_insn_and_split "*vpair_nor__2" + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") + (unspec:OO + [(ior:OO + (unspec:OO + [(not:OO (match_operand:OO 1 "vsx_register_operand" "wa"))] + VP_INT) + (unspec:OO + [(not:OO (match_operand:OO 2 "vsx_register_operand" "wa"))] + VP_INT))] + VP_INT))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_binary_vector_pair (mode, operands, + gen_nor3); + DONE; +} + [(set_attr "length" "8")]) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index a830ad06b90..ff7918c7a58 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21414,6 +21414,78 @@ __vector_pair __builtin_vpair_f64_min (__vector_pair, __vector_pair); __vector_pair __builtin_vpair_f64_sub (__vector_pair, __vector_pair); @end smallexample +The following built-in functions operate on pairs of +@code{vector long long} or @code{vector unsigned long long} values: + +@smallexample +__vector_pair __builtin_vpair_i64_add (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_and (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_ior (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_min (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_neg (__vector_pair); +__vector_pair __builtin_vpair_i64_not (__vector_pair); +__vector_pair __builtin_vpair_i64_sub (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64_xor (__vector_pair, __vector_pair); + +__vector_pair __builtin_vpair_i64u_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i64u_min (__vector_pair, __vector_pair); +@end smallexample + +The following built-in functions operate on pairs of +@code{vector int} or @code{vector unsigned int} values: + +@smallexample +__vector_pair __builtin_vpair_i32_add (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_and (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_ior (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_neg (__vector_pair); +__vector_pair __builtin_vpair_i32_not (__vector_pair); +__vector_pair __builtin_vpair_i32_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_min (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_sub (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32_xor (__vector_pair, __vector_pair); + +__vector_pair __builtin_vpair_i32u_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i32u_min (__vector_pair, __vector_pair); +@end smallexample + +The following built-in functions operate on pairs of +@code{vector short} or @code{vector unsigned short} values: + +@smallexample +__vector_pair __builtin_vpair_i16_add (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_and (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_ior (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_min (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_neg (__vector_pair); +__vector_pair __builtin_vpair_i16_not (__vector_pair); +__vector_pair __builtin_vpair_i16_sub (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16_xor (__vector_pair, __vector_pair); + +__vector_pair __builtin_vpair_i16u_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i16u_min (__vector_pair, __vector_pair); +@end smallexample + +The following built-in functions operate on pairs of +@code{vector signed char} or @code{vector unsigned char} values: + +@smallexample +__vector_pair __builtin_vpair_i8_add (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_and (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_ior (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_max (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_min (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_neg (__vector_pair); +__vector_pair __builtin_vpair_i8_not (__vector_pair); +__vector_pair __builtin_vpair_i8_sub (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_xor (__vector_pair, __vector_pair); + +__vector_pair __builtin_vpair_i8_umax (__vector_pair, __vector_pair); +__vector_pair __builtin_vpair_i8_umin (__vector_pair, __vector_pair); +@end smallexample + @node PowerPC Hardware Transactional Memory Built-in Functions @subsection PowerPC Hardware Transactional Memory Built-in Functions GCC provides two interfaces for accessing the Hardware Transactional diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-5.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-5.c new file mode 100644 index 00000000000..924919cae1b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-5.c @@ -0,0 +1,193 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test whether the vector buitin code generates the expected instructions for + vector pairs with 4 64-bit integer elements. */ + +void +test_add (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vaddudm, 1 stxvp. */ + *dest = __builtin_vpair_i64_add (*x, *y); +} + +void +test_sub (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vaddudm, 1 stxvp. */ + *dest = __builtin_vpair_i64_sub (*x, *y); +} + +void +test_and (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxland, 1 stxvp. */ + *dest = __builtin_vpair_i64_and (*x, *y); +} + +void +test_or (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlor, 1 stxvp. */ + *dest = __builtin_vpair_i64_ior (*x, *y); +} + +void +test_xor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlxor, 1 stxvp. */ + *dest = __builtin_vpair_i64_xor (*x, *y); +} + +void +test_smax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxsd, 1 stxvp. */ + *dest = __builtin_vpair_i64_max (*x, *y); +} + +void +test_smin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminsd, 1 stxvp. */ + *dest = __builtin_vpair_i64_min (*x, *y); +} + +void +test_umax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxud, 1 stxvp. */ + *dest = __builtin_vpair_i64u_max (*x, *y); +} + +void +test_umin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminud, 1 stxvp. */ + *dest = __builtin_vpair_i64u_min (*x, *y); +} + +void +test_negate (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 vnegd, 1 stxvp. */ + *dest = __builtin_vpair_i64_neg (*x); +} + +void +test_not (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + *dest = __builtin_vpair_i64_not (*x); +} + +/* Combination of logical operators. */ + +void +test_andc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i64_not (*y); + *dest = __builtin_vpair_i64_and (*x, n); +} + +void +test_andc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i64_not (*x); + *dest = __builtin_vpair_i64_and (n, *y); +} + +void +test_orc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i64_not (*y); + *dest = __builtin_vpair_i64_ior (*x, n); +} + +void +test_orc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i64_not (*x); + *dest = __builtin_vpair_i64_ior (n, *y); +} + +void +test_nand_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i64_and (*x, *y); + *dest = __builtin_vpair_i64_not (a); +} + +void +test_nand_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair nx = __builtin_vpair_i64_not (*x); + __vector_pair ny = __builtin_vpair_i64_not (*y); + *dest = __builtin_vpair_i64_ior (nx, ny); +} + +void +test_nor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i64_ior (*x, *y); + *dest = __builtin_vpair_i64_not (a); +} + +/* { dg-final { scan-assembler-times {\mlxvp\M} 34 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 18 } } */ +/* { dg-final { scan-assembler-times {\mvaddudm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxsd\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxud\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminsd\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminud\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvnegd\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsubudm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxland\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlandc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnand\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnor\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlorc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlxor\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-6.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-6.c new file mode 100644 index 00000000000..f22949c1f95 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-6.c @@ -0,0 +1,193 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test whether the vector buitin code generates the expected instructions for + vector pairs with 8 32-bit integer elements. */ + +void +test_add (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vadduwm, 1 stxvp. */ + *dest = __builtin_vpair_i32_add (*x, *y); +} + +void +test_sub (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vsubuwm, 1 stxvp. */ + *dest = __builtin_vpair_i32_sub (*x, *y); +} + +void +test_and (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxland, 1 stxvp. */ + *dest = __builtin_vpair_i32_and (*x, *y); +} + +void +test_or (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlor, 1 stxvp. */ + *dest = __builtin_vpair_i32_ior (*x, *y); +} + +void +test_xor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlxor, 1 stxvp. */ + *dest = __builtin_vpair_i32_xor (*x, *y); +} + +void +test_smax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxsw, 1 stxvp. */ + *dest = __builtin_vpair_i32_max (*x, *y); +} + +void +test_smin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminsw, 1 stxvp. */ + *dest = __builtin_vpair_i32_min (*x, *y); +} + +void +test_umax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxuw, 1 stxvp. */ + *dest = __builtin_vpair_i32u_max (*x, *y); +} + +void +test_umin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminuw, 1 stxvp. */ + *dest = __builtin_vpair_i32u_min (*x, *y); +} + +void +test_negate (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 vnegw, 1 stxvp. */ + *dest = __builtin_vpair_i32_neg (*x); +} + +void +test_not (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + *dest = __builtin_vpair_i32_not (*x); +} + +/* Combination of logical operators. */ + +void +test_andc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i32_not (*y); + *dest = __builtin_vpair_i32_and (*x, n); +} + +void +test_andc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i32_not (*x); + *dest = __builtin_vpair_i32_and (n, *y); +} + +void +test_orc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i32_not (*y); + *dest = __builtin_vpair_i32_ior (*x, n); +} + +void +test_orc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i32_not (*x); + *dest = __builtin_vpair_i32_ior (n, *y); +} + +void +test_nand_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i32_and (*x, *y); + *dest = __builtin_vpair_i32_not (a); +} + +void +test_nand_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair nx = __builtin_vpair_i32_not (*x); + __vector_pair ny = __builtin_vpair_i32_not (*y); + *dest = __builtin_vpair_i32_ior (nx, ny); +} + +void +test_nor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i32_ior (*x, *y); + *dest = __builtin_vpair_i32_not (a); +} + +/* { dg-final { scan-assembler-times {\mlxvp\M} 34 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 18 } } */ +/* { dg-final { scan-assembler-times {\mvadduwm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxsw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxuw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminsw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminuw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvnegw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsubuwm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxland\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlandc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnand\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnor\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlorc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlxor\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-7.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-7.c new file mode 100644 index 00000000000..71452f59284 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-7.c @@ -0,0 +1,193 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test whether the vector buitin code generates the expected instructions for + vector pairs with 16 16-bit integer elements. */ + +void +test_add (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vadduhm, 1 stxvp. */ + *dest = __builtin_vpair_i16_add (*x, *y); +} + +void +test_sub (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vsubuhm, 1 stxvp. */ + *dest = __builtin_vpair_i16_sub (*x, *y); +} + +void +test_and (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxland, 1 stxvp. */ + *dest = __builtin_vpair_i16_and (*x, *y); +} + +void +test_or (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlor, 1 stxvp. */ + *dest = __builtin_vpair_i16_ior (*x, *y); +} + +void +test_xor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlxor, 1 stxvp. */ + *dest = __builtin_vpair_i16_xor (*x, *y); +} + +void +test_smax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxsh, 1 stxvp. */ + *dest = __builtin_vpair_i16_max (*x, *y); +} + +void +test_smin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminsh, 1 stxvp. */ + *dest = __builtin_vpair_i16_min (*x, *y); +} + +void +test_umax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxuh, 1 stxvp. */ + *dest = __builtin_vpair_i16u_max (*x, *y); +} + +void +test_umin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminuh, 1 stxvp. */ + *dest = __builtin_vpair_i16u_min (*x, *y); +} + +void +test_negate (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 1 xxspltib, 2 vsubuhm, 1 stxvp. */ + *dest = __builtin_vpair_i16_neg (*x); +} + +void +test_not (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + *dest = __builtin_vpair_i16_not (*x); +} + +/* Combination of logical operators. */ + +void +test_andc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i16_not (*y); + *dest = __builtin_vpair_i16_and (*x, n); +} + +void +test_andc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i16_not (*x); + *dest = __builtin_vpair_i16_and (n, *y); +} + +void +test_orc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i16_not (*y); + *dest = __builtin_vpair_i16_ior (*x, n); +} + +void +test_orc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i16_not (*x); + *dest = __builtin_vpair_i16_ior (n, *y); +} + +void +test_nand_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i16_and (*x, *y); + *dest = __builtin_vpair_i16_not (a); +} + +void +test_nand_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair nx = __builtin_vpair_i16_not (*x); + __vector_pair ny = __builtin_vpair_i16_not (*y); + *dest = __builtin_vpair_i16_ior (nx, ny); +} + +void +test_nor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i16_ior (*x, *y); + *dest = __builtin_vpair_i16_not (a); +} + +/* { dg-final { scan-assembler-times {\mlxvp\M} 34 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 18 } } */ +/* { dg-final { scan-assembler-times {\mvadduhm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxsh\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxuh\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminsh\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminuh\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsubuhm\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxland\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlandc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnand\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnor\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlorc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlxor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxspltib\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-8.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-8.c new file mode 100644 index 00000000000..8db9056d4cc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-8.c @@ -0,0 +1,194 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test whether the vector buitin code generates the expected instructions for + vector pairs with 32 8-bit integer elements. */ + + +void +test_add (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vaddubm, 1 stxvp. */ + *dest = __builtin_vpair_i8_add (*x, *y); +} + +void +test_sub (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vsububm, 1 stxvp. */ + *dest = __builtin_vpair_i8_sub (*x, *y); +} + +void +test_and (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxland, 1 stxvp. */ + *dest = __builtin_vpair_i8_and (*x, *y); +} + +void +test_or (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlor, 1 stxvp. */ + *dest = __builtin_vpair_i8_ior (*x, *y); +} + +void +test_xor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlxor, 1 stxvp. */ + *dest = __builtin_vpair_i8_xor (*x, *y); +} + +void +test_smax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxsb, 1 stxvp. */ + *dest = __builtin_vpair_i8_max (*x, *y); +} + +void +test_smin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminsb, 1 stxvp. */ + *dest = __builtin_vpair_i8_min (*x, *y); +} + +void +test_umax (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vmaxub, 1 stxvp. */ + *dest = __builtin_vpair_i8u_max (*x, *y); +} + +void +test_umin (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 vminub, 1 stxvp. */ + *dest = __builtin_vpair_i8u_min (*x, *y); +} + +void +test_negate (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 1 xxspltib, 2 vsububm, 1 stxvp. */ + *dest = __builtin_vpair_i8_neg (*x); +} + +void +test_not (__vector_pair *dest, + __vector_pair *x) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + *dest = __builtin_vpair_i8_not (*x); +} + +/* Combination of logical operators. */ + +void +test_andc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i8_not (*y); + *dest = __builtin_vpair_i8_and (*x, n); +} + +void +test_andc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlandc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i8_not (*x); + *dest = __builtin_vpair_i8_and (n, *y); +} + +void +test_orc_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i8_not (*y); + *dest = __builtin_vpair_i8_ior (*x, n); +} + +void +test_orc_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlorc, 1 stxvp. */ + __vector_pair n = __builtin_vpair_i8_not (*x); + *dest = __builtin_vpair_i8_ior (n, *y); +} + +void +test_nand_1 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i8_and (*x, *y); + *dest = __builtin_vpair_i8_not (a); +} + +void +test_nand_2 (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnand, 1 stxvp. */ + __vector_pair nx = __builtin_vpair_i8_not (*x); + __vector_pair ny = __builtin_vpair_i8_not (*y); + *dest = __builtin_vpair_i8_ior (nx, ny); +} + +void +test_nor (__vector_pair *dest, + __vector_pair *x, + __vector_pair *y) +{ + /* 2 lxvp, 2 xxlnor, 1 stxvp. */ + __vector_pair a = __builtin_vpair_i8_ior (*x, *y); + *dest = __builtin_vpair_i8_not (a); +} + +/* { dg-final { scan-assembler-times {\mlxvp\M} 34 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 18 } } */ +/* { dg-final { scan-assembler-times {\mvaddubm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxsb\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvmaxub\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminsb\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvminub\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsububm\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxland\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlandc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnand\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlnor\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxlorc\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxlxor\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxspltib\M} 1 } } */