From patchwork Fri Nov 10 23:13:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 164024 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b129:0:b0:403:3b70:6f57 with SMTP id q9csp1439701vqs; Fri, 10 Nov 2023 15:14:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IElC5v8PULlxSYXFp4Oy5DpAyscvnenYYu9jjEbfJL0VBO8s9Sz2tLJPygvm582WZbQZFi6 X-Received: by 2002:a0c:ea32:0:b0:66d:46ac:2fbf with SMTP id t18-20020a0cea32000000b0066d46ac2fbfmr631282qvp.14.1699658088656; Fri, 10 Nov 2023 15:14:48 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699658088; cv=pass; d=google.com; s=arc-20160816; b=LU0BwDsur8kbVA2qCssIv5eanbIpawuwPeNAIP2vwettaNxDUYE94QeE7CQ1meX1dY t9xZytSQvpaPOKZ6VKn29vE+2fwjF23qyTFuvEGk2/bzuAOw+tuoQUnkmnrolp81oJTG xftf2Z4ltXpFihA08i2fgOwVJHDXdcDYeroMFwJth2Uw/9DffWa+u+g7PXK2BattPKQ5 TUXpkDSKeLu3iQS6nKGRu2e/17ZoFUKIGqk9TN31eL9ES9J3sfjpUMefL8ZFme30tUFz BYLmvWVPQF9DbBC/Kd6UQXQenL3nrrrCb5W2oPTOsbYXeLMXY0Z9LoCkoE9ya+xI4ny5 BXkQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:in-reply-to:content-disposition :mime-version:references:mail-followup-to:message-id:subject:to:from :date:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ORvoLoFz24f0FA3fM7/33kbNsBqPF/NHrMexGZ0vSpI=; fh=jH+DijE7mz3ySVsRmzRqEe/ioBeGu3vnvA+jm2JjCm8=; b=DDhLC6dfDbN3ILzHx/rug6AMyn4hrpMqfUWunVNWvZWADQkAHWluvlM16HnA0fHZ2S 6Q9t+e1TWK83Tu+rGbyHF2OrZQxsztdJTqnW2GL8XnjO9x6AVAgxI1XUbs+ikePiEA1S Bkn9jO850l7X5FfnurrrdqXNIU2Ig/27VyJHUzFcwQOJ465DEn+4KBsvbq2EFiZcmaJz TUSZ3RUDxBGJ5bJpg8aghHJkV1zGaO67Z0uOxRbrhaeQdtxI1WyIA7XGyx6lIDT7GZVS FsE+Dk0FE4sDONj0wu4Q/an68xFaiqSAOzGff8ur+wNZZXZZlpoo5dCwYLKKiP86oZAJ HFXw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=fIZnkl2P; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c5-20020a0ceb45000000b0067073f29dcdsi492662qvq.429.2023.11.10.15.14.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Nov 2023 15:14:48 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=fIZnkl2P; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 58A7538582AC for ; Fri, 10 Nov 2023 23:14:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 139223858D1E for ; Fri, 10 Nov 2023 23:14:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 139223858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 139223858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699658061; cv=none; b=Pk493gDvjAHJkCrsva2T1drp0X3mWq2Kj8do9U/5+Acldz2VbxpMog3J4qA1lB5mAab7aM7nCL7YfU3yOg1X2ldb9WYG2p6Hq3Rb+2LiI1G44BJJksMu+ItBVRwE1svLJr2NXEn4ys1QhgjmCvBtK3UIevOUZrPHH+s2fkqFcP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699658061; c=relaxed/simple; bh=NTu+5rRqmUzz4a8sxGrisNjsQC0gcmhFOBk0DzX8qp4=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=ByE0nw69j90+FPQt4ZBwCPVqiEuOA9x/Q82HqvLsqfPib++/BiJNf5BA5U75TtKzQJ7Lh9+GlR0KRDVNmSK/473D2VA1+WLKs8lNoiQ4Lp+FKfHTXt1I5V6S6K+UabAVF+zFRk38opSLSofjJprSylMaHv4wMkpwXwFKRN9DAcU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AANBFGA027357; Fri, 10 Nov 2023 23:14:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=ORvoLoFz24f0FA3fM7/33kbNsBqPF/NHrMexGZ0vSpI=; b=fIZnkl2PFFc/B4P6fHVQXPQXVQLB3GjH4iaExMagcTLiROL3mwOX3zGSTqf1l42wj4Gf 3HHqrIDmOKbr+zS6zbYZOx+snCgiGdKgZlJRQSQFhSHro8kubsAqwtCVtMmuePLa9/Df jkSzwvDHAJes+57CKOAeiRwK61q5Do/qWG0oTuBg8YlaSHpgotpIa0IYLg2Q7byQiiB9 TwpXIGy82v9yXhToLQ70hHuNKxyL+RDuxg2xZRs073A7O9+8lJQJvcG0QVOS3dIU4Wxd zFjZnnceg1roGwvtch4Bmuh1Ro+LtB/TteBGc7j0WcZyq+rCRaheXNeJzKJQxLNGkVjD cQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9wyar1x5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:14:17 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AANCaxl031198; Fri, 10 Nov 2023 23:14:15 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9wyar1rq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:14:14 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AALV2RN000652; Fri, 10 Nov 2023 23:14:01 GMT Received: from smtprelay04.dal12v.mail.ibm.com ([172.16.1.6]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3u7w23eb1h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 23:14:01 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay04.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AANE0Nk1704494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Nov 2023 23:14:00 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7FB8B58059; Fri, 10 Nov 2023 23:14:00 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C7B7D58043; Fri, 10 Nov 2023 23:13:59 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.104.206]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Fri, 10 Nov 2023 23:13:59 +0000 (GMT) Date: Fri, 10 Nov 2023 18:13:58 -0500 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH 4/4] Add support for doing a horizontal add on vector pair elements. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: KakXDFAZmM4z6-rmfcylfw3wJvpXD7l6 X-Proofpoint-GUID: sKBi8yTOQ-3UIA6a8_mmojLOVTfrECiv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-10_21,2023-11-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 priorityscore=1501 impostorscore=0 bulkscore=0 mlxlogscore=993 suspectscore=0 clxscore=1015 spamscore=0 adultscore=0 phishscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311100192 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782220680198925653 X-GMAIL-MSGID: 1782220680198925653 This patch adds a series of built-in functions to allow users to write code to do a number of simple operations where the loop is done using the __vector_pair type. The __vector_pair type is an opaque type. These built-in functions keep the two 128-bit vectors within the __vector_pair together, and split the operation after register allocation. This patch provides vector pair built-in functions to do a horizontal add on vector pair elements. Only floating point and 64-bit horizontal adds are provided in this patch. I have built and tested these patches on: * A little endian power10 server using --with-cpu=power10 * A little endian power9 server using --with-cpu=power9 * A big endian power9 server using --with-cpu=power9. Can I check this patch into the master branch after the preceeding patches have been checked in? 2023-11-08 Michael Meissner gcc/ * config/rs6000/rs6000-builtins.def (__builtin_vpair_f32_add_elements): New built-in function. (__builtin_vpair_f64_add_elements): Likewise. (__builtin_vpair_i64_add_elements): Likewise. (__builtin_vpair_i64u_add_elements): Likewise. * config/rs6000/vector-pair.md (UNSPEC_VPAIR_REDUCE_PLUS_F32): New unspec. (UNSPEC_VPAIR_REDUCE_PLUS_F64): Likewise. (UNSPEC_VPAIR_REDUCE_PLUS_I64): Likewise. (vpair_reduc_plus_scale_v8sf): New insn. (vpair_reduc_plus_scale_v4df): Likewise. (vpair_reduc_plus_scale_v4di): Likewise. * doc/extend.texi (__builtin_vpair_f32_add_elements): Document. (__builtin_vpair_f64_add_elements): Likewise. (__builtin_vpair_i64_add_elements): Likewise. gcc/testsuite/ * gcc.target/powerpc/vector-pair-16.c: New test. --- gcc/config/rs6000/rs6000-builtins.def | 12 +++ gcc/config/rs6000/vector-pair.md | 93 +++++++++++++++++++ gcc/doc/extend.texi | 3 + .../gcc.target/powerpc/vector-pair-16.c | 45 +++++++++ 4 files changed, 153 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-16.c diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index fbd416ceb87..b9a16c01420 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -4145,6 +4145,9 @@ v256 __builtin_vpair_f32_add (v256, v256); VPAIR_F32_ADD vpair_add_v8sf3 {mma,pair} + float __builtin_vpair_f32_add_elements (v256); + VPAIR_F32_ADD_ELEMENTS vpair_reduc_plus_scale_v8sf {mma,pair} + v256 __builtin_vpair_f32_assemble (vf, vf); VPAIR_F32_ASSEMBLE vpair_assemble_v8sf {mma,pair} @@ -4180,6 +4183,9 @@ v256 __builtin_vpair_f64_add (v256, v256); VPAIR_F64_ADD vpair_add_v4df3 {mma,pair} + double __builtin_vpair_f64_add_elements (v256); + VPAIR_F64_ADD_ELEMENTS vpair_reduc_plus_scale_v4df {mma,pair} + v256 __builtin_vpair_f64_assemble (vd, vd); VPAIR_F64_ASSEMBLE vpair_assemble_v4df {mma,pair} @@ -4375,6 +4381,9 @@ v256 __builtin_vpair_f64_assemble (vd, vd); v256 __builtin_vpair_i64_add (v256, v256); VPAIR_I64_ADD vpair_add_v4di3 {mma,pair} + long long __builtin_vpair_i64_add_elements (v256); + VPAIR_I64_ADD_ELEMENTS vpair_reduc_plus_scale_v4di {mma,pair,no32bit} + v256 __builtin_vpair_i64_and (v256, v256); VPAIR_I64_AND vpair_and_v4di3 {mma,pair} @@ -4408,6 +4417,9 @@ v256 __builtin_vpair_f64_assemble (vd, vd); v256 __builtin_vpair_i64_xor (v256, v256); VPAIR_I64_XOR vpair_xor_v4di3 {mma,pair} + unsigned long long __builtin_vpair_i64u_add_elements (v256); + VPAIR_I64U_ADD_ELEMENTS vpair_reduc_plus_scale_v4di {mma,pair,no32bit} + v256 __builtin_vpair_i64u_assemble (vull, vull); VPAIR_I64U_ASSEMBLE vpair_assemble_v4di {mma,pair} diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md index f6d0b2a39fc..b5e9330e71f 100644 --- a/gcc/config/rs6000/vector-pair.md +++ b/gcc/config/rs6000/vector-pair.md @@ -35,6 +35,9 @@ (define_c_enum "unspec" UNSPEC_VPAIR_V4DI UNSPEC_VPAIR_ZERO UNSPEC_VPAIR_SPLAT + UNSPEC_VPAIR_REDUCE_PLUS_F32 + UNSPEC_VPAIR_REDUCE_PLUS_F64 + UNSPEC_VPAIR_REDUCE_PLUS_I64 ]) ;; Iterator doing unary/binary arithmetic on vector pairs @@ -577,6 +580,66 @@ (define_insn_and_split "*vpair_nfms_fpcontract_4" } [(set_attr "length" "8")]) + +;; Add all elements in a pair of V4SF vectors. +(define_insn_and_split "vpair_reduc_plus_scale_v8sf" + [(set (match_operand:SF 0 "vsx_register_operand" "=wa") + (unspec:SF [(match_operand:OO 1 "vsx_register_operand" "v")] + UNSPEC_VPAIR_REDUCE_PLUS_F32)) + (clobber (match_scratch:V4SF 2 "=&v")) + (clobber (match_scratch:V4SF 3 "=&v"))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(pc)] +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx tmp1 = operands[2]; + rtx tmp2 = operands[3]; + unsigned r = reg_or_subregno (op1); + rtx op1_hi = gen_rtx_REG (V4SFmode, r); + rtx op1_lo = gen_rtx_REG (V4SFmode, r + 1); + + emit_insn (gen_addv4sf3 (tmp1, op1_hi, op1_lo)); + emit_insn (gen_altivec_vsldoi_v4sf (tmp2, tmp1, tmp1, GEN_INT (8))); + emit_insn (gen_addv4sf3 (tmp2, tmp1, tmp2)); + emit_insn (gen_altivec_vsldoi_v4sf (tmp1, tmp2, tmp2, GEN_INT (4))); + emit_insn (gen_addv4sf3 (tmp2, tmp1, tmp2)); + emit_insn (gen_vsx_xscvspdp_scalar2 (op0, tmp2)); + DONE; +} + [(set_attr "length" "24")]) + +;; Add all elements in a pair of V2DF vectors +(define_insn_and_split "vpair_reduc_plus_scale_v4df" + [(set (match_operand:DF 0 "vsx_register_operand" "=&wa") + (unspec:DF [(match_operand:OO 1 "vsx_register_operand" "wa")] + UNSPEC_VPAIR_REDUCE_PLUS_F64)) + (clobber (match_scratch:DF 2 "=&wa")) + (clobber (match_scratch:V2DF 3 "=&wa"))] + "TARGET_MMA" + "#" + "&& reload_completed" + [(set (match_dup 3) + (plus:V2DF (match_dup 4) + (match_dup 5))) + (set (match_dup 2) + (vec_select:DF (match_dup 3) + (parallel [(match_dup 6)]))) + (set (match_dup 0) + (plus:DF (match_dup 7) + (match_dup 2)))] +{ + unsigned reg1 = reg_or_subregno (operands[1]); + unsigned reg3 = reg_or_subregno (operands[3]); + + operands[4] = gen_rtx_REG (V2DFmode, reg1); + operands[5] = gen_rtx_REG (V2DFmode, reg1 + 1); + operands[6] = GEN_INT (BYTES_BIG_ENDIAN ? 1 : 0); + operands[7] = gen_rtx_REG (DFmode, reg3); +}) + ;; Vector pair integer negate support. (define_insn_and_split "vpair_neg_2" @@ -786,3 +849,33 @@ (define_insn_and_split "*vpair_nor__2" DONE; } [(set_attr "length" "8")]) + +;; Add all elements in a pair of V2DI vectors +(define_insn_and_split "vpair_reduc_plus_scale_v4di" + [(set (match_operand:DI 0 "gpc_reg_operand" "=&r") + (unspec:DI [(match_operand:OO 1 "altivec_register_operand" "v")] + UNSPEC_VPAIR_REDUCE_PLUS_I64)) + (clobber (match_scratch:V2DI 2 "=&v")) + (clobber (match_scratch:DI 3 "=&r"))] + "TARGET_MMA && TARGET_POWERPC64" + "#" + "&& reload_completed" + [(set (match_dup 2) + (plus:V2DI (match_dup 4) + (match_dup 5))) + (set (match_dup 3) + (vec_select:DI (match_dup 2) + (parallel [(const_int 0)]))) + (set (match_dup 0) + (vec_select:DI (match_dup 2) + (parallel [(const_int 1)]))) + (set (match_dup 0) + (plus:DI (match_dup 0) + (match_dup 3)))] +{ + unsigned reg1 = reg_or_subregno (operands[1]); + + operands[4] = gen_rtx_REG (V2DImode, reg1); + operands[5] = gen_rtx_REG (V2DImode, reg1 + 1); +} + [(set_attr "length" "16")]) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 600e2c393db..0e6e74b8087 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21399,6 +21399,7 @@ The following built-in functions operate on pairs of @smallexample __vector_pair __builtin_vpair_f32_abs (__vector_pair); __vector_pair __builtin_vpair_f32_add (__vector_pair, __vector_pair); +float __builtin_vpair_f32_add_elements (__vector_pair); __vector_pair __builtin_vpair_f32_assemble (vector float, vector float); vector float __builtin_vpair_f32_extract_vector (__vector_pair, int); __vector_pair __builtin_vpair_f32_fma (__vector_pair, __vector_pair, __vector_pair); @@ -21416,6 +21417,7 @@ The following built-in functions operate on pairs of @smallexample __vector_pair __builtin_vpair_f64_abs (__vector_pair); __vector_pair __builtin_vpair_f64_add (__vector_pair, __vector_pair); +double __builtin_vpair_f64_add_elements (__vector_pair); __vector_pair __builtin_vpair_f64_assemble (vector double, vector double); vector double __builtin_vpair_f64_extract_vector (__vector_pair, int); __vector_pair __builtin_vpair_f64_fma (__vector_pair, __vector_pair, __vector_pair); @@ -21432,6 +21434,7 @@ The following built-in functions operate on pairs of @smallexample __vector_pair __builtin_vpair_i64_add (__vector_pair, __vector_pair); +long long __builtin_vpair_i64_add_elements (__vector_pair); __vector_pair __builtin_vpair_i64_and (__vector_pair, __vector_pair); __vector_pair __builtin_vpair_i64_assemble (vector long long, vector long long); diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-16.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-16.c new file mode 100644 index 00000000000..a8c206c4093 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-16.c @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test vector pair built-in functions to do a horizontal add of the + elements. */ + +float +f32_add_elements (__vector_pair *p) +{ + /* 1 lxvp, 1 xvaddsp, 2 vsldoi, 2 xvaddsp, 1 xcvspdp. */ + return __builtin_vpair_f32_add_elements (*p); +} + +double +f64_add_elements (__vector_pair *p) +{ + /* 1 lxvp, 1 xvadddp, 1 xxperdi, 1 fadd/xxadddp. */ + return __builtin_vpair_f64_add_elements (*p); +} + +long long +i64_add_elements (__vector_pair *p) +{ + /* 1 lxvp, 1vaddudm, 1 mfvsrld, 1 mfvsrd, 1 add. */ + return __builtin_vpair_i64_add_elements (*p); +} + +unsigned long long +i64u_add_elements (__vector_pair *p) +{ + /* 1 lxvp, 1vaddudm, 1 mfvsrld, 1 mfvsrd, 1 add. */ + return __builtin_vpair_i64u_add_elements (*p); +} + +/* { dg-final { scan-assembler-times {\mfadd\M|\mxsadddp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mlxvp\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrd\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrld\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvaddudm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsldoi\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxscvspdp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvadddp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvaddsp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */