From patchwork Fri Feb 3 21:29:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 52657 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp1066174wrn; Fri, 3 Feb 2023 13:30:16 -0800 (PST) X-Google-Smtp-Source: AK7set9HXYDOvv7oNS2C3jkXIuELhDS1RGRZ7tBiyRirHThxivih+zifXpCl+cqyRj4sX8VEzCHX X-Received: by 2002:a17:907:8e93:b0:878:43b9:6b8d with SMTP id tx19-20020a1709078e9300b0087843b96b8dmr13292036ejc.25.1675459816640; Fri, 03 Feb 2023 13:30:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675459816; cv=none; d=google.com; s=arc-20160816; b=X8PRRT2SDRSvUw+R9IU79iavOgVd0nQIpvXj93GTm8B+f1WiaaN2lRwMI2W53fcfTg f4ZzFGLLzqXaIRvYOH7Mkex2RT1FLbR9Hl4n8Hawb5phYj/bI9OjQKExe7hTwVE0kdqr 2YzLJih1cFeXoUuzu/ocLKsGg6yTZdYU+BG5Ugxahz/eY2EenTEdkZw/6ZK81x83wYPD xthmg4KVYB0xGMd6WIOoyJhxlH0WmGqcEU4Yp2GO3T5fK8SD6ggnK/XcCH0R+J9VlMTH csbGBWtoHYe3n98aVU7cNREvYWIKkxdhIpR7OvX+T/LNzcE4gV8Ad38JvyNikYiZsgPf Z+Lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :in-reply-to:content-disposition:references:mail-followup-to :message-id:subject:to:date:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=yYB2+/IuP4Z0/WzzXmYxlF3Jg1nuw/Qw3Rvf+TSMFug=; b=kfI6Mhj36X8TRXIUyoDNy1gjWYIR5g4zSafBFznpUmypGYr9jvsNR1To6MDF68wr/s DH/X8yy1ZKohLDjriqsxOetOPFxdB8iivoFTFgB5Uo42VL9ukRR/cd/Ljg8KwohfA6Of Ewjgjo5hK9+9XunjQ++aFTCP5JsF1uF5DP8XTZGD92feVQ6+vk1CedHQJFHvj1lCeQtn MY2S3OwlljzmvvNVasP9bcsmVcZ0ow/KaPVZRr992GSzzocuHl+3QItSujkWYdeDdZkI dIx7TORZMKPm3Kgrqzp7eIUN+LjaPapiGU0CdF9DdfZRx8oMfEzFuNz8kwNZKyEcP4+3 oEcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Esu/Cmug"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id vl11-20020a17090730cb00b008787b6dd91bsi4411320ejb.493.2023.02.03.13.30.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Feb 2023 13:30:16 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Esu/Cmug"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8B9163858C2D for ; Fri, 3 Feb 2023 21:30:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8B9163858C2D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1675459815; bh=yYB2+/IuP4Z0/WzzXmYxlF3Jg1nuw/Qw3Rvf+TSMFug=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Esu/CmugVS947+PUej5Jtg1lIvfobWjtE1wE/TIr+8jm4hh96kO1QEO0yh6ithuR7 mFqvp3V2z5XmpZmVziyJtzlULJ6hZHhxsM6W2VPMcwpaCWbxxmLvMF6MtsRVJKIrYu EqrjoTafApPGaj4D23FLwrmTcWvnbHwgBFPBdxa8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9790F3858C52 for ; Fri, 3 Feb 2023 21:29:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9790F3858C52 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 313KluAq014726; Fri, 3 Feb 2023 21:29:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3nh9981hf8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Feb 2023 21:29:29 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 313Kssgf004434; Fri, 3 Feb 2023 21:29:29 GMT Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3nh9981hf0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Feb 2023 21:29:29 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 313KMfvA019159; Fri, 3 Feb 2023 21:29:28 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([9.208.130.100]) by ppma02dal.us.ibm.com (PPS) with ESMTPS id 3ncvurbedq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Feb 2023 21:29:28 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 313LTRrW3211976 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 3 Feb 2023 21:29:27 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 571A55805C; Fri, 3 Feb 2023 21:29:27 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AA9335805A; Fri, 3 Feb 2023 21:29:26 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.65.233.34]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTPS; Fri, 3 Feb 2023 21:29:26 +0000 (GMT) Date: Fri, 3 Feb 2023 16:29:25 -0500 To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt Subject: [PATCH 4/8] PowerPC: Switch to dense math names for all MMA operations Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt References: Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 9orKIHOdC8_5FlyKuNya42pLM2lGRopT X-Proofpoint-ORIG-GUID: xrjq53lD7QI5cDWZnWjjyEiLBEBBc1k6 X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-03_19,2023-02-03_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 priorityscore=1501 mlxscore=0 malwarescore=0 adultscore=0 suspectscore=0 impostorscore=0 phishscore=0 spamscore=0 clxscore=1015 bulkscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302030189 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Michael Meissner via Gcc-patches From: Michael Meissner Reply-To: Michael Meissner Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756846952557486686?= X-GMAIL-MSGID: =?utf-8?q?1756846952557486686?= This patch changes the assembler instruction names for MMA instructions from the original name used in power10 to the new name when used with the dense math system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the same bits for either spelling. The patches have been tested on the following platforms. I added the patches for PR target/107299 that I submitted on November 2nd before doing the builds so that GCC would build on systems using IEEE 128-bit long double. * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604834.html There were no regressions with doing bootstrap builds and running the regression tests: 1) Power10 LE using --with-cpu=power10 --with-long-double-format=ieee; 2) Power10 LE using --with-cpu=power10 --with-long-double-format=ibm; 3) Power9 LE using --with-cpu=power9 --with-long-double-format=ibm; and 4) Power8 BE using --with-cpu=power8 (both 32-bit & 64-bit tested). Note, I will be on vacation from Tuesday February 7th through Tuesday February 14th. Can I check this patch into the GCC 13 master branch? 2023-02-03 Michael Meissner gcc/ * config/rs6000/mma.md (vvi4i4i8_dm): New int attribute. (avvi4i4i8_dm): Likewise. (vvi4i4i2_dm): Likewise. (avvi4i4i2_dm): Likewise. (vvi4i4_dm): Likewise. (avvi4i4_dm): Likewise. (pvi4i2_dm): Likewise. (apvi4i2_dm): Likewise. (vvi4i4i4_dm): Likewise. (avvi4i4i4_dm): Likewise. (mma_): Add support for running on DMF systems, generating the dense math instruction and using the dense math accumulators. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. gcc/testsuite/ * gcc.target/powerpc/dm-double-test.c: New test. * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New target test. --- gcc/config/rs6000/mma.md | 98 +++++++-- .../gcc.target/powerpc/dm-double-test.c | 194 ++++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 19 ++ 3 files changed, 299 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/dm-double-test.c diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md index 9e3feb3ea54..411e2345291 100644 --- a/gcc/config/rs6000/mma.md +++ b/gcc/config/rs6000/mma.md @@ -227,13 +227,22 @@ (define_int_attr apv [(UNSPEC_MMA_XVF64GERPP "xvf64gerpp") (define_int_attr vvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8 "pmxvi4ger8")]) +(define_int_attr vvi4i4i8_dm [(UNSPEC_MMA_PMXVI4GER8 "pmdmxvi4ger8")]) + (define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP "pmxvi4ger8pp")]) +(define_int_attr avvi4i4i8_dm [(UNSPEC_MMA_PMXVI4GER8PP "pmdmxvi4ger8pp")]) + (define_int_attr vvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2 "pmxvi16ger2") (UNSPEC_MMA_PMXVI16GER2S "pmxvi16ger2s") (UNSPEC_MMA_PMXVF16GER2 "pmxvf16ger2") (UNSPEC_MMA_PMXVBF16GER2 "pmxvbf16ger2")]) +(define_int_attr vvi4i4i2_dm [(UNSPEC_MMA_PMXVI16GER2 "pmdmxvi16ger2") + (UNSPEC_MMA_PMXVI16GER2S "pmdmxvi16ger2s") + (UNSPEC_MMA_PMXVF16GER2 "pmdmxvf16ger2") + (UNSPEC_MMA_PMXVBF16GER2 "pmdmxvbf16ger2")]) + (define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP "pmxvi16ger2pp") (UNSPEC_MMA_PMXVI16GER2SPP "pmxvi16ger2spp") (UNSPEC_MMA_PMXVF16GER2PP "pmxvf16ger2pp") @@ -245,25 +254,54 @@ (define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP "pmxvi16ger2pp") (UNSPEC_MMA_PMXVBF16GER2NP "pmxvbf16ger2np") (UNSPEC_MMA_PMXVBF16GER2NN "pmxvbf16ger2nn")]) +(define_int_attr avvi4i4i2_dm [(UNSPEC_MMA_PMXVI16GER2PP "pmdmxvi16ger2pp") + (UNSPEC_MMA_PMXVI16GER2SPP "pmdmxvi16ger2spp") + (UNSPEC_MMA_PMXVF16GER2PP "pmdmxvf16ger2pp") + (UNSPEC_MMA_PMXVF16GER2PN "pmdmxvf16ger2pn") + (UNSPEC_MMA_PMXVF16GER2NP "pmdmxvf16ger2np") + (UNSPEC_MMA_PMXVF16GER2NN "pmdmxvf16ger2nn") + (UNSPEC_MMA_PMXVBF16GER2PP "pmdmxvbf16ger2pp") + (UNSPEC_MMA_PMXVBF16GER2PN "pmdmxvbf16ger2pn") + (UNSPEC_MMA_PMXVBF16GER2NP "pmdmxvbf16ger2np") + (UNSPEC_MMA_PMXVBF16GER2NN "pmdmxvbf16ger2nn")]) + (define_int_attr vvi4i4 [(UNSPEC_MMA_PMXVF32GER "pmxvf32ger")]) +(define_int_attr vvi4i4_dm [(UNSPEC_MMA_PMXVF32GER "pmdmxvf32ger")]) + (define_int_attr avvi4i4 [(UNSPEC_MMA_PMXVF32GERPP "pmxvf32gerpp") (UNSPEC_MMA_PMXVF32GERPN "pmxvf32gerpn") (UNSPEC_MMA_PMXVF32GERNP "pmxvf32gernp") (UNSPEC_MMA_PMXVF32GERNN "pmxvf32gernn")]) +(define_int_attr avvi4i4_dm [(UNSPEC_MMA_PMXVF32GERPP "pmdmxvf32gerpp") + (UNSPEC_MMA_PMXVF32GERPN "pmdmxvf32gerpn") + (UNSPEC_MMA_PMXVF32GERNP "pmdmxvf32gernp") + (UNSPEC_MMA_PMXVF32GERNN "pmdmxvf32gernn")]) + (define_int_attr pvi4i2 [(UNSPEC_MMA_PMXVF64GER "pmxvf64ger")]) +(define_int_attr pvi4i2_dm [(UNSPEC_MMA_PMXVF64GER "pmdmxvf64ger")]) + (define_int_attr apvi4i2 [(UNSPEC_MMA_PMXVF64GERPP "pmxvf64gerpp") (UNSPEC_MMA_PMXVF64GERPN "pmxvf64gerpn") (UNSPEC_MMA_PMXVF64GERNP "pmxvf64gernp") (UNSPEC_MMA_PMXVF64GERNN "pmxvf64gernn")]) +(define_int_attr apvi4i2_dm [(UNSPEC_MMA_PMXVF64GERPP "pmdmxvf64gerpp") + (UNSPEC_MMA_PMXVF64GERPN "pmdmxvf64gerpn") + (UNSPEC_MMA_PMXVF64GERNP "pmdmxvf64gernp") + (UNSPEC_MMA_PMXVF64GERNN "pmdmxvf64gernn")]) + (define_int_attr vvi4i4i4 [(UNSPEC_MMA_PMXVI8GER4 "pmxvi8ger4")]) +(define_int_attr vvi4i4i4_dm [(UNSPEC_MMA_PMXVI8GER4 "pmdmxvi8ger4")]) + (define_int_attr avvi4i4i4 [(UNSPEC_MMA_PMXVI8GER4PP "pmxvi8ger4pp") (UNSPEC_MMA_PMXVI8GER4SPP "pmxvi8ger4spp")]) +(define_int_attr avvi4i4i4_dm [(UNSPEC_MMA_PMXVI8GER4PP "pmdmxvi8ger4pp") + (UNSPEC_MMA_PMXVI8GER4SPP "pmdmxvi8ger4spp")]) ;; Vector pair support. OOmode can only live in VSRs. (define_expand "movoo" @@ -622,7 +660,10 @@ (define_insn "mma_" (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_VV))] "TARGET_MMA" - " %A0,%x1,%x2" + "@ + dm %A0,%x1,%x2 + %A0,%x1,%x2 + %A0,%x1,%x2" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -643,7 +684,10 @@ (define_insn "mma_" (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_PV))] "TARGET_MMA" - " %A0,%x1,%x2" + "@ + dm %A0,%x1,%x2 + %A0,%x1,%x2 + %A0,%x1,%x2" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -654,7 +698,10 @@ (define_insn "mma_" (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa")] MMA_APV))] "TARGET_MMA" - " %A0,%x2,%x3" + "@ + dm %A0,%x2,%x3 + %A0,%x2,%x3 + %A0,%x2,%x3" [(set_attr "type" "mma") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -667,7 +714,10 @@ (define_insn "mma_" (match_operand:SI 5 "u8bit_cint_operand" "n,n,n")] MMA_VVI4I4I8))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + dm %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -696,7 +746,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_VVI4I4I2))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -711,7 +764,10 @@ (define_insn "mma_" (match_operand:SI 6 "const_0_to_3_operand" "n,n,n")] MMA_AVVI4I4I2))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5,%6" + "@ + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -724,7 +780,10 @@ (define_insn "mma_" (match_operand:SI 4 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4" + "@ + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -738,7 +797,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5" + "@ + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -751,7 +813,10 @@ (define_insn "mma_" (match_operand:SI 4 "const_0_to_3_operand" "n,n,n")] MMA_PVI4I2))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4" + "@ + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4 + %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -765,7 +830,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_APVI4I2))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5" + "@ + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5 + %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -779,7 +847,10 @@ (define_insn "mma_" (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4I4))] "TARGET_MMA" - " %A0,%x1,%x2,%3,%4,%5" + "@ + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5 + %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) @@ -794,7 +865,10 @@ (define_insn "mma_" (match_operand:SI 6 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4I4))] "TARGET_MMA" - " %A0,%x2,%x3,%4,%5,%6" + "@ + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6 + %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") (set_attr "prefixed" "yes") (set_attr "isa" "dm,not_dm,not_dm")]) diff --git a/gcc/testsuite/gcc.target/powerpc/dm-double-test.c b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c new file mode 100644 index 00000000000..66c19779585 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c @@ -0,0 +1,194 @@ +/* Test derived from mma-double-1.c, modified for dense math. */ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_dense_math_ok } */ +/* { dg-options "-mdejagnu-cpu=future -O2" } */ + +#include +#include +#include + +typedef unsigned char vec_t __attribute__ ((vector_size (16))); +typedef double v4sf_t __attribute__ ((vector_size (16))); +#define SAVE_ACC(ACC, ldc, J) \ + __builtin_mma_disassemble_acc (result, ACC); \ + rowC = (v4sf_t *) &CO[0*ldc+J]; \ + rowC[0] += result[0]; \ + rowC = (v4sf_t *) &CO[1*ldc+J]; \ + rowC[0] += result[1]; \ + rowC = (v4sf_t *) &CO[2*ldc+J]; \ + rowC[0] += result[2]; \ + rowC = (v4sf_t *) &CO[3*ldc+J]; \ + rowC[0] += result[3]; + +void +DM (int m, int n, int k, double *A, double *B, double *C) +{ + __vector_quad acc0, acc1, acc2, acc3, acc4, acc5, acc6, acc7; + v4sf_t result[4]; + v4sf_t *rowC; + for (int l = 0; l < n; l += 4) + { + double *CO; + double *AO; + AO = A; + CO = C; + C += m * 4; + for (int j = 0; j < m; j += 16) + { + double *BO = B; + __builtin_mma_xxsetaccz (&acc0); + __builtin_mma_xxsetaccz (&acc1); + __builtin_mma_xxsetaccz (&acc2); + __builtin_mma_xxsetaccz (&acc3); + __builtin_mma_xxsetaccz (&acc4); + __builtin_mma_xxsetaccz (&acc5); + __builtin_mma_xxsetaccz (&acc6); + __builtin_mma_xxsetaccz (&acc7); + unsigned long i; + + for (i = 0; i < k; i++) + { + vec_t *rowA = (vec_t *) & AO[i * 16]; + __vector_pair rowB; + vec_t *rb = (vec_t *) & BO[i * 4]; + __builtin_mma_assemble_pair (&rowB, rb[1], rb[0]); + __builtin_mma_xvf64gerpp (&acc0, rowB, rowA[0]); + __builtin_mma_xvf64gerpp (&acc1, rowB, rowA[1]); + __builtin_mma_xvf64gerpp (&acc2, rowB, rowA[2]); + __builtin_mma_xvf64gerpp (&acc3, rowB, rowA[3]); + __builtin_mma_xvf64gerpp (&acc4, rowB, rowA[4]); + __builtin_mma_xvf64gerpp (&acc5, rowB, rowA[5]); + __builtin_mma_xvf64gerpp (&acc6, rowB, rowA[6]); + __builtin_mma_xvf64gerpp (&acc7, rowB, rowA[7]); + } + SAVE_ACC (&acc0, m, 0); + SAVE_ACC (&acc2, m, 4); + SAVE_ACC (&acc1, m, 2); + SAVE_ACC (&acc3, m, 6); + SAVE_ACC (&acc4, m, 8); + SAVE_ACC (&acc6, m, 12); + SAVE_ACC (&acc5, m, 10); + SAVE_ACC (&acc7, m, 14); + AO += k * 16; + BO += k * 4; + CO += 16; + } + B += k * 4; + } +} + +void +init (double *matrix, int row, int column) +{ + for (int j = 0; j < column; j++) + { + for (int i = 0; i < row; i++) + { + matrix[j * row + i] = (i * 16 + 2 + j) / 0.123; + } + } +} + +void +init0 (double *matrix, double *matrix1, int row, int column) +{ + for (int j = 0; j < column; j++) + for (int i = 0; i < row; i++) + matrix[j * row + i] = matrix1[j * row + i] = 0; +} + + +void +print (const char *name, const double *matrix, int row, int column) +{ + printf ("Matrix %s has %d rows and %d columns:\n", name, row, column); + for (int i = 0; i < row; i++) + { + for (int j = 0; j < column; j++) + { + printf ("%f ", matrix[j * row + i]); + } + printf ("\n"); + } + printf ("\n"); +} + +int +main (int argc, char *argv[]) +{ + int rowsA, colsB, common; + int i, j, k; + int ret = 0; + + for (int t = 16; t <= 128; t += 16) + { + for (int t1 = 4; t1 <= 16; t1 += 4) + { + rowsA = t; + colsB = t1; + common = 1; + /* printf ("Running test for rows = %d,cols = %d\n", t, t1); */ + double A[rowsA * common]; + double B[common * colsB]; + double C[rowsA * colsB]; + double D[rowsA * colsB]; + + + init (A, rowsA, common); + init (B, common, colsB); + init0 (C, D, rowsA, colsB); + DM (rowsA, colsB, common, A, B, C); + + for (i = 0; i < colsB; i++) + { + for (j = 0; j < rowsA; j++) + { + D[i * rowsA + j] = 0; + for (k = 0; k < common; k++) + { + D[i * rowsA + j] += + A[k * rowsA + j] * B[k + common * i]; + } + } + } + for (i = 0; i < colsB; i++) + { + for (j = 0; j < rowsA; j++) + { + for (k = 0; k < common; k++) + { + if (D[i * rowsA + j] != C[i * rowsA + j]) + { + printf ("Error %d,%d,%d\n",i,j,k); + ret++; + } + } + } + } + if (ret) + { + print ("A", A, rowsA, common); + print ("B", B, common, colsB); + print ("C", C, rowsA, colsB); + print ("D", D, rowsA, colsB); + } + } + } + +#ifdef VERBOSE + if (ret) + printf ("DM double test fail: %d errors\n",ret); + else + printf ("DM double test success: 0 DM errors\n"); +#else + if (ret) + abort(); +#endif + + return ret; +} + +/* { dg-final { scan-assembler {\mdmsetdmrz\M} } } */ +/* { dg-final { scan-assembler {\mdmxvf64gerpp\M} } } */ +/* { dg-final { scan-assembler {\mdmxxextfdmr512\M} } } */ + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 227e3004077..9586ed3ae47 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6581,6 +6581,25 @@ proc check_effective_target_power10_ok { } { } } +# Return 1 if this is a PowerPC target supporting -mcpu=future or -mdense-math +# which enables the dense math operations. +proc check_effective_target_powerpc_dense_math_ok { } { + return [check_no_compiler_messages_nocache powerpc_dense_math_ok assembly { + __vector_quad vq; + void test (void) + { + #ifndef __PPC_DMR__ + #error "target does not have dense math support." + #else + /* Make sure we have dense math support. */ + __vector_quad dmr; + __asm__ ("dmsetaccz %A0" : "=wD" (dmr)); + vq = dmr; + #endif + } + } "-mcpu=future"] +} + # Return 1 if this is a PowerPC target supporting -mfloat128 via either # software emulation on power7/power8 systems or hardware support on power9.