From patchwork Fri Nov 10 22:54:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 164021 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b129:0:b0:403:3b70:6f57 with SMTP id q9csp1430019vqs; Fri, 10 Nov 2023 14:55:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IEaKx+RlScaCo/OxPTMWJnNFGYosG9hLNVo/sVYQYLYQcg5SW3k4TsWf5xRjepWMeoG6fRo X-Received: by 2002:a25:37d2:0:b0:da0:4d4d:da6a with SMTP id e201-20020a2537d2000000b00da04d4dda6amr491143yba.42.1699656931913; Fri, 10 Nov 2023 14:55:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699656931; cv=pass; d=google.com; s=arc-20160816; b=aajcps65bhc29FR6xR3nLcEmA94DU+kBgx70if3wCwbJqUJkyOyO8vt7sbO++GQU+S U39yjqB5qbXrfb+AuklYckJfO1aBe7VuOciq7tsUZRssY0BCVbfqDcuPljy2QsyhFH/K aYiEZMXCky1FLEesbyk2vQ6IGeIxBpLhI8DD4fAQMCDT3Lh7nWjU3VbokBaHKMtxZaOI LAocPAfnUgdBLNzHZZ6QelDZntiI8JDdJSc3CDKtSXUYIO6iigSr2+asxWpUm1WBSKSA p6EjeB1+bGBbq5Kzy9xMB658+AG2/KLyfFkCRdvIvxHsrKgxqvoPwaE0q0FrdPFlBZK6 GrCA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-disposition :mime-version:mail-followup-to:message-id:subject:to:from:date :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=vENIbgbBP1lVxpHw7xhLYopBGs2q/d52XxWRedUtKxI=; fh=0bQqz0x7CUT8I+BDaNEJSevw0DVHT666RDr0Mh+m1Jk=; b=wQ3cq9GYm9au6wVYaQHd+/a+Wk3Ryu3D9gL7uFVFpPJD5qQYNrG8lKpTpNtCCIO1kM Ajp5yFck8rYN3+1HZvrKU/kSNNRN90nWvXZiWCOmG7caTtV6/FLA+CiUFCALGm4zwL05 fuRo3Hjjb6Y71jSrNzi1fjixoMQZFq93mbsjfwGdw22W+7kIqMe1gmuDLHzELU+JkY4m 2VWz00mr1/MWZEJKVg/Tg+nmdSfJ3TvAeJIuPFmAmGhI9fn4cgBLi6u5BaO9hU41+ni/ 1lIx/eyh/1LddCPI3ZTeaXbyCARNZXvQgGt4BBmvoNIwnOTdapUThBvv6co7AXk7dHZA Kz6A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=a30ITY+l; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n6-20020a056214008600b0066cfc2c8139si445056qvr.377.2023.11.10.14.55.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Nov 2023 14:55:31 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=a30ITY+l; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6F31385843A for ; Fri, 10 Nov 2023 22:55:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 2A8213858D37 for ; Fri, 10 Nov 2023 22:55:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A8213858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2A8213858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699656905; cv=none; b=pBh49lhCk6h6FVvDItIcHZ8yxSyC2UjBsYvVN39Qe7zTRv99whKf20tKdevdLlg+Z4S2boFCWSfSCFkuhOjqiq8lbVCtpTU1Co/9b053uUTYeF2W/lRo8dCkBzO/yROf3Kjpf+FWB3+LQg9n/DKMFpD3wThWnmpTAiEEsl9ecME= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699656905; c=relaxed/simple; bh=aqHi0yu81yQnNGQvZcnj5ipWuLhoLASBUGzk5k/sVwo=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=ic4dBJgnSWlJ730Jgj6YhtIcZhDGvKLBY6MuWkDd9Km7Za3GvMbF6zSoP0Vl2raTU2GFMIV1tzUpQHsXXX1e/8+yiDuKER2/6AB1RloNk/s0uXGYy40T4PNZ+AJdSdKAewwK5RJlGmIu3sQhGh5d56cJNXNdZTGcg2hIouqnfi4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AAMlpnK025622; Fri, 10 Nov 2023 22:55:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : mime-version : content-type; s=pp1; bh=vENIbgbBP1lVxpHw7xhLYopBGs2q/d52XxWRedUtKxI=; b=a30ITY+l6awf8u18kpLdRr4p7EXw1NnQJ4hveu9nc2li369FfQIBtCy7wBVsfKwO+4xq axQ75M1bxIpgPoI0D8yyXXx0jDF3zwUKQd6l64+wfcYGKtMSNIU1CRyk/7TS0vfd8b+B oZFLh1a5jRM1ZZMzE3Ef+vS6BK6Sa0uF7vjKM2Kenoao84cCZzJifUiwkqZrRu134Ea3 9X4RZ0hBCM6BRShAN324TRgjLM6yemEmmdWZjeG6Svg8im6jDTfyaYpjAhFezZu5qMAe iqtu1YlTxPb9ssSZXT2kK3PXvPCcY6cP5N8wZ8ooPUypDDuZmvXpM1/FsDcb7SIqaTlm ow== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9w560we3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 22:55:01 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AAMoVOb031932; Fri, 10 Nov 2023 22:55:00 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9w560wdu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 22:55:00 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AALEbpe014340; Fri, 10 Nov 2023 22:55:00 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([172.16.1.8]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3u7w22e8t1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 22:55:00 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AAMswJE24576592 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Nov 2023 22:54:59 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DA2E258052; Fri, 10 Nov 2023 22:54:58 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2741558050; Fri, 10 Nov 2023 22:54:58 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.104.206]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Fri, 10 Nov 2023 22:54:58 +0000 (GMT) Date: Fri, 10 Nov 2023 17:54:56 -0500 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Michael Meissner , Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH, V2] Power10: Add options to disable load and store vector pair. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner MIME-Version: 1.0 Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: RQPFOX5bZ6zSLPoP00ZBk9FUJWU2xuXH X-Proofpoint-GUID: hKkw_gOMRROvN_zv-gZCZEK0DRLLXA2V X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-10_21,2023-11-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 phishscore=0 priorityscore=1501 lowpriorityscore=0 impostorscore=0 spamscore=0 suspectscore=0 adultscore=0 clxscore=1015 bulkscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311100190 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782219467016510456 X-GMAIL-MSGID: 1782219467016510456 This is version 2 of the patch to add -mno-load-vector-pair and -mno-store-vector-pair undocumented tuning switches. The differences between the first version of the patch and this version is that I added explicit RTL abi attributes for when the compiler can generate the load vector pair and store vector pair instructions. By having this attribute, the movoo insn has separate alternatives for when we generate the instruction and when we want to split the instruction into 2 separate vector loads or stores. In the first version of the patch, I had previously provided built-in functions that would always generate load vector pair and store vector pair instructions even if these instructions are normally disabled. I found these built-ins weren't specified like the other vector pair built-ins, and I didn't include documentation for the built-in functions. If we want such built-in functions, we can add them as a separate patch later. In addition, since both versions of the patch adds #pragma target and attribute support to change the results for individual functions, we can select on a function by function basis what the defaults for load/store vector pair is. The original text for the patch is: In working on some future patches that involve utilizing vector pair instructions, I wanted to be able to tune my program to enable or disable using the vector pair load or store operations while still keeping the other operations on the vector pair. This patch adds two undocumented tuning options. The -mno-load-vector-pair option would tell GCC to generate two load vector instructions instead of a single load vector pair. The -mno-store-vector-pair option would tell GCC to generate two store vector instructions instead of a single store vector pair. If either -mno-load-vector-pair is used, GCC will not generate the indexed stxvpx instruction. Similarly if -mno-store-vector-pair is used, GCC will not generate the indexed lxvpx instruction. The reason for this is to enable splitting the {,p}lxvp or {,p}stxvp instructions after reload without needing a scratch GPR register. The default for -mcpu=power10 is that both load vector pair and store vector pair are enabled. I added code so that the user code can modify these settings using either a '#pragma GCC target' directive or used __attribute__((__target__(...))) in the function declaration. I added tests for the switches, #pragma, and attribute options. I have built this on both little endian power10 systems and big endian power9 systems doing the normal bootstrap and test. There were no regressions in any of the tests, and the new tests passed. Can I check this patch into the master branch? 2023-11-09 Michael Meissner gcc/ * config/rs6000/mma.md (movoo): Add support for -mno-load-vector-pair and -mno-store-vector-pair. * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support for -mload-vector-pair and -mstore-vector-pair. (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow indexed mode for OOmode if we are generating both load vector pair and store vector pair instructions. (rs6000_option_override_internal): Add support for -mno-load-vector-pair and -mno-store-vector-pair. (rs6000_opt_masks): Likewise. * config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp attributes. (enabled attribute): Likewise. * config/rs6000/rs6000.opt (-mload-vector-pair): New option. (-mstore-vector-pair): Likewise. gcc/testsuite/ * gcc.target/powerpc/vector-pair-attribute.c: New test. * gcc.target/powerpc/vector-pair-pragma.c: New test. * gcc.target/powerpc/vector-pair-switch1.c: New test. * gcc.target/powerpc/vector-pair-switch2.c: New test. * gcc.target/powerpc/vector-pair-switch3.c: New test. * gcc.target/powerpc/vector-pair-switch4.c: New test. --- gcc/config/rs6000/mma.md | 19 +++++-- gcc/config/rs6000/rs6000-cpus.def | 8 ++- gcc/config/rs6000/rs6000.cc | 30 +++++++++- gcc/config/rs6000/rs6000.md | 10 +++- gcc/config/rs6000/rs6000.opt | 8 +++ .../powerpc/vector-pair-attribute.c | 39 +++++++++++++ .../gcc.target/powerpc/vector-pair-pragma.c | 55 +++++++++++++++++++ .../gcc.target/powerpc/vector-pair-switch1.c | 16 ++++++ .../gcc.target/powerpc/vector-pair-switch2.c | 17 ++++++ .../gcc.target/powerpc/vector-pair-switch3.c | 17 ++++++ .../gcc.target/powerpc/vector-pair-switch4.c | 17 ++++++ 11 files changed, 225 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md index 575751d477e..3efb94be84f 100644 --- a/gcc/config/rs6000/mma.md +++ b/gcc/config/rs6000/mma.md @@ -292,27 +292,34 @@ (define_expand "movoo" gcc_assert (false); }) +;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an +;; alternative that does not allow indexed addresses so we can split the load +;; or store. (define_insn_and_split "*movoo" - [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa") - (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))] + [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa") + (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))] "TARGET_MMA && (gpc_reg_operand (operands[0], OOmode) || gpc_reg_operand (operands[1], OOmode))" "@ lxvp%X1 %x0,%1 + # stxvp%X0 %x1,%0 + # #" "&& reload_completed - && (!MEM_P (operands[0]) && !MEM_P (operands[1]))" + && ((MEM_P (operands[0]) && !TARGET_STORE_VECTOR_PAIR) + || (MEM_P (operands[1]) && !TARGET_LOAD_VECTOR_PAIR) + || (!MEM_P (operands[0]) && !MEM_P (operands[1])))" [(const_int 0)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; } - [(set_attr "type" "vecload,vecstore,veclogical") + [(set_attr "type" "vecload,vecload,vecstore,vecstore,veclogical") (set_attr "size" "256") - (set_attr "length" "*,*,8")]) - + (set_attr "length" "*,8,*,8,8") + (set_attr "isa" "lxvp,*,stxvp,*,*")]) ;; Vector quad support. XOmode can only live in FPRs. (define_expand "movxo" diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 4f350da378c..75435a52d1a 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -77,10 +77,12 @@ /* Flags that need to be turned off if -mno-power10. */ /* We comment out PCREL_OPT here to disable it by default because SPEC2017 performance was degraded by it. */ -#define OTHER_POWER10_MASKS (OPTION_MASK_MMA \ +#define OTHER_POWER10_MASKS (OPTION_MASK_LOAD_VECTOR_PAIR \ + | OPTION_MASK_MMA \ | OPTION_MASK_PCREL \ /* | OPTION_MASK_PCREL_OPT */ \ - | OPTION_MASK_PREFIXED) + | OPTION_MASK_PREFIXED \ + | OPTION_MASK_STORE_VECTOR_PAIR) #define ISA_3_1_MASKS_SERVER (ISA_3_0_MASKS_SERVER \ | OPTION_MASK_POWER10 \ @@ -130,6 +132,7 @@ | OPTION_MASK_FLOAT128_HW \ | OPTION_MASK_FLOAT128_KEYWORD \ | OPTION_MASK_FPRND \ + | OPTION_MASK_LOAD_VECTOR_PAIR \ | OPTION_MASK_POWER10 \ | OPTION_MASK_P10_FUSION \ | OPTION_MASK_HTM \ @@ -156,6 +159,7 @@ | OPTION_MASK_QUAD_MEMORY_ATOMIC \ | OPTION_MASK_RECIP_PRECISION \ | OPTION_MASK_SOFT_FLOAT \ + | OPTION_MASK_STORE_VECTOR_PAIR \ | OPTION_MASK_STRICT_ALIGN_OPTIONAL \ | OPTION_MASK_VSX) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 5f56c3ed85b..db60d3ca960 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -2711,7 +2711,9 @@ rs6000_setup_reg_addr_masks (void) /* Vector pairs can do both indexed and offset loads if the instructions are enabled, otherwise they can only do offset loads since it will be broken into two vector moves. Vector quads can - only do offset loads. */ + only do offset loads. If the user restricted generation of either + of the LXVP or STXVP instructions, do not allow indexed mode so + that we can split the load/store. */ else if ((addr_mask != 0) && TARGET_MMA && (m2 == OOmode || m2 == XOmode)) { @@ -2719,7 +2721,9 @@ rs6000_setup_reg_addr_masks (void) if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX) { addr_mask |= RELOAD_REG_QUAD_OFFSET; - if (m2 == OOmode) + if (m2 == OOmode + && TARGET_LOAD_VECTOR_PAIR + && TARGET_STORE_VECTOR_PAIR) addr_mask |= RELOAD_REG_INDEXED; } } @@ -4405,6 +4409,26 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags &= ~OPTION_MASK_MMA; } + /* Warn if -m-load-vector-pair or -m-store-vector-pair are used and MMA is + not set. */ + if (!TARGET_MMA && TARGET_LOAD_VECTOR_PAIR) + { + if ((rs6000_isa_flags_explicit & OPTION_MASK_LOAD_VECTOR_PAIR) != 0) + warning (0, "%qs should not be used unless you use %qs", + "-mload-vector-pair", "-mmma"); + + rs6000_isa_flags &= ~OPTION_MASK_LOAD_VECTOR_PAIR; + } + + if (!TARGET_MMA && TARGET_STORE_VECTOR_PAIR) + { + if ((rs6000_isa_flags_explicit & OPTION_MASK_STORE_VECTOR_PAIR) != 0) + warning (0, "%qs should not be used unless you use %qs", + "-mstore-vector-pair", "-mmma"); + + rs6000_isa_flags &= OPTION_MASK_STORE_VECTOR_PAIR; + } + /* Enable power10 fusion if we are tuning for power10, even if we aren't generating power10 instructions. */ if (!(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION)) @@ -24437,6 +24461,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "hard-dfp", OPTION_MASK_DFP, false, true }, { "htm", OPTION_MASK_HTM, false, true }, { "isel", OPTION_MASK_ISEL, false, true }, + { "load-vector-pair", OPTION_MASK_LOAD_VECTOR_PAIR, false, true }, { "mfcrf", OPTION_MASK_MFCRF, false, true }, { "mfpgpr", 0, false, true }, { "mma", OPTION_MASK_MMA, false, true }, @@ -24461,6 +24486,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "quad-memory-atomic", OPTION_MASK_QUAD_MEMORY_ATOMIC, false, true }, { "recip-precision", OPTION_MASK_RECIP_PRECISION, false, true }, { "save-toc-indirect", OPTION_MASK_SAVE_TOC_INDIRECT, false, true }, + { "store-vector-pair", OPTION_MASK_STORE_VECTOR_PAIR, false, true }, { "string", 0, false, true }, { "update", OPTION_MASK_NO_UPDATE, true , true }, { "vsx", OPTION_MASK_VSX, false, true }, diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 2a1b5ecfaee..dcf1f3526f5 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -355,7 +355,7 @@ (define_attr "cpu" (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) ;; The ISA we implement. -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,lxvp,stxvp" (const_string "any")) ;; Is this alternative enabled for the current CPU/ISA/etc.? @@ -403,6 +403,14 @@ (define_attr "enabled" "" (and (eq_attr "isa" "p10") (match_test "TARGET_POWER10")) (const_int 1) + + (and (eq_attr "isa" "lxvp") + (match_test "TARGET_LOAD_VECTOR_PAIR")) + (const_int 1) + + (and (eq_attr "isa" "stxvp") + (match_test "TARGET_STORE_VECTOR_PAIR")) + (const_int 1) ] (const_int 0))) ;; If this instruction is microcoded on the CELL processor diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index bde6d3ff664..369095df9ed 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -597,6 +597,14 @@ mmma Target Mask(MMA) Var(rs6000_isa_flags) Generate (do not generate) MMA instructions. +mload-vector-pair +Target Undocumented Mask(LOAD_VECTOR_PAIR) Var(rs6000_isa_flags) +Generate (do not generate) load vector pair instructions. + +mstore-vector-pair +Target Undocumented Mask(STORE_VECTOR_PAIR) Var(rs6000_isa_flags) +Generate (do not generate) store vector pair instructions. + mrelative-jumptables Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c new file mode 100644 index 00000000000..985a44aca85 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test if we can control generating load and store vector pair via the target + attribute. */ + +__attribute__((__target__("load-vector-pair,store-vector-pair"))) +void +test_load_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 1 stxvp. */ +} + +__attribute__((__target__("load-vector-pair,no-store-vector-pair"))) +void +test_load_no_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 2 stxv. */ +} + +__attribute__((__target__("no-load-vector-pair,store-vector-pair"))) +void +test_store_no_load (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 1 stxvp. */ +} + +__attribute__((__target__("no-load-vector-pair,no-store-vector-pair"))) +void +test_no_load_or_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 2 stxv. */ +} + +/* { dg-final { scan-assembler-times {\mp?lxvpx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mp?lxvx?\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvx?\M} 4 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c new file mode 100644 index 00000000000..74c6baf8185 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c @@ -0,0 +1,55 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test if we can control generating load and store vector pair via the #pragma + directive. */ + +#pragma gcc push_options +#pragma GCC target("load-vector-pair,store-vector-pair") + +void +test_load_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 1 stxvp. */ +} + +#pragma gcc pop_options + +#pragma gcc push_options +#pragma GCC target("load-vector-pair,no-store-vector-pair") + +void +test_load_no_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 2 stxv. */ +} + +#pragma gcc pop_options + +#pragma gcc push_options +#pragma GCC target("no-load-vector-pair,store-vector-pair") + +void +test_store_no_load (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 1 stxvp. */ +} + +#pragma gcc pop_options + +#pragma gcc push_options +#pragma GCC target("no-load-vector-pair,no-store-vector-pair") + +void +test_no_load_or_store (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 2 stxv. */ +} + +#pragma gcc pop_options + +/* { dg-final { scan-assembler-times {\mp?lxvpx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mp?lxvx?\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvx?\M} 4 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c new file mode 100644 index 00000000000..48e433b378e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* Test if we generate load and store vector pair by default on power 10. */ + +void +test (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 1 stxvp. */ +} + +/* { dg-final { scan-assembler-times {\mp?lxvpx?\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mp?lxvx?\M} } } */ +/* { dg-final { scan-assembler-not {\mp?stxvx?\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c new file mode 100644 index 00000000000..2a38c2f2aae --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-store-vector-pair" } */ + +/* Test if we generate load vector pair but not store vector pair if + -mno-store-vector-pair is used on power10. */ + +void +test (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 1 lxvp, 2 stxv. */ +} + +/* { dg-final { scan-assembler-times {\mp?lxvpx?\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mp?stxvpx?\M} } } */ +/* { dg-final { scan-assembler-not {\mp?lxvx?\M} } } */ +/* { dg-final { scan-assembler-times {\mp?stxvx?\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c new file mode 100644 index 00000000000..fd273056b8f --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-load-vector-pair" } */ + +/* Test if we do not generate load vector pair but generate store vector pair + if -mno-load-vector-pair is used on power10. */ + +void +test (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 1 stxvp. */ +} + +/* { dg-final { scan-assembler-not {\mp?lxvpx?\M} } } */ +/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mp?lxvx?\M} 2 } } */ +/* { dg-final { scan-assembler-not {\mp?stxvx?\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c new file mode 100644 index 00000000000..01686e073fe --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-load-vector-pair -mno-store-vector-pair" } */ + +/* Test if we do not generate load and store vector pair if directed to on + power 10. */ + +void +test (__vector_pair *p, __vector_pair *q) +{ + *p = *q; /* 2 lxv, 2 stxv. */ +} + +/* { dg-final { scan-assembler-not {\mp?lxvpx?\M} } } */ +/* { dg-final { scan-assembler-not {\mp?stxvpx?\M} } } */ +/* { dg-final { scan-assembler-times {\mp?lxvx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mp?stxvx?\M} 2 } } */