From patchwork Mon Jul 3 06:30:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 115197 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp320637vqx; Sun, 2 Jul 2023 23:31:05 -0700 (PDT) X-Google-Smtp-Source: APBJJlFbh2PU2bJTbusWIQ7oEAG4HgjWXQ9kxlHEnoII3QW063agyp8QSXE//zWb2pHlaHZUlD/C X-Received: by 2002:aa7:ce06:0:b0:51d:b450:5e35 with SMTP id d6-20020aa7ce06000000b0051db4505e35mr6144653edv.15.1688365865644; Sun, 02 Jul 2023 23:31:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688365865; cv=none; d=google.com; s=arc-20160816; b=iTcxAsAdC0kQcM4IL89tbI5GH5hYEK/v95h0ixh4Fv+qKj6meiocYAYVpHMoT/ebSP SF7kQODGRap33/G+c5NSpq1ESwk1qX+XUh622be8znM5o3UFsT6kJgCAwFkDRqTdMZNM CMQxhL01e1mCZNDVmXnQd+L8AUUovUp+B+K0eiLN7kLI6CCq8WjcB7pOgTJvsplcSXdN 7SSrGORoeAu84SS4VnDjFjRPD1dqQSi6FuyDJ0DFQaBXVPAj37RUjtufNlv17hLUW8CK cWDsZMhpt2jdIrFw3vVNwHOXOsOLyWSNGa0Bowq0BYFdLYiMtw/myQtilEvPqVRfly4/ lXjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:subject:cc:to:content-language:user-agent :date:message-id:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=p6vEHLlvDFc0KsOlqBbNbqBU8vjBPBKYuFKLZ5Vza04=; fh=3eUSxJU+9IWNwGHlMjnmqDQDnJfeMKAjlglEUO7a4vw=; b=yZDJc7cSdN3uDlWnlJH7Ngetmw9FOLb22QyC5KcFGiEIiDG+fCkHMiv28vKaFRu4Kc LyMWs6y3/aNuKMsA+qEawuqGabfm8vo50eb1+lot4vOcMuUEDnq+WU50sf+yaExiw083 jtVojpNQ1t6+/qXgZcgnIBG+WcCeoLUstChl2Dl83zhYPpZB3uITYOtOjGUBEzM7Vqpw ghqy8UmLdmv7eTzKOLejvSvTafzlKWxJ7VqSgzCK+GJTFq5QlLhuIINHFEgAfcFxkaZY dMtARYXOT8j1uThAqM3VvwYiBTKRYHsxbFjoOLvr5N0aBa6HC7J4Nvscd1lWVKMC0NEs QvkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YaCxQdBR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g13-20020a50ee0d000000b0051decf83dbesi3917469eds.205.2023.07.02.23.31.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jul 2023 23:31:05 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YaCxQdBR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 137753858024 for ; Mon, 3 Jul 2023 06:31:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 137753858024 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688365864; bh=p6vEHLlvDFc0KsOlqBbNbqBU8vjBPBKYuFKLZ5Vza04=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=YaCxQdBRZZ++if3P46haZJPgfOfhUqLEmQuSC76KYV/uVbnb+9bBy/BO6umCx+KoZ s5XKRNNYbh0bsYnFHNPi6YGCDF/687XqpNiBkhW2niRA9pld1JFw4dSB3HXSpUo6uh CpUgtAQ2K0NoTo8LwyiZ3oA1kHV4Gly7vRfx1RAE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 976723858D28 for ; Mon, 3 Jul 2023 06:30:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 976723858D28 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3636GvGr021791; Mon, 3 Jul 2023 06:30:13 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rkrx0097f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jul 2023 06:30:12 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3636GvHu021804; Mon, 3 Jul 2023 06:30:12 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rkrx00966-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jul 2023 06:30:12 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3633lx6Y032708; Mon, 3 Jul 2023 06:30:09 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3rjbs4s5nx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 03 Jul 2023 06:30:09 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3636U69318612794 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 3 Jul 2023 06:30:06 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 019EE20043; Mon, 3 Jul 2023 06:30:06 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D33C120049; Mon, 3 Jul 2023 06:30:03 +0000 (GMT) Received: from [9.200.60.26] (unknown [9.200.60.26]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 3 Jul 2023 06:30:03 +0000 (GMT) Message-ID: <08fef5da-91b2-9e26-4fb2-ce4bb72293e9@linux.ibm.com> Date: Mon, 3 Jul 2023 14:30:01 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner Subject: [PATCH, rs6000] Extract the element in dword0 by mfvsrd and shift/mask [PR110331] X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: nJyrVR4GEvFNex5c8Ac2R1Y-QugIr447 X-Proofpoint-GUID: -iUSUWpQaWPwvFhUGEGtndGYJUk8hHSs X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-03_04,2023-06-30_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 priorityscore=1501 lowpriorityscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 clxscore=1015 adultscore=0 suspectscore=0 spamscore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307030056 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770379925871362187?= X-GMAIL-MSGID: =?utf-8?q?1770379925871362187?= Hi, This patch implements the vector element extraction by mfvsrd and shift/mask when the element is in dword0 of the vector. Originally, it generates vsplat/mfvsrd on P8 and li/vextract on P9. Since mfvsrd has lower latency than vextract and rldicl has lower latency than vsplat, the new sequence has the benefit. Specially, the shift/mask is no need when the element is the first element of dword0. So it saves another rldicl when it returns a sign extend value. This patch is based on previous one. https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622101.html Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog rs6000: Extract the element in dword0 by mfvsrd and shift/mask gcc/ PR target/110331 * config/rs6000/rs6000-protos.h (rs6000_vsx_element_in_dword0_p): Declare. (rs6000_vsx_extract_element_from_dword0): Declare. * config/rs6000/rs6000.cc (rs6000_vsx_element_in_dword0_p): New function to judge if an element is in dword0 of a vector. (rs6000_vsx_extract_element_from_dword0): Extract an element from dword0 by mfvsrd and lshiftrt and mask. * config/rs6000/rs6000.md (*rotl3_mask): Rename to... (rotl3_mask): ...this * config/rs6000/vsx.md (vsx_extract_): Add a comment. (split pattern for p9 vector extract): Call rs6000_vsx_extract_element_from_dword0 if the element is in dword0. (*vsx_extract__di_p9): Exclude the elements in dword0 which are processed by *vsx_extract__zero_extend for both p8 and p9. (*vsx_extract__zero_extend): Zero extend pattern for vector extract on the element of dword0. (*vsx_extract__p8): Call rs6000_vsx_extract_element_from_dword0 when the extracted element is in dword0. Refined the pattern and remove reload_completed from split condition. gcc/testsuite/ PR target/110331 * gcc.target/powerpc/fold-vec-extract-char.p8.c: Set the extracted elements in dword1. * gcc.target/powerpc/fold-vec-extract-char.p9.c: Likewise. * gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise. * gcc.target/powerpc/fold-vec-extract-int.p9.c: Likewise. * gcc.target/powerpc/fold-vec-extract-short.p8.c: Likewise. * gcc.target/powerpc/fold-vec-extract-short.p9.c: Likewise. * gcc.target/powerpc/p9-extract-1.c: Likewise. * gcc.target/powerpc/pr110331-p8.c: New. * gcc.target/powerpc/pr110331-p9.c: New. * gcc.target/powerpc/pr110331.h: New. patch.diff diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index f70118ea40f..ccef280122b 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -161,6 +161,8 @@ extern bool rs6000_function_pcrel_p (struct function *); extern bool rs6000_pcrel_p (void); extern bool rs6000_fndecl_pcrel_p (const_tree); extern void rs6000_output_addr_vec_elt (FILE *, int); +extern bool rs6000_vsx_element_in_dword0_p (rtx, enum machine_mode); +extern void rs6000_vsx_extract_element_from_dword0 (rtx, rtx, rtx, bool); /* Different PowerPC instruction formats that are used by GCC. There are various other instruction formats used by the PowerPC hardware, but these diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 07c3a3d15ac..fad01d6b5dd 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -29098,6 +29098,74 @@ rs6000_opaque_type_invalid_use_p (gimple *stmt) return false; } +/* Return true when the element is in dword0 of a vector. Exclude word + element 1 of VS4SI as the word can be extracted by mfvsrwz directly. */ + +bool +rs6000_vsx_element_in_dword0_p (rtx op, enum machine_mode mode) +{ + gcc_assert (CONST_INT_P (op)); + gcc_assert (mode == V16QImode || mode == V8HImode || mode == V4SImode); + + int units = GET_MODE_NUNITS (mode); + int elt = INTVAL (op); + elt = BYTES_BIG_ENDIAN ? units - 1 - elt : elt; + + if (elt > units / 2 + || (elt == units / 2 && mode != V4SImode)) + return true; + else + return false; +} + +/* Extract element from dword0 by mfvsrd and lshiftrt and mask. Extend_p + indicates if zero extend is needed or not. */ + +void +rs6000_vsx_extract_element_from_dword0 (rtx dest, rtx src, rtx element, + bool extend_p) +{ + enum machine_mode mode = GET_MODE (src); + gcc_assert (rs6000_vsx_element_in_dword0_p (element, mode)); + + enum machine_mode dest_mode = GET_MODE (dest); + enum machine_mode inner_mode = GET_MODE_INNER (mode); + int units = GET_MODE_NUNITS (mode); + int elt = INTVAL (element); + elt = BYTES_BIG_ENDIAN ? units - 1 - elt : elt; + int value, shift; + unsigned int mask; + + rtx vec_tmp = gen_lowpart (V2DImode, src); + rtx tmp1 = can_create_pseudo_p () + ? gen_reg_rtx (DImode) + : simplify_gen_subreg (DImode, dest, dest_mode, 0); + value = BYTES_BIG_ENDIAN ? 0 : 1; + emit_insn (gen_vsx_extract_v2di (tmp1, vec_tmp, GEN_INT (value))); + + rtx tmp2; + shift = (elt - units / 2) * GET_MODE_BITSIZE (inner_mode); + if (shift || extend_p) + { + tmp2 = (dest_mode == DImode) + ? dest + : (can_create_pseudo_p () + ? gen_reg_rtx (DImode) + : simplify_gen_subreg (DImode, dest, dest_mode, 0)); + mask = (1ULL << GET_MODE_BITSIZE (inner_mode)) - 1; + rtx shift_op = gen_rtx_LSHIFTRT (DImode, tmp1, GEN_INT (shift)); + emit_insn (gen_rotldi3_mask (tmp2, tmp1, GEN_INT (shift), GEN_INT (mask), + shift_op)); + } + else + tmp2 = tmp1; + + if (dest_mode != DImode) + emit_move_insn (dest, gen_lowpart (dest_mode, tmp2)); + + return; +} + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-rs6000.h" diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index cdab49fbb91..f46b65c0107 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4104,7 +4104,7 @@ (define_insn "*eqv3" ;; Rotate-and-mask and insert. -(define_insn "*rotl3_mask" +(define_insn "rotl3_mask" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (and:GPR (match_operator:GPR 4 "rotate_mask_operator" [(match_operand:GPR 1 "gpc_reg_operand" "r") diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 9075644271a..10c6428e66f 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -3754,7 +3754,9 @@ (define_expand "vsx_extract_" (clobber (match_scratch:VSX_EXTRACT_I 3))])] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" { - /* If we have ISA 3.0, we can do a xxextractuw/vextractu{b,h}. */ + /* If we have ISA 3.0, we can do a xxextractuw/vextractu{b,h}. But if + the element is word element 1 of a V4SI, it can be extracted by + mfvsrwz directly. */ if (TARGET_P9_VECTOR && (mode != V4SImode || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))) @@ -3805,10 +3807,20 @@ (define_split "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB && reload_completed" [(const_int 0)] { + /* If the element is in dword0, it can be extracted by mfvsrd and lshiftrt + and mask. */ + if (rs6000_vsx_element_in_dword0_p (operands[2], mode)) + { + rs6000_vsx_extract_element_from_dword0 (operands[0], operands[1], + operands[2], false); + DONE; + } + rtx op0_si = gen_rtx_REG (SImode, REGNO (operands[0])); rtx op1 = operands[1]; rtx op2 = operands[2]; rtx op3 = operands[3]; + HOST_WIDE_INT offset = INTVAL (op2) * GET_MODE_UNIT_SIZE (mode); emit_move_insn (op3, GEN_INT (offset)); @@ -3829,7 +3841,8 @@ (define_insn_and_split "*vsx_extract__di_p9" (clobber (match_scratch:SI 3 "=r,X"))] "TARGET_VEXTRACTUB && (mode != V4SImode - || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2))" + || INTVAL (operands[2]) != (BYTES_BIG_ENDIAN ? 1 : 2)) + && !rs6000_vsx_element_in_dword0_p (operands[2], mode)" "#" "&& reload_completed" [(parallel [(set (match_dup 4) @@ -3837,6 +3850,7 @@ (define_insn_and_split "*vsx_extract__di_p9" (match_dup 1) (parallel [(match_dup 2)]))) (clobber (match_dup 3))])] + { operands[4] = gen_rtx_REG (mode, REGNO (operands[0])); } @@ -3902,6 +3916,14 @@ (define_insn_and_split "vsx_extract_si" rtx element = operands[2]; rtx vec_tmp; + /* If the element is in dword0, it can be extracted by mfvsrd and lshiftrt + and mask. */ + if (rs6000_vsx_element_in_dword0_p (element, V4SImode)) + { + rs6000_vsx_extract_element_from_dword0 (dest, src, element, false); + DONE; + } + if (GET_CODE (operands[3]) == SCRATCH) vec_tmp = gen_reg_rtx (V4SImode); else @@ -3923,49 +3945,78 @@ (define_insn_and_split "vsx_extract_si" (set_attr "length" "4,4,4,0") (set_attr "isa" "p8v,*,p8v,*")]) +(define_insn_and_split "*vsx_extract__zero_extend" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (zero_extend:DI + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v") + (parallel [(match_operand:QI 2 "" "n")])))) + (clobber (match_scratch:VSX_EXTRACT_I 3 "=v"))] + "TARGET_DIRECT_MOVE_64BIT + && rs6000_vsx_element_in_dword0_p (operands[2], mode)" + "#" + "&& 1" + [(const_int 0)] +{ + rtx dest = operands[0]; + rtx src = operands[1]; + rtx element = operands[2]; + + rs6000_vsx_extract_element_from_dword0 (dest, src, element, true); + DONE; +} + [(set_attr "type" "mfvsr")]) + (define_insn_and_split "*vsx_extract__p8" - [(set (match_operand: 0 "nonimmediate_operand" "=r") + [(set (match_operand: 0 "gpc_reg_operand" "=r") (vec_select: (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v") (parallel [(match_operand:QI 2 "" "n")]))) (clobber (match_scratch:VSX_EXTRACT_I2 3 "=v"))] - "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT - && !TARGET_P9_VECTOR" + "TARGET_DIRECT_MOVE_64BIT && !TARGET_P9_VECTOR" "#" - "&& reload_completed" + "&& 1" [(const_int 0)] { rtx dest = operands[0]; rtx src = operands[1]; rtx element = operands[2]; - rtx vec_tmp = operands[3]; + rtx vec_tmp; int value; + int num_elt = GET_MODE_NUNITS (mode); + enum machine_mode dest_mode = GET_MODE (dest); - if (!BYTES_BIG_ENDIAN) - element = GEN_INT (GET_MODE_NUNITS (mode) - 1 - INTVAL (element)); + if (rs6000_vsx_element_in_dword0_p (element, mode)) + { + rs6000_vsx_extract_element_from_dword0 (dest, src, element, false); + DONE; + } - /* If the value is in the correct position, we can avoid doing the VSPLT - instruction. */ + if (GET_CODE (operands[3]) == SCRATCH) + vec_tmp = gen_reg_rtx (mode); + else + vec_tmp = operands[3]; + + if (!BYTES_BIG_ENDIAN) + element = GEN_INT (num_elt - 1 - INTVAL (element)); value = INTVAL (element); + if (mode == V16QImode) - { - if (value != 7) - emit_insn (gen_altivec_vspltb_direct (vec_tmp, src, element)); - else - vec_tmp = src; - } + emit_insn (gen_altivec_vspltb_direct (vec_tmp, src, element)); else if (mode == V8HImode) - { - if (value != 3) - emit_insn (gen_altivec_vsplth_direct (vec_tmp, src, element)); - else - vec_tmp = src; - } + emit_insn (gen_altivec_vsplth_direct (vec_tmp, src, element)); else gcc_unreachable (); - emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)), - gen_rtx_REG (DImode, REGNO (vec_tmp))); + value = BYTES_BIG_ENDIAN ? 0 : 1; + rtx tmp1 = can_create_pseudo_p () + ? gen_reg_rtx (DImode) + : simplify_gen_subreg (DImode, dest, dest_mode, 0); + rtx vec_tmp1 = gen_lowpart (V2DImode, vec_tmp); + emit_insn (gen_vsx_extract_v2di (tmp1, vec_tmp1, GEN_INT (value))); + + emit_move_insn (dest, gen_lowpart (dest_mode, tmp1)); + DONE; } [(set_attr "type" "mfvsr")]) diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c index f3b9556b2e6..caead22beca 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c @@ -49,21 +49,27 @@ testuc_var (vector unsigned char vuc2, signed int si) return vec_extract (vuc2, si); } +#ifdef __BIG_ENDIAN__ +#define LANE 12 +#else +#define LANE 3 +#endif + unsigned char testbc_cst (vector bool char vbc2) { - return vec_extract (vbc2, 12); + return vec_extract (vbc2, LANE); } signed char testsc_cst (vector signed char vsc2) { - return vec_extract (vsc2, 12); + return vec_extract (vsc2, LANE); } unsigned char testuc_cst (vector unsigned char vuc2) { - return vec_extract (vuc2, 12); + return vec_extract (vuc2, LANE); } diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p9.c index 8a4c380edad..6b7c3f0c9eb 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p9.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p9.c @@ -34,21 +34,27 @@ testuc_var (vector unsigned char vuc2, signed int si) return vec_extract (vuc2, si); } +#ifdef __BIG_ENDIAN__ +#define LANE 12 +#else +#define LANE 3 +#endif + unsigned char testbc_cst (vector bool char vbc2) { - return vec_extract (vbc2, 12); + return vec_extract (vbc2, LANE); } signed char testsc_cst (vector signed char vsc2) { - return vec_extract (vsc2, 12); + return vec_extract (vsc2, LANE); } unsigned char testuc_cst (vector unsigned char vuc2) { - return vec_extract (vuc2, 12); + return vec_extract (vuc2, LANE); } diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p8.c index f5f953320d8..961bf2b2a9f 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p8.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p8.c @@ -54,21 +54,27 @@ testui_var (vector unsigned int vui2, signed int si) return vec_extract (vui2, si); } +#ifdef __BIG_ENDIAN__ +#define LANE 11 +#else +#define LANE 0 +#endif + unsigned int testbi_cst (vector bool int vbi2) { - return vec_extract (vbi2, 12); + return vec_extract (vbi2, LANE); } signed int testsi_cst (vector signed int vsi2) { - return vec_extract (vsi2, 12); + return vec_extract (vsi2, LANE); } unsigned int testui_cst (vector unsigned int vui2) { - return vec_extract (vui2, 12); + return vec_extract (vui2, LANE); } diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p9.c index 1abf19da40d..46d6a7ce140 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p9.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-int.p9.c @@ -40,21 +40,27 @@ testui_var (vector unsigned int vui2, signed int si) return vec_extract (vui2, si); } +#ifdef __BIG_ENDIAN__ +#define LANE 11 +#else +#define LANE 0 +#endif + unsigned int testbi_cst (vector bool int vbi2) { - return vec_extract (vbi2, 12); + return vec_extract (vbi2, LANE); } signed int testsi_cst (vector signed int vsi2) { - return vec_extract (vsi2, 12); + return vec_extract (vsi2, LANE); } unsigned int testui_cst (vector unsigned int vui2) { - return vec_extract (vui2, 12); + return vec_extract (vui2, LANE); } diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p8.c index 0ddecb4e4b5..92ea1f54dec 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p8.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p8.c @@ -38,22 +38,28 @@ #include +#ifdef __BIG_ENDIAN__ +#define LANE 15 +#else +#define LANE 1 +#endif + unsigned short testbi_cst (vector bool short vbs2) { - return vec_extract (vbs2, 12); + return vec_extract (vbs2, LANE); } signed short testsi_cst (vector signed short vss2) { - return vec_extract (vss2, 12); + return vec_extract (vss2, LANE); } unsigned short testui_cst12 (vector unsigned short vus2) { - return vec_extract (vus2, 12); + return vec_extract (vus2, LANE); } unsigned short diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p9.c index fac35cb792f..4cb3fa7d313 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p9.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-extract-short.p9.c @@ -17,22 +17,28 @@ #include +#ifdef __BIG_ENDIAN__ +#define LANE 15 +#else +#define LANE 1 +#endif + unsigned short testbi_cst (vector bool short vbs2) { - return vec_extract (vbs2, 12); + return vec_extract (vbs2, LANE); } signed short testsi_cst (vector signed short vss2) { - return vec_extract (vss2, 12); + return vec_extract (vss2, LANE); } unsigned short testui_cst12 (vector unsigned short vus2) { - return vec_extract (vus2, 12); + return vec_extract (vus2, LANE); } unsigned short diff --git a/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c b/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c index d7d3ad77aea..2e5f2d11bf0 100644 --- a/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c +++ b/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c @@ -9,123 +9,124 @@ #include +#ifdef __BIG_ENDIAN__ +#define DWORD1_FIRST_INT 3 +#define DWORD1_LAST_INT 2 +#define DWORD1_FIRST_SHORT 7 +#define DWORD1_LAST_SHORT 4 +#define DWORD1_FIRST_CHAR 15 +#define DWORD1_LAST_CHAR 8 +#else +#define DWORD1_FIRST_INT 0 +#define DWORD1_LAST_INT 1 +#define DWORD1_FIRST_SHORT 0 +#define DWORD1_LAST_SHORT 3 +#define DWORD1_FIRST_CHAR 0 +#define DWORD1_LAST_CHAR 7 +#endif + int extract_int_0 (vector int a) { - int c = 0; - int b = vec_extract (a, c); + int b = vec_extract (a, DWORD1_FIRST_INT); return b; } int extract_int_3 (vector int a) { - int c = 3; - int b = vec_extract (a, c); + int b = vec_extract (a, DWORD1_LAST_INT); return b; } unsigned int extract_uint_0 (vector unsigned int a) { - int c = 0; - unsigned int b = vec_extract (a, c); + unsigned int b = vec_extract (a, DWORD1_FIRST_INT); return b; } unsigned int extract_uint_3 (vector unsigned int a) { - int c = 3; - unsigned int b = vec_extract (a, c); + unsigned int b = vec_extract (a, DWORD1_LAST_INT); return b; } short extract_short_0 (vector short a) { - int c = 0; - short b = vec_extract (a, c); + short b = vec_extract (a, DWORD1_FIRST_SHORT); return b; } short extract_short_7 (vector short a) { - int c = 7; - short b = vec_extract (a, c); + short b = vec_extract (a, DWORD1_LAST_SHORT); return b; } unsigned short extract_ushort_0 (vector unsigned short a) { - int c = 0; - unsigned short b = vec_extract (a, c); + unsigned short b = vec_extract (a, DWORD1_FIRST_SHORT); return b; } unsigned short extract_ushort_7 (vector unsigned short a) { - int c = 7; - unsigned short b = vec_extract (a, c); + unsigned short b = vec_extract (a, DWORD1_LAST_SHORT); return b; } signed char extract_schar_0 (vector signed char a) { - int c = 0; - signed char b = vec_extract (a, c); + signed char b = vec_extract (a, DWORD1_FIRST_CHAR); return b; } signed char extract_schar_15 (vector signed char a) { - int c = 15; - signed char b = vec_extract (a, c); + signed char b = vec_extract (a, DWORD1_LAST_CHAR); return b; } unsigned char extract_uchar_0 (vector unsigned char a) { - int c = 0; - unsigned char b = vec_extract (a, c); + unsigned char b = vec_extract (a, DWORD1_FIRST_CHAR); return b; } unsigned char extract_uchar_15 (vector unsigned char a) { - int c = 15; - signed char b = vec_extract (a, c); + signed char b = vec_extract (a, DWORD1_LAST_CHAR); return b; } unsigned char extract_bool_char_0 (vector bool char a) { - int c = 0; - unsigned char b = vec_extract (a, c); + unsigned char b = vec_extract (a, DWORD1_FIRST_CHAR); return b; } unsigned int extract_bool_int_0 (vector bool int a) { - int c = 0; - unsigned int b = vec_extract (a, c); + unsigned int b = vec_extract (a, DWORD1_FIRST_INT); return b; } unsigned short int extract_bool_short_int_0 (vector bool short int a) { - int c = 0; - unsigned short int b = vec_extract (a, c); + unsigned short int b = vec_extract (a, DWORD1_FIRST_SHORT); return b; } diff --git a/gcc/testsuite/gcc.target/powerpc/pr110331-p8.c b/gcc/testsuite/gcc.target/powerpc/pr110331-p8.c new file mode 100644 index 00000000000..1d3b8cd6078 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr110331-p8.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +#include "pr110331.h" + +/* { dg-final { scan-assembler-times {\mmfvsrd\M} 10 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mrldicl\M} 8 } } */ +/* { dg-final { scan-assembler-times "exts\[bhw\]" 6 } } */ +/* { dg-final { scan-assembler-not {\mvsplt} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr110331-p9.c b/gcc/testsuite/gcc.target/powerpc/pr110331-p9.c new file mode 100644 index 00000000000..0762a3365db --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr110331-p9.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +#include "pr110331.h" + +/* { dg-final { scan-assembler-times {\mmfvsrd\M} 10 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mrldicl\M} 8 } } */ +/* { dg-final { scan-assembler-times "exts\[bhw\]" 6 } } */ +/* { dg-final { scan-assembler-not {\mvextu} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr110331.h b/gcc/testsuite/gcc.target/powerpc/pr110331.h new file mode 100644 index 00000000000..01deb19534a --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr110331.h @@ -0,0 +1,90 @@ +#include + +#ifdef __BIG_ENDIAN__ +#define DWORD0_FIRST_INT 1 +#define DWORD0_LAST_INT 0 +#define DWORD0_FIRST_SHORT 3 +#define DWORD0_LAST_SHORT 0 +#define DWORD0_FIRST_CHAR 7 +#define DWORD0_LAST_CHAR 0 +#else +#define DWORD0_FIRST_INT 2 +#define DWORD0_LAST_INT 3 +#define DWORD0_FIRST_SHORT 4 +#define DWORD0_LAST_SHORT 7 +#define DWORD0_FIRST_CHAR 8 +#define DWORD0_LAST_CHAR 15 +#endif + +/* mfvsrd, rldicl */ +unsigned char testuc_f (vector unsigned char v) +{ + return vec_extract (v, DWORD0_FIRST_CHAR); +} + +/* mfvsrd, extsb */ +signed char testsc_f (vector signed char v) +{ + return vec_extract (v, DWORD0_FIRST_CHAR); +} + +/* mfvsrd, rldicl */ +unsigned char testuc_l (vector unsigned char v) +{ + return vec_extract (v, DWORD0_LAST_CHAR); +} + +/* mfvsrd, rldicl, extsb */ +signed char testsc_l (vector signed char v) +{ + return vec_extract (v, DWORD0_LAST_CHAR); +} + +/* mfvsrd, rldicl */ +unsigned short testus_f (vector unsigned short v) +{ + return vec_extract (v, DWORD0_FIRST_SHORT); +} + +/* mfvsrd, extsh */ +signed short testss_f (vector signed short v) +{ + return vec_extract (v, DWORD0_FIRST_SHORT); +} + +/* mfvsrd, rldicl */ +unsigned short testus_l (vector unsigned short v) +{ + return vec_extract (v, DWORD0_LAST_SHORT); +} + +/* mfvsrd, rldicl, extsh */ +signed short testss_l (vector signed short v) +{ + return vec_extract (v, DWORD0_LAST_SHORT); +} + +/* mfvsrwz */ +unsigned int testui_f (vector unsigned int v) +{ + return vec_extract (v, DWORD0_FIRST_INT); +} + +/* mfvsrwz, extsw */ +signed int testsi_f (vector signed int v) +{ + return vec_extract (v, DWORD0_FIRST_INT); +} + +/* mfvsrd, rldicl */ +unsigned int testui_l (vector unsigned int v) +{ + return vec_extract (v, DWORD0_LAST_INT); +} + +/* mfvsrd, rldicl, extsw */ +signed int testsi_l (vector signed int v) +{ + return vec_extract (v, DWORD0_LAST_INT); +} +