From patchwork Sat Feb 24 03:18:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andrew Pinski (QUIC)" X-Patchwork-Id: 205768 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp968205dyb; Fri, 23 Feb 2024 19:19:49 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWTo/Nk25b9qDJlgVEniOR/gODlTL5ZijdOMv/znMM9Av6Vc9PIjyqx5mULFm2hPYsnEEsKMj7/16t+HH5pNIh7kqHQ+g== X-Google-Smtp-Source: AGHT+IFhCs37SE30VCjduDBXh90H9faKcP4E3vN/W/DqbxkJOiesF3SXUHRtXxRx6Y1nlj/RX8PN X-Received: by 2002:a05:622a:1002:b0:42e:66cb:eab1 with SMTP id d2-20020a05622a100200b0042e66cbeab1mr1223813qte.31.1708744789164; Fri, 23 Feb 2024 19:19:49 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708744789; cv=pass; d=google.com; s=arc-20160816; b=AsH+sbYVk4aoZH6gstN3lWtVDqVAZlZrgO/beNhEtGwlHWD89wOxBcoHrm3x8QxPFp wpiaNGxgJZh1YIbw844EAdYseysx1c81HMIwZLGJd0qleIDtoz1JbZVtC3mPbeLmmdkn W70NXLeGNM4+T8FJ4/fTagh9yFhJ6CepbBuxkWC4ffgSSpGBDxudhkGalBRZ6UoD3JES 56f8lsNKYKqNa5bjVK9tefcoH0tclGLWhvEP9EiW5yFAyoDtx6xtuS4WpU8w6frEv4zV pQl/H0pCGyHsUSyebglp/vvdglHA2qwvv8mlWLfjbb0BDBeJRGBPsDwDpTBh/qaNvc4S 1VNA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=DksqJlXr+pGBmFYYg+5K8cNbLUszUZvoeYOpSuMmPEo=; fh=o3CEdS3yGN/eIa89Zn0Kf01WYLxTE9IjKuSQGVPeOHI=; b=wnsIC+y+6UZeynTok9qi0/Vy0Exc1pDnSYjd2oSV5lOBdqgweIbekme4K9SemjIU/z l/roQqkIb6pnM/bopDFniajlFMdrxDPLqC2aOimZMMqas2yRXyv6c8f/QBXaTxmHAeMQ T2sJhwtV/+kuSbZWctzpDkPcoq5vcsdgtFAid1oXjX+DYinAEPUuU+ScSkozhQ45dO5L gP5EfPFhGyOSGLcJeGdxf/hFzqLL5KpAnlWtuRajoOMqT7cAdMNJwFdiW/Ge+c8Cab3r 5vnZS7bfmfTDS263+PVEoWowpCqMuVyCadL59rN9cQaQUTS4oOau42AEFDy892llz8Ub TAXw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=LiCepQv9; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id v41-20020a05622a18a900b0042e57a3f734si355089qtc.392.2024.02.23.19.19.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Feb 2024 19:19:49 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=LiCepQv9; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DC4BF385828D for ; Sat, 24 Feb 2024 03:19:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 78CEB3858C53 for ; Sat, 24 Feb 2024 03:19:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78CEB3858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 78CEB3858C53 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744744; cv=none; b=MCcLBWs5ufAszE5cgg1v1n+VYT7TQdz/lNP+DhranMrNcXEdiKTOZZMMi0wa8jMgEuLNx4myAneF0mYpek0woL1JpTyCu6IxOODn4evPgIdd6Iux9hBKXWWM6LcGGpFphjzsh8F/jcJtzidHKoGOyQn8omCBFji43ZOsnrqQK0Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744744; c=relaxed/simple; bh=etcULU9tiZXxOXwhSHFNAvMls/ujrzVoppLhGxbVBns=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=N23PJM/q5zET/ttCpBBRqN7LPSgydG4hO5fC+uALAa30HgG+LPFUwPWZXwLkvMrj/yr3rwey1VLJMI8kZG13x1sOVLP7fYvdmvq+1ilm2aSAyADdCrfwxOscXkpQgUveUpAJfi7K/b5Di8xj8wXsN+FQtRK4x/V90vNjQsZsWbE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279869.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41O34EHa027893 for ; Sat, 24 Feb 2024 03:19:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding:content-type; s=qcppdkim1; bh=DksqJlX r+pGBmFYYg+5K8cNbLUszUZvoeYOpSuMmPEo=; b=LiCepQv9vufl0e6VDiyI2eP nd0rAdzRfrT1aG2iJGji4AnxQnAz1+qutArPm9bT0B7pDnw1rHgALRQVYySobi8e PGqnO9lEk1LmOb2sILfTbmw0K8C7kgNrpx1CWpjL5ApnnoSgm7NdldfHt9moVi4L R6Xu4FwBVTIEVzrJgu9iOrT30MZouNCAJVbDRm+ck3T2rQEj1OGgMsxgCKEEplV5 v3413vKcps1gaxSj0MQDokouAT4nhkyY2xcTVmqgHpcwYTjNOOBY0at0ulO/Oq1T xAgEljD3SkfvT6qdTNgVS3EMEfeTmuBsjbzyR6ptNIHmU1BBGullNNoOBkCblHQ= = Received: from nasanppmta05.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3wf7fbg213-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:19:00 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 41O3IxGo014766 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:18:59 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 19:18:59 -0800 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 1/2] aarch64: Use fmov s/d/hN, FP_CST for some vector CST [PR113856] Date: Fri, 23 Feb 2024 19:18:47 -0800 Message-ID: <20240224031848.3866630-1-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01b.na.qualcomm.com (10.47.209.197) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 5afvzipuZE9u83JU1G7VvFaOciIJdxhA X-Proofpoint-GUID: 5afvzipuZE9u83JU1G7VvFaOciIJdxhA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-23_08,2024-02-23_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 clxscore=1015 adultscore=0 bulkscore=0 suspectscore=0 malwarescore=0 mlxscore=0 spamscore=0 priorityscore=1501 mlxlogscore=999 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2402240025 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791748776350036421 X-GMAIL-MSGID: 1791748776350036421 Aarch64 has a way to form some floating point CSTs via the fmov instructions, these instructions also zero out the upper parts of the registers so they can be used for vector CSTs that have have one non-zero constant that would be able to formed via the fmov in the first element. This implements this "small" optimization so these vector cst don't need to do loads from memory. Built and tested on aarch64-linux-gnu with no regressions. PR target/113856 gcc/ChangeLog: * config/aarch64/aarch64.cc (struct simd_immediate_info): Add FMOV_SDH to insn_type. For scalar_float_mode constructor add insn_in. (aarch64_simd_valid_immediate): Catch `{fp, 0...}` vector_cst and return a simd_immediate_info which uses FMOV_SDH. (aarch64_output_simd_mov_immediate): Support outputting fmov for FMOV_SDH. gcc/testsuite/ChangeLog: * gcc.target/aarch64/fmov-zero-cst-1.c: New test. * gcc.target/aarch64/fmov-zero-cst-2.c: New test. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64.cc | 48 ++++++++++++++--- .../gcc.target/aarch64/fmov-zero-cst-1.c | 52 +++++++++++++++++++ .../gcc.target/aarch64/fmov-zero-cst-2.c | 19 +++++++ 3 files changed, 111 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-2.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5dd0814f198..c4386591a9b 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -126,11 +126,11 @@ constexpr auto AARCH64_STATE_OUT = 1U << 2; /* Information about a legitimate vector immediate operand. */ struct simd_immediate_info { - enum insn_type { MOV, MVN, INDEX, PTRUE }; + enum insn_type { MOV, FMOV_SDH, MVN, INDEX, PTRUE }; enum modifier_type { LSL, MSL }; simd_immediate_info () {} - simd_immediate_info (scalar_float_mode, rtx); + simd_immediate_info (scalar_float_mode, rtx, insn_type = MOV); simd_immediate_info (scalar_int_mode, unsigned HOST_WIDE_INT, insn_type = MOV, modifier_type = LSL, unsigned int = 0); @@ -145,7 +145,7 @@ struct simd_immediate_info union { - /* For MOV and MVN. */ + /* For MOV, FMOV_SDH and MVN. */ struct { /* The value of each element. */ @@ -173,9 +173,10 @@ struct simd_immediate_info /* Construct a floating-point immediate in which each element has mode ELT_MODE_IN and value VALUE_IN. */ inline simd_immediate_info -::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in) - : elt_mode (elt_mode_in), insn (MOV) +::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in, insn_type insn_in) + : elt_mode (elt_mode_in), insn (insn_in) { + gcc_assert (insn_in == MOV || insn_in == FMOV_SDH); u.mov.value = value_in; u.mov.modifier = LSL; u.mov.shift = 0; @@ -22932,6 +22933,35 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, return true; } } + /* See if we can use fmov d0/s0/h0 ... for the constant. */ + if (n_elts >= 1 + && (vec_flags & VEC_ADVSIMD) + && is_a (elt_mode, &elt_float_mode) + && !CONST_VECTOR_DUPLICATE_P (op)) + { + rtx elt = CONST_VECTOR_ENCODED_ELT (op, 0); + if (aarch64_float_const_zero_rtx_p (elt) + || aarch64_float_const_representable_p (elt)) + { + bool valid = true; + for (unsigned int i = 1; i < n_elts; i++) + { + rtx elt1 = CONST_VECTOR_ENCODED_ELT (op, i); + if (!aarch64_float_const_zero_rtx_p (elt1)) + { + valid = false; + break; + } + } + if (valid) + { + if (info) + *info = simd_immediate_info (elt_float_mode, elt, + simd_immediate_info::FMOV_SDH); + return true; + } + } + } /* If all elements in an SVE vector have the same value, we have a free choice between using the element mode and using the container mode. @@ -25121,7 +25151,8 @@ aarch64_output_simd_mov_immediate (rtx const_vector, unsigned width, if (GET_MODE_CLASS (info.elt_mode) == MODE_FLOAT) { - gcc_assert (info.insn == simd_immediate_info::MOV + gcc_assert ((info.insn == simd_immediate_info::MOV + || info.insn == simd_immediate_info::FMOV_SDH) && info.u.mov.shift == 0); /* For FP zero change it to a CONST_INT 0 and use the integer SIMD move immediate path. */ @@ -25134,8 +25165,9 @@ aarch64_output_simd_mov_immediate (rtx const_vector, unsigned width, real_to_decimal_for_mode (float_buf, CONST_DOUBLE_REAL_VALUE (info.u.mov.value), buf_size, buf_size, 1, info.elt_mode); - - if (lane_count == 1) + if (info.insn == simd_immediate_info::FMOV_SDH) + snprintf (templ, sizeof (templ), "fmov\t%%%c0, %s", element_char, float_buf); + else if (lane_count == 1) snprintf (templ, sizeof (templ), "fmov\t%%d0, %s", float_buf); else snprintf (templ, sizeof (templ), "fmov\t%%0.%d%c, %s", diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-1.c b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-1.c new file mode 100644 index 00000000000..9b13ef7b1ef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-1.c @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcmodel=tiny" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +/* PR target/113856 */ + +#define vect64 __attribute__((vector_size(8) )) +#define vect128 __attribute__((vector_size(16) )) + +/* +** f1: +** fmov s0, 1.0e\+0 +** ret +*/ +vect64 float f1() +{ + return (vect64 float){1.0f, 0}; +} + +/* +** f2: +** fmov s0, 1.0e\+0 +** ret +*/ +vect128 float f2() +{ + return (vect128 float){1.0f, 0, 0, 0}; +} + +/* f3 should only be done for -ffast-math. */ +/* +** f3: +** ldr q0, .LC[0-9]+ +** ret +*/ +vect128 float f3() +{ + return (vect128 float){1.0f, 0, -0.0f, 0.0f}; +} + +/* f4 cannot be using fmov here, + Note this is checked here as {1.0, 0.0, 1.0, 0.0} + has CONST_VECTOR_DUPLICATE_P set + and represented interanlly as: {1.0, 0.0}. */ +/* +** f4: +** ldr q0, .LC[0-9]+ +** ret +*/ +vect128 float f4() +{ + return (vect128 float){1.0f, 0, 1.0f, 0.0f}; +} diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-2.c b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-2.c new file mode 100644 index 00000000000..c97d85b68a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-2.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcmodel=tiny -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +/* PR target/113856 */ + +#define vect64 __attribute__((vector_size(8) )) +#define vect128 __attribute__((vector_size(16) )) + +/* f3 can be done with -ffast-math. */ +/* +** f3: +** fmov s0, 1.0e\+0 +** ret +*/ +vect128 float f3() +{ + return (vect128 float){1.0f, 0, -0.0f, 0.0f}; +} + From patchwork Sat Feb 24 03:18:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andrew Pinski (QUIC)" X-Patchwork-Id: 205769 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp968221dyb; Fri, 23 Feb 2024 19:19:51 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUcxmHiqV8e4mGg+RKg3Yp7xoq9pIiJeB9YeOzM9KFOvPx2wWVEsWWeiY6YEMo07MHPqrHWM0bQJnqUFYS+KaEscbTRFA== X-Google-Smtp-Source: AGHT+IFRJO4eL9gokQf/GiKGaB74Uk4LVhiChjBmmy6MsJnxZjsXQT8IU4J4cGWp+0J4/VY5gd4A X-Received: by 2002:a0c:e387:0:b0:68f:e47d:a6e6 with SMTP id a7-20020a0ce387000000b0068fe47da6e6mr1392748qvl.62.1708744791768; Fri, 23 Feb 2024 19:19:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708744791; cv=pass; d=google.com; s=arc-20160816; b=mznd2SFeRrL1eYLwRS90hM6E9Dbb/YukPAn0EdPHQ6h4fkSWZtw+ntAvRZNTLr119c gJ388iPBsqoruecA2Mnb8x+pONLKRTFnJqRErweuIt8GA1DPbzNGugcGXKvOUlSQciaX WHh2GEUz3UY9f3RRs3t76ANEBtgk/pyFRl6Qf4BRzLr2RQ3h+GBWH86bdZVbvvSKKklc L7rEsmMG062Eopv0JA0aG1bxPXQDFxPxfGQ6fFtDvuJZdYkNFynT/7Zso4UVmhiTxe5I seXXONwx8sJHMDQAz6OAasfs1fsIsLabBWl0wkQAXCjlWsmnBSMAiz+miYdY0PMEczFu lsfQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=AKcXqXLxuLQHrTWtSLt9ObnxaC8XF2zxLwyB8FOxCSc=; fh=o3CEdS3yGN/eIa89Zn0Kf01WYLxTE9IjKuSQGVPeOHI=; b=xdjeYWDjTXjaIa0YCtogbesQFIP5/i1SYFZdcjMNqq1V8kzv0Y6afVK5bWTVXoAQkB 8CWJuMxUIQvUBbCwo8llW4KFDp4xGZETX9xdGV7WklIl/KrCeLIypo3kSNY987xM5NIF lnisF8IT6wkez6QF5hfhcBfRpLvpG7EtXbB/kEM6FXfsSvkd2nnbg2MnL4ePmx1LlzNw MTHVQCEcJtlooAKmz/cpSiI9caCEcagJwJkiLQ3335jc4HMoB4C0UZ2agCI0qDEzqtV0 rCaLKbyQV1pxyh9HMV3swYxheUVnWHa7WpbV9l77WXCS7vflJE6fYf48qlCH5E4fnNHA Jq6A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=BudFkPR7; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id q1-20020ad45ca1000000b0068f4261e47dsi337406qvh.84.2024.02.23.19.19.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Feb 2024 19:19:51 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=BudFkPR7; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 761B23858291 for ; Sat, 24 Feb 2024 03:19:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 78D233858407 for ; Sat, 24 Feb 2024 03:19:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78D233858407 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 78D233858407 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744746; cv=none; b=EL6TCovk5dsmpfzeWmWmzwB/VbxU8BAqXO9oq/Ix2K7QmFeiJBnNHbg8uGHpTS6w+iK9GKYjSyBakM7IlHWlCw3QkTwNSo9e4jliBtiCfXai1mNTNjxpuKXWV8GRohg9owfhSBRkvMVpXSySGOund5b8rcCxdxciU8/HSybtAZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744746; c=relaxed/simple; bh=ImYZ7jVRcyWJsIrMDMiIlYNWLATfiLh5sGDuHMgr2CM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=AWtG+3+gZasGs58pNnYgaKFjRUnQJ6qnLNJRqRYjE/MuwBS8QgcgOBUYL6oO8tl2qQNiCzsvg6CTMheQjeqMBZmTqDnBBCvCP06vKM7VJH//TR4LVUSbqHrj0I/d7vm08wdI/hopUO4EQYfeRETSfYKBg8WPmsqGZAz+EogaxAc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41O3GSeO030888 for ; Sat, 24 Feb 2024 03:19:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s= qcppdkim1; bh=AKcXqXLxuLQHrTWtSLt9ObnxaC8XF2zxLwyB8FOxCSc=; b=Bu dFkPR7sJXZIeDqJqhsV2yhWznW+UL/rKC9CiQgVwb0/J2sXfDVM57u4SfPSNMBy0 Y5AXthvHbrVCXcfEVltceDeRVSTpkjkaXv8hAR2san+yRMJmSgoMLkInI4P4gcMR IEOh942fJFsqrintCN4C9V16HUJUW9Cil86MQNxKXaZFErKFNiwn+1aPQzAZCXqF 7NcQtZ+syfEGccGe0rkZyVP5D0CCEGLv3l52n1AldBlqN6vJNFiB4utTjtp0DsjP l+qRZuVdwjoLkltkcNBZUXVrA2PShoYMdSOeaZawOepAiSjFs2f1yk4tXQQzdHvA X9lmpm8XTL+mVT8XN2Rg== Received: from nasanppmta01.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3wesgg1x68-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:19:00 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 41O3Ixds026517 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:18:59 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 19:18:59 -0800 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 2/2] aarch64: Support `{1.0f, 1.0f, 0.0, 0.0}` CST forming with fmov with a smaller vector type. Date: Fri, 23 Feb 2024 19:18:48 -0800 Message-ID: <20240224031848.3866630-2-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240224031848.3866630-1-quic_apinski@quicinc.com> References: <20240224031848.3866630-1-quic_apinski@quicinc.com> MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01b.na.qualcomm.com (10.47.209.197) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: s546313aMXxneW8Lf9vKtJc1xfnZ0yHe X-Proofpoint-ORIG-GUID: s546313aMXxneW8Lf9vKtJc1xfnZ0yHe X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-23_08,2024-02-23_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 bulkscore=0 suspectscore=0 adultscore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 mlxscore=0 malwarescore=0 clxscore=1015 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2402240025 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791748778748247104 X-GMAIL-MSGID: 1791748778748247104 This enables construction of V4SF CST like `{1.0f, 1.0f, 0.0f, 0.0f}` (and other fp enabled CSTs) by using `fmov v0.2s, 1.0` as the instruction is designed to zero out the other bits. This is a small extension on top of the code that creates fmov for the case where the all but the first element is non-zero. Built and tested for aarch64-linux-gnu with no regressions. PR target/113856 gcc/ChangeLog: * config/aarch64/aarch64.cc (simd_immediate_info): Add bool to the float mode constructor. Document modifier field for FMOV_SDH. (aarch64_simd_valid_immediate): Recognize where the first half of the const float vect is the same. (aarch64_output_simd_mov_immediate): Handle the case where insn is FMOV_SDH and modifier is MSL. gcc/testsuite/ChangeLog: * gcc.target/aarch64/fmov-zero-cst-3.c: New test. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64.cc | 34 ++++++++++++++++--- .../gcc.target/aarch64/fmov-zero-cst-3.c | 28 +++++++++++++++ 2 files changed, 57 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index c4386591a9b..89bd0c5e5a6 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -130,7 +130,7 @@ struct simd_immediate_info enum modifier_type { LSL, MSL }; simd_immediate_info () {} - simd_immediate_info (scalar_float_mode, rtx, insn_type = MOV); + simd_immediate_info (scalar_float_mode, rtx, insn_type = MOV, bool = false); simd_immediate_info (scalar_int_mode, unsigned HOST_WIDE_INT, insn_type = MOV, modifier_type = LSL, unsigned int = 0); @@ -153,6 +153,8 @@ struct simd_immediate_info /* The kind of shift modifier to use, and the number of bits to shift. This is (LSL, 0) if no shift is needed. */ + /* For FMOV_SDH, LSL says it is a single while MSL + says if it is either .4h/.2s fmov. */ modifier_type modifier; unsigned int shift; } mov; @@ -173,12 +175,12 @@ struct simd_immediate_info /* Construct a floating-point immediate in which each element has mode ELT_MODE_IN and value VALUE_IN. */ inline simd_immediate_info -::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in, insn_type insn_in) +::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in, insn_type insn_in, bool firsthalfsame) : elt_mode (elt_mode_in), insn (insn_in) { gcc_assert (insn_in == MOV || insn_in == FMOV_SDH); u.mov.value = value_in; - u.mov.modifier = LSL; + u.mov.modifier = firsthalfsame ? MSL : LSL; u.mov.shift = 0; } @@ -22944,10 +22946,23 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, || aarch64_float_const_representable_p (elt)) { bool valid = true; + bool firsthalfsame = false; for (unsigned int i = 1; i < n_elts; i++) { rtx elt1 = CONST_VECTOR_ENCODED_ELT (op, i); if (!aarch64_float_const_zero_rtx_p (elt1)) + { + if (i == 1) + firsthalfsame = true; + if (!firsthalfsame + || i >= n_elts/2 + || !rtx_equal_p (elt, elt1)) + { + valid = false; + break; + } + } + else if (firsthalfsame && i < n_elts/2) { valid = false; break; @@ -22957,7 +22972,8 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, { if (info) *info = simd_immediate_info (elt_float_mode, elt, - simd_immediate_info::FMOV_SDH); + simd_immediate_info::FMOV_SDH, + firsthalfsame); return true; } } @@ -25165,8 +25181,16 @@ aarch64_output_simd_mov_immediate (rtx const_vector, unsigned width, real_to_decimal_for_mode (float_buf, CONST_DOUBLE_REAL_VALUE (info.u.mov.value), buf_size, buf_size, 1, info.elt_mode); - if (info.insn == simd_immediate_info::FMOV_SDH) + if (info.insn == simd_immediate_info::FMOV_SDH + && info.u.mov.modifier == simd_immediate_info::LSL) snprintf (templ, sizeof (templ), "fmov\t%%%c0, %s", element_char, float_buf); + else if (info.insn == simd_immediate_info::FMOV_SDH + && info.u.mov.modifier == simd_immediate_info::MSL) + { + gcc_assert (element_char != 'd'); + gcc_assert (lane_count > 2); + snprintf (templ, sizeof (templ), "fmov\t%%0.%d%c, %s", lane_count/2, element_char, float_buf); + } else if (lane_count == 1) snprintf (templ, sizeof (templ), "fmov\t%%d0, %s", float_buf); else diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c new file mode 100644 index 00000000000..7a78b6d3caf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcmodel=tiny" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +/* PR target/113856 */ + +#define vect64 __attribute__((vector_size(8) )) +#define vect128 __attribute__((vector_size(16) )) + +/* +** f2: +** fmov v0.2s, 1.0e\+0 +** ret +*/ +vect128 float f2() +{ + return (vect128 float){1.0f, 1.0f, 0, 0}; +} + +/* +** f3: +** ldr q0, \.LC[0-9]+ +** ret +*/ +vect128 float f3() +{ + return (vect128 float){1.0f, 1.0f, 1.0f, 0.0}; +} +