From patchwork Fri Jan 5 07:43:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andrew Pinski (QUIC)" X-Patchwork-Id: 185254 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp6077068dyb; Thu, 4 Jan 2024 23:44:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IF/w+nWlDX0IFrTFgTv8KGdjZgNtaY0fGbo1D904VWAH/rwdyOoidwNpScmW8uY4DYuF/rT X-Received: by 2002:a05:620a:8520:b0:781:1da8:8c56 with SMTP id pe32-20020a05620a852000b007811da88c56mr1602541qkn.75.1704440679391; Thu, 04 Jan 2024 23:44:39 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704440679; cv=pass; d=google.com; s=arc-20160816; b=CJR7T/XTHa67HcPV4fL4hISuxuNImkcridwO9Rjo5E66Mbx3t6hZQtGQIz5sMqeb91 kdq0p5gmfbIuQs8dOqRScK/GiPv5FS/8SOc5OgwSixJplBSvlotkn/6h0S7XpPWkJROF CBQPvoBO273Ena9iq35rPoYKmMzs0IvxYlVqMdAu11hxAjLhNSLlI8TkHH3viQGdO4iR CRDwEQJeWZzFdTBZr9qdAOXRq3kgJ0AVMvBY2FwKPpHsqsFylZpMdv9AAQzONiMx7zPE sCRMEZLjdwT8Duy/wBvAJqMXnFHzUMUzuoEFL/xV9uU3M/5CDrBRKPjS9DDBNFQoduh0 j3HQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=fi26b7rXztdWEt83dpknS+ERNxCn9Q7m4qhxdkiv+08=; fh=o3CEdS3yGN/eIa89Zn0Kf01WYLxTE9IjKuSQGVPeOHI=; b=R3WHfviLo+nhMzXKx3x6FVBNnUWQv0JK8L40LciQfyahqEXLf71mwYWrNu6EuU5vPC 52MuDuNuamaZ5kF0NqLKk6zHw+oqIq5M6Unvupc99cCEMU2G2UdLvta9APaEn8xlJAEn S4uXDMuk/FiXG6stot9oUE8ti5Td+I2oMlyVabnnBoryKWIHJ5EiNZArsoc+QVKrGx6X hcIinHTyScZn5SmSKteOjwOoVxei1AJTHhMAX3wdRoQsuA7EL7nksCN9S8pgq3+YV7Hu 4Hl0LSjZmNOftqbTReX9OKuTWCCOU0H21TIeSHgd4Cc4C9wFrwck2Y12KNKCdUOym1yI JQXw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b="cn/9Mse/"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id x13-20020ae9e90d000000b0076ccd3dc9f2si1159690qkf.741.2024.01.04.23.44.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 23:44:39 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b="cn/9Mse/"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18E103857400 for ; Fri, 5 Jan 2024 07:44:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by sourceware.org (Postfix) with ESMTPS id 5B55638582AD for ; Fri, 5 Jan 2024 07:43:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5B55638582AD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5B55638582AD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704440634; cv=none; b=di1XEBvn4iizveMJl1b8fV0TXv8qWJGcBgSrd1XfhgRiblQDQ/m++ehfLIUwcUtofMF3Hv9fKGQaaHBw7bD7YxFX2T0Zb5P7qA5zhDenINJZIfnQyBzmRB0Sp4xGoHDTajT5OTbMQNvlR3SaScUIEuPTg16VjVAfxnizl+l1LDg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704440634; c=relaxed/simple; bh=s8Z1Qrwqy74DGUUA0xI2XeVf/iJwdEjRT7plmYE/cEM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=JlZ3idvmEmy6J8hh0UmwezDbBEFOHEGI8Y5D2YtpcOFzM5sg45cvCNdFtHR0BpxYDzcRLEsF4bTIVZ3B6jj8q01Nztb24rd8rs2yY0YUcR/4MTl43mMbdM3qJ4QOVlI52b+XrgElzDJ8O5mMTY5I+mkrQsIPLV8UwZ9W89edrCQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 4057K2iu029727 for ; Fri, 5 Jan 2024 07:43:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding:content-type; s=qcppdkim1; bh=fi26b7r XztdWEt83dpknS+ERNxCn9Q7m4qhxdkiv+08=; b=cn/9Mse/2dXsA5WgyZpd4Ig ki8lttpMuuDDifz2S4kZzWAXQR8nSwU63Y4K3zfJQnu/7z0/z9HjdgSyVBAoCgf/ EvLH84n/InB+qYmNamV5rSJb8n2duNHb8iNpKV0ARhkpvbIglH0ICJd3rMaqq6I0 KeS/m9WdKwDBVpnsm1mx+xPYxkcGy/YoVykOh2Axm7OFw13l33qK5n7bodNUzaGo RVPH655Rx5mOhf/XLq/GIwaXFmhvLyM5GbVMkR9vIosA+LVMbIvnghAjcyC8CHbX si8YFGA+LVab5a3amcLKaclolPo4jvUHMezkbrMTa5OU4NqsF15S/yqBG/uPnoA= = Received: from nasanppmta04.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3ve99dgefg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 05 Jan 2024 07:43:50 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 4057hn7S022005 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 5 Jan 2024 07:43:49 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Thu, 4 Jan 2024 23:43:48 -0800 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCHv2] aarch64/expr: Use ccmp when the outer expression is used twice [PR100942] Date: Thu, 4 Jan 2024 23:43:30 -0800 Message-ID: <20240105074330.2309587-1-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01a.na.qualcomm.com (10.47.209.196) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: r-A7aQiUmA2CyTRu8SZni9rOSUqdtJn3 X-Proofpoint-GUID: r-A7aQiUmA2CyTRu8SZni9rOSUqdtJn3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-09_01,2023-12-07_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=291 impostorscore=0 bulkscore=0 mlxscore=0 suspectscore=0 lowpriorityscore=0 priorityscore=1501 clxscore=1015 phishscore=0 spamscore=0 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2311290000 definitions=main-2401050065 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK, URIBL_SBL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787235589346905179 X-GMAIL-MSGID: 1787235589346905179 Ccmp is not used if the result of the and/ior is used by both a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation here by using ccmp in this case. Two changes is required, first we need to allow the outer statement's result be used more than once. The second change is that during the expansion of the gimple, we need to try using ccmp. This is needed because we don't use expand the ssa name of the lhs but rather expand directly from the gimple. A small note on the ccmp_4.c testcase, we should be able to get slightly better than with this patch but it is one extra instruction compared to before. Diff from v1: * v2: Split out expand_gimple_assign_ssa so the we only need to handle promotion once. Add ccmp_5.c testcase which was suggested. Change comment on ccmp_candidate_p. Bootstraped and tested on aarch64-linux-gnu with no regressions. PR target/100942 gcc/ChangeLog: * ccmp.cc (ccmp_candidate_p): Add outer argument. Allow if the outer is true and the lhs is used more than once. (expand_ccmp_expr): Update call to ccmp_candidate_p. * cfgexpand.cc (expand_gimple_assign_ssa): New function, try using ccmp for binary assignments and extracted from ... (expand_gimple_stmt_1): Here. Call expand_gimple_assign_ssa. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ccmp_3.c: New test. * gcc.target/aarch64/ccmp_4.c: New test. * gcc.target/aarch64/ccmp_5.c: New test. Signed-off-by: Andrew Pinski --- gcc/ccmp.cc | 12 ++-- gcc/cfgexpand.cc | 75 +++++++++++++++-------- gcc/testsuite/gcc.target/aarch64/ccmp_3.c | 20 ++++++ gcc/testsuite/gcc.target/aarch64/ccmp_4.c | 35 +++++++++++ gcc/testsuite/gcc.target/aarch64/ccmp_5.c | 20 ++++++ 5 files changed, 132 insertions(+), 30 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/ccmp_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ccmp_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ccmp_5.c diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc index 09d6b5595a4..7cb525addf4 100644 --- a/gcc/ccmp.cc +++ b/gcc/ccmp.cc @@ -90,9 +90,10 @@ ccmp_tree_comparison_p (tree t, basic_block bb) If all checks OK in expand_ccmp_expr, it emits insns in prep_seq, then insns in gen_seq. */ -/* Check whether G is a potential conditional compare candidate. */ +/* Check whether G is a potential conditional compare candidate; OUTER is true if + G is the outer most AND/IOR. */ static bool -ccmp_candidate_p (gimple *g) +ccmp_candidate_p (gimple *g, bool outer = false) { tree lhs, op0, op1; gimple *gs0, *gs1; @@ -109,8 +110,9 @@ ccmp_candidate_p (gimple *g) lhs = gimple_assign_lhs (g); op0 = gimple_assign_rhs1 (g); op1 = gimple_assign_rhs2 (g); - if ((TREE_CODE (op0) != SSA_NAME) || (TREE_CODE (op1) != SSA_NAME) - || !has_single_use (lhs)) + if ((TREE_CODE (op0) != SSA_NAME) || (TREE_CODE (op1) != SSA_NAME)) + return false; + if (!outer && !has_single_use (lhs)) return false; bb = gimple_bb (g); @@ -284,7 +286,7 @@ expand_ccmp_expr (gimple *g, machine_mode mode) rtx_insn *last; rtx tmp; - if (!ccmp_candidate_p (g)) + if (!ccmp_candidate_p (g, true)) return NULL_RTX; last = get_last_insn (); diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 1db22f0a1a3..293b9386192 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -74,6 +74,7 @@ along with GCC; see the file COPYING3. If not see #include "output.h" #include "builtins.h" #include "opts.h" +#include "ccmp.h" /* Some systems use __main in a way incompatible with its use in gcc, in these cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to @@ -3858,6 +3859,50 @@ expand_clobber (tree lhs) } } +/* A subroutine of expand_gimple_stmt_1, expanding one gimple assign statement + ASSIGN_STMT that has a SSA name on the lhs, using TARGET and TARGET_MODE + as the target if possible and mode for what mode we should expand into. */ +static rtx +expand_gimple_assign_ssa (gassign *assign_stmt, + rtx target, + machine_mode target_mode) +{ + rtx temp; + tree lhs = gimple_assign_lhs (assign_stmt); + /* Try to expand conditonal compare. */ + if (targetm.gen_ccmp_first + && gimple_assign_rhs_class (assign_stmt) == GIMPLE_BINARY_RHS) + { + machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); + gcc_checking_assert (targetm.gen_ccmp_next != NULL); + temp = expand_ccmp_expr (assign_stmt, mode); + if (temp) + return temp; + } + + struct separate_ops ops; + ops.code = gimple_assign_rhs_code (assign_stmt); + ops.type = TREE_TYPE (lhs); + switch (get_gimple_rhs_class (ops.code)) + { + case GIMPLE_TERNARY_RHS: + ops.op2 = gimple_assign_rhs3 (assign_stmt); + /* Fallthru */ + case GIMPLE_BINARY_RHS: + ops.op1 = gimple_assign_rhs2 (assign_stmt); + /* Fallthru */ + case GIMPLE_UNARY_RHS: + ops.op0 = gimple_assign_rhs1 (assign_stmt); + break; + default: + gcc_unreachable (); + } + ops.location = gimple_location (assign_stmt); + + return expand_expr_real_2 (&ops, target, target_mode, + EXPAND_NORMAL); +} + /* A subroutine of expand_gimple_stmt, expanding one gimple statement STMT that doesn't require special handling for outgoing edges. That is no tailcalls and no GIMPLE_COND. */ @@ -3971,37 +4016,17 @@ expand_gimple_stmt_1 (gimple *stmt) { rtx target, temp; bool nontemporal = gimple_assign_nontemporal_move_p (assign_stmt); - struct separate_ops ops; bool promoted = false; target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); if (GET_CODE (target) == SUBREG && SUBREG_PROMOTED_VAR_P (target)) promoted = true; - ops.code = gimple_assign_rhs_code (assign_stmt); - ops.type = TREE_TYPE (lhs); - switch (get_gimple_rhs_class (ops.code)) - { - case GIMPLE_TERNARY_RHS: - ops.op2 = gimple_assign_rhs3 (assign_stmt); - /* Fallthru */ - case GIMPLE_BINARY_RHS: - ops.op1 = gimple_assign_rhs2 (assign_stmt); - /* Fallthru */ - case GIMPLE_UNARY_RHS: - ops.op0 = gimple_assign_rhs1 (assign_stmt); - break; - default: - gcc_unreachable (); - } - ops.location = gimple_location (stmt); - - /* If we want to use a nontemporal store, force the value to - register first. If we store into a promoted register, - don't directly expand to target. */ + /* If we want to use a nontemporal store, force the value to + register first. If we store into a promoted register, + don't directly expand to target. */ temp = nontemporal || promoted ? NULL_RTX : target; - temp = expand_expr_real_2 (&ops, temp, GET_MODE (target), - EXPAND_NORMAL); + temp = expand_gimple_assign_ssa (assign_stmt, temp, GET_MODE (target)); if (temp == target) ; @@ -4013,7 +4038,7 @@ expand_gimple_stmt_1 (gimple *stmt) if (CONSTANT_P (temp) && GET_MODE (temp) == VOIDmode) { temp = convert_modes (GET_MODE (target), - TYPE_MODE (ops.type), + TYPE_MODE (TREE_TYPE (lhs)), temp, unsignedp); temp = convert_modes (GET_MODE (SUBREG_REG (target)), GET_MODE (target), temp, unsignedp); diff --git a/gcc/testsuite/gcc.target/aarch64/ccmp_3.c b/gcc/testsuite/gcc.target/aarch64/ccmp_3.c new file mode 100644 index 00000000000..a2b47fbee14 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ccmp_3.c @@ -0,0 +1,20 @@ +/* { dg-options "-O2" } */ +/* PR target/100942 */ + +void foo(void); +int f1(int a, int b) +{ + int c = a == 0 || b == 0; + if (c) foo(); + return c; +} + +/* We should get one cmp followed by ccmp and one cset. */ +/* { dg-final { scan-assembler "\tccmp\t" } } */ +/* { dg-final { scan-assembler-times "\tcset\t" 1 } } */ +/* { dg-final { scan-assembler-times "\tcmp\t" 1 } } */ +/* And not get 2 cmps and 2 (or more cset) and orr and a cbnz. */ +/* { dg-final { scan-assembler-not "\torr\t" } } */ +/* { dg-final { scan-assembler-not "\tcbnz\t" } } */ +/* { dg-final { scan-assembler-not "\tcbz\t" } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/ccmp_4.c b/gcc/testsuite/gcc.target/aarch64/ccmp_4.c new file mode 100644 index 00000000000..bc0f57a7c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ccmp_4.c @@ -0,0 +1,35 @@ +/* { dg-options "-O2" } */ +/* PR target/100942 */ + +void foo(void); +int f1(int a, int b, int d) +{ + int c = a < 8 || b < 9; + int e = d < 11 || c; + if (e) foo(); + return c; +} + +/* + We really should get: + cmp w0, 7 + ccmp w1, 8, 4, gt + cset w0, le + ccmp w2, 10, 4, gt + ble .L11 + + But we currently get: + cmp w0, 7 + ccmp w1, 8, 4, gt + cset w0, le + cmp w0, 0 + ccmp w2, 10, 4, eq + ble .L11 + The middle cmp is not needed. + */ + +/* We should end up with only one cmp and 2 ccmp and 1 cset but currently we get 2 cmp + though. */ +/* { dg-final { scan-assembler-times "\tccmp\t" 2 } } */ +/* { dg-final { scan-assembler-times "\tcset\t" 1 } } */ +/* { dg-final { scan-assembler-times "\tcmp\t" 1 { xfail *-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ccmp_5.c b/gcc/testsuite/gcc.target/aarch64/ccmp_5.c new file mode 100644 index 00000000000..7e52ae4f322 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ccmp_5.c @@ -0,0 +1,20 @@ + +/* { dg-options "-O2" } */ +/* PR target/100942 */ +void f1(int a, int b, _Bool *x) +{ + x[0] = x[1] = a == 0 || b == 0; +} + +void f2(int a, int b, int *x) +{ + x[0] = x[1] = a == 0 || b == 0; +} + + +/* Both functions should be using ccmp rather than 2 cset/orr. */ +/* { dg-final { scan-assembler-times "\tccmp\t" 2 } } */ +/* { dg-final { scan-assembler-times "\tcset\t" 2 } } */ +/* { dg-final { scan-assembler-times "\tcmp\t" 2 } } */ +/* { dg-final { scan-assembler-not "\torr\t" } } */ +