From patchwork Fri Mar 17 03:39:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 71068 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp126619wrt; Thu, 16 Mar 2023 20:40:48 -0700 (PDT) X-Google-Smtp-Source: AK7set/Q5hLAo8IW6kBwvi1PAeu8yOqQxSotK9Lkj8iWsAKw5QT8mOQ6fjwqFhwBz+dJ24h7QOLc X-Received: by 2002:a17:906:9f1d:b0:931:7350:a4f3 with SMTP id fy29-20020a1709069f1d00b009317350a4f3mr1910367ejc.10.1679024448160; Thu, 16 Mar 2023 20:40:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679024448; cv=none; d=google.com; s=arc-20160816; b=eFh4CIvtWRzPASon32nwsxvYGYdGQJCOjr0DupmKYfTRpuIW+AU9yOcaT6t4aHAfWx 9WQCfEKFCmJztc6QOYfTnQ8+rojRWRh0fy62Mx+AdGSGvklW9W6xCQb68IsXJoQ9dA6u rgap2m3tA6DDabBTggH1De3IxPNvQ4FITCqnWj6iAY2l3MT1C0JQzKKWJrKGXn58ZLiB Vn5XTfwR7DV3F6vj1cpdvJkQQsdtW7bZuIKyDtTiuCntlnaCieRN83ZPxMEcqE0rZgCf R1kRPsHqHReYI1zLs4B/3Ms7OftfJc8xYkNtA/1OLJSuL83vw32HXH3w91bHnkK8Yxcl KVZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:message-id:date:subject:cc:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=1xx9uy0wcb1VTdHl3C/6EN4rKS3AlDlmgcwP3bnv2aI=; b=iqCmE+5BGvxukV0RChlTSQ8yVYr40X3viOkzCiXkdLc9OnWMTLz14dkNMa1kYvLNmQ fvs+0fBbPe7+A0DTSO/LwnAN6eoVdjlTzB/GuQvja2PgmNvQSQFLwyF+Xf46glUU/+q1 iRSsWikIThTYoYUFu6C49YxHUZ/RMt0O/AKHScvqQR2xr361d1lOI6PcjacVbBebdf9D qfokDkZzUONOmChP1No5YMDgsxW7kG83YRibuzAuQ+30XdsRQMXCflFKH5uQxEY+/6wv Uon0eQObC5QY0xxxboe8R0sAebZzyAnIF5pM2HuWaLTgwCWds6IZWvUxQaGZ1qsJIzPI QYSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=EbboCoVs; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id s25-20020a170906061900b0092fb2807dadsi953535ejb.651.2023.03.16.20.40.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Mar 2023 20:40:48 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=EbboCoVs; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CFBAD385B53D for ; Fri, 17 Mar 2023 03:40:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CFBAD385B53D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1679024446; bh=1xx9uy0wcb1VTdHl3C/6EN4rKS3AlDlmgcwP3bnv2aI=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=EbboCoVs7TCsaSU4aYjEpAAf7Kyh4YISsN7n+raqrk7rez5yW+ZFjir3M2UgEE+XJ GpIqqXD0xAwMYJB0NZIbwoDVzrH/39Od3PbqYKqAHpV0yn0bLCMmFPG95Nnq1JordG IP+5WacMz5bQvg7r2O1VSp5vhmKSR79/C0KzAHyY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 5E4A13858423; Fri, 17 Mar 2023 03:40:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5E4A13858423 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32H3JInf014931; Fri, 17 Mar 2023 03:40:01 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pcg6jrc5a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 03:40:01 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32H3au6v004521; Fri, 17 Mar 2023 03:40:00 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pcg6jrc4g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 03:40:00 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32GHujaK001711; Fri, 17 Mar 2023 03:39:58 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3pbsmrh9r7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 03:39:58 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32H3dsgx63963604 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 17 Mar 2023 03:39:54 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A26032004B; Fri, 17 Mar 2023 03:39:54 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2A66620043; Fri, 17 Mar 2023 03:39:53 +0000 (GMT) Received: from ltcden2-lp1.aus.stglabs.ibm.com (unknown [9.3.90.43]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 17 Mar 2023 03:39:52 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: rguenther@suse.de, jeffreyalaw@gmail.com, segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, guojiufu@linux.ibm.com, meissner@linux.ibm.com Subject: [PATCH V5] Use reg mode to move sub blocks for parameters and returns Date: Fri, 17 Mar 2023 11:39:52 +0800 Message-Id: <20230317033952.1549050-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.31.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: P18V1Y9Cahi7bK-Qo8YJZFEPwIivMMH3 X-Proofpoint-ORIG-GUID: bbMlF6p5TkQJqmrV1hA6opZtjIiBjMLs X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-16_16,2023-03-16_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 adultscore=0 priorityscore=1501 mlxscore=0 spamscore=0 suspectscore=0 clxscore=1015 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303170019 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760584740090679147?= X-GMAIL-MSGID: =?utf-8?q?1760584740090679147?= Hi, When assigning a parameter to a variable, or assigning a variable to return value with struct type, and the parameter/return is passed through registers. For this kind of case, it would be better to use the nature mode of the registers to move the content for the assignment. As the example code (like code in PR65421): typedef struct SA {double a[3];} A; A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s) A ret_arg (A a) {return a;} // just empty fun body void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s) Comparing with previous version: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609394.html This version refine code to eliminated reductant code in the sub routine "move_sub_blocks". Bootstrap and regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu) PR target/65421 gcc/ChangeLog: * cfgexpand.cc (expand_used_vars): Update to mark DECL_USEDBY_RETURN_P for returns. * expr.cc (move_sub_blocks): New function. (expand_assignment): Update assignment code about returns/parameters. * function.cc (assign_parm_setup_block): Update to mark DECL_REGS_TO_STACK_P for parameter. * tree-core.h (struct tree_decl_common): Add comment. * tree.h (DECL_USEDBY_RETURN_P): New define. (DECL_REGS_TO_STACK_P): New define. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr65421-1.c: New test. * gcc.target/powerpc/pr65421.c: New test. --- gcc/cfgexpand.cc | 14 +++++ gcc/expr.cc | 61 ++++++++++++++++++++ gcc/function.cc | 3 + gcc/tree-core.h | 4 +- gcc/tree.h | 9 +++ gcc/testsuite/gcc.target/powerpc/pr65421-1.c | 6 ++ gcc/testsuite/gcc.target/powerpc/pr65421.c | 33 +++++++++++ 7 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 1a1b26b1c6c..eda4d85d140 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars) frame_phase = off ? align - off : 0; } + /* Collect VARs on returns. */ + if (DECL_RESULT (current_function_decl)) + { + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) + if (greturn *ret = safe_dyn_cast (last_stmt (e->src))) + { + tree val = gimple_return_retval (ret); + if (val && VAR_P (val)) + DECL_USEDBY_RETURN_P (val) = 1; + } + } + /* Set TREE_USED on all variables in the local_decls. */ FOR_EACH_LOCAL_DECL (cfun, i, var) TREE_USED (var) = 1; diff --git a/gcc/expr.cc b/gcc/expr.cc index 15be1c8db99..97a7be9542e 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -5559,6 +5559,41 @@ mem_ref_refers_to_non_mem_p (tree ref) return non_mem_decl_p (base); } +/* Sub routine of expand_assignment, invoked when assigning from a + parameter or assigning to a return val on struct type which may + be passed through registers. The mode of register is used to + move the content for the assignment. + + This routine generates code for expression FROM which is BLKmode, + and move the generated content to TO_RTX by su-blocks in SUB_MODE. */ + +static void +move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode) +{ + gcc_assert (MEM_P (to_rtx)); + + HOST_WIDE_INT size = MEM_SIZE (to_rtx).to_constant (); + HOST_WIDE_INT sub_size = GET_MODE_SIZE (sub_mode).to_constant (); + HOST_WIDE_INT len = size / sub_size; + gcc_assert (size % sub_size == 0); + + push_temp_slots (); + + rtx from_rtx = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL); + for (int i = 0; i < len; i++) + { + rtx temp = gen_reg_rtx (sub_mode); + rtx src = adjust_address (from_rtx, sub_mode, sub_size * i); + rtx dest = adjust_address (to_rtx, sub_mode, sub_size * i); + emit_move_insn (temp, src); + emit_move_insn (dest, temp); + } + + preserve_temp_slots (to_rtx); + pop_temp_slots (); + return; +} + /* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL is true, try generating a nontemporal store. */ @@ -6045,6 +6080,32 @@ expand_assignment (tree to, tree from, bool nontemporal) return; } + /* If it is assigning from a struct param which may be passed via registers, + it would be better to use the register's mode to move sub-blocks for the + assignment. */ + if (TREE_CODE (from) == PARM_DECL && mode == BLKmode + && DECL_REGS_TO_STACK_P (from)) + { + rtx parm = DECL_INCOMING_RTL (from); + machine_mode sub_mode + = REG_P (parm) ? word_mode : GET_MODE (XEXP (XVECEXP (parm, 0, 0), 0)); + move_sub_blocks (to_rtx, from, sub_mode); + return; + } + + /* If it is assigning to a struct var which will be returned, and the + function is returning via registers, it would be better to use the + register's mode to move sub-blocks for the assignment. */ + if (VAR_P (to) && DECL_USEDBY_RETURN_P (to) && mode == BLKmode + && TREE_CODE (from) != CONSTRUCTOR + && GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl))) == PARALLEL) + { + rtx ret = DECL_RTL (DECL_RESULT (current_function_decl)); + machine_mode sub_mode = GET_MODE (XEXP (XVECEXP (ret, 0, 0), 0)); + move_sub_blocks (to_rtx, from, sub_mode); + return; + } + /* Compute FROM and store the value in the rtx we got. */ push_temp_slots (); diff --git a/gcc/function.cc b/gcc/function.cc index cfc4d2f74af..b8cd3d89d22 100644 --- a/gcc/function.cc +++ b/gcc/function.cc @@ -2991,6 +2991,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all, mem = validize_mem (copy_rtx (stack_parm)); + if (MEM_P (mem) && size != 0 && size % UNITS_PER_WORD == 0) + DECL_REGS_TO_STACK_P (parm) = 1; + /* Handle values in multiple non-contiguous locations. */ if (GET_CODE (entry_parm) == PARALLEL && !MEM_P (mem)) emit_group_store (mem, entry_parm, data->arg.type, size); diff --git a/gcc/tree-core.h b/gcc/tree-core.h index fd2be57b78c..30656658431 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1808,7 +1808,9 @@ struct GTY(()) tree_decl_common { In VAR_DECL, PARM_DECL and RESULT_DECL, this is DECL_HAS_VALUE_EXPR_P. */ unsigned decl_flag_2 : 1; - /* In FIELD_DECL, this is DECL_PADDING_P. */ + /* In FIELD_DECL, this is DECL_PADDING_P + In VAR_DECL, this is DECL_USEDBY_RETURN_P + In PARM_DECL, this is DECL_REGS_TO_STACK_P. */ unsigned decl_flag_3 : 1; /* Logically, these two would go in a theoretical base shared by var and parm decl. */ diff --git a/gcc/tree.h b/gcc/tree.h index 91375f9652f..356c6067bac 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -3019,6 +3019,15 @@ extern void decl_value_expr_insert (tree, tree); #define DECL_PADDING_P(NODE) \ (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_3) +/* Used in a VAR_DECL to indicate that it is used by a return stmt. */ +#define DECL_USEDBY_RETURN_P(NODE) \ + (VAR_DECL_CHECK (NODE)->decl_common.decl_flag_3) + +/* Used in a PARM_DECL to indicate that it is struct parameter passed + by registers totally and stored to stack during setup. */ +#define DECL_REGS_TO_STACK_P(NODE) \ + (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3) + /* Used in a FIELD_DECL to indicate whether this field is not a flexible array member. This is only valid for the last array type field of a structure. */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421-1.c b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c new file mode 100644 index 00000000000..4e1f87f7939 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c @@ -0,0 +1,6 @@ +/* PR target/65421 */ +/* { dg-options "-O2" } */ + +typedef struct LARGE {double a[4]; int arr[32];} LARGE; +LARGE foo (LARGE a){return a;} +/* { dg-final { scan-assembler-times {\mmemcpy\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421.c b/gcc/testsuite/gcc.target/powerpc/pr65421.c new file mode 100644 index 00000000000..fd5ad542c64 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr65421.c @@ -0,0 +1,33 @@ +/* PR target/65421 */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target powerpc_elfv2 } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +typedef struct FLOATS +{ + double a[3]; +} FLOATS; + +/* 3 lfd */ +FLOATS ret_arg_pt (FLOATS *a){return *a;} +/* { dg-final { scan-assembler-times {\mlfd\M} 3 } } */ + +/* 3 stfd */ +void st_arg (FLOATS a, FLOATS *p) {*p = a;} +/* { dg-final { scan-assembler-times {\mstfd\M} 3 } } */ + +/* blr */ +FLOATS ret_arg (FLOATS a) {return a;} + +typedef struct MIX +{ + double a[2]; + long l; +} MIX; + +/* std 3 param regs to return slot */ +MIX ret_arg1 (MIX a) {return a;} +/* { dg-final { scan-assembler-times {\mstd\M} 3 } } */ + +/* count insns */ +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 13 } } */