From patchwork Mon Aug 14 05:41:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 135132 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp2537911vqi; Sun, 13 Aug 2023 22:43:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHpQiq7CJnUxOyumNdBz64fgCF5pJAVGP41V5vzDRV4XJf/w0aGmbaTNMpxNShAn2vdP6lB X-Received: by 2002:a05:6512:2312:b0:4f9:556b:93c2 with SMTP id o18-20020a056512231200b004f9556b93c2mr8086948lfu.1.1691991783860; Sun, 13 Aug 2023 22:43:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691991783; cv=none; d=google.com; s=arc-20160816; b=06yhyDLIZcak+SuWgaDgNJMxM4c4YTG9/051oU1YVkqRjvW/6yrXsGnUuyARXxXk8B bCeY4UdKEC+ptW3aEQ+eOQyu1oghtRVx+m0Ys41R2gvmVpcb/X/CounNgMrBqla5fxfh vFgr7SqC9hrmSyXV1cBeeXlLlXwF8xZ+AkFTWE8hF7oRLIfskG6rFCLNXfWL9YkkXmOb uEITzZ/22ZT4xaBPqo9EnCIckziOG2h7ReJZIFWNPbSaV4pRrkLtCGU/T+tXpN0q6H5a hAuRV0AenH+PtlBfleAsodJ0+Lts08Qorn7/pWzLY9JU2sPrm5rfjGJo6TqWfcUqkFrh 4JCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:message-id:date:subject:cc:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=KIOvGPcg95BwvTC60ewHkNfZF9WW9rND1LuKQ8a4OcI=; fh=LwpA6VWkZpYvfqjT3jkuGnHXRwVGI2wu5tey08t2PNA=; b=q4H3UzlXOg15SFtnwD76zepRvrH/yRU0OHoX75CZR3JpBikUbBGZ0ematkhPHlQrle yBUsx0o7bUQMERwboEp+6tQes1lz9wMsvIf8tnczZlNStOctHSzFVqI8mUDuvw4TDlvr giua0XYPMU/SMqSWchdumA2AHxGV5yDgW62JEh7cd9AIsDXcFkWU4pU+riVA+cdLcDCM fRORmQS2G3rYCBruRVu6HE03Rmlwnaz6nRE4ZGlhbuHzddhA/RMM1GUuzYoRqsJzaFKb BsmWjqsrjvuIRPqa7+4WTqJHft/fX3L7oGkj04lI55cUVMHTUbKW2tXuLOcZm+we3Oc/ JWjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="YN/EK/bQ"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id i3-20020aa7dd03000000b0052318dbe3b0si2997728edv.345.2023.08.13.22.43.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Aug 2023 22:43:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="YN/EK/bQ"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F08C83858028 for ; Mon, 14 Aug 2023 05:42:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F08C83858028 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691991773; bh=KIOvGPcg95BwvTC60ewHkNfZF9WW9rND1LuKQ8a4OcI=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=YN/EK/bQ5VjiG3X9TwRZHwXisbZAHzqw6764uTI7XTCbJfdb1zBAdpPSYoZYs/xqZ zSliJ/gf7jEAyeCk5uAcJm+zWgLaBQF3GCkfAUJ+zurVLooMVMsElSArQoRWiCkRHg /81mjliDDjaQyR0aQknHvWIJRz0w1ZIHZBtRpMdo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id BE4863858D39; Mon, 14 Aug 2023 05:42:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BE4863858D39 Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37E5SNYU016587; Mon, 14 Aug 2023 05:42:03 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfe550dmn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:03 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37E5U5Kj020052; Mon, 14 Aug 2023 05:42:03 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfe550dm4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:02 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37E5CY1D018856; Mon, 14 Aug 2023 05:42:02 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3seq410vet-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:02 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37E5fxfo52625680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 14 Aug 2023 05:41:59 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EEABD2004B; Mon, 14 Aug 2023 05:41:58 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9532D20040; Mon, 14 Aug 2023 05:41:57 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 14 Aug 2023 05:41:57 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: rguenther@suse.de, jeffreyalaw@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH 1/2] light expander sra v0 Date: Mon, 14 Aug 2023 13:41:55 +0800 Message-Id: <20230814054156.2068718-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: YoblXTDdNGX31o7olBPX1mjSF5lql5PA X-Proofpoint-ORIG-GUID: qcQACD2Stx6UkA_S_3sa-m-m7XTFeeGU X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-13_24,2023-08-10_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxscore=0 adultscore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308140051 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774181976296042556 X-GMAIL-MSGID: 1774181976296042556 Hi, There are a few PRs about the issues on the struct parameters and returns, like PRs 69143/65421/108073. we could consider introducing a light SRA in the expander to handle those parameters and returns in aggregate type, if they are passed through registers. For access to the fields of the parameters or returns, the corresponding scalar registers can be used. As discussed: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619884.html This is an initial patch for the light-expander-sra. Bootstrapped and regtested on x86_64-redhat-linux, and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR, Jeff (Jiufu Guo) PR target/65421 PR target/69143 gcc/ChangeLog: * cfgexpand.cc (expand_shift): Extern declare. (struct access): New class. (struct expand_sra): New class. (expand_sra::build_access): New member function. (expand_sra::visit_base): Likewise. (expand_sra::analyze_default_stmt): Likewise. (expand_sra::analyze_assign): Likewise. (expand_sra::add_sra_candidate): Likewise. (expand_sra::collect_sra_candidates): Likewise. (expand_sra::valid_scalariable_accesses): Likewise. (expand_sra::prepare_expander_sra): Likewise. (expand_sra::expand_sra): Class constructor. (expand_sra::~expand_sra): Class destructor. (expand_sra::get_scalarized_rtx): New member function. (extract_one_reg): New function. (extract_sub_reg): New function. (expand_sra::scalarize_access): New member function. (expand_sra::scalarize_accesses): New member function. (get_scalar_rtx_for_aggregate_expr): New function. (set_scalar_rtx_for_aggregate_access): New function. (set_scalar_rtx_for_returns): New function. (expand_return): Call get_scalar_rtx_for_aggregate_expr. (expand_debug_expr): Call get_scalar_rtx_for_aggregate_expr. (pass_expand::execute): Update to use the expand_sra. * expr.cc (get_scalar_rtx_for_aggregate_expr): Extern declare. (expand_assignment): Call get_scalar_rtx_for_aggregate_expr. (expand_expr_real): Call get_scalar_rtx_for_aggregate_expr. * function.cc (set_scalar_rtx_for_aggregate_access): Extern declare. (set_scalar_rtx_for_returns): Extern declare. (assign_parm_setup_block): Call set_scalar_rtx_for_aggregate_access. (assign_parms): Call set_scalar_rtx_for_aggregate_access. (expand_function_start): Call set_scalar_rtx_for_returns. * tree-sra.h (struct base_access): New class. (struct default_analyzer): New class. (scan_function): New function template. gcc/testsuite/ChangeLog: * g++.target/powerpc/pr102024.C: Updated. * gcc.target/powerpc/pr108073.c: New test. * gcc.target/powerpc/pr65421-1.c: New test. * gcc.target/powerpc/pr65421-2.c: New test. --- gcc/cfgexpand.cc | 478 ++++++++++++++++++- gcc/expr.cc | 15 +- gcc/function.cc | 28 +- gcc/tree-sra.h | 80 +++- gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- gcc/testsuite/gcc.target/powerpc/pr108073.c | 29 ++ gcc/testsuite/gcc.target/powerpc/pr65421-1.c | 6 + gcc/testsuite/gcc.target/powerpc/pr65421-2.c | 32 ++ 8 files changed, 660 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-2.c diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index edf292cfbe95ac2711faee7769e839cb4edb0dd3..21a09ebac96bbcddc67da73c42f470c6d5f60e6c 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -74,6 +74,7 @@ along with GCC; see the file COPYING3. If not see #include "output.h" #include "builtins.h" #include "opts.h" +#include "tree-sra.h" /* Some systems use __main in a way incompatible with its use in gcc, in these cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to @@ -97,6 +98,472 @@ static bool defer_stack_allocation (tree, bool); static void record_alignment_for_reg_var (unsigned int); +extern rtx +expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx, int); + +/* For light SRA in expander about paramaters and returns. */ +struct access : public base_access +{ + /* The rtx for the access: link to incoming/returning register(s). */ + rtx rtx_val; +}; + +typedef struct access *access_p; + +struct expand_sra : public default_analyzer +{ + expand_sra (); + ~expand_sra (); + + /* Now use default APIs, no actions for + pre_analyze_stmt, analyze_return. */ + + /* overwrite analyze_default_stmt. */ + void analyze_default_stmt (gimple *); + + /* overwrite analyze phi,call,asm . */ + void analyze_phi (gphi *stmt) { analyze_default_stmt (stmt); }; + void analyze_call (gcall *stmt) { analyze_default_stmt (stmt); }; + void analyze_asm (gasm *stmt) { analyze_default_stmt (stmt); }; + /* overwrite analyze_assign. */ + void analyze_assign (gassign *); + + /* Compute the scalar rtx(s) for all access of BASE from a parrallel REGS. */ + bool scalarize_accesses (tree base, rtx regs); + /* Return the scalarized rtx for EXPR. */ + rtx get_scalarized_rtx (tree expr); + +private: + void prepare_expander_sra (void); + + /* Return true if VAR is a candidate for SRA. */ + bool add_sra_candidate (tree var); + + /* Collect the parameter and returns with type which is suitable for + scalarization. */ + bool collect_sra_candidates (void); + + /* Return true if EXPR has interesting access to the sra candidates, + and created access, return false otherwise. */ + access_p build_access (tree expr, bool write); + + /* Check if the accesses of BASE are scalarizbale. + Now support the parms only with reading or returns only with writing. */ + bool valid_scalariable_accesses (vec *access_vec, bool is_parm); + + /* Compute the scalar rtx for one access ACC from a parrallel REGS. */ + bool scalarize_access (access_p acc, rtx regs); + + /* Callback of walk_stmt_load_store_addr_ops, used to remove + unscalarizable accesses. */ + static bool visit_base (gimple *, tree op, tree, void *data); + + /* Expr (tree) -> Scalarized value (rtx) map. */ + hash_map *expr_rtx_vec; + + /* Base (tree) -> Vector (vec *) map. */ + hash_map > *base_access_vec; +}; + +access_p +expand_sra::build_access (tree expr, bool write) +{ + enum tree_code code = TREE_CODE (expr); + if (code != VAR_DECL && code != PARM_DECL && code != COMPONENT_REF + && code != ARRAY_REF && code != ARRAY_RANGE_REF) + return NULL; + + HOST_WIDE_INT offset, size; + bool reverse; + tree base = get_ref_base_and_extent_hwi (expr, &offset, &size, &reverse); + if (!base || !DECL_P (base)) + return NULL; + if (storage_order_barrier_p (expr) || TREE_THIS_VOLATILE (expr)) + { + base_access_vec->remove (base); + return NULL; + } + + vec *access_vec = base_access_vec->get (base); + if (!access_vec) + return NULL; + + /* TODO: support reverse. */ + if (reverse || size <= 0 || offset + size > tree_to_shwi (DECL_SIZE (base))) + { + base_access_vec->remove (base); + return NULL; + } + + struct access *access = XNEWVEC (struct access, 1); + + memset (access, 0, sizeof (struct access)); + access->offset = offset; + access->size = size; + access->expr = expr; + access->write = write; + access->rtx_val = NULL_RTX; + + access_vec->safe_push (access); + + return access; +} + +bool +expand_sra::visit_base (gimple *, tree op, tree, void *data) +{ + op = get_base_address (op); + if (op && DECL_P (op)) + { + expand_sra *p = (expand_sra *) data; + p->base_access_vec->remove (op); + } + return false; +} + +void +expand_sra::analyze_default_stmt (gimple *stmt) +{ + if (base_access_vec && !base_access_vec->is_empty ()) + walk_stmt_load_store_addr_ops (stmt, this, visit_base, visit_base, + visit_base); +} + +void +expand_sra::analyze_assign (gassign *stmt) +{ + if (!base_access_vec || base_access_vec->is_empty ()) + return; + + if (gimple_assign_single_p (stmt) && !gimple_clobber_p (stmt)) + { + tree rhs = gimple_assign_rhs1 (stmt); + tree lhs = gimple_assign_lhs (stmt); + bool res_r = build_access (rhs, false); + bool res_l = build_access (lhs, true); + if (res_l && TREE_CODE (rhs) == CONSTRUCTOR) + base_access_vec->remove (get_base_address (lhs)); + + if (res_l || res_r) + return; + } + + analyze_default_stmt (stmt); +} + +/* Return true if VAR is a candidate for SRA. */ + +bool +expand_sra::add_sra_candidate (tree var) +{ + tree type = TREE_TYPE (var); + + if (!AGGREGATE_TYPE_P (type) || !tree_fits_shwi_p (TYPE_SIZE (type)) + || tree_to_shwi (TYPE_SIZE (type)) == 0 || TREE_THIS_VOLATILE (var) + || is_va_list_type (type)) + return false; + gcc_assert (COMPLETE_TYPE_P (type)); + + base_access_vec->get_or_insert (var); + + return true; +} + +bool +expand_sra::collect_sra_candidates (void) +{ + bool ret = false; + + /* Collect parameters. */ + for (tree parm = DECL_ARGUMENTS (current_function_decl); parm; + parm = DECL_CHAIN (parm)) + ret |= add_sra_candidate (parm); + + /* Collect VARs on returns. */ + if (DECL_RESULT (current_function_decl)) + { + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) + if (greturn *r = safe_dyn_cast (*gsi_last_bb (e->src))) + { + tree val = gimple_return_retval (r); + /* To sclaraized the return, the return value should be only + writen, except this return stmt. + Then using 'true(write)' to create the access. */ + if (val && VAR_P (val)) + ret |= add_sra_candidate (val) && build_access (val, true); + } + } + + return ret; +} + +bool +expand_sra::valid_scalariable_accesses (vec *access_vec, bool is_parm) +{ + if (access_vec->is_empty ()) + return false; + + for (unsigned int j = 0; j < access_vec->length (); j++) + { + struct access *access = (*access_vec)[j]; + if (is_parm && access->write) + return false; + + if (!is_parm && !access->write) + return false; + } + + return true; +} + +void +expand_sra::prepare_expander_sra () +{ + if (optimize <= 0) + return; + + base_access_vec = new hash_map >; + expr_rtx_vec = new hash_map; + + collect_sra_candidates (); +} + +expand_sra::expand_sra () : expr_rtx_vec (NULL), base_access_vec (NULL) +{ + prepare_expander_sra (); +} + +expand_sra::~expand_sra () +{ + if (optimize <= 0) + return; + delete expr_rtx_vec; + expr_rtx_vec = NULL; + delete base_access_vec; + base_access_vec = NULL; +} + +rtx +expand_sra::get_scalarized_rtx (tree expr) +{ + if (!expr_rtx_vec) + return NULL_RTX; + rtx *val = expr_rtx_vec->get (expr); + return val ? *val : NULL_RTX; +} + +/* Get the register at INDEX from a parallel REGS. */ + +static rtx +extract_one_reg (rtx regs, int index) +{ + rtx orig_reg = XEXP (XVECEXP (regs, 0, index), 0); + if (!HARD_REGISTER_P (orig_reg)) + return orig_reg; + + /* Reading from param hard reg need to be moved to a temp. */ + rtx reg = gen_reg_rtx (GET_MODE (orig_reg)); + emit_move_insn (reg, orig_reg); + return reg; +} + +/* Get IMODE part from REG at OFF_BITS. */ + +static rtx +extract_sub_reg (rtx orig_reg, int off_bits, machine_mode mode) +{ + scalar_int_mode imode; + if (!int_mode_for_mode (mode).exists (&imode)) + return NULL_RTX; + + machine_mode orig_mode = GET_MODE (orig_reg); + gcc_assert (GET_MODE_CLASS (orig_mode) == MODE_INT); + + poly_uint64 lowpart_off = subreg_lowpart_offset (imode, orig_mode); + int lowpart_off_bits = lowpart_off.to_constant () * BITS_PER_UNIT; + int shift_bits; + if (lowpart_off_bits >= off_bits) + shift_bits = lowpart_off_bits - off_bits; + else + shift_bits = off_bits - lowpart_off_bits; + + rtx reg = orig_reg; + if (shift_bits > 0) + reg = expand_shift (RSHIFT_EXPR, orig_mode, reg, shift_bits, NULL, 1); + + rtx subreg = gen_lowpart (imode, reg); + rtx result = gen_reg_rtx (imode); + emit_move_insn (result, subreg); + + if (mode != imode) + result = gen_lowpart (mode, result); + + return result; +} + +bool +expand_sra::scalarize_access (access_p acc, rtx regs) +{ + machine_mode expr_mode = TYPE_MODE (TREE_TYPE (acc->expr)); + + /* mode of mult registers. */ + if (expr_mode != BLKmode + && known_gt (acc->size, GET_MODE_BITSIZE (word_mode))) + return false; + + /* Compute the position of the access in the whole parallel rtx. */ + int start_index = -1; + int end_index = -1; + HOST_WIDE_INT left_bits = 0; + HOST_WIDE_INT right_bits = 0; + int cur_index = XEXP (XVECEXP (regs, 0, 0), 0) ? 0 : 1; + for (; cur_index < XVECLEN (regs, 0); cur_index++) + { + rtx slot = XVECEXP (regs, 0, cur_index); + HOST_WIDE_INT off = UINTVAL (XEXP (slot, 1)) * BITS_PER_UNIT; + machine_mode mode = GET_MODE (XEXP (slot, 0)); + HOST_WIDE_INT size = GET_MODE_BITSIZE (mode).to_constant (); + if (off <= acc->offset && off + size > acc->offset) + { + start_index = cur_index; + left_bits = acc->offset - off; + } + if (off + size >= acc->offset + acc->size) + { + end_index = cur_index; + right_bits = off + size - (acc->offset + acc->size); + break; + } + } + /* Invalid access possition: padding or outof bound. */ + if (start_index < 0 || end_index < 0) + return false; + + /* Need multi-registers in a parallel for the access. */ + if (expr_mode == BLKmode || end_index > start_index) + { + if (left_bits || right_bits) + return false; + + int num_words = end_index - start_index + 1; + rtx *tmps = XALLOCAVEC (rtx, num_words); + + int pos = 0; + HOST_WIDE_INT start; + start = UINTVAL (XEXP (XVECEXP (regs, 0, start_index), 1)); + /* Extract whole registers. */ + for (; pos < num_words; pos++) + { + int index = start_index + pos; + rtx reg = extract_one_reg (regs, index); + machine_mode mode = GET_MODE (reg); + HOST_WIDE_INT off; + off = UINTVAL (XEXP (XVECEXP (regs, 0, index), 1)) - start; + tmps[pos] = gen_rtx_EXPR_LIST (mode, reg, GEN_INT (off)); + } + + rtx reg = gen_rtx_PARALLEL (expr_mode, gen_rtvec_v (pos, tmps)); + acc->rtx_val = reg; + return true; + } + + /* Just need one reg for the access. */ + if (end_index == start_index && left_bits == 0 && right_bits == 0) + { + rtx reg = extract_one_reg (regs, start_index); + if (GET_MODE (reg) != expr_mode) + reg = gen_lowpart (expr_mode, reg); + + acc->rtx_val = reg; + return true; + } + + /* Need to extract part reg for the access. */ + if (!acc->write && end_index == start_index + && (acc->size % BITS_PER_UNIT) == 0) + { + rtx orig_reg = XEXP (XVECEXP (regs, 0, start_index), 0); + acc->rtx_val = extract_sub_reg (orig_reg, left_bits, expr_mode); + if (acc->rtx_val) + return true; + } + + return false; +} + +bool +expand_sra::scalarize_accesses (tree base, rtx regs) +{ + if (!base_access_vec) + return false; + vec *access_vec = base_access_vec->get (base); + if (!access_vec) + return false; + bool is_parm = TREE_CODE (base) == PARM_DECL; + if (!valid_scalariable_accesses (access_vec, is_parm)) + return false; + + /* Go through each access, compute corresponding rtx(regs or subregs) + for the expression. */ + int n = access_vec->length (); + int cur_access_index = 0; + for (; cur_access_index < n; cur_access_index++) + if (!scalarize_access ((*access_vec)[cur_access_index], regs)) + break; + + /* Bind/map expr(tree) to sclarized rtx if all access scalarized. */ + if (cur_access_index == n) + for (int j = 0; j < n; j++) + { + access_p access = (*access_vec)[j]; + expr_rtx_vec->put (access->expr, access->rtx_val); + } + + return true; +} + +static expand_sra *current_sra = NULL; + +/* Check If there is an sra access for the expr. + Return the correspond scalar sym for the access. */ + +rtx +get_scalar_rtx_for_aggregate_expr (tree expr) +{ + return current_sra ? current_sra->get_scalarized_rtx (expr) : NULL_RTX; +} + +/* Compute/Set RTX registers for those accesses on BASE. */ + +void +set_scalar_rtx_for_aggregate_access (tree base, rtx regs) +{ + if (!current_sra) + return; + current_sra->scalarize_accesses (base, regs); +} + +void +set_scalar_rtx_for_returns () +{ + if (!current_sra) + return; + + tree res = DECL_RESULT (current_function_decl); + gcc_assert (res); + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) + if (greturn *r = safe_dyn_cast (*gsi_last_bb (e->src))) + { + tree val = gimple_return_retval (r); + if (val && VAR_P (val)) + current_sra->scalarize_accesses (val, DECL_RTL (res)); + } +} + /* Return an expression tree corresponding to the RHS of GIMPLE statement STMT. */ @@ -3778,7 +4245,8 @@ expand_return (tree retval) /* If we are returning the RESULT_DECL, then the value has already been stored into it, so we don't have to do anything special. */ - if (TREE_CODE (retval_rhs) == RESULT_DECL) + if (TREE_CODE (retval_rhs) == RESULT_DECL + || get_scalar_rtx_for_aggregate_expr (retval_rhs)) expand_value_return (result_rtl); /* If the result is an aggregate that is being returned in one (or more) @@ -4422,6 +4890,9 @@ expand_debug_expr (tree exp) int unsignedp = TYPE_UNSIGNED (TREE_TYPE (exp)); addr_space_t as; scalar_int_mode op0_mode, op1_mode, addr_mode; + rtx x = get_scalar_rtx_for_aggregate_expr (exp); + if (x) + return NULL_RTX;/* optimized out. */ switch (TREE_CODE_CLASS (TREE_CODE (exp))) { @@ -6624,6 +7095,9 @@ pass_expand::execute (function *fun) auto_bitmap forced_stack_vars; discover_nonconstant_array_refs (forced_stack_vars); + current_sra = new expand_sra; + scan_function (cfun, *current_sra); + /* Make sure all values used by the optimization passes have sane defaults. */ reg_renumber = 0; @@ -7052,6 +7526,8 @@ pass_expand::execute (function *fun) loop_optimizer_finalize (); } + delete current_sra; + current_sra = NULL; timevar_pop (TV_POST_EXPAND); return 0; diff --git a/gcc/expr.cc b/gcc/expr.cc index 174f8acb269ab5450fc799516471d5a2bd9b9efa..53b48aba790d4dd8ade326a2b33a0c7ec3fffc47 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -100,6 +100,7 @@ static void do_tablejump (rtx, machine_mode, rtx, rtx, rtx, static rtx const_vector_from_tree (tree); static tree tree_expr_size (const_tree); static void convert_mode_scalar (rtx, rtx, int); +rtx get_scalar_rtx_for_aggregate_expr (tree); /* This is run to set up which modes can be used @@ -5618,11 +5619,12 @@ expand_assignment (tree to, tree from, bool nontemporal) Assignment of an array element at a constant index, and assignment of an array element in an unaligned packed structure field, has the same problem. Same for (partially) storing into a non-memory object. */ - if (handled_component_p (to) - || (TREE_CODE (to) == MEM_REF - && (REF_REVERSE_STORAGE_ORDER (to) - || mem_ref_refers_to_non_mem_p (to))) - || TREE_CODE (TREE_TYPE (to)) == ARRAY_TYPE) + if (!get_scalar_rtx_for_aggregate_expr (to) + && (handled_component_p (to) + || (TREE_CODE (to) == MEM_REF + && (REF_REVERSE_STORAGE_ORDER (to) + || mem_ref_refers_to_non_mem_p (to))) + || TREE_CODE (TREE_TYPE (to)) == ARRAY_TYPE)) { machine_mode mode1; poly_int64 bitsize, bitpos; @@ -9006,6 +9008,9 @@ expand_expr_real (tree exp, rtx target, machine_mode tmode, ret = CONST0_RTX (tmode); return ret ? ret : const0_rtx; } + rtx x = get_scalar_rtx_for_aggregate_expr (exp); + if (x) + return x; ret = expand_expr_real_1 (exp, target, tmode, modifier, alt_rtl, inner_reference_p); diff --git a/gcc/function.cc b/gcc/function.cc index dd2c1136e0725f55673f28e0eeaf4c91ad18e93f..7fe927bd36beac11466ca9fca12800892b57f0be 100644 --- a/gcc/function.cc +++ b/gcc/function.cc @@ -2740,6 +2740,9 @@ assign_parm_find_stack_rtl (tree parm, struct assign_parm_data_one *data) data->stack_parm = stack_parm; } +extern void set_scalar_rtx_for_aggregate_access (tree, rtx); +extern void set_scalar_rtx_for_returns (); + /* A subroutine of assign_parms. Adjust DATA->ENTRY_RTL such that it's always valid and contiguous. */ @@ -3115,8 +3118,24 @@ assign_parm_setup_block (struct assign_parm_data_all *all, emit_move_insn (mem, entry_parm); } else - move_block_from_reg (REGNO (entry_parm), mem, - size_stored / UNITS_PER_WORD); + { + int regno = REGNO (entry_parm); + int nregs = size_stored / UNITS_PER_WORD; + move_block_from_reg (regno, mem, nregs); + + rtx *tmps = XALLOCAVEC (rtx, nregs); + machine_mode mode = word_mode; + HOST_WIDE_INT word_size = GET_MODE_SIZE (mode).to_constant (); + for (int i = 0; i < nregs; i++) + { + rtx reg = gen_rtx_REG (mode, regno + i); + rtx off = GEN_INT (word_size * i); + tmps[i] = gen_rtx_EXPR_LIST (VOIDmode, reg, off); + } + + rtx regs = gen_rtx_PARALLEL (BLKmode, gen_rtvec_v (nregs, tmps)); + set_scalar_rtx_for_aggregate_access (parm, regs); + } } else if (data->stack_parm == 0 && !TYPE_EMPTY_P (data->arg.type)) { @@ -3716,6 +3735,10 @@ assign_parms (tree fndecl) else set_decl_incoming_rtl (parm, data.entry_parm, false); + rtx incoming = DECL_INCOMING_RTL (parm); + if (GET_CODE (incoming) == PARALLEL) + set_scalar_rtx_for_aggregate_access (parm, incoming); + assign_parm_adjust_stack_rtl (&data); if (assign_parm_setup_block_p (&data)) @@ -5136,6 +5159,7 @@ expand_function_start (tree subr) { gcc_assert (GET_CODE (hard_reg) == PARALLEL); set_parm_rtl (res, gen_group_rtx (hard_reg)); + set_scalar_rtx_for_returns (); } } diff --git a/gcc/tree-sra.h b/gcc/tree-sra.h index f20266c46226f7840299a768cb575f6f92b54207..7af87bccf1b43badbc3f8a4c51a87c84d5020b9e 100644 --- a/gcc/tree-sra.h +++ b/gcc/tree-sra.h @@ -19,7 +19,85 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ -bool type_internals_preclude_sra_p (tree type, const char **msg); +struct base_access +{ + /* Values returned by get_ref_base_and_extent, indicates the + OFFSET, SIZE and BASE of the access. */ + HOST_WIDE_INT offset; + HOST_WIDE_INT size; + tree base; + + /* The context expression of this access. */ + tree expr; + + /* Indicates this is a write access. */ + bool write : 1; + + /* Indicates if this access is made in reverse storage order. */ + bool reverse : 1; +}; + +/* Default template for sra_scan_function. */ + +struct default_analyzer +{ + /* Template analyze functions. */ + void analyze_phi (gphi *){}; + void pre_analyze_stmt (gimple *){}; + void analyze_return (greturn *){}; + void analyze_assign (gassign *){}; + void analyze_call (gcall *){}; + void analyze_asm (gasm *){}; + void analyze_default_stmt (gimple *){}; +}; + +/* Scan function and look for interesting expressions. */ + +template +void +scan_function (struct function *fun, analyzer &a) +{ + basic_block bb; + FOR_EACH_BB_FN (bb, fun) + { + for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + a.analyze_phi (gsi.phi ()); + + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + a.pre_analyze_stmt (stmt); + + switch (gimple_code (stmt)) + { + case GIMPLE_RETURN: + a.analyze_return (as_a (stmt)); + break; + + case GIMPLE_ASSIGN: + a.analyze_assign (as_a (stmt)); + break; + + case GIMPLE_CALL: + a.analyze_call (as_a (stmt)); + break; + + case GIMPLE_ASM: + a.analyze_asm (as_a (stmt)); + break; + + default: + a.analyze_default_stmt (stmt); + break; + } + } + } +} + +bool +type_internals_preclude_sra_p (tree type, const char **msg); /* Return true iff TYPE is stdarg va_list type (which early SRA and IPA-SRA should leave alone). */ diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C b/gcc/testsuite/g++.target/powerpc/pr102024.C index 769585052b507ad971868795f861106230c976e3..c8995cae707bb6e2e849275b823d2ba14d24a966 100644 --- a/gcc/testsuite/g++.target/powerpc/pr102024.C +++ b/gcc/testsuite/g++.target/powerpc/pr102024.C @@ -5,7 +5,7 @@ // Test that a zero-width bit field in an otherwise homogeneous aggregate // generates a psabi warning and passes arguments in GPRs. -// { dg-final { scan-assembler-times {\mstd\M} 4 } } +// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 } } struct a_thing { diff --git a/gcc/testsuite/gcc.target/powerpc/pr108073.c b/gcc/testsuite/gcc.target/powerpc/pr108073.c new file mode 100644 index 0000000000000000000000000000000000000000..7dd1a4a326a181e0f35c9418af20a9bebabdfe4b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr108073.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -save-temps" } */ + +typedef struct DF {double a[4]; short s1; short s2; short s3; short s4; } DF; +typedef struct SF {float a[4]; int i1; int i2; } SF; + +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 3 {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ +/* { dg-final { scan-assembler-not {\mlwz\M} {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ +/* { dg-final { scan-assembler-not {\mlhz\M} {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ +short __attribute__ ((noipa)) foo_hi (DF a, int flag){if (flag == 2)return a.s2+a.s3;return 0;} +int __attribute__ ((noipa)) foo_si (SF a, int flag){if (flag == 2)return a.i2+a.i1;return 0;} +double __attribute__ ((noipa)) foo_df (DF arg, int flag){if (flag == 2)return arg.a[3];else return 0.0;} +float __attribute__ ((noipa)) foo_sf (SF arg, int flag){if (flag == 2)return arg.a[2]; return 0;} +float __attribute__ ((noipa)) foo_sf1 (SF arg, int flag){if (flag == 2)return arg.a[1];return 0;} + +DF gdf = {{1.0,2.0,3.0,4.0}, 1, 2, 3, 4}; +SF gsf = {{1.0f,2.0f,3.0f,4.0f}, 1, 2}; + +int main() +{ + if (!(foo_hi (gdf, 2) == 5 && foo_si (gsf, 2) == 3 && foo_df (gdf, 2) == 4.0 + && foo_sf (gsf, 2) == 3.0 && foo_sf1 (gsf, 2) == 2.0)) + __builtin_abort (); + if (!(foo_hi (gdf, 1) == 0 && foo_si (gsf, 1) == 0 && foo_df (gdf, 1) == 0 + && foo_sf (gsf, 1) == 0 && foo_sf1 (gsf, 1) == 0)) + __builtin_abort (); + return 0; +} + diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421-1.c b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c new file mode 100644 index 0000000000000000000000000000000000000000..4e1f87f7939cbf1423772023ee392fc5200b6708 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c @@ -0,0 +1,6 @@ +/* PR target/65421 */ +/* { dg-options "-O2" } */ + +typedef struct LARGE {double a[4]; int arr[32];} LARGE; +LARGE foo (LARGE a){return a;} +/* { dg-final { scan-assembler-times {\mmemcpy\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421-2.c b/gcc/testsuite/gcc.target/powerpc/pr65421-2.c new file mode 100644 index 0000000000000000000000000000000000000000..8a8e1a0e9962317ba2c0942af8891b3c51f4d3a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-2.c @@ -0,0 +1,32 @@ +/* PR target/65421 */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target powerpc_elfv2 } */ +/* { dg-require-effective-target has_arch_ppc64 } */ + +typedef struct FLOATS +{ + double a[3]; +} FLOATS; + +/* 3 lfd after returns also optimized */ +/* FLOATS ret_arg_pt (FLOATS *a){return *a;} */ + +/* 3 stfd */ +void st_arg (FLOATS a, FLOATS *p) {*p = a;} +/* { dg-final { scan-assembler-times {\mstfd\M} 3 } } */ + +/* blr */ +FLOATS ret_arg (FLOATS a) {return a;} + +typedef struct MIX +{ + double a[2]; + long l; +} MIX; + +/* std 3 param regs to return slot */ +MIX ret_arg1 (MIX a) {return a;} +/* { dg-final { scan-assembler-times {\mstd\M} 3 } } */ + +/* count insns */ +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9 } } */ From patchwork Mon Aug 14 05:41:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 135133 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp2537954vqi; Sun, 13 Aug 2023 22:43:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH0IYFnyyz8FyFzHXnbganYgQX3ur0dgXUZFmo4/7C5NP7XupsmpnoKlVgz0VWaYDpGaMLF X-Received: by 2002:a05:6512:692:b0:4f8:7325:bcd4 with SMTP id t18-20020a056512069200b004f87325bcd4mr7363294lfe.0.1691991791663; Sun, 13 Aug 2023 22:43:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691991791; cv=none; d=google.com; s=arc-20160816; b=d79yMo7uXW/qQCW+bm14zErhEt3nOsRwrR9df66Ij7gbRUz/cXFwH3NJQj0lcw/8z8 WTprSu0oiAOpiSu7SdvNyeRTAyCzUNmYwBg+ia+G4TbR5FuvpYaVMmooeXoqc/ivo8Cl GPR+QofZtRbg7QlVhOnkxxo1CaUohpLST1HOjrgd6AVzSJtEY5LsP3CaBNCxvaeXqVCE eT1L+L9jj4UuHNQzb5EdtpCpGL/9V65FenKEMMRwxczd3P++ZpaXBlg9eNfYhE9R1nyO 0rYiSz/aMnK9mUSKJHXLhviedrPqcchDR7FwdVR/ApZHJIyWpurgaJdT5C5i7OA+tNuG qJ4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=xidu/TCt5POmUvDdNxUSPuxebLxii3ALXgSwmwTtJ6E=; fh=LwpA6VWkZpYvfqjT3jkuGnHXRwVGI2wu5tey08t2PNA=; b=n7538a1l1S9u71tcllzzk6zHwU9e2Re51dDXm5jzUC3Qjc0gpBuhO3XlUlr7czZzIT 0vT7eLni+lKLfB97SQPZrAdxFUtFq6MPfjXyHy+rtpgsM2qYkEL5Cg351lvt8mDW3bXm ik5o8ffQ8tIxRQMyvCc6SvkI371Uhwb7cpe5FleQ0TK0Tlae6Nqv1GxnEF0LNajFdvL4 Cq4Us1++z6dHxDnzWkarXd0HF3vl4HTGWqOimTvlRgGbrY2a0vaXHEcvvB3tBxr7yW4S 84nwTjbzwLaUozVTDiFaB6fMDgepcZF3K/qAxAanjYEzTJiX671TQmOuSMSlJfpRxNUQ ctwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=oCaicZcj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ay2-20020a056402202200b005254561b160si3161416edb.483.2023.08.13.22.43.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Aug 2023 22:43:11 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=oCaicZcj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7BA743857718 for ; Mon, 14 Aug 2023 05:42:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7BA743857718 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691991779; bh=xidu/TCt5POmUvDdNxUSPuxebLxii3ALXgSwmwTtJ6E=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=oCaicZcjZxmUJ0P60AGnCKj8YkEM9T1xvCVpJ/rn9svD6GGz9pKbOzhBf8lQwT/qJ FKK7cumgI0aL4t0LiYPdTNdPfoYB3AyE1URXuui70POH0wJ5bgXpDFf74QtrNmwDcZ //RW1QFVoZuFMeOPfdVQt8vetvpoS5iU6Vd91t4o= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A92CF3858CDA; Mon, 14 Aug 2023 05:42:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A92CF3858CDA Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37E5ZrIJ026338; Mon, 14 Aug 2023 05:42:06 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfe5f05qr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:05 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37E5bSDf029764; Mon, 14 Aug 2023 05:42:05 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sfe5f05q8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:05 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37E56XeV001119; Mon, 14 Aug 2023 05:42:03 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3semsxsuhf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Aug 2023 05:42:03 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37E5g0Q426739326 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 14 Aug 2023 05:42:00 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8BCAB2004D; Mon, 14 Aug 2023 05:42:00 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 38E3C20040; Mon, 14 Aug 2023 05:41:59 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 14 Aug 2023 05:41:59 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: rguenther@suse.de, jeffreyalaw@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH 2/2] combine nonconstant_array walker and expander_sra walker Date: Mon, 14 Aug 2023 13:41:56 +0800 Message-Id: <20230814054156.2068718-2-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230814054156.2068718-1-guojiufu@linux.ibm.com> References: <20230814054156.2068718-1-guojiufu@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: j0gyhAlaThGjwbQwtHRHOWh0yR-YMTqP X-Proofpoint-GUID: wFtTJa-YC3DGwYojLM4mfI4FyDrIlbL_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-13_24,2023-08-10_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 clxscore=1015 priorityscore=1501 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 phishscore=0 impostorscore=0 adultscore=0 bulkscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308140051 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774181985018686329 X-GMAIL-MSGID: 1774181985018686329 Hi, In the light-expander-sra, each statement in each basic-block of a function need to be analyzed, and there is a similar behavior in checking variable which need to be stored in the stack. These per-stmt analyses can be combined to improve cache locality. Bootstrapped and regtested on x86_64-redhat-linux, and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR, Jeff (Jiufu Guo) gcc/ChangeLog: * cfgexpand.cc (discover_nonconstant_array_refs): Deleted. (struct array_and_sra_walk): New class. (pass_expand::execute): Call scan_function on array_and_sra_walk. --- gcc/cfgexpand.cc | 104 +++++++++++++++++++++++------------------------ 1 file changed, 52 insertions(+), 52 deletions(-) diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 21a09ebac96bbcddc67da73c42f470c6d5f60e6c..dc3ebe45275cc4b1c0873b4c6e5f6cbe2491ab8c 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -6843,59 +6843,59 @@ avoid_type_punning_on_regs (tree t, bitmap forced_stack_vars) bitmap_set_bit (forced_stack_vars, DECL_UID (base)); } -/* RTL expansion is not able to compile array references with variable - offsets for arrays stored in single register. Discover such - expressions and mark variables as addressable to avoid this - scenario. */ +/* Beside light-sra, walk stmts to discover expressions of array references + with variable offsets for arrays and mark variables as addressable to + avoid to be stored in single register. */ -static void -discover_nonconstant_array_refs (bitmap forced_stack_vars) +struct array_and_sra_walk : public expand_sra { - basic_block bb; - gimple_stmt_iterator gsi; + array_and_sra_walk (bitmap map) : wi{}, forced_stack_vars (map) + { + wi.info = forced_stack_vars; + }; - walk_stmt_info wi = {}; - wi.info = forced_stack_vars; - FOR_EACH_BB_FN (bb, cfun) - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + void pre_analyze_stmt (gimple *stmt) + { + expand_sra::pre_analyze_stmt (stmt); + if (!is_gimple_debug (stmt)) + walk_gimple_op (stmt, discover_nonconstant_array_refs_r, &wi); + if (gimple_vdef (stmt)) { - gimple *stmt = gsi_stmt (gsi); - if (!is_gimple_debug (stmt)) + tree t = gimple_get_lhs (stmt); + if (t && REFERENCE_CLASS_P (t)) + avoid_type_punning_on_regs (t, forced_stack_vars); + } + } + + void analyze_call (gcall *call) + { + expand_sra::analyze_call (call); + if (gimple_call_internal_p (call)) + { + tree cand = NULL_TREE; + switch (gimple_call_internal_fn (call)) { - walk_gimple_op (stmt, discover_nonconstant_array_refs_r, &wi); - gcall *call = dyn_cast (stmt); - if (call && gimple_call_internal_p (call)) - { - tree cand = NULL_TREE; - switch (gimple_call_internal_fn (call)) - { - case IFN_LOAD_LANES: - /* The source must be a MEM. */ - cand = gimple_call_arg (call, 0); - break; - case IFN_STORE_LANES: - /* The destination must be a MEM. */ - cand = gimple_call_lhs (call); - break; - default: - break; - } - if (cand) - cand = get_base_address (cand); - if (cand - && DECL_P (cand) - && use_register_for_decl (cand)) - bitmap_set_bit (forced_stack_vars, DECL_UID (cand)); - } - if (gimple_vdef (stmt)) - { - tree t = gimple_get_lhs (stmt); - if (t && REFERENCE_CLASS_P (t)) - avoid_type_punning_on_regs (t, forced_stack_vars); - } + case IFN_LOAD_LANES: + /* The source must be a MEM. */ + cand = gimple_call_arg (call, 0); + break; + case IFN_STORE_LANES: + /* The destination must be a MEM. */ + cand = gimple_call_lhs (call); + break; + default: + break; } + if (cand) + cand = get_base_address (cand); + if (cand && DECL_P (cand) && use_register_for_decl (cand)) + bitmap_set_bit (forced_stack_vars, DECL_UID (cand)); } -} + }; + + walk_stmt_info wi; + bitmap forced_stack_vars; +}; /* This function sets crtl->args.internal_arg_pointer to a virtual register if DRAP is needed. Local register allocator will replace @@ -7091,12 +7091,12 @@ pass_expand::execute (function *fun) avoid_deep_ter_for_debug (gsi_stmt (gsi), 0); } - /* Mark arrays indexed with non-constant indices with TREE_ADDRESSABLE. */ + /* Mark arrays indexed with non-constant indices with TREE_ADDRESSABLE. + And scan expressions for possible SRA accesses. */ auto_bitmap forced_stack_vars; - discover_nonconstant_array_refs (forced_stack_vars); - - current_sra = new expand_sra; - scan_function (cfun, *current_sra); + array_and_sra_walk *walker = new array_and_sra_walk (forced_stack_vars); + current_sra = walker; + scan_function (cfun, *walker); /* Make sure all values used by the optimization passes have sane defaults. */ @@ -7526,7 +7526,7 @@ pass_expand::execute (function *fun) loop_optimizer_finalize (); } - delete current_sra; + delete walker; current_sra = NULL; timevar_pop (TV_POST_EXPAND);