From patchwork Mon Aug 7 10:26:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 131784 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1351459vqr; Mon, 7 Aug 2023 03:27:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHUbTcHvYTkSOSjGTd4C7Nro4LHuAWuWyQupUOMDXOj1yAUkUCGJYxPB7ybp/VB/c8XJ9MA X-Received: by 2002:a17:906:1001:b0:993:6845:89d6 with SMTP id 1-20020a170906100100b00993684589d6mr8031587ejm.47.1691404070846; Mon, 07 Aug 2023 03:27:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691404070; cv=none; d=google.com; s=arc-20160816; b=iMYE3DwsY0GlDvufwNe0m7uoSL3EazhDDae0XRaS/9pDWXHHEINiRDNtuRWMWq2bE6 Cwjqwz2c0S9Gd/Sl+f8Vl99zSi7qcEadU7dEC3G96EHchcuVFUw/f4V2rHpWQ6IJDER+ 4hYNN7NgatGLmV6rdKygoz/c9eQFt79xU26hleFQao+jnoQ8rUqxPSry6ddINbt++lRF IKpacYuWyzRdqvMoYmNHMLf2N4waEnm76Ff0nBNFtb1PP/8i7e7N1PuesTUin3B9O/mG hLJ7JCCZX1MGBVmp15+XKAa0vQp/hUUPIqieTg86AAI1v+UGgFBl6b/fnDIZ7LtV6u9W nEvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:to:subject:content-language:cc:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=EY/mt2adNXYap0BQN0ANeeoFUSoPxDXmGMUL+xzLqns=; fh=1ZJ4ytFedIrBTcWDFd7bKdwqFae67H37C7BZarC4Y+0=; b=f4YgJCPEIiga+A5A0JBeczS/7EkrHl3H24vrJGTSgDaQjdAFppEFC38yhUkfo7nNlD gqBWG5iqU90h032uu+8NWwT0gDedadMmOQN1m1j17uLEBW35C36pVeXnjCz0NjiIU6Lf Ge964Hxa7Saz10oGeucQbdRRyc5fVKv5wHIxkq5JQrKyzA6I9bwI7yvKMx/sTl2rQJa/ jXvjKGBCizdtwT9YicDPEP0OM6Yx5BfwBtuxQJvXQQhln4EqwwvhLa/DUYB7+LnDYhhB vz1fsBGSwCqE55Anh2Ffg7pMjcpWF+IzHTa5BS3YVbTDoTmxZ3sk6R+1jZ2DkPMntuhw k9Kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=suL7LyRr; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id rp7-20020a170906d96700b0098e16f844f5si4203288ejb.715.2023.08.07.03.27.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Aug 2023 03:27:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=suL7LyRr; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 688403858417 for ; Mon, 7 Aug 2023 10:27:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 688403858417 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691404069; bh=EY/mt2adNXYap0BQN0ANeeoFUSoPxDXmGMUL+xzLqns=; h=Date:Cc:Subject:To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=suL7LyRrBHLZZJ2E7mRTg4Dd66Fxt4M+ZoZfI/H7c/ZO/TW0aFEVU3VROqT1/fMeZ ia2L5zwQnkDuNiClLlzOUvfHozd+wh5SX87xhpVqWSB9KHwP1Ek7tytqYHcORdgC/o tN4ubKQAjWQ/GhGkVO1YWeq1XBOdVZjbByHtUfI0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a]) by sourceware.org (Postfix) with ESMTPS id ECA733858D28 for ; Mon, 7 Aug 2023 10:27:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ECA733858D28 Received: by mail-lj1-x22a.google.com with SMTP id 38308e7fff4ca-2b9db1de50cso65808821fa.3 for ; Mon, 07 Aug 2023 03:27:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691404021; x=1692008821; h=content-transfer-encoding:to:subject:from:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=EY/mt2adNXYap0BQN0ANeeoFUSoPxDXmGMUL+xzLqns=; b=dZiluY6nJmcYam7tcmtaqiM51vfA5wVdgsfvhxTz8LiBtCt055+CvVeYRBiQWV/bZ2 zq7SQCVXgCKm9lZ6e+3OMUNdoMg9ctjbxzntGHCpXpudYggEGd/TFRcXR5MAmqhaa0tp Z2WFf/ycHQo3aXf7hVtuwHv3HR3X/Vi/h5GPJHje1jl4tqfBi1qCGntp0Q3E+T1qni8U FO2rA1ALPU7mJVBqEswXIzDMjyUh5JhfKSuZpjaTmJbRQW55DYF51gwxaJSzno4MkT5Z qLenyXtIrL+eQLqgdv8keq3f4wywn84wW20Ji7M3p0SxHurEuw6ur9yrXItnZTCORycQ IQfg== X-Gm-Message-State: AOJu0YxCEt5MEmqn5wwYLqB0lCIjELDkH4PPJrgAP2duXuTWdLa2dWbl Y/v9wd+7mShQ8p72gwGBhy4pBrXN50g= X-Received: by 2002:a2e:a163:0:b0:2b6:dd26:c02a with SMTP id u3-20020a2ea163000000b002b6dd26c02amr6163776ljl.14.1691404020822; Mon, 07 Aug 2023 03:27:00 -0700 (PDT) Received: from [192.168.1.23] (ip-046-005-130-086.um12.pools.vodafone-ip.de. [46.5.130.86]) by smtp.gmail.com with ESMTPSA id e14-20020a170906374e00b0099ce025f8ccsm746794ejc.186.2023.08.07.03.27.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Aug 2023 03:27:00 -0700 (PDT) Message-ID: <5a90c8a9-1570-5af4-bfdc-19d097bfee6e@gmail.com> Date: Mon, 7 Aug 2023 12:26:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Cc: rdapp.gcc@gmail.com Content-Language: en-US Subject: [PATCH] fwprop: Allow UNARY_P and check register pressure. To: gcc-patches , "richard.sandiford" X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Robin Dapp via Gcc-patches From: Robin Dapp Reply-To: Robin Dapp Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773565714685329157 X-GMAIL-MSGID: 1773565714685329157 Hi, originally inspired by the wish to transform vmv v3, a0 ; = vec_duplicate vadd.vv v1, v2, v3 into vadd.vx v1, v2, a0 via fwprop for riscv, this patch enables the forward propagation of UNARY_P sources. As this involves potentially replacing a vector register with a scalar register the ira_hoist_pressure machinery is used to calculate the change in register pressure. If the propagation would increase the pressure beyond the number of hard regs, we don't perform it. The regpressure commit this patch depends on is not yet pushed because I figured I'd post the code using it in case of further comments. The testsuite is unchanged on i386 and power10 but aarch64 has some new FAILs but I'm not terribly familiar with aarch64, hence some examples here: The following cases shrn-combine-1/2/3 seem worse as we emit one instruction more. Before: shrn v30.8b, v30.8h, 2 shrn2 v30.16b, v31.8h, 2 str q30, [x1, x3] After: ushr v30.8h, v30.8h, 2 ushr v31.8h, v31.8h, 2 uzp1 v31.16b, v30.16b, v31.16b str q31, [x1, x3] Here, one optimization already happens in fwprop so combine cannot do the same work it did before. vec-init-22-speed.c changes from sxth w0, w0 sxth w1, w1 fmov d31, x0 fmov d0, x1 to: dup v31.4h, w0 dup v0.4h, w1 which I hope has the same semantics and is shorter. Apart from that there are numerous check-asm testcases where a new, earlier propagation prevents a later, supposedly better propagation. One such example is from ld4_s8.c: (insn 11 10 12 2 (set (reg:DI 100) (neg:DI (reg:DI 102))) 385 {negdi2} (expr_list:REG_EQUAL (const_poly_int:DI [-576, -576]) (nil))) (insn 12 11 13 2 (set (reg/f:DI 99) (plus:DI (reg/v/f:DI 97 [ x0 ]) (reg:DI 100))) 153 {*adddi3_aarch64} (expr_list:REG_EQUAL (plus:DI (reg/v/f:DI 97 [ x0 ]) (const_poly_int:DI [-576, -576])) (nil))) (insn 13 12 14 2 (set (reg/v:VNx64QI 94 [ z0 ]) (unspec:VNx64QI [ (reg/v:VNx16BI 96 [ p0 ]) (mem:VNx64QI (reg/f:DI 99) [0 MEM [(signed char *)_1]+0 S[64, 64] A8]) ] UNSPEC_LDN)) 5885 {vec_mask_load_lanesvnx64qivnx16qi} (nil)) where we now do: propagating insn 11 into insn 12, replacing: (set (reg/f:DI 99) (plus:DI (reg/v/f:DI 97 [ x0 ]) (reg:DI 100))) successfully matched this instruction to subdi3: (set (reg/f:DI 99) (minus:DI (reg/v/f:DI 97 [ x0 ]) (reg:DI 102))) vs before: cannot propagate from insn 11 into insn 12: would increase complexity of pattern propagating insn 12 into insn 13, replacing: (set (reg/v:VNx64QI 94 [ z0 ]) (unspec:VNx64QI [ (reg/v:VNx16BI 96 [ p0 ]) (mem:VNx64QI (reg/f:DI 99) [0 MEM [(signed char *)_1]+0 S[64, 64] A8]) ] UNSPEC_LDN)) successfully matched this instruction to vec_mask_load_lanesvnx64qivnx16qi: (set (reg/v:VNx64QI 94 [ z0 ]) (unspec:VNx64QI [ (reg/v:VNx16BI 96 [ p0 ]) (mem:VNx64QI (plus:DI (reg/v/f:DI 97 [ x0 ]) (reg:DI 100)) [0 MEM [(signed char *)_1]+0 S[64, 64] A8]) ] UNSPEC_LDN)) All in all this seems like a general problem with earlier optimizations preventing later ones and surely all of those could be fixed individually. Still, the question remains if the general approach is useful or desired or if we not rather prevent more optimizations that we gain. Suggestions welcome. I have some riscv tests for it but didn't attach them yet in order to focus on the main part first. Regards Robin gcc/ChangeLog: * fwprop.cc (fwprop_propagation::profitable_p): Add unary handling. (fwprop_propagation::update_register_pressure): New function. (fwprop_propagation::register_pressure_high_p): New function (reg_single_def_for_src_p): Look through unary expressions. (try_fwprop_subst_pattern): Check register pressure. (forward_propagate_into): Call new function. (fwprop_init): Init register pressure. (fwprop_done): Clean up register pressure. (fwprop_insn): Add comment. --- gcc/fwprop.cc | 307 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 300 insertions(+), 7 deletions(-) diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc index 0707a234726..413fe4e7335 100644 --- a/gcc/fwprop.cc +++ b/gcc/fwprop.cc @@ -36,6 +36,10 @@ along with GCC; see the file COPYING3. If not see #include "tree-pass.h" #include "rtl-iter.h" #include "target.h" +#include "dominance.h" + +#include "ira.h" +#include "regpressure.h" /* This pass does simple forward propagation and simplification when an operand of an insn can only come from a single def. This pass uses @@ -103,6 +107,10 @@ using namespace rtl_ssa; static int num_changes; +/* Keep track of which registers already increased the pressure to avoid double + booking. */ +sbitmap pressure_accounted; + /* Do not try to replace constant addresses or addresses of local and argument slots. These MEM expressions are made only once and inserted in many instructions, as well as being used to control symbol table @@ -181,6 +189,8 @@ namespace bool changed_mem_p () const { return result_flags & CHANGED_MEM; } bool folded_to_constants_p () const; bool profitable_p () const; + bool register_pressure_high_p (rtx, rtx, rtx_insn *, rtx_insn *) const; + void update_register_pressure (rtx, rtx, rtx_insn *, rtx_insn *) const; bool check_mem (int, rtx) final override; void note_simplification (int, uint16_t, rtx, rtx) final override; @@ -332,25 +342,243 @@ fwprop_propagation::profitable_p () const && (result_flags & PROFITABLE)) return true; - if (REG_P (to)) + /* Only continue with an unary operation if we consider register + pressure. */ + rtx what = copy_rtx (to); + if (UNARY_P (what) && flag_ira_hoist_pressure) + what = XEXP (what, 0); + + if (REG_P (what)) return true; - if (GET_CODE (to) == SUBREG - && REG_P (SUBREG_REG (to)) - && !paradoxical_subreg_p (to)) + if (GET_CODE (what) == SUBREG + && REG_P (SUBREG_REG (what)) + && !paradoxical_subreg_p (what)) return true; - if (CONSTANT_P (to)) + if (CONSTANT_P (what)) return true; return false; } -/* Check that X has a single def. */ +/* Check if the register pressure in any predecessor block of USE's block + until DEF's block is equal or higher to the number of hardregs in NU's + register class. */ +bool +fwprop_propagation::register_pressure_high_p (rtx nu, rtx old, rtx_insn *def, + rtx_insn *use) const +{ + enum reg_class nu_class, old_class; + int nu_nregs, old_nregs; + nu_class = regpressure_get_regno_pressure_class (REGNO (nu), &nu_nregs); + old_class + = regpressure_get_regno_pressure_class (REGNO (old), &old_nregs); + + if (nu_class == NO_REGS && old_class == NO_REGS) + return true; + + if (nu_class == old_class) + return false; + + basic_block bbfrom = BLOCK_FOR_INSN (def); + basic_block bbto = BLOCK_FOR_INSN (use); + + basic_block bb; + + sbitmap visited = sbitmap_alloc (last_basic_block_for_fn (cfun)); + bitmap_clear (visited); + auto_vec q; + q.safe_push (bbto); + + while (!q.is_empty ()) + { + bb = q.pop (); + + if (bitmap_bit_p (visited, bb->index)) + continue; + + /* Nothing to do if the register to be replaced is not live + in this BB. */ + if (bb != bbfrom && !regpressure_is_live_in (bb, REGNO (old))) + continue; + + /* Nothing to do if the replacement register is already live in + this BB. */ + if (regpressure_is_live_in (bb, REGNO (nu))) + continue; + + bitmap_set_bit (visited, bb->index); + + int press = regpressure_get (bb, nu_class); + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "pressure for reg %d in bb %d: %d\n", + REGNO (nu), bb->index, press); + + if (press + nu_nregs > ira_class_hard_regs_num[nu_class]) + return true; + + if (bb == bbfrom || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)) + break; + + edge e; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb->preds) + { + basic_block pred = e->src; + q.safe_push (pred); + } + } + + sbitmap_free (visited); + + return false; +} + +/* Update the register pressure when propagating from DEF to USE + replacing OLD with NU. */ +void +fwprop_propagation::update_register_pressure (rtx nu, rtx old, rtx_insn *def, + rtx_insn *use) const +{ + basic_block bb; + + enum reg_class nu_class, old_class; + int nu_nregs, old_nregs; + nu_class = regpressure_get_regno_pressure_class (REGNO (nu), &nu_nregs); + old_class = regpressure_get_regno_pressure_class (REGNO (old), &old_nregs); + + if (nu_class == old_class) + return; + + basic_block bbfrom = BLOCK_FOR_INSN (def); + basic_block bbto = BLOCK_FOR_INSN (use); + + sbitmap visited = sbitmap_alloc (last_basic_block_for_fn (cfun)); + bitmap_clear (visited); + auto_vec q; + q.safe_push (bbto); + + /* Precompute the users. */ + auto_vec users; + df_ref duse; + bool should_decrease = true; + FOR_EACH_INSN_USE (duse, def) + { + rtx dreg = DF_REF_REAL_REG (duse); + if (REGNO (dreg) != REGNO (old)) + continue; + + df_ref op_ref = DF_REG_USE_CHAIN (REGNO (dreg)); + for (; op_ref; op_ref = DF_REF_NEXT_REG (op_ref)) + { + if (!DF_REF_INSN_INFO (op_ref)) + continue; + + rtx_insn *insn = DF_REF_INSN (op_ref); + + /* If there are other users in BBTO, never decrease the + pressure. */ + if (BLOCK_FOR_INSN (insn) == bbto && insn != use) + { + should_decrease = false; + break; + } + + users.safe_push (insn); + } + } + + while (!q.is_empty ()) + { + bb = q.pop (); + + if (bitmap_bit_p (visited, bb->index)) + continue; + + /* Nothing to do if the register to be replaced is not live + in this BB. */ + if (bb != bbfrom && !regpressure_is_live_in (bb, REGNO (old))) + continue; + + /* Nothing to do if the new register is already live in this BB. */ + if (regpressure_is_live_in (bb, REGNO (nu))) + continue; + + bitmap_set_bit (visited, bb->index); + + if (regpressure_is_live_in (bb, REGNO (old))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "increasing pressure for " + "reg %d in bb %d by %d regs.\n", + REGNO (nu), bb->index, nu_nregs); + regpressure_set_live_in (bb, REGNO (nu)); + regpressure_increase (bb, nu_class, nu_nregs); + + if (should_decrease) + { + rtx_insn *user; + int ix; + bool should_decrease_local = true; + + /* If this BB dominates no BB where OLD is used (except BBTO) + the register pressure is decreased. */ + FOR_EACH_VEC_ELT(users, ix, user) + { + basic_block ubb = BLOCK_FOR_INSN (user); + if (ubb == bbto) + continue; + if (dominated_by_p (CDI_DOMINATORS, ubb, bb)) + { + should_decrease_local = false; + break; + } + } + + if (should_decrease_local) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "decreasing pressure for " + "reg %d in bb %d by %d regs.\n", + REGNO (nu), bb->index, old_nregs); + regpressure_clear_live_in (bb, REGNO (old)); + regpressure_decrease (bb, old_class, old_nregs); + } + } + } + + if (bb == bbfrom || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)) + break; + + edge e; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb->preds) + { + basic_block pred = e->src; + q.safe_push (pred); + } + } + + sbitmap_free (visited); +} + +/* Check that X has a single def. Also look through unary operations if + we take register pressure into consideration. */ static bool reg_single_def_p (rtx x) { + if (UNARY_P (x) && flag_ira_hoist_pressure) + { + x = XEXP (x, 0); + + if (SUBREG_P (x) && !contains_paradoxical_subreg_p (x)) + x = SUBREG_REG (x); + } + return REG_P (x) && crtl->ssa->single_dominating_def (REGNO (x)); } @@ -490,6 +718,53 @@ try_fwprop_subst_pattern (obstack_watermark &attempt, insn_change &use_change, } } + if (ok && UNARY_P (src)) + { + /* Currently the register classes can only be different with a + unary operation like vec_duplicate. + In that case the propagation can increase the register pressure + on the way to the target BB. + Therefore, we first check if the pressure of the respective register + is already high, i.e. equals or exceeds the number of hardregs for + that class. If not, we allow the propagation and update the + pressure. */ + + rtx src1 = src; + + /* Strip unary operation and possible subreg. */ + src1 = XEXP (src, 0); + if (GET_CODE (src1) == SUBREG) + src1 = SUBREG_REG (src1); + + rtx_insn *def_rtl = def_insn->rtl (); + + if (REG_P (dest) && REG_P (src1)) + { + if (!bitmap_bit_p (pressure_accounted, REGNO (src1))) + { + if (!prop.register_pressure_high_p (src1, dest, def_rtl, use_rtl)) + { + prop.update_register_pressure (src1, dest, def_rtl, use_rtl); + bitmap_set_bit (pressure_accounted, REGNO (src1)); + } + else + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "cannot propagate from insn %d into" + " insn %d: %s\n", def_insn->uid (), + use_insn->uid (), + "would increase register pressure beyond number" + " of hard regs"); + ok = false; + } + } + else + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "pressure for reg %d has already" + " been accounted\n", REGNO (src1)); + } + } + if (!ok) { /* The pattern didn't match, but if all uses of SRC folded to @@ -859,7 +1134,7 @@ forward_propagate_into (use_info *use, bool reg_prop_only = false) if ((reg_prop_only || (def_loop != use_loop && !flow_loop_nested_p (use_loop, def_loop))) - && (!reg_single_def_p (dest) || !reg_single_def_p (src))) + && !reg_single_def_p (dest)) return false; /* Don't substitute into a non-local goto, this confuses CFG. */ @@ -890,6 +1165,14 @@ fwprop_init (void) df_analyze (); crtl->ssa = new rtl_ssa::function_info (cfun); + + /* Calculate register pressure for each basic block. */ + if (flag_ira_hoist_pressure) + { + regpressure_init (); + + pressure_accounted = sbitmap_alloc (DF_REG_SIZE (df)); + } } static void @@ -910,6 +1193,12 @@ fwprop_done (void) fprintf (dump_file, "\nNumber of successful forward propagations: %d\n\n", num_changes); + + if (flag_ira_hoist_pressure) + { + regpressure_cleanup (); + sbitmap_free (pressure_accounted); + } } /* Try to optimize INSN, returning true if something changes. @@ -919,6 +1208,10 @@ fwprop_done (void) static bool fwprop_insn (insn_info *insn, bool fwprop_addr_p) { + /* TODO: when rejecting propagations due to register pressure + we would actually like to try the propagation with most + potential = uses first instead of going through them + sequentially. */ for (use_info *use : insn->uses ()) { if (use->is_mem ())