From patchwork Sun Nov 12 12:08:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164246 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp659548vqg; Sun, 12 Nov 2023 04:09:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IEi4CCQZXMJ+XGgjIQdKqLV2F9S7u41TjGp4ubrNQXUzl3N3AczUQaRakxjwM+RbU+XYVKI X-Received: by 2002:a05:6870:470d:b0:1e9:b537:51ef with SMTP id b13-20020a056870470d00b001e9b53751efmr4102643oaq.31.1699790951403; Sun, 12 Nov 2023 04:09:11 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699790951; cv=pass; d=google.com; s=arc-20160816; b=UvOqO5xaBxVQ39YIDOGHzKB6XVanjZMOrRf7Da+JDFFJYNvqdBOvtAl8KNQzUDDVPb 3oN+/ixvm7mwukTgExUk2oUZAmo3h7TKapDpIJP4+gpai14gmB4DeqNMsKKW9hcTGcXo P4jUQg/1/BuJruj0BpO8ksryz8HfNkjb1fUdMWQbC+BXvhT0enJb/BPCOA5jfbi6gl7Z Z1QpJjyCQoT9NG+M1Ove2Ll5O0wllRxgID3WiROPjvUr966S8B1C4YzMQUulLFcbZokl x1kO6iL57MO3uJjFI2yStUuDhmLsd2v+eBxLuW8kVX/gzVtkhCqH4YMvXv0AOTYMWoO5 gtjg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=nKM6Ct8t3HDu0M+BpP5/VGPHnAtv1BTzcw/KOZDOINA=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=K4joa6a/czCOJc5l1RlNqXt6KFd3xMTexDJQc7MxUMvIrf63ZXDiKF99QZVoBaPBMH /GOrM6quIspOdY3Z+jCz+PlKNbZtwbUbIlglRj6pLqi8W62/QVeRPv1ucURtkgnsm2kO r702VgoMBQXAgZVSR+E1kCxNfQyvN2jsJ/+srO6cStgGEnApteUXnlRhlNCOb8uVa+Ie wOZGPJOSms35ELW1hZFAbbkIc0CKrqz2VW/uTklN9OjLc9NsJ0HyMvs/6iU/fWb06ats WuHJsqVh4HIVfa7aQlOPoegSLUG9L2g/p4fdSWIPUygtFPA/qvOLT48ZKYV+SJrW7J9c NASA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id az14-20020a05620a170e00b00773f574d0fdsi2967603qkb.669.2023.11.12.04.09.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:09:11 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 81060385703A for ; Sun, 12 Nov 2023 12:08:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 77F9D3858D35 for ; Sun, 12 Nov 2023 12:08:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 77F9D3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 77F9D3858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790915; cv=none; b=RX6VUV2JknbykgwDPOW5dz/NBYASqhyHbEOob3iTYiQwezZ0dYnxWd04Mgki6JboF3yD1m/GctvDGtr9SnOJrv6j5WLBWwujW3P/jnUIGEpyL9s9bXlNtWtyWoksdHKCCW2lrT3+Jek7reGQQMlU7HLK9vmSZlOP+NB/HhQYtO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790915; c=relaxed/simple; bh=bhszhFXrqon2ISNTkcnO4WA2wV5klMmEKWQgsIcSOjo=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=iuXXKrk+dIiitCym0pZqO+L7xOIK2roPpLzKTYAqSOoUjuydXPixQFHitJKrQ1tPKvIXjy29AjlQw7WUhhACoxAZ61wJ57zbA+DL1R82OQwbiO/XRoIv8fhQ0T1vVByNi5/QGKJ+hMKNw0cdIoJCLR2g+IckywiLYI1WrFRD+Qw= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790905ty5rjd9q Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:25 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: +s1mbd4gJMpTi86lOMy0TINlHVNKrjcUfWTkrNEyZyK3JQ+sOhdHHkihJXiKU 4AXB/xzG7yqXmy0W4ncaKuyABtQbWgxMvLsjPfNgxTj3P1cw204MmLBTLn0/XDVCXo0Zbug 9XuE5sCSpnhORJmiBShlxjb3LZtVNvys3tIfEDKygPFSHFLcNvoIO0hAmuk5HHzmHUkgi2Q 3Wsxw8Id2vxXSej1pOimAO4qJZLph39Weqt2u9BMooBhTNDqF7m0FE5l7VeXXjObk2XCn9h mmU/S0WxhhGCrr8g9HhD9wA9BzCIAD0nuM7yqMJTnY4iFN1/wnD1B5wGDV0VVyeDMZIMbQo yMDeA5M/X/79kFkI1bB4Q3Dyw2OPEup9qTJKA4QrxFG+EsSEfBMChZBtLF3rBWlHnrug68H 3HAGO6o6/jc= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 7242735660438849757 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 2/7] ira: Switch to live_subreg data Date: Sun, 12 Nov 2023 20:08:12 +0800 Message-Id: <20231112120817.2635864-3-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782359996218709735 X-GMAIL-MSGID: 1782359996218709735 This patch switch the use of live_reg data to live_subreg data. gcc/ChangeLog: * ira-build.cc (create_bb_allocnos): Switch. (create_loop_allocnos): Ditto. * ira-color.cc (ira_loop_edge_freq): Ditto. * ira-emit.cc (generate_edge_moves): Ditto. (add_ranges_and_copies): Ditto. * ira-lives.cc (process_out_of_region_eh_regs): Ditto. (add_conflict_from_region_landing_pads): Ditto. (process_bb_node_lives): Ditto. * ira.cc (find_moveable_pseudos): Ditto. (interesting_dest_for_shprep_1): Ditto. (allocate_initial_values): Ditto. (ira): Ditto. --- gcc/ira-build.cc | 7 ++++--- gcc/ira-color.cc | 8 ++++---- gcc/ira-emit.cc | 12 ++++++------ gcc/ira-lives.cc | 7 ++++--- gcc/ira.cc | 16 +++++++++------- 5 files changed, 27 insertions(+), 23 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 93e46033170..f931c6e304c 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -1919,7 +1919,8 @@ create_bb_allocnos (ira_loop_tree_node_t bb_node) create_insn_allocnos (PATTERN (insn), NULL, false); /* It might be a allocno living through from one subloop to another. */ - EXECUTE_IF_SET_IN_REG_SET (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, i, bi) + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, + i, bi) if (ira_curr_regno_allocno_map[i] == NULL) ira_create_allocno (i, false, ira_curr_loop_tree_node); } @@ -1935,9 +1936,9 @@ create_loop_allocnos (edge e) bitmap_iterator bi; ira_loop_tree_node_t parent; - live_in_regs = df_get_live_in (e->dest); + live_in_regs = DF_LIVE_SUBREG_IN (e->dest); border_allocnos = ira_curr_loop_tree_node->border_allocnos; - EXECUTE_IF_SET_IN_REG_SET (df_get_live_out (e->src), + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_OUT (e->src), FIRST_PSEUDO_REGISTER, i, bi) if (bitmap_bit_p (live_in_regs, i)) { diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index f2e8ea34152..4aa3e316282 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -2783,8 +2783,8 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int regno, bool exit_p) FOR_EACH_EDGE (e, ei, loop_node->loop->header->preds) if (e->src != loop_node->loop->latch && (regno < 0 - || (bitmap_bit_p (df_get_live_out (e->src), regno) - && bitmap_bit_p (df_get_live_in (e->dest), regno)))) + || (bitmap_bit_p (DF_LIVE_SUBREG_OUT (e->src), regno) + && bitmap_bit_p (DF_LIVE_SUBREG_IN (e->dest), regno)))) freq += EDGE_FREQUENCY (e); } else @@ -2792,8 +2792,8 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int regno, bool exit_p) auto_vec edges = get_loop_exit_edges (loop_node->loop); FOR_EACH_VEC_ELT (edges, i, e) if (regno < 0 - || (bitmap_bit_p (df_get_live_out (e->src), regno) - && bitmap_bit_p (df_get_live_in (e->dest), regno))) + || (bitmap_bit_p (DF_LIVE_SUBREG_OUT (e->src), regno) + && bitmap_bit_p (DF_LIVE_SUBREG_IN (e->dest), regno))) freq += EDGE_FREQUENCY (e); } diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc index bcc4f09f7c4..84ed482e568 100644 --- a/gcc/ira-emit.cc +++ b/gcc/ira-emit.cc @@ -510,8 +510,8 @@ generate_edge_moves (edge e) return; src_map = src_loop_node->regno_allocno_map; dest_map = dest_loop_node->regno_allocno_map; - regs_live_in_dest = df_get_live_in (e->dest); - regs_live_out_src = df_get_live_out (e->src); + regs_live_in_dest = DF_LIVE_SUBREG_IN (e->dest); + regs_live_out_src = DF_LIVE_SUBREG_OUT (e->src); EXECUTE_IF_SET_IN_REG_SET (regs_live_in_dest, FIRST_PSEUDO_REGISTER, regno, bi) if (bitmap_bit_p (regs_live_out_src, regno)) @@ -1229,16 +1229,16 @@ add_ranges_and_copies (void) destination block) to use for searching allocnos by their regnos because of subsequent IR flattening. */ node = IRA_BB_NODE (bb)->parent; - bitmap_copy (live_through, df_get_live_in (bb)); + bitmap_copy (live_through, DF_LIVE_SUBREG_IN (bb)); add_range_and_copies_from_move_list (at_bb_start[bb->index], node, live_through, REG_FREQ_FROM_BB (bb)); - bitmap_copy (live_through, df_get_live_out (bb)); + bitmap_copy (live_through, DF_LIVE_SUBREG_OUT (bb)); add_range_and_copies_from_move_list (at_bb_end[bb->index], node, live_through, REG_FREQ_FROM_BB (bb)); FOR_EACH_EDGE (e, ei, bb->succs) { - bitmap_and (live_through, - df_get_live_in (e->dest), df_get_live_out (bb)); + bitmap_and (live_through, DF_LIVE_SUBREG_IN (e->dest), + DF_LIVE_SUBREG_OUT (bb)); add_range_and_copies_from_move_list ((move_t) e->aux, node, live_through, REG_FREQ_FROM_EDGE_FREQ (EDGE_FREQUENCY (e))); diff --git a/gcc/ira-lives.cc b/gcc/ira-lives.cc index 81af5c06460..05e2be12a26 100644 --- a/gcc/ira-lives.cc +++ b/gcc/ira-lives.cc @@ -1194,7 +1194,8 @@ process_out_of_region_eh_regs (basic_block bb) if (! eh_p) return; - EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), FIRST_PSEUDO_REGISTER, i, bi) + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_OUT (bb), FIRST_PSEUDO_REGISTER, i, + bi) { ira_allocno_t a = ira_curr_regno_allocno_map[i]; for (int n = ALLOCNO_NUM_OBJECTS (a) - 1; n >= 0; n--) @@ -1228,7 +1229,7 @@ add_conflict_from_region_landing_pads (eh_region region, ira_object_t obj, if ((landing_label = lp->landing_pad) != NULL && (landing_bb = BLOCK_FOR_INSN (landing_label)) != NULL && (region->type != ERT_CLEANUP - || bitmap_bit_p (df_get_live_in (landing_bb), + || bitmap_bit_p (DF_LIVE_SUBREG_IN (landing_bb), ALLOCNO_REGNO (a)))) { HARD_REG_SET new_conflict_regs @@ -1265,7 +1266,7 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) high_pressure_start_point[ira_pressure_classes[i]] = -1; } curr_bb_node = loop_tree_node; - reg_live_out = df_get_live_out (bb); + reg_live_out = DF_LIVE_SUBREG_OUT (bb); sparseset_clear (objects_live); REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out); hard_regs_live &= ~(eliminable_regset | ira_no_alloc_regs); diff --git a/gcc/ira.cc b/gcc/ira.cc index d7530f01380..c7f27b17002 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -4735,8 +4735,8 @@ find_moveable_pseudos (void) bitmap_initialize (local, 0); bitmap_initialize (transp, 0); bitmap_initialize (moveable, 0); - bitmap_copy (live, df_get_live_out (bb)); - bitmap_and_into (live, df_get_live_in (bb)); + bitmap_copy (live, DF_LIVE_SUBREG_OUT (bb)); + bitmap_and_into (live, DF_LIVE_SUBREG_IN (bb)); bitmap_copy (transp, live); bitmap_clear (moveable); bitmap_clear (live); @@ -5036,7 +5036,8 @@ interesting_dest_for_shprep_1 (rtx set, basic_block call_dom) rtx dest = SET_DEST (set); if (!REG_P (src) || !HARD_REGISTER_P (src) || !REG_P (dest) || HARD_REGISTER_P (dest) - || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest)))) + || (call_dom + && !bitmap_bit_p (DF_LIVE_SUBREG_IN (call_dom), REGNO (dest)))) return NULL; return dest; } @@ -5514,10 +5515,10 @@ allocate_initial_values (void) /* Update global register liveness information. */ FOR_EACH_BB_FN (bb, cfun) { - if (REGNO_REG_SET_P (df_get_live_in (bb), regno)) - SET_REGNO_REG_SET (df_get_live_in (bb), new_regno); - if (REGNO_REG_SET_P (df_get_live_out (bb), regno)) - SET_REGNO_REG_SET (df_get_live_out (bb), new_regno); + if (REGNO_REG_SET_P (DF_LIVE_SUBREG_IN (bb), regno)) + SET_REGNO_REG_SET (DF_LIVE_SUBREG_IN (bb), new_regno); + if (REGNO_REG_SET_P (DF_LIVE_SUBREG_OUT (bb), regno)) + SET_REGNO_REG_SET (DF_LIVE_SUBREG_OUT (bb), new_regno); } } } @@ -5679,6 +5680,7 @@ ira (FILE *f) if (optimize > 1) df_remove_problem (df_live); gcc_checking_assert (df_live == NULL); + df_live_subreg_add_problem (); if (flag_checking) df->changeable_flags |= DF_VERIFY_SCHEDULED; From patchwork Sun Nov 12 12:08:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164249 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp659962vqg; Sun, 12 Nov 2023 04:10:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGc4C+6D21ITAmiNQ/AIqoIZq42wkObD5wizUvT04NPU+Jj+nYEiSUb1vta9KOdg4zV+BTK X-Received: by 2002:a0c:e70e:0:b0:653:5bed:83d4 with SMTP id d14-20020a0ce70e000000b006535bed83d4mr3961123qvn.30.1699791014344; Sun, 12 Nov 2023 04:10:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699791014; cv=pass; d=google.com; s=arc-20160816; b=fWR+an4rgMPqkiz3rZDUuPdbC4CX8gcRDsdW8jJc1CrMb3F8JKh4nOF5LQlrCEd6I/ pW9WbynDbEN5Nmiae5piSnVAbVdMG4yST6A8eoyl/LCTMsed5MfAnHrXreNg6gzich92 v6ES/YUhroSf51zwBkIH0TlyfvP8IalC+NoTpUhyKI+E0YpMEPmKMDZPhntM+JbtqqcW ou7ZEHfIclXT63w0/EAqUdSWk8fvWr+IkOZNvfLpy3n2xkdW364IBaDtHAV0kbAGm2Tw /Pc3s6BeHHx9tsFd93hILiHXdjmX6DIHF9UZZ3T61FBRVEuGSA4RWqithQiId4ELQZ4l NLvQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=fDCCPx4b3tnLOHLSVe46Ul6o8PizPn+AB0T4AQ7h2Dc=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=07Q0HlJv968dJ8bMituqclagGvCHS6fde++u/A7sg3/uRlfk4cCA3BewkZAo6E0plw lYhZg9fiteq3aICrfSUQkPDaKZffNTS0nrmPc8MjoQu4BFK0AZoiIRDzHo0ofqjdF6/W Oo1L21/EByI1hDJUfHBNDevd6gnN9QGh0fCrB0ves3mHqojjGDhkoiS3pNaOoy6Xk8CR b6re62tgduoalbolih2HaS4BVTLmv5OqbYdR8o5buW/5L3ZYjRxTQxZd/FGfFKTiRe3W nhleL9bBiHQiZZyXowyToimOuWzhnCSbmfhSBR8S8SY1NCMTCkfqu+w5gRf4qslEF4hL gTtQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e12-20020ac84e4c000000b00421b90e0cd3si2812207qtw.248.2023.11.12.04.10.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:10:14 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 72946385770F for ; Sun, 12 Nov 2023 12:09:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast2.qq.com (smtpbguseast2.qq.com [54.204.34.130]) by sourceware.org (Postfix) with ESMTPS id 50A063858C78 for ; Sun, 12 Nov 2023 12:08:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 50A063858C78 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 50A063858C78 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.204.34.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790922; cv=none; b=AkquzdKWrg+eCIk38fwH1HdI2e4NWsJGxrvPxT2CBR1ZFU885skQgAxgLjboJfFNWiV5PdlmJ4EX6cgqf6noi6J1iHYTTmiajyT3dPHOxoDjabCGonEaTMnGNuBU+yEDN1QXyFLUSBwpwFh8f2F7H1usACEvsq+aOJuuYluGacE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790922; c=relaxed/simple; bh=WNdjzsSdbnLEQqrHN8yk3iYBAeqhGYlDDThfy3JOTxI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=daSpFjIRWZTeXM93MVhsfUGrc544PuYUdt7FTAus861w6XJFU4sV72KUJrx0i62lZMlQ7cr+6X8Pa9LNv1xTn2i9wGkQGD926noRYKwAnwkGIl0/iPHHhoOZh1N43VcGx9Q67tmIxP0NphtJE7trg9MsT2Pp7RpNmirEIWoQdas= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790909t7w6u504 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:28 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: LNlk/juym4isBhBL5Srkfvwen9pOlJuoVVQfMIkI8276/Ux05G1bYDjT2NUBk XF3UoX8q/PprMA1lZdkfMVhdaqvC9dDubiZRXoR2SLBCUCcFN8uchB91CHPbpqSq/uNuzlw WG8jk/2nP6pux1Z/HqGBCbiZKEYYfk4fozD+7cxK8fmqkSHq7byQg18//EQZdCJfhCBTCQp sQePlrK4l4mfOs/auNyxV+QdvSvkegSh54TZTTiuIydvmOp2Hna6uS6BHgIjRWy3Vi6DTXU xbfdmZZVL3wqRy+eTSp5A1udisO8zY/QMfCLQLvdBlgIc8NfxII35frDeDnqAIHOCRe4+pQ qEzeMGLdWxDzkWLF+F0hE9IpjzyUt10Y0wx+Avadp4K3s9qTpHuuHbtQD74Z/QZABvyzn/9 T6ApVQG/V2s= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 12724372260321493713 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 3/7] ira: Support subreg live range track Date: Sun, 12 Nov 2023 20:08:13 +0800 Message-Id: <20231112120817.2635864-4-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782360062526774727 X-GMAIL-MSGID: 1782360062526774727 This patch supports tracking subreg liveness. It first extends ira_object_t objects[2] to std::vector objects, which can hold more than one object, and is used to collect all access via subreg in program and the partial_in and partial_out of the basic block live in/out. Then there is a modification to the way conflicts between registers are detected, for example, if a object conflicts with b object, then the offset and size of the object relative to the allocno it belongs to need to be taken into account to compute the conflict registers between allocno and allocno. gcc/ChangeLog: * hard-reg-set.h (struct HARD_REG_SET): New shift operator. * ira-build.cc (ira_create_object): Adjust. (find_object): New. (find_object_anyway): New. (ira_create_allocno): Adjust. (get_range): New. (ira_copy_allocno_objects): New. (merge_hard_reg_conflicts): Adjust copy. (create_cap_allocno): Adjust. (find_subreg_p): New. (add_subregs): New. (create_insn_allocnos): Collect subreg. (create_bb_allocnos): Ditto. (move_allocno_live_ranges): Adjust. (copy_allocno_live_ranges): Adjust. (setup_min_max_allocno_live_range_point): Adjust. * ira-color.cc (INCLUDE_MAP): include map. (setup_left_conflict_sizes_p): Adjust conflict size. (setup_profitable_hard_regs): Adjust. (get_conflict_and_start_profitable_regs): Adjust. (check_hard_reg_p): Adjust conflict check. (assign_hard_reg): Adjust. (push_allocno_to_stack): Adjust conflict size. (improve_allocation): Adjust. * ira-conflicts.cc (record_object_conflict): Simplify. (build_object_conflicts): Adjust. (build_conflicts): Adjust. (print_allocno_conflicts): Adjust. * ira-emit.cc (modify_move_list): Adjust. * ira-int.h (struct ira_object): Adjust struct. (struct ira_allocno): Adjust struct. (ALLOCNO_NUM_OBJECTS): New accessor. (ALLOCNO_UNIT_SIZE): Ditto. (ALLOCNO_TRACK_SUBREG_P): Ditto. (ALLOCNO_NREGS): Ditto. (OBJECT_SUBWORD): Ditto. (OBJECT_INDEX): Ditto. (OBJECT_START): Ditto. (OBJECT_NREGS): Ditto. (find_object): Exported. (find_object_anyway): Ditto. (ira_copy_allocno_objects): Ditto. (has_subreg_object_p): Ditto. (get_full_object): Ditto. * ira-lives.cc (INCLUDE_VECTOR): Include vector. (add_onflict_hard_regs): New. (add_onflict_hard_reg): New. (make_hard_regno_dead): Adjust. (make_object_live): Adjust. (update_allocno_pressure_excess_length): Adjust. (make_object_dead): Adjust. (mark_pseudo_regno_live): Adjust. (add_subreg_point): New. (mark_pseudo_object_live): Adjust. (mark_pseudo_regno_subword_live): Adjust. (mark_pseudo_regno_subreg_live): Adjust. (mark_pseudo_regno_subregs_live): Adjust. (mark_pseudo_reg_live): Adjust. (mark_pseudo_regno_dead): Adjust. (mark_pseudo_object_dead): Adjust. (mark_pseudo_regno_subword_dead): Adjust. (mark_pseudo_regno_subreg_dead): Adjust. (mark_pseudo_reg_dead): Adjust. (process_single_reg_class_operands): Adjust. (process_out_of_region_eh_regs): Adjust. (add_conflict_from_region_landing_pads): Adjust. (process_bb_node_lives): Adjust. (class subreg_live_item): New class. (create_subregs_live_ranges): New function. (ira_create_allocno_live_ranges): Adjust. * ira.cc (check_allocation): Adjust. --- gcc/hard-reg-set.h | 33 +++ gcc/ira-build.cc | 235 +++++++++++++++++--- gcc/ira-color.cc | 302 +++++++++++++++++--------- gcc/ira-conflicts.cc | 48 ++--- gcc/ira-emit.cc | 2 +- gcc/ira-int.h | 57 ++++- gcc/ira-lives.cc | 500 ++++++++++++++++++++++++++++++++----------- gcc/ira.cc | 52 ++--- 8 files changed, 907 insertions(+), 322 deletions(-) diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h index b0bb9bce074..760eadba186 100644 --- a/gcc/hard-reg-set.h +++ b/gcc/hard-reg-set.h @@ -113,6 +113,39 @@ struct HARD_REG_SET return !operator== (other); } + HARD_REG_SET + operator>> (unsigned int shift_amount) const + { + if (shift_amount == 0) + return *this; + + HARD_REG_SET res; + unsigned int total_bits = sizeof (HARD_REG_ELT_TYPE) * 8; + if (shift_amount >= total_bits) + { + unsigned int n_elt = shift_amount % total_bits; + shift_amount -= n_elt * total_bits; + for (unsigned int i = 0; i < ARRAY_SIZE (elts) - n_elt - 1; i += 1) + res.elts[i] = elts[i + n_elt]; + /* clear upper n_elt elements. */ + for (unsigned int i = 0; i < n_elt; i += 1) + res.elts[ARRAY_SIZE (elts) - 1 - i] = 0; + } + + if (shift_amount > 0) + { + /* The left bits of an element be shifted. */ + HARD_REG_ELT_TYPE left = 0; + /* Total bits of an element. */ + for (int i = ARRAY_SIZE (elts) - 1; i >= 0; --i) + { + res.elts[i] = (elts[i] >> shift_amount) | left; + left = elts[i] << (total_bits - shift_amount); + } + } + return res; + } + HARD_REG_ELT_TYPE elts[HARD_REG_SET_LONGS]; }; typedef const HARD_REG_SET &const_hard_reg_set; diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index f931c6e304c..a32693e69e4 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -29,10 +29,12 @@ along with GCC; see the file COPYING3. If not see #include "insn-config.h" #include "regs.h" #include "memmodel.h" +#include "tm_p.h" #include "ira.h" #include "ira-int.h" #include "sparseset.h" #include "cfgloop.h" +#include "subreg-live-range.h" static ira_copy_t find_allocno_copy (ira_allocno_t, ira_allocno_t, rtx_insn *, ira_loop_tree_node_t); @@ -442,13 +444,12 @@ initiate_allocnos (void) /* Create and return an object corresponding to a new allocno A. */ static ira_object_t -ira_create_object (ira_allocno_t a, int subword) +ira_create_object (ira_allocno_t a, int start, int nregs) { enum reg_class aclass = ALLOCNO_CLASS (a); ira_object_t obj = object_pool.allocate (); OBJECT_ALLOCNO (obj) = a; - OBJECT_SUBWORD (obj) = subword; OBJECT_CONFLICT_ID (obj) = ira_objects_num; OBJECT_CONFLICT_VEC_P (obj) = false; OBJECT_CONFLICT_ARRAY (obj) = NULL; @@ -460,12 +461,75 @@ ira_create_object (ira_allocno_t a, int subword) OBJECT_MIN (obj) = INT_MAX; OBJECT_MAX (obj) = -1; OBJECT_LIVE_RANGES (obj) = NULL; + OBJECT_START (obj) = start; + OBJECT_NREGS (obj) = nregs; + OBJECT_INDEX (obj) = ALLOCNO_NUM_OBJECTS (a); ira_object_id_map_vec.safe_push (obj); ira_object_id_map = ira_object_id_map_vec.address (); ira_objects_num = ira_object_id_map_vec.length (); + a->objects.push_back (obj); + + return obj; +} + +/* Return the object in allocno A which match START & NREGS. */ +ira_object_t +find_object (ira_allocno_t a, int start, int nregs) +{ + for (ira_object_t obj : a->objects) + { + if (OBJECT_START (obj) == start && OBJECT_NREGS (obj) == nregs) + return obj; + } + return NULL; +} + +ira_object_t +find_object (ira_allocno_t a, poly_int64 offset, poly_int64 size) +{ + enum reg_class aclass = ALLOCNO_CLASS (a); + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ira_reg_class_max_nregs[aclass][mode]; + + if (!has_subreg_object_p (a) + || maybe_eq (GET_MODE_SIZE (ALLOCNO_MODE (a)), size)) + return find_object (a, 0, nregs); + + gcc_assert (maybe_lt (size, GET_MODE_SIZE (ALLOCNO_MODE (a))) + && maybe_le (offset + size, GET_MODE_SIZE (ALLOCNO_MODE (a)))); + + int subreg_start = -1; + int subreg_nregs = -1; + for (int i = 0; i < nregs; i += 1) + { + poly_int64 right = ALLOCNO_UNIT_SIZE (a) * (i + 1); + if (subreg_start < 0 && maybe_lt (offset, right)) + { + subreg_start = i; + } + if (subreg_nregs < 0 && maybe_le (offset + size, right)) + { + subreg_nregs = i + 1 - subreg_start; + break; + } + } + gcc_assert (subreg_start >= 0 && subreg_nregs > 0); + return find_object (a, subreg_start, subreg_nregs); +} + +/* Return the object in allocno A which match START & NREGS. Create when not + found. */ +ira_object_t +find_object_anyway (ira_allocno_t a, int start, int nregs) +{ + ira_object_t obj = find_object (a, start, nregs); + if (obj == NULL && ALLOCNO_TRACK_SUBREG_P (a)) + obj = ira_create_object (a, start, nregs); + + gcc_assert (obj != NULL); return obj; } @@ -525,7 +589,8 @@ ira_create_allocno (int regno, bool cap_p, ALLOCNO_MEMORY_COST (a) = 0; ALLOCNO_UPDATED_MEMORY_COST (a) = 0; ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) = 0; - ALLOCNO_NUM_OBJECTS (a) = 0; + ALLOCNO_UNIT_SIZE (a) = 0; + ALLOCNO_TRACK_SUBREG_P (a) = false; ALLOCNO_ADD_DATA (a) = NULL; allocno_vec.safe_push (a); @@ -549,6 +614,51 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) OBJECT_CONFLICT_HARD_REGS (obj) |= ~reg_class_contents[aclass]; OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= ~reg_class_contents[aclass]; } + + if (aclass == NO_REGS) + return; + /* SET the unit_size of one register. */ + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ira_reg_class_max_nregs[aclass][mode]; + if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD)) + { + ALLOCNO_UNIT_SIZE (a) = UNITS_PER_WORD; + ALLOCNO_TRACK_SUBREG_P (a) = true; + return; + } +} + +/* Return the subreg range of rtx SUBREG. */ +static subreg_range +get_range (rtx subreg) +{ + gcc_assert (read_modify_subreg_p (subreg)); + rtx reg = SUBREG_REG (subreg); + machine_mode reg_mode = GET_MODE (reg); + + machine_mode subreg_mode = GET_MODE (subreg); + int nblocks = get_nblocks (reg_mode); + poly_int64 unit_size = REGMODE_NATURAL_SIZE (reg_mode); + + poly_int64 offset = SUBREG_BYTE (subreg); + poly_int64 left = offset + GET_MODE_SIZE (subreg_mode); + + int subreg_start = -1; + int subreg_nblocks = -1; + for (int i = 0; i < nblocks; i += 1) + { + poly_int64 right = unit_size * (i + 1); + if (subreg_start < 0 && maybe_lt (offset, right)) + subreg_start = i; + if (subreg_nblocks < 0 && maybe_le (left, right)) + { + subreg_nblocks = i + 1 - subreg_start; + break; + } + } + gcc_assert (subreg_start >= 0 && subreg_nblocks > 0); + + return subreg_range (subreg_start, subreg_start + subreg_nblocks); } /* Determine the number of objects we should associate with allocno A @@ -558,15 +668,37 @@ ira_create_allocno_objects (ira_allocno_t a) { machine_mode mode = ALLOCNO_MODE (a); enum reg_class aclass = ALLOCNO_CLASS (a); - int n = ira_reg_class_max_nregs[aclass][mode]; - int i; + int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (n != 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD)) - n = 1; + ira_create_object (a, 0, nregs); - ALLOCNO_NUM_OBJECTS (a) = n; - for (i = 0; i < n; i++) - ALLOCNO_OBJECT (a, i) = ira_create_object (a, i); + if (aclass == NO_REGS || !ALLOCNO_TRACK_SUBREG_P (a) || a->subregs.empty ()) + return; + + int nblocks = get_nblocks (ALLOCNO_MODE (a)); + int times = nblocks / ALLOCNO_NREGS (a); + gcc_assert (times >= 1 && nblocks % ALLOCNO_NREGS (a) == 0); + + for (const auto &range : a->subregs) + { + int start = range.start / times; + int end = CEIL (range.end, times); + if (find_object (a, start, end - start) != NULL) + continue; + ira_create_object (a, start, end - start); + } + + a->subregs.clear (); +} + +/* Copy the objects from FROM to TO. */ +void +ira_copy_allocno_objects (ira_allocno_t to, ira_allocno_t from) +{ + ira_allocno_object_iterator oi; + ira_object_t obj; + FOR_EACH_ALLOCNO_OBJECT (from, obj, oi) + ira_create_object (to, OBJECT_START (obj), OBJECT_NREGS (obj)); } /* For each allocno, set ALLOCNO_NUM_OBJECTS and create the @@ -590,11 +722,11 @@ merge_hard_reg_conflicts (ira_allocno_t from, ira_allocno_t to, bool total_only) { int i; - gcc_assert (ALLOCNO_NUM_OBJECTS (to) == ALLOCNO_NUM_OBJECTS (from)); - for (i = 0; i < ALLOCNO_NUM_OBJECTS (to); i++) + for (i = 0; i < ALLOCNO_NUM_OBJECTS (from); i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); if (!total_only) OBJECT_CONFLICT_HARD_REGS (to_obj) @@ -888,7 +1020,7 @@ create_cap_allocno (ira_allocno_t a) ALLOCNO_WMODE (cap) = ALLOCNO_WMODE (a); aclass = ALLOCNO_CLASS (a); ira_set_allocno_class (cap, aclass); - ira_create_allocno_objects (cap); + ira_copy_allocno_objects (cap, a); ALLOCNO_CAP_MEMBER (cap) = a; ALLOCNO_CAP (a) = cap; ALLOCNO_CLASS_COST (cap) = ALLOCNO_CLASS_COST (a); @@ -1830,6 +1962,26 @@ ira_traverse_loop_tree (bool bb_p, ira_loop_tree_node_t loop_node, /* The basic block currently being processed. */ static basic_block curr_bb; +/* Return true if A's subregs has a subreg with same SIZE and OFFSET. */ +static bool +find_subreg_p (ira_allocno_t a, const subreg_range &r) +{ + for (const auto &item : a->subregs) + if (item.start == r.start && item.end == r.end) + return true; + return false; +} + +/* Return start and nregs subregs from DF_LIVE_SUBREG. */ +static void +add_subregs (ira_allocno_t a, const subreg_ranges &sr) +{ + gcc_assert (get_nblocks (ALLOCNO_MODE (a)) == (unsigned) sr.max); + for (const subreg_range &r : sr.ranges) + if (!find_subreg_p (a, r)) + a->subregs.push_back (r); +} + /* This recursive function creates allocnos corresponding to pseudo-registers containing in X. True OUTPUT_P means that X is an lvalue. OUTER corresponds to the parent expression of X. */ @@ -1859,6 +2011,14 @@ create_insn_allocnos (rtx x, rtx outer, bool output_p) } } + /* Collect subreg reference. */ + if (outer != NULL && read_modify_subreg_p (outer)) + { + const subreg_range r = get_range (outer); + if (!find_subreg_p (a, r)) + a->subregs.push_back (r); + } + ALLOCNO_NREFS (a)++; ALLOCNO_FREQ (a) += REG_FREQ_FROM_BB (curr_bb); if (output_p) @@ -1919,10 +2079,28 @@ create_bb_allocnos (ira_loop_tree_node_t bb_node) create_insn_allocnos (PATTERN (insn), NULL, false); /* It might be a allocno living through from one subloop to another. */ - EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_FULL_IN (bb), FIRST_PSEUDO_REGISTER, i, bi) if (ira_curr_regno_allocno_map[i] == NULL) ira_create_allocno (i, false, ira_curr_loop_tree_node); + + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_PARTIAL_IN (bb), + FIRST_PSEUDO_REGISTER, i, bi) + { + if (ira_curr_regno_allocno_map[i] == NULL) + ira_create_allocno (i, false, ira_curr_loop_tree_node); + add_subregs (ira_curr_regno_allocno_map[i], + DF_LIVE_SUBREG_RANGE_IN (bb)->lives.at (i)); + } + + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_PARTIAL_OUT (bb), + FIRST_PSEUDO_REGISTER, i, bi) + { + if (ira_curr_regno_allocno_map[i] == NULL) + ira_create_allocno (i, false, ira_curr_loop_tree_node); + add_subregs (ira_curr_regno_allocno_map[i], + DF_LIVE_SUBREG_RANGE_OUT (bb)->lives.at (i)); + } } /* Create allocnos corresponding to pseudo-registers living on edge E @@ -2137,20 +2315,20 @@ move_allocno_live_ranges (ira_allocno_t from, ira_allocno_t to) int i; int n = ALLOCNO_NUM_OBJECTS (from); - gcc_assert (n == ALLOCNO_NUM_OBJECTS (to)); - for (i = 0; i < n; i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); live_range_t lr = OBJECT_LIVE_RANGES (from_obj); if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) { fprintf (ira_dump_file, - " Moving ranges of a%dr%d to a%dr%d: ", + " Moving ranges of a%dr%d_obj%d to a%dr%d_obj%d: ", ALLOCNO_NUM (from), ALLOCNO_REGNO (from), - ALLOCNO_NUM (to), ALLOCNO_REGNO (to)); + OBJECT_INDEX (from_obj), ALLOCNO_NUM (to), + ALLOCNO_REGNO (to), OBJECT_INDEX (to_obj)); ira_print_live_range_list (ira_dump_file, lr); } change_object_in_range_list (lr, to_obj); @@ -2166,12 +2344,11 @@ copy_allocno_live_ranges (ira_allocno_t from, ira_allocno_t to) int i; int n = ALLOCNO_NUM_OBJECTS (from); - gcc_assert (n == ALLOCNO_NUM_OBJECTS (to)); - for (i = 0; i < n; i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); live_range_t lr = OBJECT_LIVE_RANGES (from_obj); if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) @@ -2783,15 +2960,17 @@ setup_min_max_allocno_live_range_point (void) ira_assert (OBJECT_LIVE_RANGES (obj) == NULL); OBJECT_MAX (obj) = 0; OBJECT_MIN (obj) = 1; - continue; } ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); /* Accumulation of range info. */ if (ALLOCNO_CAP (a) != NULL) { - for (cap = ALLOCNO_CAP (a); cap != NULL; cap = ALLOCNO_CAP (cap)) + for (cap = ALLOCNO_CAP (a); cap != NULL; + cap = ALLOCNO_CAP (cap)) { - ira_object_t cap_obj = ALLOCNO_OBJECT (cap, j); + ira_object_t cap_obj = find_object (cap, OBJECT_START (obj), + OBJECT_NREGS (obj)); + gcc_assert (cap_obj != NULL); if (OBJECT_MAX (cap_obj) < OBJECT_MAX (obj)) OBJECT_MAX (cap_obj) = OBJECT_MAX (obj); if (OBJECT_MIN (cap_obj) > OBJECT_MIN (obj)) @@ -2802,7 +2981,9 @@ setup_min_max_allocno_live_range_point (void) if ((parent = ALLOCNO_LOOP_TREE_NODE (a)->parent) == NULL) continue; parent_a = parent->regno_allocno_map[i]; - parent_obj = ALLOCNO_OBJECT (parent_a, j); + parent_obj + = find_object (parent_a, OBJECT_START (obj), OBJECT_NREGS (obj)); + gcc_assert (parent_obj != NULL); if (OBJECT_MAX (parent_obj) < OBJECT_MAX (obj)) OBJECT_MAX (parent_obj) = OBJECT_MAX (obj); if (OBJECT_MIN (parent_obj) > OBJECT_MIN (obj)) diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index 4aa3e316282..8aed25144b9 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see . */ #include "config.h" +#define INCLUDE_MAP #include "system.h" #include "coretypes.h" #include "backend.h" @@ -852,18 +853,17 @@ setup_left_conflict_sizes_p (ira_allocno_t a) node_preorder_num = node->preorder_num; node_set = node->hard_regs->set; node_check_tick++; + /* Collect conflict objects. */ + std::map allocno_conflict_regs; for (k = 0; k < nobj; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); ira_object_t conflict_obj; ira_object_conflict_iterator oci; - + FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { - int size; - ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - allocno_hard_regs_node_t conflict_node, temp_node; - HARD_REG_SET conflict_node_set; + ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); allocno_color_data_t conflict_data; conflict_data = ALLOCNO_COLOR_DATA (conflict_a); @@ -872,6 +872,24 @@ setup_left_conflict_sizes_p (ira_allocno_t a) conflict_data ->profitable_hard_regs)) continue; + int num = ALLOCNO_NUM (conflict_a); + if (allocno_conflict_regs.count (num) == 0) + allocno_conflict_regs.insert ({num, ira_allocate_bitmap ()}); + bitmap_head temp; + bitmap_initialize (&temp, ®_obstack); + bitmap_set_range (&temp, OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj)); + bitmap_and_compl_into (&temp, allocno_conflict_regs.at (num)); + int size = bitmap_count_bits (&temp); + bitmap_clear (&temp); + if (size == 0) + continue; + + bitmap_set_range (allocno_conflict_regs.at (num), + OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj)); + allocno_hard_regs_node_t conflict_node, temp_node; + HARD_REG_SET conflict_node_set; conflict_node = conflict_data->hard_regs_node; conflict_node_set = conflict_node->hard_regs->set; if (hard_reg_set_subset_p (node_set, conflict_node_set)) @@ -886,14 +904,13 @@ setup_left_conflict_sizes_p (ira_allocno_t a) temp_node->check = node_check_tick; temp_node->conflict_size = 0; } - size = (ira_reg_class_max_nregs - [ALLOCNO_CLASS (conflict_a)][ALLOCNO_MODE (conflict_a)]); - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) - /* We will deal with the subwords individually. */ - size = 1; temp_node->conflict_size += size; } } + /* Setup conflict nregs of ALLOCNO. */ + for (auto &kv : allocno_conflict_regs) + ira_free_bitmap (kv.second); + for (i = 0; i < data->hard_regs_subnodes_num; i++) { allocno_hard_regs_node_t temp_node; @@ -1031,7 +1048,7 @@ static void setup_profitable_hard_regs (void) { unsigned int i; - int j, k, nobj, hard_regno, nregs, class_size; + int j, k, nobj, hard_regno, class_size; ira_allocno_t a; bitmap_iterator bi; enum reg_class aclass; @@ -1076,7 +1093,6 @@ setup_profitable_hard_regs (void) || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) continue; mode = ALLOCNO_MODE (a); - nregs = hard_regno_nregs (hard_regno, mode); nobj = ALLOCNO_NUM_OBJECTS (a); for (k = 0; k < nobj; k++) { @@ -1088,24 +1104,39 @@ setup_profitable_hard_regs (void) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - /* We can process the conflict allocno repeatedly with - the same result. */ - if (nregs == nobj && nregs > 1) + if (!has_subreg_object_p (a)) { - int num = OBJECT_SUBWORD (conflict_obj); - - if (REG_WORDS_BIG_ENDIAN) - CLEAR_HARD_REG_BIT - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, - hard_regno + nobj - num - 1); - else - CLEAR_HARD_REG_BIT - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, - hard_regno + num); + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs + &= ~ira_reg_mode_hard_regset[hard_regno][mode]; + continue; + } + + /* Clear all hard regs occupied by obj. */ + if (REG_WORDS_BIG_ENDIAN) + { + int start_regno + = hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_START (obj); + for (int i = 0; i < OBJECT_NREGS (obj); i += 1) + { + int regno = start_regno - i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + CLEAR_HARD_REG_BIT ( + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, + regno); + } } else - ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs - &= ~ira_reg_mode_hard_regset[hard_regno][mode]; + { + int start_regno = hard_regno + OBJECT_START (obj); + for (int i = 0; i < OBJECT_NREGS (obj); i += 1) + { + int regno = start_regno + i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + CLEAR_HARD_REG_BIT ( + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, + regno); + } + } } } } @@ -1677,18 +1708,25 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, aligned. */ static inline void get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p, - HARD_REG_SET *conflict_regs, + HARD_REG_SET *start_conflict_regs, HARD_REG_SET *start_profitable_regs) { int i, nwords; ira_object_t obj; nwords = ALLOCNO_NUM_OBJECTS (a); - for (i = 0; i < nwords; i++) - { - obj = ALLOCNO_OBJECT (a, i); - conflict_regs[i] = OBJECT_TOTAL_CONFLICT_HARD_REGS (obj); - } + CLEAR_HARD_REG_SET (*start_conflict_regs); + if (has_subreg_object_p (a)) + for (i = 0; i < nwords; i++) + { + obj = ALLOCNO_OBJECT (a, i); + for (int j = 0; j < OBJECT_NREGS (obj); j += 1) + *start_conflict_regs |= OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) + >> (OBJECT_START (obj) + j); + } + else + *start_conflict_regs + = OBJECT_TOTAL_CONFLICT_HARD_REGS (get_full_object (a)); if (retry_p) *start_profitable_regs = (reg_class_contents[ALLOCNO_CLASS (a)] @@ -1702,9 +1740,9 @@ get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p, PROFITABLE_REGS and whose objects have CONFLICT_REGS. */ static inline bool check_hard_reg_p (ira_allocno_t a, int hard_regno, - HARD_REG_SET *conflict_regs, HARD_REG_SET profitable_regs) + HARD_REG_SET start_conflict_regs, + HARD_REG_SET profitable_regs) { - int j, nwords, nregs; enum reg_class aclass; machine_mode mode; @@ -1716,28 +1754,17 @@ check_hard_reg_p (ira_allocno_t a, int hard_regno, /* Checking only profitable hard regs. */ if (! TEST_HARD_REG_BIT (profitable_regs, hard_regno)) return false; - nregs = hard_regno_nregs (hard_regno, mode); - nwords = ALLOCNO_NUM_OBJECTS (a); - for (j = 0; j < nregs; j++) + + if (has_subreg_object_p (a)) + return !TEST_HARD_REG_BIT (start_conflict_regs, hard_regno); + else { - int k; - int set_to_test_start = 0, set_to_test_end = nwords; - - if (nregs == nwords) - { - if (REG_WORDS_BIG_ENDIAN) - set_to_test_start = nwords - j - 1; - else - set_to_test_start = j; - set_to_test_end = set_to_test_start + 1; - } - for (k = set_to_test_start; k < set_to_test_end; k++) - if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)) - break; - if (k != set_to_test_end) - break; + int nregs = hard_regno_nregs (hard_regno, mode); + for (int i = 0; i < nregs; i += 1) + if (TEST_HARD_REG_BIT (start_conflict_regs, hard_regno + i)) + return false; + return true; } - return j == nregs; } /* Return number of registers needed to be saved and restored at @@ -1945,7 +1972,7 @@ spill_soft_conflicts (ira_allocno_t a, bitmap allocnos_to_spill, static bool assign_hard_reg (ira_allocno_t a, bool retry_p) { - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; int i, j, hard_regno, best_hard_regno, class_size; int cost, mem_cost, min_cost, full_cost, min_full_cost, nwords, word; int *a_costs; @@ -1962,8 +1989,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) HARD_REG_SET soft_conflict_regs = {}; ira_assert (! ALLOCNO_ASSIGNED_P (a)); - get_conflict_and_start_profitable_regs (a, retry_p, - conflicting_regs, + get_conflict_and_start_profitable_regs (a, retry_p, &start_conflicting_regs, &profitable_hard_regs); aclass = ALLOCNO_CLASS (a); class_size = ira_class_hard_regs_num[aclass]; @@ -2041,7 +2067,6 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) (hard_regno, ALLOCNO_MODE (conflict_a), reg_class_contents[aclass]))) { - int n_objects = ALLOCNO_NUM_OBJECTS (conflict_a); int conflict_nregs; mode = ALLOCNO_MODE (conflict_a); @@ -2076,24 +2101,95 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) note_conflict (r); } } + else if (has_subreg_object_p (a)) + { + /* Set start_conflicting_regs if that cause obj and + conflict_obj overlap. the overlap position: + +--------------+ + | conflict_obj | + +--------------+ + + +-----------+ +-----------+ + | obj | ... | obj | + +-----------+ +-----------+ + + Point: A B C + + the hard regs from A to C point will cause overlap. + For REG_WORDS_BIG_ENDIAN: + A = hard_regno + ALLOCNO_NREGS (conflict_a) - 1 + - OBJECT_START (conflict_obj) + - OBJECT_NREGS (obj) + 1 + C = A + OBJECT_NREGS (obj) + + OBJECT_NREGS (conflict_obj) - 2 + For !REG_WORDS_BIG_ENDIAN: + A = hard_regno + OBJECT_START (conflict_obj) + - OBJECT_NREGS (obj) + 1 + C = A + OBJECT_NREGS (obj) + + OBJECT_NREGS (conflict_obj) - 2 + */ + int start_regno; + int conflict_allocno_nregs, conflict_object_nregs, + conflict_object_start; + if (has_subreg_object_p (conflict_a)) + { + conflict_allocno_nregs = ALLOCNO_NREGS (conflict_a); + conflict_object_nregs = OBJECT_NREGS (conflict_obj); + conflict_object_start = OBJECT_START (conflict_obj); + } + else + { + conflict_allocno_nregs = conflict_object_nregs + = hard_regno_nregs (hard_regno, mode); + conflict_object_start = 0; + } + if (REG_WORDS_BIG_ENDIAN) + { + int A = hard_regno + conflict_allocno_nregs - 1 + - conflict_object_start - OBJECT_NREGS (obj) + + 1; + start_regno = A + OBJECT_NREGS (obj) - 1 + + OBJECT_START (obj) - ALLOCNO_NREGS (a) + + 1; + } + else + { + int A = hard_regno + conflict_object_start + - OBJECT_NREGS (obj) + 1; + start_regno = A - OBJECT_START (obj); + } + + for (int i = 0; + i <= OBJECT_NREGS (obj) + conflict_object_nregs - 2; + i += 1) + { + int regno = start_regno + i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + SET_HARD_REG_BIT (start_conflicting_regs, regno); + } + if (hard_reg_set_subset_p (profitable_hard_regs, + start_conflicting_regs)) + goto fail; + } else { - if (conflict_nregs == n_objects && conflict_nregs > 1) + if (has_subreg_object_p (conflict_a)) { - int num = OBJECT_SUBWORD (conflict_obj); - - if (REG_WORDS_BIG_ENDIAN) - SET_HARD_REG_BIT (conflicting_regs[word], - hard_regno + n_objects - num - 1); - else - SET_HARD_REG_BIT (conflicting_regs[word], - hard_regno + num); + int start_hard_regno + = REG_WORDS_BIG_ENDIAN + ? hard_regno + ALLOCNO_NREGS (conflict_a) + - OBJECT_START (conflict_obj) + : hard_regno + OBJECT_START (conflict_obj); + for (int i = 0; i < OBJECT_NREGS (conflict_obj); + i += 1) + SET_HARD_REG_BIT (start_conflicting_regs, + start_hard_regno + i); } else - conflicting_regs[word] + start_conflicting_regs |= ira_reg_mode_hard_regset[hard_regno][mode]; if (hard_reg_set_subset_p (profitable_hard_regs, - conflicting_regs[word])) + start_conflicting_regs)) goto fail; } } @@ -2160,8 +2256,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) && FIRST_STACK_REG <= hard_regno && hard_regno <= LAST_STACK_REG) continue; #endif - if (! check_hard_reg_p (a, hard_regno, - conflicting_regs, profitable_hard_regs)) + if (!check_hard_reg_p (a, hard_regno, start_conflicting_regs, + profitable_hard_regs)) continue; cost = costs[i]; full_cost = full_costs[i]; @@ -2667,21 +2763,16 @@ push_allocno_to_stack (ira_allocno_t a) { enum reg_class aclass; allocno_color_data_t data, conflict_data; - int size, i, n = ALLOCNO_NUM_OBJECTS (a); - + int i, n = ALLOCNO_NUM_OBJECTS (a); + data = ALLOCNO_COLOR_DATA (a); data->in_graph_p = false; allocno_stack_vec.safe_push (a); aclass = ALLOCNO_CLASS (a); if (aclass == NO_REGS) return; - size = ira_reg_class_max_nregs[aclass][ALLOCNO_MODE (a)]; - if (n > 1) - { - /* We will deal with the subwords individually. */ - gcc_assert (size == ALLOCNO_NUM_OBJECTS (a)); - size = 1; - } + /* Already collect conflict objects. */ + std::map allocno_conflict_regs; for (i = 0; i < n; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); @@ -2706,6 +2797,21 @@ push_allocno_to_stack (ira_allocno_t a) continue; ira_assert (bitmap_bit_p (coloring_allocno_bitmap, ALLOCNO_NUM (conflict_a))); + + int num = ALLOCNO_NUM (conflict_a); + if (allocno_conflict_regs.count (num) == 0) + allocno_conflict_regs.insert ({num, ira_allocate_bitmap ()}); + bitmap_head temp; + bitmap_initialize (&temp, ®_obstack); + bitmap_set_range (&temp, OBJECT_START (obj), OBJECT_NREGS (obj)); + bitmap_and_compl_into (&temp, allocno_conflict_regs.at (num)); + int size = bitmap_count_bits (&temp); + bitmap_clear (&temp); + if (size == 0) + continue; + + bitmap_set_range (allocno_conflict_regs.at (num), OBJECT_START (obj), + OBJECT_NREGS (obj)); if (update_left_conflict_sizes_p (conflict_a, a, size)) { delete_allocno_from_bucket @@ -2721,6 +2827,9 @@ push_allocno_to_stack (ira_allocno_t a) } } + + for (auto &kv : allocno_conflict_regs) + ira_free_bitmap (kv.second); } /* Put ALLOCNO onto the coloring stack and remove it from its bucket. @@ -3154,7 +3263,7 @@ improve_allocation (void) machine_mode mode; int *allocno_costs; int costs[FIRST_PSEUDO_REGISTER]; - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; ira_allocno_t a; bitmap_iterator bi; int saved_nregs; @@ -3193,7 +3302,7 @@ improve_allocation (void) - allocno_copy_cost_saving (a, hregno)); try_p = false; get_conflict_and_start_profitable_regs (a, false, - conflicting_regs, + &start_conflicting_regs, &profitable_hard_regs); class_size = ira_class_hard_regs_num[aclass]; mode = ALLOCNO_MODE (a); @@ -3202,8 +3311,8 @@ improve_allocation (void) for (j = 0; j < class_size; j++) { hregno = ira_class_hard_regs[aclass][j]; - if (! check_hard_reg_p (a, hregno, - conflicting_regs, profitable_hard_regs)) + if (!check_hard_reg_p (a, hregno, start_conflicting_regs, + profitable_hard_regs)) continue; ira_assert (ira_class_hard_reg_index[aclass][hregno] == j); k = allocno_costs == NULL ? 0 : j; @@ -3287,16 +3396,15 @@ improve_allocation (void) } conflict_nregs = hard_regno_nregs (conflict_hregno, ALLOCNO_MODE (conflict_a)); - auto note_conflict = [&](int r) - { - if (check_hard_reg_p (a, r, - conflicting_regs, profitable_hard_regs)) - { - if (spill_a) - SET_HARD_REG_BIT (soft_conflict_regs, r); - costs[r] += spill_cost; - } - }; + auto note_conflict = [&] (int r) { + if (check_hard_reg_p (a, r, start_conflicting_regs, + profitable_hard_regs)) + { + if (spill_a) + SET_HARD_REG_BIT (soft_conflict_regs, r); + costs[r] += spill_cost; + } + }; for (r = conflict_hregno; r >= 0 && (int) end_hard_regno (mode, r) > conflict_hregno; r--) @@ -3314,8 +3422,8 @@ improve_allocation (void) for (j = 0; j < class_size; j++) { hregno = ira_class_hard_regs[aclass][j]; - if (check_hard_reg_p (a, hregno, - conflicting_regs, profitable_hard_regs) + if (check_hard_reg_p (a, hregno, start_conflicting_regs, + profitable_hard_regs) && min_cost > costs[hregno]) { best = hregno; diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index a4d93c8d734..0585ad10043 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -60,23 +60,8 @@ static IRA_INT_TYPE **conflicts; static void record_object_conflict (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); - ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); - int w1 = OBJECT_SUBWORD (obj1); - int w2 = OBJECT_SUBWORD (obj2); - int id1, id2; - - /* Canonicalize the conflict. If two identically-numbered words - conflict, always record this as a conflict between words 0. That - is the only information we need, and it is easier to test for if - it is collected in each allocno's lowest-order object. */ - if (w1 == w2 && w1 > 0) - { - obj1 = ALLOCNO_OBJECT (a1, 0); - obj2 = ALLOCNO_OBJECT (a2, 0); - } - id1 = OBJECT_CONFLICT_ID (obj1); - id2 = OBJECT_CONFLICT_ID (obj2); + int id1 = OBJECT_CONFLICT_ID (obj1); + int id2 = OBJECT_CONFLICT_ID (obj2); SET_MINMAX_SET_BIT (conflicts[id1], id2, OBJECT_MIN (obj1), OBJECT_MAX (obj1)); @@ -606,8 +591,8 @@ build_object_conflicts (ira_object_t obj) if (parent_a == NULL) return; ira_assert (ALLOCNO_CLASS (a) == ALLOCNO_CLASS (parent_a)); - ira_assert (ALLOCNO_NUM_OBJECTS (a) == ALLOCNO_NUM_OBJECTS (parent_a)); - parent_obj = ALLOCNO_OBJECT (parent_a, OBJECT_SUBWORD (obj)); + parent_obj + = find_object_anyway (parent_a, OBJECT_START (obj), OBJECT_NREGS (obj)); parent_num = OBJECT_CONFLICT_ID (parent_obj); parent_min = OBJECT_MIN (parent_obj); parent_max = OBJECT_MAX (parent_obj); @@ -616,7 +601,6 @@ build_object_conflicts (ira_object_t obj) { ira_object_t another_obj = ira_object_id_map[i]; ira_allocno_t another_a = OBJECT_ALLOCNO (another_obj); - int another_word = OBJECT_SUBWORD (another_obj); ira_assert (ira_reg_classes_intersect_p [ALLOCNO_CLASS (a)][ALLOCNO_CLASS (another_a)]); @@ -627,11 +611,11 @@ build_object_conflicts (ira_object_t obj) ira_assert (ALLOCNO_NUM (another_parent_a) >= 0); ira_assert (ALLOCNO_CLASS (another_a) == ALLOCNO_CLASS (another_parent_a)); - ira_assert (ALLOCNO_NUM_OBJECTS (another_a) - == ALLOCNO_NUM_OBJECTS (another_parent_a)); SET_MINMAX_SET_BIT (conflicts[parent_num], - OBJECT_CONFLICT_ID (ALLOCNO_OBJECT (another_parent_a, - another_word)), + OBJECT_CONFLICT_ID ( + find_object_anyway (another_parent_a, + OBJECT_START (another_obj), + OBJECT_NREGS (another_obj))), parent_min, parent_max); } } @@ -659,9 +643,10 @@ build_conflicts (void) build_object_conflicts (obj); for (cap = ALLOCNO_CAP (a); cap != NULL; cap = ALLOCNO_CAP (cap)) { - ira_object_t cap_obj = ALLOCNO_OBJECT (cap, j); - gcc_assert (ALLOCNO_NUM_OBJECTS (cap) == ALLOCNO_NUM_OBJECTS (a)); - build_object_conflicts (cap_obj); + ira_object_t cap_obj + = find_object_anyway (cap, OBJECT_START (obj), + OBJECT_NREGS (obj)); + build_object_conflicts (cap_obj); } } } @@ -736,7 +721,8 @@ print_allocno_conflicts (FILE * file, bool reg_p, ira_allocno_t a) } if (n > 1) - fprintf (file, "\n;; subobject %d:", i); + fprintf (file, "\n;; subobject s%d,n%d,f%d:", OBJECT_START (obj), + OBJECT_NREGS (obj), ALLOCNO_NREGS (a)); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); @@ -746,8 +732,10 @@ print_allocno_conflicts (FILE * file, bool reg_p, ira_allocno_t a) { fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a), ALLOCNO_REGNO (conflict_a)); - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) - fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj)); + if (has_subreg_object_p (conflict_a)) + fprintf (file, ",s%d,n%d,f%d", OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj), + ALLOCNO_NREGS (conflict_a)); if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL) fprintf (file, ",b%d", bb->index); else diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc index 84ed482e568..9dc7f3c655e 100644 --- a/gcc/ira-emit.cc +++ b/gcc/ira-emit.cc @@ -854,7 +854,7 @@ modify_move_list (move_t list) ALLOCNO_MODE (new_allocno) = ALLOCNO_MODE (set_move->to); ira_set_allocno_class (new_allocno, ALLOCNO_CLASS (set_move->to)); - ira_create_allocno_objects (new_allocno); + ira_copy_allocno_objects (new_allocno, set_move->to); ALLOCNO_ASSIGNED_P (new_allocno) = true; ALLOCNO_HARD_REGNO (new_allocno) = -1; ALLOCNO_EMIT_DATA (new_allocno)->reg diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 0685e1f4e8d..9095a8227f7 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -23,6 +23,8 @@ along with GCC; see the file COPYING3. If not see #include "recog.h" #include "function-abi.h" +#include +#include "subreg-live-range.h" /* To provide consistency in naming, all IRA external variables, functions, common typedefs start with prefix ira_. */ @@ -222,11 +224,13 @@ extern int ira_max_point; extern live_range_t *ira_start_point_ranges, *ira_finish_point_ranges; /* A structure representing conflict information for an allocno - (or one of its subwords). */ + (or one of its subregs). */ struct ira_object { /* The allocno associated with this record. */ ira_allocno_t allocno; + /* Index in allocno->objects array */ + unsigned int index; /* Vector of accumulated conflicting conflict_redords with NULL end marker (if OBJECT_CONFLICT_VEC_P is true) or conflict bit vector otherwise. */ @@ -236,10 +240,9 @@ struct ira_object ranges in the list are not intersected and ordered by decreasing their program points*. */ live_range_t live_ranges; - /* The subword within ALLOCNO which is represented by this object. - Zero means the lowest-order subword (or the entire allocno in case - it is not being tracked in subwords). */ - int subword; + /* Reprensent OBJECT occupied [start, start + nregs) registers of it's + ALLOCNO. */ + int start, nregs; /* Allocated size of the conflicts array. */ unsigned int conflicts_array_size; /* A unique number for every instance of this structure, which is used @@ -295,6 +298,11 @@ struct ira_allocno reload (at this point pseudo-register has only one allocno) which did not get stack slot yet. */ signed int hard_regno : 16; + /* Unit size of one register that allocate for the allocno. Only use to + compute the start and nregs of subreg which be tracked. */ + poly_int64 unit_size; + /* Flag means need track subreg live range for the allocno. */ + bool track_subreg_p; /* A bitmask of the ABIs used by calls that occur while the allocno is live. */ unsigned int crossed_calls_abis : NUM_ABI_IDS; @@ -353,8 +361,6 @@ struct ira_allocno register class living at the point than number of hard-registers of the class available for the allocation. */ int excess_pressure_points_num; - /* The number of objects tracked in the following array. */ - int num_objects; /* Accumulated frequency of calls which given allocno intersects. */ int call_freq; @@ -387,8 +393,11 @@ struct ira_allocno /* An array of structures describing conflict information and live ranges for each object associated with the allocno. There may be more than one such object in cases where the allocno represents a - multi-word register. */ - ira_object_t objects[2]; + multi-hardreg pesudo. */ + std::vector objects; + /* An array of structures decribing the subreg mode start and subreg end for + this allocno. */ + std::vector subregs; /* Registers clobbered by intersected calls. */ HARD_REG_SET crossed_calls_clobbered_regs; /* Array of usage costs (accumulated and the one updated during @@ -468,8 +477,12 @@ struct ira_allocno #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) \ ((A)->excess_pressure_points_num) #define ALLOCNO_OBJECT(A,N) ((A)->objects[N]) -#define ALLOCNO_NUM_OBJECTS(A) ((A)->num_objects) +#define ALLOCNO_NUM_OBJECTS(A) ((int) (A)->objects.size ()) #define ALLOCNO_ADD_DATA(A) ((A)->add_data) +#define ALLOCNO_UNIT_SIZE(A) ((A)->unit_size) +#define ALLOCNO_TRACK_SUBREG_P(A) ((A)->track_subreg_p) +#define ALLOCNO_NREGS(A) \ + (ira_reg_class_max_nregs[ALLOCNO_CLASS (A)][ALLOCNO_MODE (A)]) /* Typedef for pointer to the subsequent structure. */ typedef struct ira_emit_data *ira_emit_data_t; @@ -511,7 +524,7 @@ allocno_emit_reg (ira_allocno_t a) } #define OBJECT_ALLOCNO(O) ((O)->allocno) -#define OBJECT_SUBWORD(O) ((O)->subword) +#define OBJECT_INDEX(O) ((O)->index) #define OBJECT_CONFLICT_ARRAY(O) ((O)->conflicts_array) #define OBJECT_CONFLICT_VEC(O) ((ira_object_t *)(O)->conflicts_array) #define OBJECT_CONFLICT_BITVEC(O) ((IRA_INT_TYPE *)(O)->conflicts_array) @@ -524,6 +537,8 @@ allocno_emit_reg (ira_allocno_t a) #define OBJECT_MAX(O) ((O)->max) #define OBJECT_CONFLICT_ID(O) ((O)->id) #define OBJECT_LIVE_RANGES(O) ((O)->live_ranges) +#define OBJECT_START(O) ((O)->start) +#define OBJECT_NREGS(O) ((O)->nregs) /* Map regno -> allocnos with given regno (see comments for allocno member `next_regno_allocno'). */ @@ -1041,6 +1056,12 @@ extern void ira_free_cost_vector (int *, reg_class_t); extern void ira_flattening (int, int); extern bool ira_build (void); extern void ira_destroy (void); +extern ira_object_t +find_object (ira_allocno_t, int, int); +extern ira_object_t find_object (ira_allocno_t, poly_int64, poly_int64); +ira_object_t +find_object_anyway (ira_allocno_t a, int start, int nregs); +extern void ira_copy_allocno_objects (ira_allocno_t, ira_allocno_t); /* ira-costs.cc */ extern void ira_init_costs_once (void); @@ -1708,4 +1729,18 @@ ira_caller_save_loop_spill_p (ira_allocno_t a, ira_allocno_t subloop_a, return call_cost && call_cost >= spill_cost; } +/* Return true if allocno A has subreg object. */ +inline bool +has_subreg_object_p (ira_allocno_t a) +{ + return ALLOCNO_NUM_OBJECTS (a) > 1; +} + +/* Return the full object of allocno A. */ +inline ira_object_t +get_full_object (ira_allocno_t a) +{ + return find_object (a, 0, ALLOCNO_NREGS (a)); +} + #endif /* GCC_IRA_INT_H */ diff --git a/gcc/ira-lives.cc b/gcc/ira-lives.cc index 05e2be12a26..9ca9e5548da 100644 --- a/gcc/ira-lives.cc +++ b/gcc/ira-lives.cc @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see . */ #include "config.h" +#define INCLUDE_VECTOR #include "system.h" #include "coretypes.h" #include "backend.h" @@ -35,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include "sparseset.h" #include "function-abi.h" #include "except.h" +#include "subreg-live-range.h" /* The code in this file is similar to one in global but the code works on the allocno basis and creates live ranges instead of @@ -91,6 +93,9 @@ static alternative_mask preferred_alternatives; we should not add a conflict with the copy's destination operand. */ static rtx ignore_reg_for_conflicts; +/* Store def/use point of has_subreg_object_p register. */ +static class subregs_live_points *subreg_live_points; + /* Record hard register REGNO as now being live. */ static void make_hard_regno_live (int regno) @@ -98,6 +103,33 @@ make_hard_regno_live (int regno) SET_HARD_REG_BIT (hard_regs_live, regno); } +/* Update conflict hard regs of ALLOCNO a for current live part. */ +static void +add_onflict_hard_regs (ira_allocno_t a, HARD_REG_SET regs) +{ + gcc_assert (has_subreg_object_p (a)); + + if (subreg_live_points->subreg_live_ranges.count (ALLOCNO_NUM (a)) == 0) + return; + + for (const subreg_range &r : + subreg_live_points->subreg_live_ranges.at (ALLOCNO_NUM (a)).ranges) + { + ira_object_t obj = find_object_anyway (a, r.start, r.end - r.start); + OBJECT_CONFLICT_HARD_REGS (obj) |= regs; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= regs; + } +} + +static void +add_onflict_hard_reg (ira_allocno_t a, unsigned int regno) +{ + HARD_REG_SET set; + CLEAR_HARD_REG_SET (set); + SET_HARD_REG_BIT (set, regno); + add_onflict_hard_regs (a, set); +} + /* Process the definition of hard register REGNO. This updates hard_regs_live and hard reg conflict information for living allocnos. */ static void @@ -113,8 +145,13 @@ make_hard_regno_dead (int regno) == (unsigned int) ALLOCNO_REGNO (OBJECT_ALLOCNO (obj))) continue; - SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); - SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); + if (has_subreg_object_p (OBJECT_ALLOCNO (obj))) + add_onflict_hard_reg (OBJECT_ALLOCNO (obj), regno); + else + { + SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); + SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); + } } CLEAR_HARD_REG_BIT (hard_regs_live, regno); } @@ -127,9 +164,29 @@ make_object_live (ira_object_t obj) sparseset_set_bit (objects_live, OBJECT_CONFLICT_ID (obj)); live_range_t lr = OBJECT_LIVE_RANGES (obj); - if (lr == NULL - || (lr->finish != curr_point && lr->finish + 1 != curr_point)) - ira_add_live_range_to_object (obj, curr_point, -1); + if (lr == NULL || (lr->finish != curr_point && lr->finish + 1 != curr_point)) + { + ira_add_live_range_to_object (obj, curr_point, -1); + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " add new live_range for a%d(r%d): [%d...-1]\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), curr_point); + } + } + else + { + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf ( + ira_dump_file, + " use old live_range for a%d(r%d): [%d...%d], curr: %d\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), lr->start, lr->finish, + curr_point); + } + } } /* Update ALLOCNO_EXCESS_PRESSURE_POINTS_NUM for the allocno @@ -140,7 +197,6 @@ update_allocno_pressure_excess_length (ira_object_t obj) ira_allocno_t a = OBJECT_ALLOCNO (obj); int start, i; enum reg_class aclass, pclass, cl; - live_range_t p; aclass = ALLOCNO_CLASS (a); pclass = ira_pressure_class_translate[aclass]; @@ -152,10 +208,18 @@ update_allocno_pressure_excess_length (ira_object_t obj) continue; if (high_pressure_start_point[cl] < 0) continue; - p = OBJECT_LIVE_RANGES (obj); - ira_assert (p != NULL); - start = (high_pressure_start_point[cl] > p->start - ? high_pressure_start_point[cl] : p->start); + int start_point; + if (has_subreg_object_p (a)) + start_point = subreg_live_points->get_start_point (ALLOCNO_NUM (a)); + else + { + live_range_t p = OBJECT_LIVE_RANGES (obj); + ira_assert (p != NULL); + start_point = p->start; + } + start = (high_pressure_start_point[cl] > start_point + ? high_pressure_start_point[cl] + : start_point); ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) += curr_point - start + 1; } } @@ -201,6 +265,14 @@ make_object_dead (ira_object_t obj) CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); lr = OBJECT_LIVE_RANGES (obj); + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " finish a live_range a%d(r%d): [%d...%d] => [%d...%d]\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), lr->start, lr->finish, + lr->start, curr_point); + } ira_assert (lr != NULL); lr->finish = curr_point; update_allocno_pressure_excess_length (obj); @@ -295,77 +367,144 @@ pseudo_regno_single_word_and_live_p (int regno) return sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj)); } -/* Mark the pseudo register REGNO as live. Update all information about - live ranges and register pressure. */ +/* Collect the point which the OBJ be def/use. */ static void -mark_pseudo_regno_live (int regno) +add_subreg_point (ira_object_t obj, bool is_def, bool is_dec = true) { - ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - enum reg_class pclass; - int i, n, nregs; - - if (a == NULL) - return; + ira_allocno_t a = OBJECT_ALLOCNO (obj); + if (is_def) + { + OBJECT_CONFLICT_HARD_REGS (obj) |= hard_regs_live; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= hard_regs_live; + if (is_dec) + { + enum reg_class pclass + = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + dec_register_pressure (pclass, ALLOCNO_NREGS (a)); + } + update_allocno_pressure_excess_length (obj); + } + else + { + enum reg_class pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + inc_register_pressure (pclass, ALLOCNO_NREGS (a)); + } - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + subreg_range r = subreg_range ( + {OBJECT_START (obj), OBJECT_START (obj) + OBJECT_NREGS (obj)}); + subreg_live_points->add_point (ALLOCNO_NUM (a), ALLOCNO_NREGS (a), r, is_def, + curr_point); - n = ALLOCNO_NUM_OBJECTS (a); - pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; - if (n > 1) + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) { - /* We track every subobject separately. */ - gcc_assert (nregs == n); - nregs = 1; + fprintf (ira_dump_file, " %s a%d(r%d", is_def ? "def" : "use", + ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); + if (ALLOCNO_CLASS (a) != NO_REGS + && ALLOCNO_NREGS (a) != OBJECT_NREGS (obj)) + fprintf (ira_dump_file, " [subreg: start %d, nregs %d]", + OBJECT_START (obj), OBJECT_NREGS (obj)); + else + fprintf (ira_dump_file, " [full: nregs %d]", OBJECT_NREGS (obj)); + fprintf (ira_dump_file, ") at point %d\n", curr_point); } - for (i = 0; i < n; i++) - { - ira_object_t obj = ALLOCNO_OBJECT (a, i); + gcc_assert (has_subreg_object_p (a)); + gcc_assert (subreg_live_points->subreg_live_ranges.count (ALLOCNO_NUM (a)) + != 0); + + const subreg_ranges &sr + = subreg_live_points->subreg_live_ranges.at (ALLOCNO_NUM (a)); + ira_object_t main_obj = find_object (a, 0, ALLOCNO_NREGS (a)); + gcc_assert (main_obj != NULL); + if (sr.empty_p () + && sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (main_obj))) + sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (main_obj)); + else if (!sr.empty_p () + && !sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (main_obj))) + sparseset_set_bit (objects_live, OBJECT_CONFLICT_ID (main_obj)); +} + +/* Mark the object OBJ as live. */ +static void +mark_pseudo_object_live (ira_allocno_t a, ira_object_t obj) +{ + /* Invalidate because it is referenced. */ + allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + if (has_subreg_object_p (a)) + add_subreg_point (obj, false); + else + { if (sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - continue; + return; - inc_register_pressure (pclass, nregs); + enum reg_class pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + inc_register_pressure (pclass, ALLOCNO_NREGS (a)); make_object_live (obj); } } +/* Mark the pseudo register REGNO as live. Update all information about + live ranges and register pressure. */ +static void +mark_pseudo_regno_live (int regno) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + + if (a == NULL) + return; + + int nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; + ira_object_t obj = find_object (a, 0, nregs); + gcc_assert (obj != NULL); + + mark_pseudo_object_live (a, obj); +} + /* Like mark_pseudo_regno_live, but try to only mark one subword of the pseudo as live. SUBWORD indicates which; a value of 0 indicates the low part. */ static void -mark_pseudo_regno_subword_live (int regno, int subword) +mark_pseudo_regno_subreg_live (int regno, rtx subreg) { ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n; - enum reg_class pclass; - ira_object_t obj; if (a == NULL) return; - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + ira_object_t obj + = find_object (a, SUBREG_BYTE (subreg), GET_MODE_SIZE (GET_MODE (subreg))); + gcc_assert (obj != NULL); + + mark_pseudo_object_live (a, obj); +} - n = ALLOCNO_NUM_OBJECTS (a); - if (n == 1) +/* Mark objects in subreg ranges SR as live. Update all information about + live ranges and register pressure. */ +static void +mark_pseudo_regno_subregs_live (int regno, const subreg_ranges &sr) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + if (a == NULL) + return; + + if (!ALLOCNO_TRACK_SUBREG_P (a)) { mark_pseudo_regno_live (regno); return; } - pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - gcc_assert - (n == ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]); - obj = ALLOCNO_OBJECT (a, subword); - - if (sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - return; - - inc_register_pressure (pclass, 1); - make_object_live (obj); + int times = sr.max / ALLOCNO_NREGS (a); + gcc_assert (sr.max >= ALLOCNO_NREGS (a) + && times * ALLOCNO_NREGS (a) == sr.max); + for (const subreg_range &range : sr.ranges) + { + int start = range.start / times; + int end = CEIL (range.end, times); + ira_object_t obj = find_object (a, start, end - start); + gcc_assert (obj != NULL); + mark_pseudo_object_live (a, obj); + } } /* Mark the register REG as live. Store a 1 in hard_regs_live for @@ -403,10 +542,7 @@ static void mark_pseudo_reg_live (rtx orig_reg, unsigned regno) { if (read_modify_subreg_p (orig_reg)) - { - mark_pseudo_regno_subword_live (regno, - subreg_lowpart_p (orig_reg) ? 0 : 1); - } + mark_pseudo_regno_subreg_live (regno, orig_reg); else mark_pseudo_regno_live (regno); } @@ -427,72 +563,59 @@ mark_ref_live (df_ref ref) mark_hard_reg_live (reg); } -/* Mark the pseudo register REGNO as dead. Update all information about - live ranges and register pressure. */ +/* Mark object as dead. */ static void -mark_pseudo_regno_dead (int regno) +mark_pseudo_object_dead (ira_allocno_t a, ira_object_t obj) { - ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n, i, nregs; - enum reg_class cl; - - if (a == NULL) - return; - /* Invalidate because it is referenced. */ allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; - n = ALLOCNO_NUM_OBJECTS (a); - cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; - if (n > 1) - { - /* We track every subobject separately. */ - gcc_assert (nregs == n); - nregs = 1; - } - for (i = 0; i < n; i++) + if (has_subreg_object_p (a)) + add_subreg_point (obj, true); + else { - ira_object_t obj = ALLOCNO_OBJECT (a, i); if (!sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - continue; + return; - dec_register_pressure (cl, nregs); + enum reg_class cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + dec_register_pressure (cl, ALLOCNO_NREGS (a)); make_object_dead (obj); } } -/* Like mark_pseudo_regno_dead, but called when we know that only part of the - register dies. SUBWORD indicates which; a value of 0 indicates the low part. */ +/* Mark the pseudo register REGNO as dead. Update all information about + live ranges and register pressure. */ static void -mark_pseudo_regno_subword_dead (int regno, int subword) +mark_pseudo_regno_dead (int regno) { ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n; - enum reg_class cl; - ira_object_t obj; if (a == NULL) return; - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + int nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; + ira_object_t obj = find_object (a, 0, nregs); + gcc_assert (obj != NULL); - n = ALLOCNO_NUM_OBJECTS (a); - if (n == 1) - /* The allocno as a whole doesn't die in this case. */ - return; + mark_pseudo_object_dead (a, obj); +} - cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - gcc_assert - (n == ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]); +/* Like mark_pseudo_regno_dead, but called when we know that only part of the + register dies. SUBWORD indicates which; a value of 0 indicates the low part. + */ +static void +mark_pseudo_regno_subreg_dead (int regno, rtx subreg) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - obj = ALLOCNO_OBJECT (a, subword); - if (!sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) + if (a == NULL) return; - dec_register_pressure (cl, 1); - make_object_dead (obj); + ira_object_t obj + = find_object (a, SUBREG_BYTE (subreg), GET_MODE_SIZE (GET_MODE (subreg))); + gcc_assert (obj != NULL); + + mark_pseudo_object_dead (a, obj); } /* Process the definition of hard register REG. This updates hard_regs_live @@ -528,10 +651,7 @@ static void mark_pseudo_reg_dead (rtx orig_reg, unsigned regno) { if (read_modify_subreg_p (orig_reg)) - { - mark_pseudo_regno_subword_dead (regno, - subreg_lowpart_p (orig_reg) ? 0 : 1); - } + mark_pseudo_regno_subreg_dead (regno, orig_reg); else mark_pseudo_regno_dead (regno); } @@ -1059,8 +1179,15 @@ process_single_reg_class_operands (bool in_p, int freq) /* We could increase costs of A instead of making it conflicting with the hard register. But it works worse because it will be spilled in reload in anyway. */ - OBJECT_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; - OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; + if (has_subreg_object_p (a)) + add_onflict_hard_regs (OBJECT_ALLOCNO (obj), + reg_class_contents[cl]); + else + { + OBJECT_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) + |= reg_class_contents[cl]; + } } } } @@ -1198,17 +1325,15 @@ process_out_of_region_eh_regs (basic_block bb) bi) { ira_allocno_t a = ira_curr_regno_allocno_map[i]; - for (int n = ALLOCNO_NUM_OBJECTS (a) - 1; n >= 0; n--) + ira_object_t obj = find_object (a, 0, ALLOCNO_NREGS (a)); + for (int k = 0;; k++) { - ira_object_t obj = ALLOCNO_OBJECT (a, n); - for (int k = 0; ; k++) - { - unsigned int regno = EH_RETURN_DATA_REGNO (k); - if (regno == INVALID_REGNUM) - break; - SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); - SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); - } + unsigned int regno = EH_RETURN_DATA_REGNO (k); + if (regno == INVALID_REGNUM) + break; + + SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); + SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); } } } @@ -1234,8 +1359,13 @@ add_conflict_from_region_landing_pads (eh_region region, ira_object_t obj, { HARD_REG_SET new_conflict_regs = callee_abi.mode_clobbers (ALLOCNO_MODE (a)); - OBJECT_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; - OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; + if (has_subreg_object_p (a)) + add_onflict_hard_regs (a, new_conflict_regs); + else + { + OBJECT_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; + } return; } } @@ -1260,6 +1390,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) bb = loop_tree_node->bb; if (bb != NULL) { + if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) + fprintf (ira_dump_file, "\n BB exit(l%d): point = %d\n", + loop_tree_node->parent->loop_num, curr_point); + for (i = 0; i < ira_pressure_classes_num; i++) { curr_reg_pressure[ira_pressure_classes[i]] = 0; @@ -1268,6 +1402,7 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) curr_bb_node = loop_tree_node; reg_live_out = DF_LIVE_SUBREG_OUT (bb); sparseset_clear (objects_live); + subreg_live_points->clear_live_ranges (); REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out); hard_regs_live &= ~(eliminable_regset | ira_no_alloc_regs); for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) @@ -1291,9 +1426,17 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) <= ira_class_hard_regs_num[cl]); } } - EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_FULL_OUT (bb), + FIRST_PSEUDO_REGISTER, j, bi) mark_pseudo_regno_live (j); + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_PARTIAL_OUT (bb), + FIRST_PSEUDO_REGISTER, j, bi) + { + mark_pseudo_regno_subregs_live ( + j, DF_LIVE_SUBREG_RANGE_OUT (bb)->lives.at (j)); + } + #ifdef EH_RETURN_DATA_REGNO process_out_of_region_eh_regs (bb); #endif @@ -1408,8 +1551,18 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) && (find_reg_note (insn, REG_SETJMP, NULL_RTX) != NULL_RTX))) { - SET_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj)); - SET_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); + if (has_subreg_object_p (a)) + { + HARD_REG_SET regs; + SET_HARD_REG_SET (regs); + add_onflict_hard_regs (a, regs); + } + else + { + SET_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj)); + SET_HARD_REG_SET ( + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); + } } eh_region r; if (can_throw_internal (insn) @@ -1455,7 +1608,14 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) /* Mark each used value as live. */ FOR_EACH_INSN_USE (use, insn) - mark_ref_live (use); + { + unsigned regno = DF_REF_REGNO (use); + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + if (a && has_subreg_object_p (a) + && DF_REF_FLAGS (use) & (DF_REF_READ_WRITE | DF_REF_SUBREG)) + continue; + mark_ref_live (use); + } process_single_reg_class_operands (true, freq); @@ -1485,6 +1645,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) } ignore_reg_for_conflicts = NULL_RTX; + if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) + fprintf (ira_dump_file, "\n BB head(l%d): point = %d\n", + loop_tree_node->parent->loop_num, curr_point); + if (bb_has_eh_pred (bb)) for (j = 0; ; ++j) { @@ -1538,10 +1702,15 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) } EXECUTE_IF_SET_IN_SPARSESET (objects_live, i) - make_object_dead (ira_object_id_map[i]); + { + ira_object_t obj = ira_object_id_map[i]; + if (has_subreg_object_p (OBJECT_ALLOCNO (obj))) + add_subreg_point (obj, true, false); + else + make_object_dead (obj); + } curr_point++; - } /* Propagate register pressure to upper loop tree nodes. */ if (loop_tree_node != ira_loop_tree_root) @@ -1742,6 +1911,86 @@ ira_debug_live_ranges (void) print_live_ranges (stderr); } +class subreg_live_item +{ +public: + subreg_ranges subreg; + int start, finish; +}; + +/* Create subreg live ranges from objects def/use point info. */ +static void +create_subregs_live_ranges () +{ + for (const auto &subreg_point_it : subreg_live_points->subreg_points) + { + unsigned int allocno_num = subreg_point_it.first; + const class live_points &points = subreg_point_it.second; + ira_allocno_t a = ira_allocnos[allocno_num]; + std::vector temps; + gcc_assert (has_subreg_object_p (a)); + for (const auto &point_it : points.points) + { + int point = point_it.first; + const live_point ®s = point_it.second; + gcc_assert (temps.empty () || temps.back ().finish <= point); + if (!regs.use_reg.empty_p ()) + { + if (temps.empty ()) + temps.push_back ({regs.use_reg, point, -1}); + else if (temps.back ().finish == -1) + { + if (!temps.back ().subreg.same_p (regs.use_reg)) + { + if (temps.back ().start == point) + temps.back ().subreg.add_ranges (regs.use_reg); + else + { + temps.back ().finish = point - 1; + + subreg_ranges temp = regs.use_reg; + temp.add_ranges (temps.back ().subreg); + temps.push_back ({temp, point, -1}); + } + } + } + else if (temps.back ().subreg.same_p (regs.use_reg) + && (temps.back ().finish == point + || temps.back ().finish + 1 == point)) + temps.back ().finish = -1; + else + temps.push_back ({regs.use_reg, point, -1}); + } + if (!regs.def_reg.empty_p ()) + { + gcc_assert (!temps.empty ()); + if (regs.def_reg.include_ranges_p (temps.back ().subreg)) + temps.back ().finish = point; + else if (temps.back ().subreg.include_ranges_p (regs.def_reg)) + { + temps.back ().finish = point; + + subreg_ranges diff = temps.back ().subreg; + diff.remove_ranges (regs.def_reg); + temps.push_back ({diff, point + 1, -1}); + } + else + gcc_unreachable (); + } + } + for (const subreg_live_item &item : temps) + for (const subreg_range &r : item.subreg.ranges) + { + ira_object_t obj = find_object_anyway (a, r.start, r.end - r.start); + live_range_t lr = OBJECT_LIVE_RANGES (obj); + if (lr != NULL && lr->finish + 1 == item.start) + lr->finish = item.finish; + else + ira_add_live_range_to_object (obj, item.start, item.finish); + } + } +} + /* The main entry function creates live ranges, set up CONFLICT_HARD_REGS and TOTAL_CONFLICT_HARD_REGS for objects, and calculate register pressure info. */ @@ -1755,13 +2004,20 @@ ira_create_allocno_live_ranges (void) allocno_saved_at_call = (int *) ira_allocate (ira_allocnos_num * sizeof (int)); memset (allocno_saved_at_call, 0, ira_allocnos_num * sizeof (int)); + subreg_live_points = new subregs_live_points (); ira_traverse_loop_tree (true, ira_loop_tree_root, NULL, process_bb_node_lives); ira_max_point = curr_point; + create_subregs_live_ranges (); create_start_finish_chains (); if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) - print_live_ranges (ira_dump_file); + { + fprintf (ira_dump_file, ";; subreg live points:\n"); + subreg_live_points->dump (ira_dump_file); + print_live_ranges (ira_dump_file); + } /* Clean up. */ + delete subreg_live_points; ira_free (allocno_saved_at_call); sparseset_free (objects_live); sparseset_free (allocnos_processed); diff --git a/gcc/ira.cc b/gcc/ira.cc index c7f27b17002..9ea57d3b1ea 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -2623,7 +2623,7 @@ static void check_allocation (void) { ira_allocno_t a; - int hard_regno, nregs, conflict_nregs; + int hard_regno; ira_allocno_iterator ai; FOR_EACH_ALLOCNO (a, ai) @@ -2634,28 +2634,18 @@ check_allocation (void) if (ALLOCNO_CAP_MEMBER (a) != NULL || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) continue; - nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a)); - if (nregs == 1) - /* We allocated a single hard register. */ - n = 1; - else if (n > 1) - /* We allocated multiple hard registers, and we will test - conflicts in a granularity of single hard regs. */ - nregs = 1; for (i = 0; i < n; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); ira_object_t conflict_obj; ira_object_conflict_iterator oci; - int this_regno = hard_regno; - if (n > 1) - { - if (REG_WORDS_BIG_ENDIAN) - this_regno += n - i - 1; - else - this_regno += i; - } + int this_regno; + if (REG_WORDS_BIG_ENDIAN) + this_regno = hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_START (obj) + - OBJECT_NREGS (obj) + 1; + else + this_regno = hard_regno + OBJECT_START (obj); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); @@ -2665,24 +2655,18 @@ check_allocation (void) if (ira_soft_conflict (a, conflict_a)) continue; - conflict_nregs = hard_regno_nregs (conflict_hard_regno, - ALLOCNO_MODE (conflict_a)); - - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1 - && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a)) - { - if (REG_WORDS_BIG_ENDIAN) - conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a) - - OBJECT_SUBWORD (conflict_obj) - 1); - else - conflict_hard_regno += OBJECT_SUBWORD (conflict_obj); - conflict_nregs = 1; - } + if (REG_WORDS_BIG_ENDIAN) + conflict_hard_regno = conflict_hard_regno + + ALLOCNO_NREGS (conflict_a) - 1 + - OBJECT_START (conflict_obj) + - OBJECT_NREGS (conflict_obj) + 1; + else + conflict_hard_regno + = conflict_hard_regno + OBJECT_START (conflict_obj); - if ((conflict_hard_regno <= this_regno - && this_regno < conflict_hard_regno + conflict_nregs) - || (this_regno <= conflict_hard_regno - && conflict_hard_regno < this_regno + nregs)) + if (!(this_regno + OBJECT_NREGS (obj) <= conflict_hard_regno + || conflict_hard_regno + OBJECT_NREGS (conflict_obj) + <= this_regno)) { fprintf (stderr, "bad allocation for %d and %d\n", ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a)); From patchwork Sun Nov 12 12:08:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164247 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp659799vqg; Sun, 12 Nov 2023 04:09:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IGn/C6gzOc6NSoM59RZ9SJbsJGg0nRBPYD6EhZGJ+uVHSb7tf3o2ksJwvBjggF3lx1bHEiR X-Received: by 2002:ac8:5a0f:0:b0:419:c9df:412b with SMTP id n15-20020ac85a0f000000b00419c9df412bmr6715300qta.10.1699790987185; Sun, 12 Nov 2023 04:09:47 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699790987; cv=pass; d=google.com; s=arc-20160816; b=c9hDVCmduar/ji5hFDFwhOcM/8Thnda8H68+3jZYfwk6KCt6WU+u1lLAX6TXDyt12Y VxbE4Uy1s10mIUeX2G1oQ8dP8e9lJO2HvPWsJYdXplrXg8iGyA2Q2wDKm5DjPSvJcZIq ZhO0R0H4QTC+h+mNiWbAOwKDbUx2sRJ4+AgNdgETV6pukJ8DFBPHJKPM1btsa0da89IG 0bY+YCQ713gpABAqMv51At+qfPnG8tHdK4czuzpkIzyyVl1xJr7NAG7ZQ1W1RwE1YJzg HRJ4+SxJh14aOSombZ9uDzZBaqZjKvOwXAEMQxA8GwkvgVZQF/+UM1zdBfmskYJj4Eu7 B6Fg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=RxhKs+kSz3Q4xh+rI8w0s4svIy1tDn6EiJo3JXgT918=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=j4PATse7tu9n5jB9o5TLo1TGV9mjg2pNFQN0930rJtWU7l70VCq0kRa/FQKiJz8hHO xh4zqe0KyAFVtKyoQZeHLkVCt0C9V/m7Aj0aT0bKU505+fAKF2D4wbjsezuSkcdg4waz ht/dTbvGheHdF5AC2WsBgEYp8YD/hzrDtPBrT1K4IiNetfvHUZDS/3rCxxC4j3ck57Cz 7LepJp7+1CrdTCp50SN2jxNObEBUpquDCLfRgpTGksKtDPriEeO/zBGVwqQo81ij8/0J 6hWKX9OCvbhdYh2lptMS9LMFXixywKSDJtPfnjbwm4ncKmvvDMF/v6oSdcFrgnU3+gKk wGDw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id z12-20020a05622a060c00b004196b4b95e5si2882458qta.690.2023.11.12.04.09.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:09:47 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CB198385E021 for ; Sun, 12 Nov 2023 12:09:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 611D6385841C for ; Sun, 12 Nov 2023 12:08:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 611D6385841C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 611D6385841C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.169.211.239 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790925; cv=none; b=KVE6sQalQIJg/G70P/RCCK93WOUYl1mvcbBW8cfjbMhzlA07RkMjkGWlDz6M6WG0CzzpTYH7Muvy/Rq6OIcEWwOttDYbcjRZHK/VUeduspkL/ZBQpzinQYkM7usQnU2tff1Rjg/1t2DqYVG7Xe45eIHAhqZSN2mj7oAl/+Y5Lio= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790925; c=relaxed/simple; bh=v9TQsaLVXrOaPkHf+YNr+/VNfs9rA78h1zP/mRK5Ebs=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=rgPxrHDm580sjgijscQwbDoIbmrqvSHIyZMZI4tnkMfrgL0u8BxzxaLAoFYXfnQR/S2hywiR1lNeGPjKrZDoQg71c5LadSGPh/Wi/51dnepJe3eR7OeHGaGii3bf+nVVbRKjaHFKbWhcqoDtnByXYV/5EVk1KGUR94gQEv6tmps= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790913trv238f8 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:32 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: +oIWmpEafD+JM8FwxxQezvxEb3T2mFTCpsEbRExEfdJMiFuRycjOQJzjqwPC5 a/+lfD9upSivuk1FWjguq0CHBjeVDVXZZFcxcU6IuF470m/ABbtcUNYDmamEiSAYhA7wuP6 /uK2hScm7IWz7V4DH92FgMC4crGg/17BOO3BsMkKKy6jtzmCEB1cnjJm+nqxT0FMyo/yU/M yIr3lFIrkNA6yjbM3gtjGQxwaeR2qIDfGeRoWMNw50UPJqq+tg4daBivvIUQq+ly2XoR/oJ /xaCCiRwzq/fdhRfvLmfvvJ4xKG9mV0kiv+CNVaWuzRoU2YSJD3/E0aobcRlC54jylEsSBK fTQwkBIPP3iV0yyW72KGhdJWeCJkLusGWcaEZrfG9pL3Oof7F9rTZXxbovQv6l0ZVBazc5v Tew8PanAE94= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 10620856632715277005 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 4/7] ira: Support subreg copy Date: Sun, 12 Nov 2023 20:08:14 +0800 Message-Id: <20231112120817.2635864-5-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782360033856268707 X-GMAIL-MSGID: 1782360033856268707 This patch changes the previous way of creating a copy between allocnos to objects. gcc/ChangeLog: * ira-build.cc (find_allocno_copy): Removed. (find_object): New. (ira_create_copy): Adjust. (add_allocno_copy_to_list): Adjust. (swap_allocno_copy_ends_if_necessary): Adjust. (ira_add_allocno_copy): Adjust. (print_copy): Adjust. (print_allocno_copies): Adjust. (ira_flattening): Adjust. * ira-color.cc (INCLUDE_VECTOR): Include vector. (struct allocno_color_data): Adjust. (struct allocno_hard_regs_subnode): Adjust. (form_allocno_hard_regs_nodes_forest): Adjust. (update_left_conflict_sizes_p): Adjust. (struct update_cost_queue_elem): Adjust. (queue_update_cost): Adjust. (get_next_update_cost): Adjust. (update_costs_from_allocno): Adjust. (update_conflict_hard_regno_costs): Adjust. (assign_hard_reg): Adjust. (objects_conflict_by_live_ranges_p): New. (allocno_thread_conflict_p): Adjust. (object_thread_conflict_p): Ditto. (merge_threads): Ditto. (form_threads_from_copies): Ditto. (form_threads_from_bucket): Ditto. (form_threads_from_colorable_allocno): Ditto. (init_allocno_threads): Ditto. (add_allocno_to_bucket): Ditto. (delete_allocno_from_bucket): Ditto. (allocno_copy_cost_saving): Ditto. (color_allocnos): Ditto. (color_pass): Ditto. (update_curr_costs): Ditto. (coalesce_allocnos): Ditto. (ira_reuse_stack_slot): Ditto. (ira_initiate_assign): Ditto. (ira_finish_assign): Ditto. * ira-conflicts.cc (allocnos_conflict_for_copy_p): Ditto. (REG_SUBREG_P): Ditto. (subreg_move_p): New. (regs_non_conflict_for_copy_p): New. (subreg_reg_align_and_times_p): New. (process_regs_for_copy): Ditto. (add_insn_allocno_copies): Ditto. (propagate_copies): Ditto. * ira-emit.cc (add_range_and_copies_from_move_list): Ditto. * ira-int.h (struct ira_allocno_copy): Ditto. (ira_add_allocno_copy): Ditto. (find_object): Exported. (subreg_move_p): Exported. * ira.cc (print_redundant_copies): Exported. --- gcc/ira-build.cc | 154 +++++++----- gcc/ira-color.cc | 541 +++++++++++++++++++++++++++++++------------ gcc/ira-conflicts.cc | 173 +++++++++++--- gcc/ira-emit.cc | 10 +- gcc/ira-int.h | 10 +- gcc/ira.cc | 5 +- 6 files changed, 646 insertions(+), 247 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index a32693e69e4..13f0f7336ed 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -36,9 +36,6 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "subreg-live-range.h" -static ira_copy_t find_allocno_copy (ira_allocno_t, ira_allocno_t, rtx_insn *, - ira_loop_tree_node_t); - /* The root of the loop tree corresponding to the all function. */ ira_loop_tree_node_t ira_loop_tree_root; @@ -520,6 +517,16 @@ find_object (ira_allocno_t a, poly_int64 offset, poly_int64 size) return find_object (a, subreg_start, subreg_nregs); } +/* Return object in allocno A for REG. */ +ira_object_t +find_object (ira_allocno_t a, rtx reg) +{ + if (has_subreg_object_p (a) && read_modify_subreg_p (reg)) + return find_object (a, SUBREG_BYTE (reg), GET_MODE_SIZE (GET_MODE (reg))); + else + return find_object (a, 0, ALLOCNO_NREGS (a)); +} + /* Return the object in allocno A which match START & NREGS. Create when not found. */ ira_object_t @@ -1503,27 +1510,36 @@ initiate_copies (void) /* Return copy connecting A1 and A2 and originated from INSN of LOOP_TREE_NODE if any. */ static ira_copy_t -find_allocno_copy (ira_allocno_t a1, ira_allocno_t a2, rtx_insn *insn, +find_allocno_copy (ira_object_t obj1, ira_object_t obj2, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { ira_copy_t cp, next_cp; - ira_allocno_t another_a; + ira_object_t another_obj; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); for (cp = ALLOCNO_COPIES (a1); cp != NULL; cp = next_cp) { - if (cp->first == a1) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a1) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + if (cp->first == obj1) + another_obj = cp->second; + else + continue; } - else if (cp->second == a1) + else if (second_a == a1) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + if (cp->second == obj1) + another_obj = cp->first; + else + continue; } else gcc_unreachable (); - if (another_a == a2 && cp->insn == insn + if (another_obj == obj2 && cp->insn == insn && cp->loop_tree_node == loop_tree_node) return cp; } @@ -1533,7 +1549,7 @@ find_allocno_copy (ira_allocno_t a1, ira_allocno_t a2, rtx_insn *insn, /* Create and return copy with given attributes LOOP_TREE_NODE, FIRST, SECOND, FREQ, CONSTRAINT_P, and INSN. */ ira_copy_t -ira_create_copy (ira_allocno_t first, ira_allocno_t second, int freq, +ira_create_copy (ira_object_t first, ira_object_t second, int freq, bool constraint_p, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { @@ -1557,28 +1573,29 @@ ira_create_copy (ira_allocno_t first, ira_allocno_t second, int freq, static void add_allocno_copy_to_list (ira_copy_t cp) { - ira_allocno_t first = cp->first, second = cp->second; + ira_object_t first = cp->first, second = cp->second; + ira_allocno_t a1 = OBJECT_ALLOCNO (first), a2 = OBJECT_ALLOCNO (second); cp->prev_first_allocno_copy = NULL; cp->prev_second_allocno_copy = NULL; - cp->next_first_allocno_copy = ALLOCNO_COPIES (first); + cp->next_first_allocno_copy = ALLOCNO_COPIES (a1); if (cp->next_first_allocno_copy != NULL) { - if (cp->next_first_allocno_copy->first == first) + if (OBJECT_ALLOCNO (cp->next_first_allocno_copy->first) == a1) cp->next_first_allocno_copy->prev_first_allocno_copy = cp; else cp->next_first_allocno_copy->prev_second_allocno_copy = cp; } - cp->next_second_allocno_copy = ALLOCNO_COPIES (second); + cp->next_second_allocno_copy = ALLOCNO_COPIES (a2); if (cp->next_second_allocno_copy != NULL) { - if (cp->next_second_allocno_copy->second == second) + if (OBJECT_ALLOCNO (cp->next_second_allocno_copy->second) == a2) cp->next_second_allocno_copy->prev_second_allocno_copy = cp; else cp->next_second_allocno_copy->prev_first_allocno_copy = cp; } - ALLOCNO_COPIES (first) = cp; - ALLOCNO_COPIES (second) = cp; + ALLOCNO_COPIES (a1) = cp; + ALLOCNO_COPIES (a2) = cp; } /* Make a copy CP a canonical copy where number of the @@ -1586,7 +1603,8 @@ add_allocno_copy_to_list (ira_copy_t cp) static void swap_allocno_copy_ends_if_necessary (ira_copy_t cp) { - if (ALLOCNO_NUM (cp->first) <= ALLOCNO_NUM (cp->second)) + if (ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)) + <= ALLOCNO_NUM (OBJECT_ALLOCNO (cp->second))) return; std::swap (cp->first, cp->second); @@ -1595,11 +1613,10 @@ swap_allocno_copy_ends_if_necessary (ira_copy_t cp) } /* Create (or update frequency if the copy already exists) and return - the copy of allocnos FIRST and SECOND with frequency FREQ - corresponding to move insn INSN (if any) and originated from - LOOP_TREE_NODE. */ + the copy of objects FIRST and SECOND with frequency FREQ corresponding to + move insn INSN (if any) and originated from LOOP_TREE_NODE. */ ira_copy_t -ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, +ira_add_allocno_copy (ira_object_t first, ira_object_t second, int freq, bool constraint_p, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { @@ -1618,15 +1635,38 @@ ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, return cp; } +/* Create (or update frequency if the copy already exists) and return + the copy of allocnos FIRST and SECOND with frequency FREQ + corresponding to move insn INSN (if any) and originated from + LOOP_TREE_NODE. */ +ira_copy_t +ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, + bool constraint_p, rtx_insn *insn, + ira_loop_tree_node_t loop_tree_node) +{ + ira_object_t obj1 = get_full_object (first); + ira_object_t obj2 = get_full_object (second); + gcc_assert (obj1 != NULL && obj2 != NULL); + return ira_add_allocno_copy (obj1, obj2, freq, constraint_p, insn, + loop_tree_node); +} + /* Print info about copy CP into file F. */ static void print_copy (FILE *f, ira_copy_t cp) { - fprintf (f, " cp%d:a%d(r%d)<->a%d(r%d)@%d:%s\n", cp->num, - ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), cp->freq, - cp->insn != NULL - ? "move" : cp->constraint_p ? "constraint" : "shuffle"); + ira_allocno_t a1 = OBJECT_ALLOCNO (cp->first); + ira_allocno_t a2 = OBJECT_ALLOCNO (cp->second); + fprintf (f, " cp%d:a%d(r%d", cp->num, ALLOCNO_NUM (a1), ALLOCNO_REGNO (a1)); + if (ALLOCNO_NREGS (a1) != OBJECT_NREGS (cp->first)) + fprintf (f, "_obj%d", OBJECT_INDEX (cp->first)); + fprintf (f, ")<->a%d(r%d", ALLOCNO_NUM (a2), ALLOCNO_REGNO (a2)); + if (ALLOCNO_NREGS (a2) != OBJECT_NREGS (cp->second)) + fprintf (f, "_obj%d", OBJECT_INDEX (cp->second)); + fprintf (f, ")@%d:%s\n", cp->freq, + cp->insn != NULL ? "move" + : cp->constraint_p ? "constraint" + : "shuffle"); } DEBUG_FUNCTION void @@ -1673,24 +1713,25 @@ ira_debug_copies (void) static void print_allocno_copies (FILE *f, ira_allocno_t a) { - ira_allocno_t another_a; + ira_object_t another_obj; ira_copy_t cp, next_cp; fprintf (f, " a%d(r%d):", ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_obj = cp->second; } - else if (cp->second == a) + else if (OBJECT_ALLOCNO (cp->second) == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + ira_allocno_t another_a = OBJECT_ALLOCNO (another_obj); fprintf (f, " cp%d:a%d(r%d)@%d", cp->num, ALLOCNO_NUM (another_a), ALLOCNO_REGNO (another_a), cp->freq); } @@ -3480,25 +3521,21 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) copies. */ FOR_EACH_COPY (cp, ci) { - if (ALLOCNO_CAP_MEMBER (cp->first) != NULL - || ALLOCNO_CAP_MEMBER (cp->second) != NULL) + ira_allocno_t a1 = OBJECT_ALLOCNO (cp->first); + ira_allocno_t a2 = OBJECT_ALLOCNO (cp->second); + if (ALLOCNO_CAP_MEMBER (a1) != NULL || ALLOCNO_CAP_MEMBER (a2) != NULL) { if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) - fprintf - (ira_dump_file, " Remove cp%d:%c%dr%d-%c%dr%d\n", - cp->num, ALLOCNO_CAP_MEMBER (cp->first) != NULL ? 'c' : 'a', - ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_CAP_MEMBER (cp->second) != NULL ? 'c' : 'a', - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + fprintf (ira_dump_file, " Remove cp%d:%c%dr%d-%c%dr%d\n", + cp->num, ALLOCNO_CAP_MEMBER (a1) != NULL ? 'c' : 'a', + ALLOCNO_NUM (a1), REGNO (allocno_emit_reg (a1)), + ALLOCNO_CAP_MEMBER (a2) != NULL ? 'c' : 'a', + ALLOCNO_NUM (a2), REGNO (allocno_emit_reg (a2))); cp->loop_tree_node = NULL; continue; } - first - = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->first))]; - second - = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->second))]; + first = regno_top_level_allocno_map[REGNO (allocno_emit_reg (a1))]; + second = regno_top_level_allocno_map[REGNO (allocno_emit_reg (a2))]; node = cp->loop_tree_node; if (node == NULL) keep_p = true; /* It copy generated in ira-emit.cc. */ @@ -3506,8 +3543,8 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) { /* Check that the copy was not propagated from level on which we will have different pseudos. */ - node_first = node->regno_allocno_map[ALLOCNO_REGNO (cp->first)]; - node_second = node->regno_allocno_map[ALLOCNO_REGNO (cp->second)]; + node_first = node->regno_allocno_map[ALLOCNO_REGNO (a1)]; + node_second = node->regno_allocno_map[ALLOCNO_REGNO (a2)]; keep_p = ((REGNO (allocno_emit_reg (first)) == REGNO (allocno_emit_reg (node_first))) && (REGNO (allocno_emit_reg (second)) @@ -3516,18 +3553,18 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) if (keep_p) { cp->loop_tree_node = ira_loop_tree_root; - cp->first = first; - cp->second = second; + cp->first = find_object_anyway (first, OBJECT_START (cp->first), + OBJECT_NREGS (cp->first)); + cp->second = find_object_anyway (second, OBJECT_START (cp->second), + OBJECT_NREGS (cp->second)); } else { cp->loop_tree_node = NULL; if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) fprintf (ira_dump_file, " Remove cp%d:a%dr%d-a%dr%d\n", - cp->num, ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + cp->num, ALLOCNO_NUM (a1), REGNO (allocno_emit_reg (a1)), + ALLOCNO_NUM (a2), REGNO (allocno_emit_reg (a2))); } } /* Remove unnecessary allocnos on lower levels of the loop tree. */ @@ -3563,9 +3600,10 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) finish_copy (cp); continue; } - ira_assert - (ALLOCNO_LOOP_TREE_NODE (cp->first) == ira_loop_tree_root - && ALLOCNO_LOOP_TREE_NODE (cp->second) == ira_loop_tree_root); + ira_assert (ALLOCNO_LOOP_TREE_NODE (OBJECT_ALLOCNO (cp->first)) + == ira_loop_tree_root + && ALLOCNO_LOOP_TREE_NODE (OBJECT_ALLOCNO (cp->second)) + == ira_loop_tree_root); add_allocno_copy_to_list (cp); swap_allocno_copy_ends_if_necessary (cp); } diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index 8aed25144b9..099312bcdb3 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3. If not see #include "config.h" #define INCLUDE_MAP +#define INCLUDE_VECTOR #include "system.h" #include "coretypes.h" #include "backend.h" @@ -150,11 +151,18 @@ struct allocno_color_data struct update_cost_record *update_cost_records; /* Threads. We collect allocnos connected by copies into threads and try to assign hard regs to allocnos by threads. */ - /* Allocno representing all thread. */ - ira_allocno_t first_thread_allocno; + /* The head objects for all thread. */ + ira_object_t *first_thread_objects; /* Allocnos in thread forms a cycle list through the following member. */ - ira_allocno_t next_thread_allocno; + ira_object_t *next_thread_objects; + /* The allocno all thread shared. */ + ira_allocno_t first_thread_allocno; + /* The offset start relative to the first_thread_allocno. */ + int first_thread_offset; + /* All allocnos belong to the thread. */ + bitmap thread_allocnos; + /* The freq sum of all thread allocno. */ /* All thread frequency. Defined only for first thread allocno. */ int thread_freq; /* Sum of frequencies of hard register preferences of the allocno. */ @@ -188,6 +196,9 @@ static bitmap coloring_allocno_bitmap; allocnos. */ static bitmap consideration_allocno_bitmap; +/* Bitmap of allocnos which is not trivially colorable. */ +static bitmap uncolorable_allocno_set; + /* All allocnos sorted according their priorities. */ static ira_allocno_t *sorted_allocnos; @@ -647,9 +658,13 @@ struct allocno_hard_regs_subnode Overall conflict size is left_conflict_subnodes_size + MIN (max_node_impact - left_conflict_subnodes_size, - left_conflict_size) + left_conflict_size) + Use MIN here to ensure that the total conflict does not exceed + max_node_impact. */ + /* The total conflict size of subnodes. */ short left_conflict_subnodes_size; + /* The maximum number of registers that the current node can use. */ short max_node_impact; }; @@ -758,6 +773,8 @@ form_allocno_hard_regs_nodes_forest (void) collect_allocno_hard_regs_cover (hard_regs_roots, allocno_data->profitable_hard_regs); allocno_hard_regs_node = NULL; + /* Find the ancestor node in forest which cover all nodes. The ancestor is + a smallest superset of profitable_hard_regs. */ for (j = 0; hard_regs_node_vec.iterate (j, &node); j++) allocno_hard_regs_node = (j == 0 @@ -990,6 +1007,8 @@ update_left_conflict_sizes_p (ira_allocno_t a, removed_node->hard_regs->set)); start = node_preorder_num * allocno_hard_regs_nodes_num; i = allocno_hard_regs_subnode_index[start + removed_node->preorder_num]; + /* i < 0 means removed_node is parent of node instead of node is the parent of + removed_node. */ if (i < 0) i = 0; subnodes = allocno_hard_regs_subnodes + data->hard_regs_subnodes_start; @@ -999,6 +1018,7 @@ update_left_conflict_sizes_p (ira_allocno_t a, - subnodes[i].left_conflict_subnodes_size, subnodes[i].left_conflict_size)); subnodes[i].left_conflict_size -= size; + /* Update all ancestors for subnode i. */ for (;;) { conflict_size @@ -1242,6 +1262,9 @@ struct update_cost_queue_elem connecting this allocno to the one being allocated. */ int divisor; + /* Hard register regno assigned to current ALLOCNO. */ + int hard_regno; + /* Allocno from which we started chaining costs of connected allocnos. */ ira_allocno_t start; @@ -1308,7 +1331,7 @@ start_update_cost (void) /* Add (ALLOCNO, START, FROM, DIVISOR) to the end of update_cost_queue, unless ALLOCNO is already in the queue, or has NO_REGS class. */ static inline void -queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, +queue_update_cost (ira_allocno_t allocno, int hard_regno, ira_allocno_t start, ira_allocno_t from, int divisor) { struct update_cost_queue_elem *elem; @@ -1317,6 +1340,7 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, if (elem->check != update_cost_check && ALLOCNO_CLASS (allocno) != NO_REGS) { + elem->hard_regno = hard_regno; elem->check = update_cost_check; elem->start = start; elem->from = from; @@ -1334,8 +1358,8 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, false if the queue was empty, otherwise make (*ALLOCNO, *START, *FROM, *DIVISOR) describe the removed element. */ static inline bool -get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start, - ira_allocno_t *from, int *divisor) +get_next_update_cost (ira_allocno_t *allocno, int *hard_regno, + ira_allocno_t *start, ira_allocno_t *from, int *divisor) { struct update_cost_queue_elem *elem; @@ -1348,6 +1372,8 @@ get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start, *from = elem->from; *divisor = elem->divisor; update_cost_queue = elem->next; + if (hard_regno != NULL) + *hard_regno = elem->hard_regno; return true; } @@ -1449,31 +1475,41 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, enum reg_class rclass, aclass; ira_allocno_t another_allocno, start = allocno, from = NULL; ira_copy_t cp, next_cp; + ira_object_t another_obj; + unsigned int obj_index1, obj_index2; rclass = REGNO_REG_CLASS (hard_regno); do { + gcc_assert (hard_regno >= 0); mode = ALLOCNO_MODE (allocno); ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { + obj_index1 = OBJECT_INDEX (cp->first); + obj_index2 = OBJECT_INDEX (cp->second); next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_obj = cp->second; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { + obj_index1 = OBJECT_INDEX (cp->second); + obj_index2 = OBJECT_INDEX (cp->first); next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + another_allocno = OBJECT_ALLOCNO (another_obj); if (another_allocno == from || (ALLOCNO_COLOR_DATA (another_allocno) != NULL - && (ALLOCNO_COLOR_DATA (allocno)->first_thread_allocno - != ALLOCNO_COLOR_DATA (another_allocno)->first_thread_allocno))) + && (ALLOCNO_COLOR_DATA (allocno) + ->first_thread_objects[obj_index1] + != ALLOCNO_COLOR_DATA (another_allocno) + ->first_thread_objects[obj_index2]))) continue; aclass = ALLOCNO_CLASS (another_allocno); @@ -1482,6 +1518,8 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, || ALLOCNO_ASSIGNED_P (another_allocno)) continue; + ira_allocno_t first_allocno = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_allocno = OBJECT_ALLOCNO (cp->second); /* If we have different modes use the smallest one. It is a sub-register move. It is hard to predict what LRA will reload (the pseudo or its sub-register) but LRA @@ -1489,14 +1527,21 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, register classes bigger modes might be invalid, e.g. DImode for AREG on x86. For such cases the register move cost will be maximal. */ - mode = narrower_subreg_mode (ALLOCNO_MODE (cp->first), - ALLOCNO_MODE (cp->second)); + mode = narrower_subreg_mode (ALLOCNO_MODE (first_allocno), + ALLOCNO_MODE (second_allocno)); ira_init_register_move_cost_if_necessary (mode); - cost = (cp->second == allocno - ? ira_register_move_cost[mode][rclass][aclass] - : ira_register_move_cost[mode][aclass][rclass]); + cost = (second_allocno == allocno + ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); + /* Adjust the hard regno for another_allocno for subreg copy. */ + int start_regno = hard_regno; + if (cp->insn && subreg_move_p (cp->first, cp->second)) + { + int diff = OBJECT_START (cp->first) - OBJECT_START (cp->second); + start_regno += (first_allocno == allocno ? diff : -diff); + } if (decr_p) cost = -cost; @@ -1505,25 +1550,30 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, if (internal_flag_ira_verbose > 5 && ira_dump_file != NULL) fprintf (ira_dump_file, - " a%dr%d (hr%d): update cost by %d, conflict cost by %d\n", - ALLOCNO_NUM (another_allocno), ALLOCNO_REGNO (another_allocno), - hard_regno, update_cost, update_conflict_cost); + " a%dr%d (hr%d): update cost by %d, conflict " + "cost by %d\n", + ALLOCNO_NUM (another_allocno), + ALLOCNO_REGNO (another_allocno), start_regno, update_cost, + update_conflict_cost); if (update_cost == 0) continue; - if (! update_allocno_cost (another_allocno, hard_regno, - update_cost, update_conflict_cost)) + if (start_regno < 0 + || (start_regno + ALLOCNO_NREGS (another_allocno)) + > FIRST_PSEUDO_REGISTER + || !update_allocno_cost (another_allocno, start_regno, + update_cost, update_conflict_cost)) continue; - queue_update_cost (another_allocno, start, allocno, + queue_update_cost (another_allocno, start_regno, start, allocno, divisor * COST_HOP_DIVISOR); if (record_p && ALLOCNO_COLOR_DATA (another_allocno) != NULL) ALLOCNO_COLOR_DATA (another_allocno)->update_cost_records - = get_update_cost_record (hard_regno, divisor, - ALLOCNO_COLOR_DATA (another_allocno) - ->update_cost_records); + = get_update_cost_record ( + start_regno, divisor, + ALLOCNO_COLOR_DATA (another_allocno)->update_cost_records); } - } - while (get_next_update_cost (&allocno, &start, &from, &divisor)); + } while ( + get_next_update_cost (&allocno, &hard_regno, &start, &from, &divisor)); } /* Decrease preferred ALLOCNO hard register costs and costs of @@ -1632,23 +1682,25 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, enum reg_class another_aclass; ira_allocno_t allocno, another_allocno, start, from; ira_copy_t cp, next_cp; + ira_object_t another_obj; - while (get_next_update_cost (&allocno, &start, &from, &divisor)) + while (get_next_update_cost (&allocno, NULL, &start, &from, &divisor)) for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_obj = cp->second; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + another_allocno = OBJECT_ALLOCNO (another_obj); another_aclass = ALLOCNO_CLASS (another_allocno); if (another_allocno == from || ALLOCNO_ASSIGNED_P (another_allocno) @@ -1696,7 +1748,8 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, * COST_HOP_DIVISOR * COST_HOP_DIVISOR * COST_HOP_DIVISOR)) - queue_update_cost (another_allocno, start, from, divisor * COST_HOP_DIVISOR); + queue_update_cost (another_allocno, -1, start, from, + divisor * COST_HOP_DIVISOR); } } @@ -2034,6 +2087,11 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); + + if (ALLOCNO_COLOR_DATA (a)->first_thread_allocno + == ALLOCNO_COLOR_DATA (conflict_a)->first_thread_allocno) + continue; + enum reg_class conflict_aclass; allocno_color_data_t data = ALLOCNO_COLOR_DATA (conflict_a); @@ -2225,7 +2283,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) continue; full_costs[j] -= conflict_costs[k]; } - queue_update_cost (conflict_a, conflict_a, NULL, COST_HOP_DIVISOR); + queue_update_cost (conflict_a, -1, conflict_a, NULL, + COST_HOP_DIVISOR); } } } @@ -2239,7 +2298,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) if (! retry_p) { start_update_cost (); - queue_update_cost (a, a, NULL, COST_HOP_DIVISOR); + queue_update_cost (a, -1, a, NULL, COST_HOP_DIVISOR); update_conflict_hard_regno_costs (full_costs, aclass, false); } min_cost = min_full_cost = INT_MAX; @@ -2264,17 +2323,17 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) if (!HONOR_REG_ALLOC_ORDER) { if ((saved_nregs = calculate_saved_nregs (hard_regno, mode)) != 0) - /* We need to save/restore the hard register in - epilogue/prologue. Therefore we increase the cost. */ - { - rclass = REGNO_REG_CLASS (hard_regno); - add_cost = ((ira_memory_move_cost[mode][rclass][0] - + ira_memory_move_cost[mode][rclass][1]) + /* We need to save/restore the hard register in + epilogue/prologue. Therefore we increase the cost. */ + { + rclass = REGNO_REG_CLASS (hard_regno); + add_cost = ((ira_memory_move_cost[mode][rclass][0] + + ira_memory_move_cost[mode][rclass][1]) * saved_nregs / hard_regno_nregs (hard_regno, mode) - 1); - cost += add_cost; - full_cost += add_cost; - } + cost += add_cost; + full_cost += add_cost; + } } if (min_cost > cost) min_cost = cost; @@ -2393,54 +2452,173 @@ copy_freq_compare_func (const void *v1p, const void *v2p) return cp1->num - cp2->num; } - +/* Return true if object OBJ1 conflict with OBJ2. */ +static bool +objects_conflict_by_live_ranges_p (ira_object_t obj1, ira_object_t obj2) +{ + rtx reg1, reg2; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + if (a1 == a2) + return false; + reg1 = regno_reg_rtx[ALLOCNO_REGNO (a1)]; + reg2 = regno_reg_rtx[ALLOCNO_REGNO (a2)]; + if (reg1 != NULL && reg2 != NULL + && ORIGINAL_REGNO (reg1) == ORIGINAL_REGNO (reg2)) + return false; + + /* We don't keep live ranges for caps because they can be quite big. + Use ranges of non-cap allocno from which caps are created. */ + a1 = get_cap_member (a1); + a2 = get_cap_member (a2); + + obj1 = find_object (a1, OBJECT_START (obj1), OBJECT_NREGS (obj1)); + obj2 = find_object (a2, OBJECT_START (obj2), OBJECT_NREGS (obj2)); + return ira_live_ranges_intersect_p (OBJECT_LIVE_RANGES (obj1), + OBJECT_LIVE_RANGES (obj2)); +} -/* Return true if any allocno from thread of A1 conflicts with any - allocno from thread A2. */ +/* Return true if any object from thread of OBJ1 conflicts with any + object from thread OBJ2. */ static bool -allocno_thread_conflict_p (ira_allocno_t a1, ira_allocno_t a2) +object_thread_conflict_p (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a, conflict_a; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + + gcc_assert ( + obj1 != obj2 + && ALLOCNO_COLOR_DATA (a1)->first_thread_objects[OBJECT_INDEX (obj1)] + == obj1 + && ALLOCNO_COLOR_DATA (a2)->first_thread_objects[OBJECT_INDEX (obj2)] + == obj2); + + ira_allocno_t first_thread_allocno1 + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + ira_allocno_t first_thread_allocno2 + = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno; + + int offset + = (ALLOCNO_COLOR_DATA (a1)->first_thread_offset + OBJECT_START (obj1)) + - (ALLOCNO_COLOR_DATA (a2)->first_thread_offset + OBJECT_START (obj2)); + + /* Update first_thread_allocno and thread_allocnos info. */ + bitmap thread_allocnos1 + = ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_allocnos; + bitmap thread_allocnos2 + = ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos; + gcc_assert (!bitmap_empty_p (thread_allocnos1) + && !bitmap_empty_p (thread_allocnos2)); + std::vector thread_objects_2; - for (a = ALLOCNO_COLOR_DATA (a2)->next_thread_allocno;; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) + unsigned int i; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos2, 0, i, bi) { - for (conflict_a = ALLOCNO_COLOR_DATA (a1)->next_thread_allocno;; - conflict_a = ALLOCNO_COLOR_DATA (conflict_a)->next_thread_allocno) - { - if (allocnos_conflict_by_live_ranges_p (a, conflict_a)) - return true; - if (conflict_a == a1) - break; - } - if (a == a2) - break; + ira_allocno_object_iterator oi; + ira_object_t obj; + FOR_EACH_ALLOCNO_OBJECT (ira_allocnos[i], obj, oi) + thread_objects_2.push_back (obj); + } + + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos1, 0, i, bi) + { + ira_allocno_object_iterator oi; + ira_object_t obj; + ira_allocno_t a = ira_allocnos[i]; + FOR_EACH_ALLOCNO_OBJECT (ira_allocnos[i], obj, oi) + for (ira_object_t other_obj : thread_objects_2) + { + int thread_start1 = ALLOCNO_COLOR_DATA (a)->first_thread_offset + + OBJECT_START (obj); + int thread_start2 = ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (other_obj)) + ->first_thread_offset + + offset + OBJECT_START (other_obj); + if (!(thread_start1 + OBJECT_NREGS (obj) <= thread_start2 + || thread_start2 + OBJECT_NREGS (other_obj) <= thread_start1) + && objects_conflict_by_live_ranges_p (obj, other_obj)) + return true; + } } + return false; } -/* Merge two threads given correspondingly by their first allocnos T1 - and T2 (more accurately merging T2 into T1). */ +/* Merge two threads given correspondingly by their first objects OBJ1 + and OBJ2 (more accurately merging OBJ2 into OBJ1). */ static void -merge_threads (ira_allocno_t t1, ira_allocno_t t2) +merge_threads (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a, next, last; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + + gcc_assert ( + obj1 != obj2 + && ALLOCNO_COLOR_DATA (a1)->first_thread_objects[OBJECT_INDEX (obj1)] + == obj1 + && ALLOCNO_COLOR_DATA (a2)->first_thread_objects[OBJECT_INDEX (obj2)] + == obj2); + + ira_allocno_t first_thread_allocno1 + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + ira_allocno_t first_thread_allocno2 + = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno; + + gcc_assert (first_thread_allocno1 != first_thread_allocno2); - gcc_assert (t1 != t2 - && ALLOCNO_COLOR_DATA (t1)->first_thread_allocno == t1 - && ALLOCNO_COLOR_DATA (t2)->first_thread_allocno == t2); - for (last = t2, a = ALLOCNO_COLOR_DATA (t2)->next_thread_allocno;; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) + int offset + = (ALLOCNO_COLOR_DATA (a1)->first_thread_offset + OBJECT_START (obj1)) + - (ALLOCNO_COLOR_DATA (a2)->first_thread_offset + OBJECT_START (obj2)); + + /* Update first_thread_allocno and thread_allocnos info. */ + unsigned int i; + bitmap_iterator bi; + bitmap thread_allocnos2 + = ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos; + bitmap thread_allocnos1 + = ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_allocnos; + gcc_assert (!bitmap_empty_p (thread_allocnos1) + && !bitmap_empty_p (thread_allocnos2)); + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos2, 0, i, bi) + { + ira_allocno_t a = ira_allocnos[i]; + gcc_assert (ALLOCNO_COLOR_DATA (a)->first_thread_allocno + == first_thread_allocno2); + /* Update first_thread_allocno and first_thread_offset filed. */ + ALLOCNO_COLOR_DATA (a)->first_thread_allocno = first_thread_allocno1; + ALLOCNO_COLOR_DATA (a)->first_thread_offset += offset; + bitmap_set_bit (thread_allocnos1, i); + } + bitmap_clear (thread_allocnos2); + ira_free_bitmap (thread_allocnos2); + ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos = NULL; + + ira_object_t last_obj = obj2; + for (ira_object_t next_obj + = ALLOCNO_COLOR_DATA (a2)->next_thread_objects[OBJECT_INDEX (obj2)]; + ; next_obj = ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (next_obj)) + ->next_thread_objects[OBJECT_INDEX (next_obj)]) { - ALLOCNO_COLOR_DATA (a)->first_thread_allocno = t1; - if (a == t2) + ira_allocno_t next_a = OBJECT_ALLOCNO (next_obj); + ALLOCNO_COLOR_DATA (next_a)->first_thread_objects[OBJECT_INDEX (next_obj)] + = obj1; + gcc_assert (ALLOCNO_COLOR_DATA (next_a)->first_thread_allocno + == first_thread_allocno1); + gcc_assert (bitmap_bit_p (thread_allocnos1, ALLOCNO_NUM (next_a))); + if (next_obj == obj2) break; - last = a; + last_obj = next_obj; } - next = ALLOCNO_COLOR_DATA (t1)->next_thread_allocno; - ALLOCNO_COLOR_DATA (t1)->next_thread_allocno = t2; - ALLOCNO_COLOR_DATA (last)->next_thread_allocno = next; - ALLOCNO_COLOR_DATA (t1)->thread_freq += ALLOCNO_COLOR_DATA (t2)->thread_freq; + /* Add OBJ2's threads chain to OBJ1. */ + ira_object_t temp_obj + = ALLOCNO_COLOR_DATA (a1)->next_thread_objects[OBJECT_INDEX (obj1)]; + ALLOCNO_COLOR_DATA (a1)->next_thread_objects[OBJECT_INDEX (obj1)] = obj2; + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (last_obj)) + ->next_thread_objects[OBJECT_INDEX (last_obj)] + = temp_obj; + + ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_freq + += ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_freq; } /* Create threads by processing CP_NUM copies from sorted copies. We @@ -2448,7 +2626,6 @@ merge_threads (ira_allocno_t t1, ira_allocno_t t2) static void form_threads_from_copies (int cp_num) { - ira_allocno_t a, thread1, thread2; ira_copy_t cp; qsort (sorted_copies, cp_num, sizeof (ira_copy_t), copy_freq_compare_func); @@ -2457,33 +2634,43 @@ form_threads_from_copies (int cp_num) for (int i = 0; i < cp_num; i++) { cp = sorted_copies[i]; - thread1 = ALLOCNO_COLOR_DATA (cp->first)->first_thread_allocno; - thread2 = ALLOCNO_COLOR_DATA (cp->second)->first_thread_allocno; - if (thread1 == thread2) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + ira_object_t thread1 = ALLOCNO_COLOR_DATA (first_a) + ->first_thread_objects[OBJECT_INDEX (cp->first)]; + ira_object_t thread2 + = ALLOCNO_COLOR_DATA (second_a) + ->first_thread_objects[OBJECT_INDEX (cp->second)]; + if (thread1 == thread2 + || ALLOCNO_COLOR_DATA (first_a)->first_thread_allocno + == ALLOCNO_COLOR_DATA (second_a)->first_thread_allocno) continue; - if (! allocno_thread_conflict_p (thread1, thread2)) + if (!object_thread_conflict_p (thread1, thread2)) { if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) - fprintf - (ira_dump_file, - " Forming thread by copy %d:a%dr%d-a%dr%d (freq=%d):\n", - cp->num, ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), - cp->freq); + fprintf ( + ira_dump_file, + " Forming thread by copy %d:a%dr%d-a%dr%d (freq=%d):\n", + cp->num, ALLOCNO_NUM (first_a), ALLOCNO_REGNO (first_a), + ALLOCNO_NUM (second_a), ALLOCNO_REGNO (second_a), cp->freq); merge_threads (thread1, thread2); if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) { - thread1 = ALLOCNO_COLOR_DATA (thread1)->first_thread_allocno; - fprintf (ira_dump_file, " Result (freq=%d): a%dr%d(%d)", - ALLOCNO_COLOR_DATA (thread1)->thread_freq, - ALLOCNO_NUM (thread1), ALLOCNO_REGNO (thread1), - ALLOCNO_FREQ (thread1)); - for (a = ALLOCNO_COLOR_DATA (thread1)->next_thread_allocno; - a != thread1; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) - fprintf (ira_dump_file, " a%dr%d(%d)", - ALLOCNO_NUM (a), ALLOCNO_REGNO (a), - ALLOCNO_FREQ (a)); + ira_allocno_t a1 = OBJECT_ALLOCNO (thread1); + ira_allocno_t first_thread_allocno + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + fprintf (ira_dump_file, " Result (freq=%d):", + ALLOCNO_COLOR_DATA (first_thread_allocno)->thread_freq); + unsigned int i; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP ( + ALLOCNO_COLOR_DATA (first_thread_allocno)->thread_allocnos, 0, + i, bi) + { + ira_allocno_t a = ira_allocnos[i]; + fprintf (ira_dump_file, " a%dr%d(%d)", ALLOCNO_NUM (a), + ALLOCNO_REGNO (a), ALLOCNO_FREQ (a)); + } fprintf (ira_dump_file, "\n"); } } @@ -2503,13 +2690,27 @@ form_threads_from_bucket (ira_allocno_t bucket) { for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + bool intersect_p = hard_reg_set_intersect_p ( + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (cp->first)) + ->profitable_hard_regs, + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (cp->second)) + ->profitable_hard_regs); + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; + if (!intersect_p) + continue; + sorted_copies[cp_num++] = cp; + } + else if (OBJECT_ALLOCNO (cp->second) == a) + { + next_cp = cp->next_second_allocno_copy; + if (!intersect_p + || !bitmap_bit_p (uncolorable_allocno_set, + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)))) + continue; sorted_copies[cp_num++] = cp; } - else if (cp->second == a) - next_cp = cp->next_second_allocno_copy; else gcc_unreachable (); } @@ -2531,15 +2732,15 @@ form_threads_from_colorable_allocno (ira_allocno_t a) ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_a = OBJECT_ALLOCNO (cp->second); } - else if (cp->second == a) + else if (OBJECT_ALLOCNO (cp->second) == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_a = OBJECT_ALLOCNO (cp->first); } else gcc_unreachable (); @@ -2564,8 +2765,16 @@ init_allocno_threads (void) { a = ira_allocnos[j]; /* Set up initial thread data: */ - ALLOCNO_COLOR_DATA (a)->first_thread_allocno - = ALLOCNO_COLOR_DATA (a)->next_thread_allocno = a; + for (int i = 0; i < ALLOCNO_NUM_OBJECTS (a); i += 1) + { + ira_object_t obj = ALLOCNO_OBJECT (a, i); + ALLOCNO_COLOR_DATA (a)->first_thread_objects[i] + = ALLOCNO_COLOR_DATA (a)->next_thread_objects[i] = obj; + } + ALLOCNO_COLOR_DATA (a)->first_thread_allocno = a; + ALLOCNO_COLOR_DATA (a)->first_thread_offset = 0; + ALLOCNO_COLOR_DATA (a)->thread_allocnos = ira_allocate_bitmap (); + bitmap_set_bit (ALLOCNO_COLOR_DATA (a)->thread_allocnos, ALLOCNO_NUM (a)); ALLOCNO_COLOR_DATA (a)->thread_freq = ALLOCNO_FREQ (a); ALLOCNO_COLOR_DATA (a)->hard_reg_prefs = 0; for (pref = ALLOCNO_PREFS (a); pref != NULL; pref = pref->next_pref) @@ -2608,6 +2817,9 @@ add_allocno_to_bucket (ira_allocno_t a, ira_allocno_t *bucket_ptr) ira_allocno_t first_a; allocno_color_data_t data; + if (bucket_ptr == &uncolorable_allocno_bucket) + bitmap_set_bit (uncolorable_allocno_set, ALLOCNO_NUM (a)); + if (bucket_ptr == &uncolorable_allocno_bucket && ALLOCNO_CLASS (a) != NO_REGS) { @@ -2734,6 +2946,9 @@ delete_allocno_from_bucket (ira_allocno_t allocno, ira_allocno_t *bucket_ptr) { ira_allocno_t prev_allocno, next_allocno; + if (bucket_ptr == &uncolorable_allocno_bucket) + bitmap_clear_bit (uncolorable_allocno_set, ALLOCNO_NUM (allocno)); + if (bucket_ptr == &uncolorable_allocno_bucket && ALLOCNO_CLASS (allocno) != NO_REGS) { @@ -3227,16 +3442,23 @@ allocno_copy_cost_saving (ira_allocno_t allocno, int hard_regno) rclass = ALLOCNO_CLASS (allocno); for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - if (ALLOCNO_HARD_REGNO (cp->second) != hard_regno) + ira_allocno_t another_a = OBJECT_ALLOCNO (cp->second); + if (ALLOCNO_HARD_REGNO (another_a) > -1 + && hard_regno + OBJECT_START (cp->first) + != ALLOCNO_HARD_REGNO (another_a) + + OBJECT_START (cp->second)) continue; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - if (ALLOCNO_HARD_REGNO (cp->first) != hard_regno) + ira_allocno_t another_a = OBJECT_ALLOCNO (cp->first); + if (ALLOCNO_HARD_REGNO (another_a) > -1 + && hard_regno + OBJECT_START (cp->second) + != ALLOCNO_HARD_REGNO (another_a) + OBJECT_START (cp->first)) continue; } else @@ -3643,6 +3865,7 @@ color_allocnos (void) /* Put the allocnos into the corresponding buckets. */ colorable_allocno_bucket = NULL; uncolorable_allocno_bucket = NULL; + bitmap_clear (uncolorable_allocno_set); EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi) { a = ira_allocnos[i]; @@ -3740,10 +3963,12 @@ color_pass (ira_loop_tree_node_t loop_tree_node) bitmap_copy (coloring_allocno_bitmap, loop_tree_node->all_allocnos); bitmap_copy (consideration_allocno_bitmap, coloring_allocno_bitmap); n = 0; + size_t obj_n = 0; EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; n++; + obj_n += ALLOCNO_NUM_OBJECTS (a); if (! ALLOCNO_ASSIGNED_P (a)) continue; bitmap_clear_bit (coloring_allocno_bitmap, ALLOCNO_NUM (a)); @@ -3752,20 +3977,29 @@ color_pass (ira_loop_tree_node_t loop_tree_node) = (allocno_color_data_t) ira_allocate (sizeof (struct allocno_color_data) * n); memset (allocno_color_data, 0, sizeof (struct allocno_color_data) * n); + ira_object_t *thread_objects + = (ira_object_t *) ira_allocate (sizeof (ira_object_t *) * obj_n * 2); + memset (thread_objects, 0, sizeof (ira_object_t *) * obj_n * 2); curr_allocno_process = 0; n = 0; + size_t obj_offset = 0; EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; ALLOCNO_ADD_DATA (a) = allocno_color_data + n; + ALLOCNO_COLOR_DATA (a)->first_thread_objects + = thread_objects + obj_offset; + obj_offset += ALLOCNO_NUM_OBJECTS (a); + ALLOCNO_COLOR_DATA (a)->next_thread_objects = thread_objects + obj_offset; + obj_offset += ALLOCNO_NUM_OBJECTS (a); n++; } + gcc_assert (obj_n * 2 == obj_offset); init_allocno_threads (); /* Color all mentioned allocnos including transparent ones. */ color_allocnos (); /* Process caps. They are processed just once. */ - if (flag_ira_region == IRA_REGION_MIXED - || flag_ira_region == IRA_REGION_ALL) + if (flag_ira_region == IRA_REGION_MIXED || flag_ira_region == IRA_REGION_ALL) EXECUTE_IF_SET_IN_BITMAP (loop_tree_node->all_allocnos, 0, j, bi) { a = ira_allocnos[j]; @@ -3881,12 +4115,22 @@ color_pass (ira_loop_tree_node_t loop_tree_node) } } } - ira_free (allocno_color_data); EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; + gcc_assert (a != NULL); + ALLOCNO_COLOR_DATA (a)->first_thread_objects = NULL; + ALLOCNO_COLOR_DATA (a)->next_thread_objects = NULL; + if (ALLOCNO_COLOR_DATA (a)->thread_allocnos != NULL) + { + bitmap_clear (ALLOCNO_COLOR_DATA (a)->thread_allocnos); + ira_free_bitmap (ALLOCNO_COLOR_DATA (a)->thread_allocnos); + ALLOCNO_COLOR_DATA (a)->thread_allocnos = NULL; + } ALLOCNO_ADD_DATA (a) = NULL; } + ira_free (allocno_color_data); + ira_free (thread_objects); } /* Initialize the common data for coloring and calls functions to do @@ -4080,15 +4324,17 @@ update_curr_costs (ira_allocno_t a) ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_a = second_a; } - else if (cp->second == a) + else if (second_a == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_a = first_a; } else gcc_unreachable (); @@ -4100,9 +4346,8 @@ update_curr_costs (ira_allocno_t a) i = ira_class_hard_reg_index[aclass][hard_regno]; if (i < 0) continue; - cost = (cp->first == a - ? ira_register_move_cost[mode][rclass][aclass] - : ira_register_move_cost[mode][aclass][rclass]); + cost = (first_a == a ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); ira_allocate_and_set_or_copy_costs (&ALLOCNO_UPDATED_HARD_REG_COSTS (a), aclass, ALLOCNO_CLASS_COST (a), ALLOCNO_HARD_REG_COSTS (a)); @@ -4349,21 +4594,23 @@ coalesce_allocnos (void) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a) { next_cp = cp->next_first_allocno_copy; - regno = ALLOCNO_REGNO (cp->second); + regno = ALLOCNO_REGNO (second_a); /* For priority coloring we coalesce allocnos only with the same allocno class not with intersected allocno classes as it were possible. It is done for simplicity. */ if ((cp->insn != NULL || cp->constraint_p) - && ALLOCNO_ASSIGNED_P (cp->second) - && ALLOCNO_HARD_REGNO (cp->second) < 0 - && ! ira_equiv_no_lvalue_p (regno)) + && ALLOCNO_ASSIGNED_P (second_a) + && ALLOCNO_HARD_REGNO (second_a) < 0 + && !ira_equiv_no_lvalue_p (regno)) sorted_copies[cp_num++] = cp; } - else if (cp->second == a) + else if (second_a == a) next_cp = cp->next_second_allocno_copy; else gcc_unreachable (); @@ -4376,17 +4623,18 @@ coalesce_allocnos (void) for (i = 0; i < cp_num; i++) { cp = sorted_copies[i]; - if (! coalesced_allocno_conflict_p (cp->first, cp->second)) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (!coalesced_allocno_conflict_p (first_a, second_a)) { allocno_coalesced_p = true; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) - fprintf - (ira_dump_file, - " Coalescing copy %d:a%dr%d-a%dr%d (freq=%d)\n", - cp->num, ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), - cp->freq); - merge_allocnos (cp->first, cp->second); + fprintf (ira_dump_file, + " Coalescing copy %d:a%dr%d-a%dr%d (freq=%d)\n", + cp->num, ALLOCNO_NUM (first_a), + ALLOCNO_REGNO (first_a), ALLOCNO_NUM (second_a), + ALLOCNO_REGNO (second_a), cp->freq); + merge_allocnos (first_a, second_a); i++; break; } @@ -4395,8 +4643,11 @@ coalesce_allocnos (void) for (n = 0; i < cp_num; i++) { cp = sorted_copies[i]; - if (allocno_coalesce_data[ALLOCNO_NUM (cp->first)].first - != allocno_coalesce_data[ALLOCNO_NUM (cp->second)].first) + if (allocno_coalesce_data[ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first))] + .first + != allocno_coalesce_data[ALLOCNO_NUM ( + OBJECT_ALLOCNO (cp->second))] + .first) sorted_copies[n++] = cp; } cp_num = n; @@ -5070,15 +5321,15 @@ ira_reuse_stack_slot (int regno, poly_uint64 inherent_size, cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_allocno = OBJECT_ALLOCNO (cp->second); } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_allocno = OBJECT_ALLOCNO (cp->first); } else gcc_unreachable (); @@ -5274,6 +5525,7 @@ ira_initiate_assign (void) = (ira_allocno_t *) ira_allocate (sizeof (ira_allocno_t) * ira_allocnos_num); consideration_allocno_bitmap = ira_allocate_bitmap (); + uncolorable_allocno_set = ira_allocate_bitmap (); initiate_cost_update (); allocno_priorities = (int *) ira_allocate (sizeof (int) * ira_allocnos_num); sorted_copies = (ira_copy_t *) ira_allocate (ira_copies_num @@ -5286,6 +5538,7 @@ ira_finish_assign (void) { ira_free (sorted_allocnos); ira_free_bitmap (consideration_allocno_bitmap); + ira_free_bitmap (uncolorable_allocno_set); finish_cost_update (); ira_free (allocno_priorities); ira_free (sorted_copies); diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index 0585ad10043..7aeed7202ce 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -173,25 +173,115 @@ build_conflict_bit_table (void) sparseset_free (objects_live); return true; } - -/* Return true iff allocnos A1 and A2 cannot be allocated to the same - register due to conflicts. */ -static bool -allocnos_conflict_for_copy_p (ira_allocno_t a1, ira_allocno_t a2) +/* Check that X is REG or SUBREG of REG. */ +#define REG_SUBREG_P(x) \ + (REG_P (x) || (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x)))) + +/* Return true if OBJ1 and OBJ2 can be a move INSN. */ +bool +subreg_move_p (ira_object_t obj1, ira_object_t obj2) { - /* Due to the fact that we canonicalize conflicts (see - record_object_conflict), we only need to test for conflicts of - the lowest order words. */ - ira_object_t obj1 = ALLOCNO_OBJECT (a1, 0); - ira_object_t obj2 = ALLOCNO_OBJECT (a2, 0); + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + return ALLOCNO_CLASS (a1) != NO_REGS && ALLOCNO_CLASS (a2) != NO_REGS + && (ALLOCNO_TRACK_SUBREG_P (a1) || ALLOCNO_TRACK_SUBREG_P (a2)) + && OBJECT_NREGS (obj1) == OBJECT_NREGS (obj2) + && (OBJECT_NREGS (obj1) != ALLOCNO_NREGS (a1) + || OBJECT_NREGS (obj2) != ALLOCNO_NREGS (a2)); +} - return OBJECTS_CONFLICT_P (obj1, obj2); +/* Return true if ORIG_DEST_REG and ORIG_SRC_REG can be a move INSN. */ +bool +subreg_move_p (rtx orig_dest_reg, rtx orig_src_reg) +{ + gcc_assert (REG_SUBREG_P (orig_dest_reg) && REG_SUBREG_P (orig_src_reg)); + rtx reg1 + = SUBREG_P (orig_dest_reg) ? SUBREG_REG (orig_dest_reg) : orig_dest_reg; + rtx reg2 = SUBREG_P (orig_src_reg) ? SUBREG_REG (orig_src_reg) : orig_src_reg; + if (HARD_REGISTER_P (reg1) || HARD_REGISTER_P (reg2)) + return false; + ira_allocno_t a1 = ira_curr_regno_allocno_map[REGNO (reg1)]; + ira_allocno_t a2 = ira_curr_regno_allocno_map[REGNO (reg2)]; + ira_object_t obj1 = find_object (a1, orig_dest_reg); + ira_object_t obj2 = find_object (a2, orig_src_reg); + return subreg_move_p (obj1, obj2); } -/* Check that X is REG or SUBREG of REG. */ -#define REG_SUBREG_P(x) \ - (REG_P (x) || (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x)))) +/* Return true if OBJ1 and OBJ2 can allocate to the same register. */ +static bool +regs_non_conflict_for_copy_p (ira_object_t obj1, ira_object_t obj2, + bool is_move, bool offset_equal) +{ + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + if (is_move && subreg_move_p (obj1, obj2)) + { + if (OBJECTS_CONFLICT_P (obj1, obj2)) + return false; + /* Assume a1 allocate to `OBJECT_START (obj2)` and a2 allocate to + `OBJECT_START (obj1)` hard register, so both objects can use the same + hard register `OBJECT_START (obj1) + OBJECT_START (obj2)`. */ + int start_regno1 = OBJECT_START (obj2); + int start_regno2 = OBJECT_START (obj1); + + ira_object_t obj_a, obj_b; + ira_allocno_object_iterator oi_a, oi_b; + FOR_EACH_ALLOCNO_OBJECT (a1, obj_a, oi_a) + FOR_EACH_ALLOCNO_OBJECT (a2, obj_b, oi_b) + /* If there have a conflict between a1 and a2 and prevent the + allocation before, then obj1 and obj2 cannot be a copy. */ + if (OBJECTS_CONFLICT_P (obj_a, obj_b) + && !(start_regno1 + OBJECT_START (obj_a) + OBJECT_NREGS (obj_a) + <= (start_regno2 + OBJECT_START (obj_b)) + || start_regno2 + OBJECT_START (obj_b) + OBJECT_NREGS (obj_b) + <= (start_regno1 + OBJECT_START (obj_a)))) + return false; + + return true; + } + else + { + /* For normal case, make sure full_obj1 and full_obj2 can allocate to the + same register. */ + ira_object_t full_obj1 = find_object (a1, 0, ALLOCNO_NREGS (a1)); + ira_object_t full_obj2 = find_object (a2, 0, ALLOCNO_NREGS (a2)); + return !OBJECTS_CONFLICT_P (full_obj1, full_obj2) && offset_equal; + } +} + +/* Return true if ORIG_REG offset align in ALLOCNO_UNIT_SIZE (A) and times of + ALLOCNO_UNIT_SIZE (A). Use to forbidden bellow rtl which has a subreg move to + create copy (from testsuite/gcc.dg/vect/vect-simd-20.c on AArch64). Suppose + they are all allocated to the fourth register, that is, pseudo 127 is + allocated to w4, and pseudo 149 is allocated to x4 and x5. Then the third + instruction can be safely deleted without affecting the result of pseudo 149. + But when the second instruction is executed, the upper 32 bits of x4 will be + set to 0 (the behavior of the add instruction), that is to say, the result of + pseudo 149 is modified, and its 32~63 bits are set to 0, Not the desired + result. + + (set (reg:SI 127) + (subreg:SI (reg:TI 149) 0)) + ... + (set (reg:SI 127) + (plus:SI (reg:SI 127) + (reg:SI 180))) + ... + (set (zero_extract:DI (subreg:DI (reg:TI 149) 0) + (const_int 32 [0x20]) + (const_int 0 [0])) + (subreg:DI (reg:SI 127) 0)) */ +static bool +subreg_reg_align_and_times_p (ira_allocno_t a, rtx orig_reg) +{ + if (!has_subreg_object_p (a) || !SUBREG_P (orig_reg)) + return true; + + return multiple_p (SUBREG_BYTE (orig_reg), ALLOCNO_UNIT_SIZE (a)) + && multiple_p (GET_MODE_SIZE (GET_MODE (orig_reg)), + ALLOCNO_UNIT_SIZE (a)); +} /* Return X if X is a REG, otherwise it should be SUBREG of REG and the function returns the reg in this case. *OFFSET will be set to @@ -237,8 +327,9 @@ get_freq_for_shuffle_copy (int freq) SINGLE_INPUT_OP_HAS_CSTR_P is only meaningful when constraint_p is true, see function ira_get_dup_out_num for its meaning. */ static bool -process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, - int freq, bool single_input_op_has_cstr_p = true) +process_regs_for_copy (rtx orig_reg1, rtx orig_reg2, bool constraint_p, + rtx_insn *insn, int freq, + bool single_input_op_has_cstr_p = true) { int allocno_preferenced_hard_regno, index, offset1, offset2; int cost, conflict_cost, move_cost; @@ -248,10 +339,10 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, machine_mode mode; ira_copy_t cp; - gcc_assert (REG_SUBREG_P (reg1) && REG_SUBREG_P (reg2)); - only_regs_p = REG_P (reg1) && REG_P (reg2); - reg1 = go_through_subreg (reg1, &offset1); - reg2 = go_through_subreg (reg2, &offset2); + gcc_assert (REG_SUBREG_P (orig_reg1) && REG_SUBREG_P (orig_reg2)); + only_regs_p = REG_P (orig_reg1) && REG_P (orig_reg2); + rtx reg1 = go_through_subreg (orig_reg1, &offset1); + rtx reg2 = go_through_subreg (orig_reg2, &offset2); /* Set up hard regno preferenced by allocno. If allocno gets the hard regno the copy (or potential move) insn will be removed. */ if (HARD_REGISTER_P (reg1)) @@ -270,13 +361,17 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, { ira_allocno_t a1 = ira_curr_regno_allocno_map[REGNO (reg1)]; ira_allocno_t a2 = ira_curr_regno_allocno_map[REGNO (reg2)]; + ira_object_t obj1 = find_object (a1, orig_reg1); + ira_object_t obj2 = find_object (a2, orig_reg2); - if (!allocnos_conflict_for_copy_p (a1, a2) - && offset1 == offset2 + if (subreg_reg_align_and_times_p (a1, orig_reg1) + && subreg_reg_align_and_times_p (a2, orig_reg2) + && regs_non_conflict_for_copy_p (obj1, obj2, insn != NULL, + offset1 == offset2) && ordered_p (GET_MODE_PRECISION (ALLOCNO_MODE (a1)), GET_MODE_PRECISION (ALLOCNO_MODE (a2)))) { - cp = ira_add_allocno_copy (a1, a2, freq, constraint_p, insn, + cp = ira_add_allocno_copy (obj1, obj2, freq, constraint_p, insn, ira_curr_loop_tree_node); bitmap_set_bit (ira_curr_loop_tree_node->local_copies, cp->num); return true; @@ -438,16 +533,15 @@ add_insn_allocno_copies (rtx_insn *insn) freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn)); if (freq == 0) freq = 1; - if ((set = single_set (insn)) != NULL_RTX - && REG_SUBREG_P (SET_DEST (set)) && REG_SUBREG_P (SET_SRC (set)) - && ! side_effects_p (set) - && find_reg_note (insn, REG_DEAD, - REG_P (SET_SRC (set)) - ? SET_SRC (set) - : SUBREG_REG (SET_SRC (set))) != NULL_RTX) + if ((set = single_set (insn)) != NULL_RTX && REG_SUBREG_P (SET_DEST (set)) + && REG_SUBREG_P (SET_SRC (set)) && !side_effects_p (set) + && (find_reg_note (insn, REG_DEAD, + REG_P (SET_SRC (set)) ? SET_SRC (set) + : SUBREG_REG (SET_SRC (set))) + != NULL_RTX + || subreg_move_p (SET_DEST (set), SET_SRC (set)))) { - process_regs_for_copy (SET_SRC (set), SET_DEST (set), - false, insn, freq); + process_regs_for_copy (SET_SRC (set), SET_DEST (set), false, insn, freq); return; } /* Fast check of possibility of constraint or shuffle copies. If @@ -521,16 +615,23 @@ propagate_copies (void) FOR_EACH_COPY (cp, ci) { - a1 = cp->first; - a2 = cp->second; + a1 = OBJECT_ALLOCNO (cp->first); + a2 = OBJECT_ALLOCNO (cp->second); if (ALLOCNO_LOOP_TREE_NODE (a1) == ira_loop_tree_root) continue; ira_assert ((ALLOCNO_LOOP_TREE_NODE (a2) != ira_loop_tree_root)); parent_a1 = ira_parent_or_cap_allocno (a1); parent_a2 = ira_parent_or_cap_allocno (a2); + ira_object_t parent_obj1 + = find_object_anyway (parent_a1, OBJECT_START (cp->first), + OBJECT_NREGS (cp->first)); + ira_object_t parent_obj2 + = find_object_anyway (parent_a2, OBJECT_START (cp->second), + OBJECT_NREGS (cp->second)); ira_assert (parent_a1 != NULL && parent_a2 != NULL); - if (! allocnos_conflict_for_copy_p (parent_a1, parent_a2)) - ira_add_allocno_copy (parent_a1, parent_a2, cp->freq, + if (regs_non_conflict_for_copy_p (parent_obj1, parent_obj2, + cp->insn != NULL, true)) + ira_add_allocno_copy (parent_obj1, parent_obj2, cp->freq, cp->constraint_p, cp->insn, cp->loop_tree_node); } } diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc index 9dc7f3c655e..30ff46980f5 100644 --- a/gcc/ira-emit.cc +++ b/gcc/ira-emit.cc @@ -1129,11 +1129,11 @@ add_range_and_copies_from_move_list (move_t list, ira_loop_tree_node_t node, update_costs (to, false, freq); cp = ira_add_allocno_copy (from, to, freq, false, move->insn, NULL); if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) - fprintf (ira_dump_file, " Adding cp%d:a%dr%d-a%dr%d\n", - cp->num, ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + fprintf (ira_dump_file, " Adding cp%d:a%dr%d-a%dr%d\n", cp->num, + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)), + REGNO (allocno_emit_reg (OBJECT_ALLOCNO (cp->first))), + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->second)), + REGNO (allocno_emit_reg (OBJECT_ALLOCNO (cp->second)))); nr = ALLOCNO_NUM_OBJECTS (from); for (i = 0; i < nr; i++) diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 9095a8227f7..963e533e448 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -594,9 +594,9 @@ struct ira_allocno_copy { /* The unique order number of the copy node starting with 0. */ int num; - /* Allocnos connected by the copy. The first allocno should have + /* Objects connected by the copy. The first allocno should have smaller order number than the second one. */ - ira_allocno_t first, second; + ira_object_t first, second; /* Execution frequency of the copy. */ int freq; bool constraint_p; @@ -1046,6 +1046,9 @@ extern void ira_remove_allocno_prefs (ira_allocno_t); extern ira_copy_t ira_create_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx_insn *, ira_loop_tree_node_t); +extern ira_copy_t +ira_add_allocno_copy (ira_object_t, ira_object_t, int, bool, rtx_insn *, + ira_loop_tree_node_t); extern ira_copy_t ira_add_allocno_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx_insn *, ira_loop_tree_node_t); @@ -1059,6 +1062,7 @@ extern void ira_destroy (void); extern ira_object_t find_object (ira_allocno_t, int, int); extern ira_object_t find_object (ira_allocno_t, poly_int64, poly_int64); +extern ira_object_t find_object (ira_allocno_t, rtx); ira_object_t find_object_anyway (ira_allocno_t a, int start, int nregs); extern void ira_copy_allocno_objects (ira_allocno_t, ira_allocno_t); @@ -1087,6 +1091,8 @@ extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *, /* ira-conflicts.cc */ extern void ira_debug_conflicts (bool); extern void ira_build_conflicts (void); +extern bool subreg_move_p (ira_object_t, ira_object_t); +extern bool subreg_move_p (rtx, rtx); /* ira-color.cc */ extern ira_allocno_t ira_soft_conflict (ira_allocno_t, ira_allocno_t); diff --git a/gcc/ira.cc b/gcc/ira.cc index 9ea57d3b1ea..280ca47a999 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -2853,14 +2853,15 @@ print_redundant_copies (void) if (hard_regno >= 0) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) next_cp = cp->next_first_allocno_copy; else { next_cp = cp->next_second_allocno_copy; if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL && cp->insn != NULL_RTX - && ALLOCNO_HARD_REGNO (cp->first) == hard_regno) + && ALLOCNO_HARD_REGNO (OBJECT_ALLOCNO (cp->first)) + == hard_regno) fprintf (ira_dump_file, " Redundant move from %d(freq %d):%d\n", INSN_UID (cp->insn), cp->freq, hard_regno); From patchwork Sun Nov 12 12:08:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164250 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp660059vqg; Sun, 12 Nov 2023 04:10:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IHunYFsp86B6OxRjmr+Jt8n8nisL4MLfP8jiPEnyi2QUtKsEHSQSZszRz/cpNtTuqcLcS8N X-Received: by 2002:a05:6870:a792:b0:1f4:e209:a7ea with SMTP id x18-20020a056870a79200b001f4e209a7eamr3511818oao.42.1699791028393; Sun, 12 Nov 2023 04:10:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699791028; cv=pass; d=google.com; s=arc-20160816; b=muoItCiqNcn6qv9oYfFHmxHtrOcYmpqWRuqJhSPiy7GZdTR+yIH6bZJ40yHgYpywL1 yxhgi49xo+hGk3qnhZ0G9ErWyZIXtgZPteiXhiPmkYfEVCeEg6FIyVmZl8cau+gNhRuC T5FguBg4X1cKJb32uwLUNrYzpSLBOXNdNDfSpLwSPgveEL/6BgDD3Ae4UeIoy3Z8Q/WO JqGWCQ4RgdXx1CvvBYw5JYz0xEeF7Puj+J9uqLnogCtX1XxHeTNKWLh0gjEz14k3+Wsp BOgLZduu4K7LKlRg5SiCJ5VfDg8GId+mCZR2hT26odDpbB4GDTOWMzIm5XoFLle9YcRH LmyQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=i+9efPhpHu8/a1AVWASoq9hwk8svFe7BGqJNx2WnVvM=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=pRaA6AvgqZaSh2Fo73wNhJqN1wAN4XfTgBsWi0vtXJHVkx2Mb1XkDInNhFeT/nG9ir ZQ+m3VRY8mGPjZZda03P4vy9Iv0u44RE5Nr5/3yxustbC2i8XSeE6VJ1TQh55Bqw60j2 OdNWkCZxCNeOykHyHbYzWpEfmO37wdwQJ3lTpD2yhkB0UM5l+CYKN1QgV1B4rmP3V3k1 tUiSw8h0oJirin2ENDFbyJ1I+2QfTXOn6tGRHPlLmqrBWA2FBIckEPfj24n89qt+7HYE 7Up4VtGuXmrk5g/jV2HvDnQBnhgPfd05+Vie2xIZwD442rHdUNAvDq7d71uWtI9qVI2Q XrmQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i6-20020ac85c06000000b00418051c660esi2860202qti.625.2023.11.12.04.10.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:10:28 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 23D3E3858C2D for ; Sun, 12 Nov 2023 12:10:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id 31CEB3857712 for ; Sun, 12 Nov 2023 12:08:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 31CEB3857712 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 31CEB3857712 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.19.206 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790926; cv=none; b=U2ZHDn0UkTPCTCyO9CUv1vo0YulvDBlcRLq1XkhLWEt2hRZ9u4l6Go6knoR69ZzQvmCSzjav/HgoDObOntFd0vL7+uPtojkYjZjl7V5FR4uuTSr7tvRBlQZHzYKFaKHZ5kmu7YWWtBpIHV1+6Qfna71crCzBeYsOC4OuROS7ibM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790926; c=relaxed/simple; bh=iye3d0n9sW9N+o9/WQbNWrWNkyaFoca8WiFpbabmWZE=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=AWdAU6/lBbSjWvTYaJG7IxMoOzwTptqgBhqFlkR5aU4FwLk8nrAFjzwiegY84HYU247GKFkCJGXlxi1MLBNkKWHh3r3tRuHe7vR0hL2L+6ae64+agygpC2Us3aUQTsnUdc9NUPjRnouZIusf2kRMxJlChCWfmjS0twuUgG6x4U4= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790916tc5ifm63 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:35 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: PS/N6jJLnDYW5FxtN8gzZMPPVpAu2OeE+X/r7tSYewqaVs1gzVWDmGUSPb0VJ w8Tvj3PD8/dzdylAR8fQEbyyUe3PRY78E9XRUNG3+h7OYCNwKUCM/Wz2ciFksYkQ4HiGnHi Kw37xMbfZAZOWYEOazW1iSJQiW3XA5SrxX6nPtRuGJ1vRpLY6fq0H82bF1XrZ34o1MPP3Lh picYINuwtLGWRs9UJ+ODVZRgZ65JXfYxugJ6YswQBvgFsdDlk5JAWeG7FjA0n2vTIQeFc90 9QTS/9rjAVONZVR7sF5rkH4Tc5MD5Hky71P9ZhS4Zqw0NeLX7kQ60dKfCTgLgN/vce3KAb+ 2JqvVo+MsF599xDQ0waMHIRMW3vNris/8jSsM+a1uqpDqGQCEaNq0tSTAt+TCj65e3e+Uyh 6ATH11KuVDY= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5641370456946283451 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 5/7] ira: Add all nregs >= 2 pseudos to tracke subreg list Date: Sun, 12 Nov 2023 20:08:15 +0800 Message-Id: <20231112120817.2635864-6-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782360077610425018 X-GMAIL-MSGID: 1782360077610425018 This patch relax the subreg track capability to all subreg registers. gcc/ChangeLog: * ira-build.cc (get_reg_unit_size): New. (has_same_nregs): New. (ira_set_allocno_class): Adjust. --- gcc/ira-build.cc | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 13f0f7336ed..f88aeeeeaef 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -607,6 +607,37 @@ ira_create_allocno (int regno, bool cap_p, return a; } +/* Return single register size of allocno A. */ +static poly_int64 +get_reg_unit_size (ira_allocno_t a) +{ + enum reg_class aclass = ALLOCNO_CLASS (a); + gcc_assert (aclass != NO_REGS); + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ALLOCNO_NREGS (a); + poly_int64 block_size = REGMODE_NATURAL_SIZE (mode); + int nblocks = get_nblocks (mode); + gcc_assert (nblocks % nregs == 0); + return block_size * (nblocks / nregs); +} + +/* Return true if TARGET_CLASS_MAX_NREGS and TARGET_HARD_REGNO_NREGS results is + same. It should be noted that some targets may not implement these two very + uniformly, and need to be debugged step by step. For example, in V3x1DI mode + in AArch64, TARGET_CLASS_MAX_NREGS returns 2 but TARGET_HARD_REGNO_NREGS + returns 3. They are in conflict and need to be repaired in the Hook of + AArch64. */ +static bool +has_same_nregs (ira_allocno_t a) +{ + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (REGNO_REG_CLASS (i) != NO_REGS + && reg_class_subset_p (REGNO_REG_CLASS (i), ALLOCNO_CLASS (a)) + && ALLOCNO_NREGS (a) != hard_regno_nregs (i, ALLOCNO_MODE (a))) + return false; + return true; +} + /* Set up register class for A and update its conflict hard registers. */ void @@ -624,12 +655,12 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) if (aclass == NO_REGS) return; - /* SET the unit_size of one register. */ - machine_mode mode = ALLOCNO_MODE (a); - int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD)) + gcc_assert (!ALLOCNO_TRACK_SUBREG_P (a)); + /* Set unit size and track_subreg_p flag for pseudo which need occupied multi + hard regs. */ + if (ALLOCNO_NREGS (a) > 1 && has_same_nregs (a)) { - ALLOCNO_UNIT_SIZE (a) = UNITS_PER_WORD; + ALLOCNO_UNIT_SIZE (a) = get_reg_unit_size (a); ALLOCNO_TRACK_SUBREG_P (a) = true; return; } From patchwork Sun Nov 12 12:08:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164248 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp659908vqg; Sun, 12 Nov 2023 04:10:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IHmRvfkcxA+6a4EJs+7O7NhWv360Zh4I+txuIVxTYSLMDetz86j4qMxyyw6aSL6lDJriNSD X-Received: by 2002:a25:dc52:0:b0:daf:11cd:d54a with SMTP id y79-20020a25dc52000000b00daf11cdd54amr3116773ybe.36.1699791005096; Sun, 12 Nov 2023 04:10:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699791005; cv=pass; d=google.com; s=arc-20160816; b=EwalEnxOm3xgx3AtPblHjzcHnd7e4B6tWtpUjKN7JZgGEVsS3OuLWZQrPhl9WEqMgN APB2pru5Zw6ojwLAhlZF3rhbj6JNMvJdv370n92xn9YL4Do1i5nVISAggPV/2TTo3NX/ K9EhKQwWl6g1u2w1WBDYZZxJGHs4UGV4Zx05izySZs9ZFBwfhB5PbCFErx1JKxQ8/DkB gR3b5GVymRGubrEIjRpV07/EpB2IEOwZRVA5MwoUSLllEVOV/9YaBl3QY+VQICpm5MJn G3CTxMJ/l34xlKvrvRe5UhCk4E2Lby/dfeBk0y67ssTFGw8+4RSrm5Gm3Lnt1Ug0WIBM OCxA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=qBx8EBX5OpTeFT1zJnLqMSQBpjvnVFPyCoqj+m1YLM0=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=F76Y7akJUJyypbvEfsurEzd9OwjRLcxEd6CJBuVhbu7jpF0a8WeILAnMRaq7ehyeUo CIrqLD+lHw4twdmUadWTlRM+cCstC/oLkgenDuh0VWfN4paPSVO+gCrgwDCjo40uFI8o n3gUbDZfD2aKDx8KLyeZpAYSGsoAOXuAjQESoNkd+0pq6tXYemHOja7FnFZHGHi5A/H0 8828Yh1ovV+KDjgCnzzB4jvX6+CLEnWaJ5AYRP2iOMD60UsztObAn08AePOzf+x3E/54 dXA7mr5UCdolkNIOSjhpvW2wQIJRI4Fs2IM/KHOHjY/7H4DmSLhVbbl6iLRMAsD1Bz/g Xqjw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id q10-20020ad4402a000000b0065e0125a6besi2787764qvp.410.2023.11.12.04.10.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:10:05 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3D537385C8B0 for ; Sun, 12 Nov 2023 12:09:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id AB1323857709 for ; Sun, 12 Nov 2023 12:08:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AB1323857709 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AB1323857709 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.19.206 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790932; cv=none; b=cAyIFPEXE4zikNE6CCXv8NAwgBKI+GOhOOYou59vxGMQOBITdqOcgW5ZNewd/ZLScYHLCLNkP9vfU5W+oJxy7F2FDCf3sde+hp/s+Bwnk70OWkYoB6JYlQCfv1itM+h8OGCWzPwVvnpsNFAhV7mIXcxb+LK1p1FMrcdvnfq9sv4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790932; c=relaxed/simple; bh=lqtsuFbvEONTwu99teuaxg+5qUkd/ZhJ9j1kQB7JOA0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=mAj8mENcYE74foNehZvUUGC/8iMztmImQH2i9PhEp5RJlcgaJtSNlXXuT4lYX1mMK17MOcxNo4HEqPE9fiDvfu0mWgJBGyt/oLoW+CufhSyQBLTM7i3egwPniBXqy5HtK43wWaZRrBixcTCvrQaSjPvAalvHGuZKyQG0qKGf+ro= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790919tsmkb273 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:39 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: rZJGTgY0+YMtKCLiOlXOQnFeDhyDNA6qp8+DteOx6cQMhjFSmN5iQagK88lj9 MrqwMTQbznJmQMy4ffWiuxWp7utE79Ecg2ImUScRIJWZp3ES9C6v5XGL1SGUH2xpBjp0qSm Wu3vddiMjJzGSCCwK663Z2hpnxxG4yfwW9BjNfk1AOiX0qrKRt7LubAG9rl4J629bn6CLfF DTWA5bYT2ngAzcR4ZArtzJ4j8eJ6nUF21B1rGE6jYW2O89ebuEGSeM8B/v+KkVH1IBszPrc M91D00lq6fttlb/v/zirqtHxF9doXG0jL1xVD3G+ElkISDsAAtB3kCHCgtINJdvlcXjIZy4 J7DjemfN86eQppYSQSHugkJjhAwv1dXy/hnlfkxeAflrTZiNQmImqV0w3D5iVSR58pl8hk2 HD68wzpBq0w= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 18255706897759880586 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 6/7] lra: Switch to live_subreg data flow Date: Sun, 12 Nov 2023 20:08:16 +0800 Message-Id: <20231112120817.2635864-7-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782360052588988562 X-GMAIL-MSGID: 1782360052588988562 This patch switches the live_reg data in lra to live_subreg data, and the situation will be more complicated than in ira because this part of the data is modified in lra also and the live_subreg data will be recalculated. gcc/ChangeLog: * lra-coalesce.cc (update_live_info): Adjust to new live subreg data. (lra_coalesce): Ditto. * lra-constraints.cc (update_ebb_live_info): Ditto. (get_live_on_other_edges): Ditto. (inherit_in_ebb): Ditto. (lra_inheritance): Ditto. (fix_bb_live_info): Ditto. (remove_inheritance_pseudos): Ditto. * lra-int.h (GCC_LRA_INT_H): Ditto. * lra-lives.cc (class bb_data_pseudos): Ditto. (make_hard_regno_live): Ditto. (make_hard_regno_dead): Ditto. (mark_regno_live): Ditto. (mark_regno_dead): Ditto. (live_trans_fun): Ditto. (live_con_fun_0): Ditto. (live_con_fun_n): Ditto. (initiate_live_solver): Ditto. (finish_live_solver): Ditto. (process_bb_lives): Ditto. (lra_create_live_ranges_1): Ditto. * lra-remat.cc (dump_candidates_and_remat_bb_data): Ditto. (calculate_livein_cands): Ditto. (do_remat): Ditto. * lra-spills.cc (spill_pseudos): Ditto. --- gcc/lra-coalesce.cc | 20 ++- gcc/lra-constraints.cc | 93 +++++++++--- gcc/lra-int.h | 2 + gcc/lra-lives.cc | 328 ++++++++++++++++++++++++++++++++--------- gcc/lra-remat.cc | 13 +- gcc/lra-spills.cc | 22 ++- 6 files changed, 374 insertions(+), 104 deletions(-) diff --git a/gcc/lra-coalesce.cc b/gcc/lra-coalesce.cc index 04a5bbd714b..abfc54f1cc2 100644 --- a/gcc/lra-coalesce.cc +++ b/gcc/lra-coalesce.cc @@ -188,19 +188,25 @@ static bitmap_head used_pseudos_bitmap; /* Set up USED_PSEUDOS_BITMAP, and update LR_BITMAP (a BB live info bitmap). */ static void -update_live_info (bitmap lr_bitmap) +update_live_info (bitmap all, bitmap full, bitmap partial) { unsigned int j; bitmap_iterator bi; bitmap_clear (&used_pseudos_bitmap); - EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, lr_bitmap, + EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, all, FIRST_PSEUDO_REGISTER, j, bi) bitmap_set_bit (&used_pseudos_bitmap, first_coalesced_pseudo[j]); if (! bitmap_empty_p (&used_pseudos_bitmap)) { - bitmap_and_compl_into (lr_bitmap, &coalesced_pseudos_bitmap); - bitmap_ior_into (lr_bitmap, &used_pseudos_bitmap); + bitmap_and_compl_into (all, &coalesced_pseudos_bitmap); + bitmap_ior_into (all, &used_pseudos_bitmap); + + bitmap_and_compl_into (full, &coalesced_pseudos_bitmap); + bitmap_ior_and_compl_into (full, &used_pseudos_bitmap, partial); + + bitmap_and_compl_into (partial, &coalesced_pseudos_bitmap); + bitmap_ior_and_compl_into (partial, &used_pseudos_bitmap, full); } } @@ -303,8 +309,10 @@ lra_coalesce (void) bitmap_initialize (&used_pseudos_bitmap, ®_obstack); FOR_EACH_BB_FN (bb, cfun) { - update_live_info (df_get_live_in (bb)); - update_live_info (df_get_live_out (bb)); + update_live_info (DF_LIVE_SUBREG_IN (bb), DF_LIVE_SUBREG_FULL_IN (bb), + DF_LIVE_SUBREG_PARTIAL_IN (bb)); + update_live_info (DF_LIVE_SUBREG_OUT (bb), DF_LIVE_SUBREG_FULL_OUT (bb), + DF_LIVE_SUBREG_PARTIAL_OUT (bb)); FOR_BB_INSNS_SAFE (bb, insn, next) if (INSN_P (insn) && bitmap_bit_p (&involved_insns_bitmap, INSN_UID (insn))) diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index 0607c8be7cb..c3ad846b97b 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -6571,34 +6571,75 @@ update_ebb_live_info (rtx_insn *head, rtx_insn *tail) { if (prev_bb != NULL) { - /* Update df_get_live_in (prev_bb): */ + /* Update subreg live (prev_bb): */ + bitmap subreg_all_in = DF_LIVE_SUBREG_IN (prev_bb); + bitmap subreg_full_in = DF_LIVE_SUBREG_FULL_IN (prev_bb); + bitmap subreg_partial_in = DF_LIVE_SUBREG_PARTIAL_IN (prev_bb); + subregs_live *range_in = DF_LIVE_SUBREG_RANGE_IN (prev_bb); EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi) if (bitmap_bit_p (&live_regs, j)) - bitmap_set_bit (df_get_live_in (prev_bb), j); - else - bitmap_clear_bit (df_get_live_in (prev_bb), j); + { + bitmap_set_bit (subreg_all_in, j); + bitmap_set_bit (subreg_full_in, j); + if (bitmap_bit_p (subreg_partial_in, j)) + { + bitmap_clear_bit (subreg_partial_in, j); + range_in->remove_live (j); + } + } + else if (bitmap_bit_p (subreg_all_in, j)) + { + bitmap_clear_bit (subreg_all_in, j); + bitmap_clear_bit (subreg_full_in, j); + if (bitmap_bit_p (subreg_partial_in, j)) + { + bitmap_clear_bit (subreg_partial_in, j); + range_in->remove_live (j); + } + } } + bitmap subreg_all_out = DF_LIVE_SUBREG_OUT (curr_bb); if (curr_bb != last_bb) { - /* Update df_get_live_out (curr_bb): */ + /* Update subreg live (curr_bb): */ + bitmap subreg_all_out = DF_LIVE_SUBREG_OUT (curr_bb); + bitmap subreg_full_out = DF_LIVE_SUBREG_FULL_OUT (curr_bb); + bitmap subreg_partial_out = DF_LIVE_SUBREG_PARTIAL_OUT (curr_bb); + subregs_live *range_out = DF_LIVE_SUBREG_RANGE_OUT (curr_bb); EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi) { live_p = bitmap_bit_p (&live_regs, j); if (! live_p) FOR_EACH_EDGE (e, ei, curr_bb->succs) - if (bitmap_bit_p (df_get_live_in (e->dest), j)) + if (bitmap_bit_p (DF_LIVE_SUBREG_IN (e->dest), j)) { live_p = true; break; } if (live_p) - bitmap_set_bit (df_get_live_out (curr_bb), j); - else - bitmap_clear_bit (df_get_live_out (curr_bb), j); + { + bitmap_set_bit (subreg_all_out, j); + bitmap_set_bit (subreg_full_out, j); + if (bitmap_bit_p (subreg_partial_out, j)) + { + bitmap_clear_bit (subreg_partial_out, j); + range_out->remove_live (j); + } + } + else if (bitmap_bit_p (subreg_all_out, j)) + { + bitmap_clear_bit (subreg_all_out, j); + bitmap_clear_bit (subreg_full_out, j); + if (bitmap_bit_p (subreg_partial_out, j)) + { + bitmap_clear_bit (subreg_partial_out, j); + range_out->remove_live (j); + } + } } } prev_bb = curr_bb; - bitmap_and (&live_regs, &check_only_regs, df_get_live_out (curr_bb)); + bitmap_and (&live_regs, &check_only_regs, subreg_all_out); } if (! NONDEBUG_INSN_P (curr_insn)) continue; @@ -6715,7 +6756,7 @@ get_live_on_other_edges (basic_block from, basic_block to, bitmap res) bitmap_clear (res); FOR_EACH_EDGE (e, ei, from->succs) if (e->dest != to) - bitmap_ior_into (res, df_get_live_in (e->dest)); + bitmap_ior_into (res, DF_LIVE_SUBREG_IN (e->dest)); last = get_last_insertion_point (from); if (! JUMP_P (last)) return; @@ -6787,7 +6828,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn *tail) { /* We are at the end of BB. Add qualified living pseudos for potential splitting. */ - to_process = df_get_live_out (curr_bb); + to_process = DF_LIVE_SUBREG_OUT (curr_bb); if (last_processed_bb != NULL) { /* We are somewhere in the middle of EBB. */ @@ -7159,7 +7200,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn *tail) { /* We reached the beginning of the current block -- do rest of spliting in the current BB. */ - to_process = df_get_live_in (curr_bb); + to_process = DF_LIVE_SUBREG_IN (curr_bb); if (BLOCK_FOR_INSN (head) != curr_bb) { /* We are somewhere in the middle of EBB. */ @@ -7236,7 +7277,7 @@ lra_inheritance (void) fprintf (lra_dump_file, "EBB"); /* Form a EBB starting with BB. */ bitmap_clear (&ebb_global_regs); - bitmap_ior_into (&ebb_global_regs, df_get_live_in (bb)); + bitmap_ior_into (&ebb_global_regs, DF_LIVE_SUBREG_IN (bb)); for (;;) { if (lra_dump_file != NULL) @@ -7252,7 +7293,7 @@ lra_inheritance (void) break; bb = bb->next_bb; } - bitmap_ior_into (&ebb_global_regs, df_get_live_out (bb)); + bitmap_ior_into (&ebb_global_regs, DF_LIVE_SUBREG_OUT (bb)); if (lra_dump_file != NULL) fprintf (lra_dump_file, "\n"); if (inherit_in_ebb (BB_HEAD (start_bb), BB_END (bb))) @@ -7281,15 +7322,23 @@ int lra_undo_inheritance_iter; /* Fix BB live info LIVE after removing pseudos created on pass doing inheritance/split which are REMOVED_PSEUDOS. */ static void -fix_bb_live_info (bitmap live, bitmap removed_pseudos) +fix_bb_live_info (bitmap all, bitmap full, bitmap partial, + bitmap removed_pseudos) { unsigned int regno; bitmap_iterator bi; EXECUTE_IF_SET_IN_BITMAP (removed_pseudos, 0, regno, bi) - if (bitmap_clear_bit (live, regno) - && REG_P (lra_reg_info[regno].restore_rtx)) - bitmap_set_bit (live, REGNO (lra_reg_info[regno].restore_rtx)); + { + if (bitmap_clear_bit (all, regno) + && REG_P (lra_reg_info[regno].restore_rtx)) + { + bitmap_set_bit (all, REGNO (lra_reg_info[regno].restore_rtx)); + bitmap_clear_bit (full, regno); + bitmap_set_bit (full, REGNO (lra_reg_info[regno].restore_rtx)); + gcc_assert (!bitmap_bit_p (partial, regno)); + } + } } /* Return regno of the (subreg of) REG. Otherwise, return a negative @@ -7355,8 +7404,10 @@ remove_inheritance_pseudos (bitmap remove_pseudos) constraint pass. */ FOR_EACH_BB_FN (bb, cfun) { - fix_bb_live_info (df_get_live_in (bb), remove_pseudos); - fix_bb_live_info (df_get_live_out (bb), remove_pseudos); + fix_bb_live_info (DF_LIVE_SUBREG_IN (bb), DF_LIVE_SUBREG_FULL_IN (bb), + DF_LIVE_SUBREG_PARTIAL_IN (bb), remove_pseudos); + fix_bb_live_info (DF_LIVE_SUBREG_OUT (bb), DF_LIVE_SUBREG_FULL_OUT (bb), + DF_LIVE_SUBREG_PARTIAL_OUT (bb), remove_pseudos); FOR_BB_INSNS_REVERSE (bb, curr_insn) { if (! INSN_P (curr_insn)) diff --git a/gcc/lra-int.h b/gcc/lra-int.h index d0752c2ae50..678377d9ec6 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_LRA_INT_H #define GCC_LRA_INT_H +#include "subreg-live-range.h" + #define lra_assert(c) gcc_checking_assert (c) /* The parameter used to prevent infinite reloading for an insn. Each diff --git a/gcc/lra-lives.cc b/gcc/lra-lives.cc index f60e564da82..d93921ad302 100644 --- a/gcc/lra-lives.cc +++ b/gcc/lra-lives.cc @@ -272,8 +272,26 @@ update_pseudo_point (int regno, int point, enum point_type type) } } -/* The corresponding bitmaps of BB currently being processed. */ -static bitmap bb_killed_pseudos, bb_gen_pseudos; +/* Structure describing local BB data used for pseudo + live-analysis. */ +class bb_data_pseudos : public basic_block_subreg_live_info +{ +public: + /* Basic block about which the below data are. */ + basic_block bb; +}; + +/* Array for all BB data. Indexed by the corresponding BB index. */ +typedef class bb_data_pseudos *bb_data_t; + +/* All basic block data are referred through the following array. */ +static bb_data_t bb_data; + +/* The corresponding basic block info of BB currently being processed. */ +static bb_data_t curr_bb_info; + +/* Flag mean curr function has subreg ref need be tracked. */ +static bool has_subreg_live_p; /* Record hard register REGNO as now being live. It updates living hard regs and START_LIVING. */ @@ -287,7 +305,7 @@ make_hard_regno_live (int regno) SET_HARD_REG_BIT (hard_regs_live, regno); sparseset_set_bit (start_living, regno); if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno)) - bitmap_set_bit (bb_gen_pseudos, regno); + bitmap_set_bit (&curr_bb_info->full_use, regno); } /* Process the definition of hard register REGNO. This updates @@ -310,8 +328,8 @@ make_hard_regno_dead (int regno) sparseset_set_bit (start_dying, regno); if (fixed_regs[regno] || TEST_HARD_REG_BIT (hard_regs_spilled_into, regno)) { - bitmap_clear_bit (bb_gen_pseudos, regno); - bitmap_set_bit (bb_killed_pseudos, regno); + bitmap_clear_bit (&curr_bb_info->full_use, regno); + bitmap_set_bit (&curr_bb_info->full_def, regno); } } @@ -355,7 +373,9 @@ mark_regno_live (int regno, machine_mode mode) else { mark_pseudo_live (regno); - bitmap_set_bit (bb_gen_pseudos, regno); + bitmap_set_bit (&curr_bb_info->full_use, regno); + gcc_assert (!bitmap_bit_p (&curr_bb_info->partial_use, regno)); + gcc_assert (!bitmap_bit_p (&curr_bb_info->partial_def, regno)); } } @@ -375,8 +395,10 @@ mark_regno_dead (int regno, machine_mode mode) else { mark_pseudo_dead (regno); - bitmap_clear_bit (bb_gen_pseudos, regno); - bitmap_set_bit (bb_killed_pseudos, regno); + bitmap_clear_bit (&curr_bb_info->full_use, regno); + bitmap_set_bit (&curr_bb_info->full_def, regno); + gcc_assert (!bitmap_bit_p (&curr_bb_info->partial_use, regno)); + gcc_assert (!bitmap_bit_p (&curr_bb_info->partial_def, regno)); } } @@ -387,23 +409,6 @@ mark_regno_dead (int regno, machine_mode mode) border. That might be a consequence of some global transformations in LRA, e.g. PIC pseudo reuse or rematerialization. */ -/* Structure describing local BB data used for pseudo - live-analysis. */ -class bb_data_pseudos -{ -public: - /* Basic block about which the below data are. */ - basic_block bb; - bitmap_head killed_pseudos; /* pseudos killed in the BB. */ - bitmap_head gen_pseudos; /* pseudos generated in the BB. */ -}; - -/* Array for all BB data. Indexed by the corresponding BB index. */ -typedef class bb_data_pseudos *bb_data_t; - -/* All basic block data are referred through the following array. */ -static bb_data_t bb_data; - /* Two small functions for access to the bb data. */ static inline bb_data_t get_bb_data (basic_block bb) @@ -430,13 +435,93 @@ static bool live_trans_fun (int bb_index) { basic_block bb = get_bb_data_by_index (bb_index)->bb; - bitmap bb_liveout = df_get_live_out (bb); - bitmap bb_livein = df_get_live_in (bb); + bitmap full_out = DF_LIVE_SUBREG_FULL_OUT (bb); + bitmap full_in = DF_LIVE_SUBREG_FULL_IN (bb); + bitmap partial_out = DF_LIVE_SUBREG_PARTIAL_OUT (bb); + bitmap partial_in = DF_LIVE_SUBREG_PARTIAL_IN (bb); + subregs_live *range_out = DF_LIVE_SUBREG_RANGE_OUT (bb); + subregs_live *range_in = DF_LIVE_SUBREG_RANGE_IN (bb); bb_data_t bb_info = get_bb_data (bb); - bitmap_and_compl (&temp_bitmap, bb_liveout, &all_hard_regs_bitmap); - return bitmap_ior_and_compl (bb_livein, &bb_info->gen_pseudos, - &temp_bitmap, &bb_info->killed_pseudos); + if (!has_subreg_live_p) + { + bitmap_and_compl (&temp_bitmap, full_out, &all_hard_regs_bitmap); + return bitmap_ior_and_compl (full_in, &bb_info->full_use, &temp_bitmap, + &bb_info->full_def); + } + + /* If there has subreg live need be tracked. */ + unsigned int regno; + bitmap_iterator bi; + bool changed = false; + bitmap_head temp_full_out; + bitmap_head temp_partial_out; + bitmap_head temp_partial_be_full_out; + bitmap_head all_def; + subregs_live temp_range_out; + bitmap_initialize (&temp_full_out, ®_obstack); + bitmap_initialize (&temp_partial_out, ®_obstack); + bitmap_initialize (&temp_partial_be_full_out, ®_obstack); + bitmap_initialize (&all_def, ®_obstack); + + bitmap_and_compl (&temp_full_out, full_out, &all_hard_regs_bitmap); + + bitmap_ior (&all_def, &bb_info->full_def, &bb_info->partial_def); + + bitmap_and (&temp_partial_out, &temp_full_out, &bb_info->partial_def); + EXECUTE_IF_SET_IN_BITMAP (&temp_partial_out, FIRST_PSEUDO_REGISTER, regno, bi) + { + subreg_ranges temp (bb_info->range_def->lives.at (regno).max); + temp.make_full (); + temp.remove_ranges (bb_info->range_def->lives.at (regno)); + temp_range_out.add_ranges (regno, temp); + } + bitmap_ior_and_compl_into (&temp_partial_out, partial_out, &all_def); + EXECUTE_IF_AND_COMPL_IN_BITMAP (partial_out, &all_def, FIRST_PSEUDO_REGISTER, + regno, bi) + { + temp_range_out.add_ranges (regno, range_out->lives.at (regno)); + } + EXECUTE_IF_AND_IN_BITMAP (partial_out, &bb_info->partial_def, 0, regno, bi) + { + subreg_ranges temp = range_out->lives.at (regno); + temp.remove_ranges (bb_info->range_def->lives.at (regno)); + if (!temp.empty_p ()) + { + bitmap_set_bit (&temp_partial_out, regno); + temp_range_out.add_ranges (regno, temp); + } + } + + temp_range_out.add_lives (*bb_info->range_use); + EXECUTE_IF_AND_IN_BITMAP (&temp_partial_out, &bb_info->partial_use, 0, regno, + bi) + { + subreg_ranges temp = temp_range_out.lives.at (regno); + temp.add_ranges (bb_info->range_use->lives.at (regno)); + if (temp.full_p ()) + { + bitmap_set_bit (&temp_partial_be_full_out, regno); + temp_range_out.remove_live (regno); + } + } + + bitmap_ior_and_compl_into (&temp_partial_be_full_out, &temp_full_out, + &all_def); + changed + |= bitmap_ior (full_in, &temp_partial_be_full_out, &bb_info->full_use); + + bitmap_ior_into (&temp_partial_out, &bb_info->partial_use); + changed |= bitmap_and_compl (partial_in, &temp_partial_out, + &temp_partial_be_full_out); + changed |= range_in->copy_lives (temp_range_out); + + bitmap_clear (&temp_full_out); + bitmap_clear (&temp_partial_out); + bitmap_clear (&temp_partial_be_full_out); + bitmap_clear (&all_def); + + return changed; } /* The confluence function used by the DF equation solver to set up @@ -444,7 +529,9 @@ live_trans_fun (int bb_index) static void live_con_fun_0 (basic_block bb) { - bitmap_and_into (df_get_live_out (bb), &all_hard_regs_bitmap); + bitmap_and_into (DF_LIVE_SUBREG_OUT (bb), &all_hard_regs_bitmap); + bitmap_and_into (DF_LIVE_SUBREG_FULL_OUT (bb), &all_hard_regs_bitmap); + bitmap_and_into (DF_LIVE_SUBREG_PARTIAL_OUT (bb), &all_hard_regs_bitmap); } /* The confluence function used by the DF equation solver to propagate @@ -456,13 +543,77 @@ live_con_fun_0 (basic_block bb) static bool live_con_fun_n (edge e) { - basic_block bb = e->src; - basic_block dest = e->dest; - bitmap bb_liveout = df_get_live_out (bb); - bitmap dest_livein = df_get_live_in (dest); + class df_live_subreg_bb_info *src_bb_info + = df_live_subreg_get_bb_info (e->src->index); + class df_live_subreg_bb_info *dest_bb_info + = df_live_subreg_get_bb_info (e->dest->index); + + if (!has_subreg_live_p) + { + return bitmap_ior_and_compl_into (&src_bb_info->full_out, + &dest_bb_info->full_in, + &all_hard_regs_bitmap); + } + + /* If there has subreg live need be tracked. Calculation formula: + temp_full mean: + 1. partial in out/in, full in other in/out + 2. partial in out and in, and mrege range is full + temp_range mean: + the range of regno which partial live + src_bb_info->partial_out = (src_bb_info->partial_out | + dest_bb_info->partial_in) & ~temp_full src_bb_info->range_out = copy + (temp_range) src_bb_info->full_out |= dest_bb_info->full_in | temp_full + */ + subregs_live temp_range; + temp_range.add_lives (*src_bb_info->range_out); + temp_range.add_lives (*dest_bb_info->range_in); + + bitmap_head temp_partial_all; + bitmap_initialize (&temp_partial_all, &bitmap_default_obstack); + bitmap_ior (&temp_partial_all, &src_bb_info->partial_out, + &dest_bb_info->partial_in); + + bitmap_head temp_full; + bitmap_initialize (&temp_full, &bitmap_default_obstack); + + /* Collect regno that become full after merge src_bb_info->partial_out + and dest_bb_info->partial_in. */ + unsigned int regno; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (&temp_partial_all, FIRST_PSEUDO_REGISTER, regno, bi) + { + if (bitmap_bit_p (&src_bb_info->full_out, regno) + || bitmap_bit_p (&dest_bb_info->full_in, regno)) + { + bitmap_set_bit (&temp_full, regno); + temp_range.remove_live (regno); + continue; + } + else if (!bitmap_bit_p (&src_bb_info->partial_out, regno) + || !bitmap_bit_p (&dest_bb_info->partial_in, regno)) + continue; + + subreg_ranges temp = src_bb_info->range_out->lives.at (regno); + temp.add_ranges (dest_bb_info->range_in->lives.at (regno)); + if (temp.full_p ()) + { + bitmap_set_bit (&temp_full, regno); + temp_range.remove_live (regno); + } + } + + /* Calculating src_bb_info->partial_out and src_bb_info->range_out. */ + bool changed = bitmap_and_compl (&src_bb_info->partial_out, &temp_partial_all, + &temp_full); + changed |= src_bb_info->range_out->copy_lives (temp_range); - return bitmap_ior_and_compl_into (bb_liveout, - dest_livein, &all_hard_regs_bitmap); + /* Calculating src_bb_info->full_out. */ + bitmap_ior_and_compl_into (&temp_full, &dest_bb_info->full_in, + &all_hard_regs_bitmap); + changed |= bitmap_ior_into (&src_bb_info->full_out, &temp_full); + + return changed; } /* Indexes of all function blocks. */ @@ -483,8 +634,12 @@ initiate_live_solver (void) { bb_data_t bb_info = get_bb_data (bb); bb_info->bb = bb; - bitmap_initialize (&bb_info->killed_pseudos, ®_obstack); - bitmap_initialize (&bb_info->gen_pseudos, ®_obstack); + bitmap_initialize (&bb_info->full_def, ®_obstack); + bitmap_initialize (&bb_info->partial_def, ®_obstack); + bitmap_initialize (&bb_info->full_use, ®_obstack); + bitmap_initialize (&bb_info->partial_use, ®_obstack); + bb_info->range_def = new subregs_live (); + bb_info->range_use = new subregs_live (); bitmap_set_bit (&all_blocks, bb->index); } } @@ -499,8 +654,12 @@ finish_live_solver (void) FOR_ALL_BB_FN (bb, cfun) { bb_data_t bb_info = get_bb_data (bb); - bitmap_clear (&bb_info->killed_pseudos); - bitmap_clear (&bb_info->gen_pseudos); + bitmap_clear (&bb_info->full_def); + bitmap_clear (&bb_info->partial_def); + bitmap_clear (&bb_info->full_use); + bitmap_clear (&bb_info->partial_use); + delete bb_info->range_def; + delete bb_info->range_use; } free (bb_data); bitmap_clear (&all_hard_regs_bitmap); @@ -663,7 +822,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) /* Only has a meaningful value once we've seen a call. */ function_abi last_call_abi = default_function_abi; - reg_live_out = df_get_live_out (bb); + reg_live_out = DF_LIVE_SUBREG_OUT (bb); sparseset_clear (pseudos_live); sparseset_clear (pseudos_live_through_calls); sparseset_clear (pseudos_live_through_setjumps); @@ -675,10 +834,13 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) mark_pseudo_live (j); } - bb_gen_pseudos = &get_bb_data (bb)->gen_pseudos; - bb_killed_pseudos = &get_bb_data (bb)->killed_pseudos; - bitmap_clear (bb_gen_pseudos); - bitmap_clear (bb_killed_pseudos); + curr_bb_info = get_bb_data (bb); + bitmap_clear (&curr_bb_info->full_use); + bitmap_clear (&curr_bb_info->partial_use); + bitmap_clear (&curr_bb_info->full_def); + bitmap_clear (&curr_bb_info->partial_def); + curr_bb_info->range_use->clear (); + curr_bb_info->range_def->clear (); freq = REG_FREQ_FROM_BB (bb); if (lra_dump_file != NULL) @@ -1101,16 +1263,16 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) bool live_change_p = false; /* Check if bb border live info was changed. */ unsigned int live_pseudos_num = 0; - EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), - FIRST_PSEUDO_REGISTER, j, bi) + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, j, + bi) { live_pseudos_num++; - if (! sparseset_bit_p (pseudos_live, j)) + if (!sparseset_bit_p (pseudos_live, j)) { live_change_p = true; if (lra_dump_file != NULL) - fprintf (lra_dump_file, - " r%d is removed as live at bb%d start\n", j, bb->index); + fprintf (lra_dump_file, " r%d is removed as live at bb%d start\n", + j, bb->index); break; } } @@ -1120,9 +1282,9 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) live_change_p = true; if (lra_dump_file != NULL) EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j) - if (! bitmap_bit_p (df_get_live_in (bb), j)) - fprintf (lra_dump_file, - " r%d is added to live at bb%d start\n", j, bb->index); + if (!bitmap_bit_p (DF_LIVE_SUBREG_IN (bb), j)) + fprintf (lra_dump_file, " r%d is added to live at bb%d start\n", j, + bb->index); } /* See if we'll need an increment at the end of this basic block. An increment is needed if the PSEUDOS_LIVE set is not empty, @@ -1135,8 +1297,9 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) mark_pseudo_dead (i); } - EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, j, bi) - { + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, j, + bi) + { if (sparseset_cardinality (pseudos_live_through_calls) == 0) break; if (sparseset_bit_p (pseudos_live_through_calls, j)) @@ -1151,7 +1314,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) if (!TEST_HARD_REG_BIT (hard_regs_spilled_into, i)) continue; - if (bitmap_bit_p (df_get_live_in (bb), i)) + if (bitmap_bit_p (DF_LIVE_SUBREG_IN (bb), i)) continue; live_change_p = true; @@ -1159,7 +1322,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) fprintf (lra_dump_file, " hard reg r%d is added to live at bb%d start\n", i, bb->index); - bitmap_set_bit (df_get_live_in (bb), i); + bitmap_set_bit (DF_LIVE_SUBREG_IN (bb), i); + bitmap_set_bit (DF_LIVE_SUBREG_FULL_IN (bb), i); } if (need_curr_point_incr) @@ -1425,10 +1589,24 @@ lra_create_live_ranges_1 (bool all_p, bool dead_insn_p) disappear, e.g. pseudos with used equivalences. */ FOR_EACH_BB_FN (bb, cfun) { - bitmap_clear_range (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, + bitmap_clear_range (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, + max_regno - FIRST_PSEUDO_REGISTER); + bitmap_clear_range (DF_LIVE_SUBREG_FULL_IN (bb), + FIRST_PSEUDO_REGISTER, max_regno - FIRST_PSEUDO_REGISTER); - bitmap_clear_range (df_get_live_out (bb), FIRST_PSEUDO_REGISTER, + bitmap_clear_range (DF_LIVE_SUBREG_PARTIAL_IN (bb), + FIRST_PSEUDO_REGISTER, max_regno - FIRST_PSEUDO_REGISTER); + bitmap_clear_range (DF_LIVE_SUBREG_OUT (bb), FIRST_PSEUDO_REGISTER, + max_regno - FIRST_PSEUDO_REGISTER); + bitmap_clear_range (DF_LIVE_SUBREG_FULL_OUT (bb), + FIRST_PSEUDO_REGISTER, + max_regno - FIRST_PSEUDO_REGISTER); + bitmap_clear_range (DF_LIVE_SUBREG_PARTIAL_OUT (bb), + FIRST_PSEUDO_REGISTER, + max_regno - FIRST_PSEUDO_REGISTER); + DF_LIVE_SUBREG_RANGE_IN (bb)->clear (); + DF_LIVE_SUBREG_RANGE_OUT (bb)->clear (); } /* As we did not change CFG since LRA start we can use DF-infrastructure solver to solve live data flow problem. */ @@ -1441,6 +1619,8 @@ lra_create_live_ranges_1 (bool all_p, bool dead_insn_p) (DF_BACKWARD, NULL, live_con_fun_0, live_con_fun_n, live_trans_fun, &all_blocks, df_get_postorder (DF_BACKWARD), df_get_n_blocks (DF_BACKWARD)); + df_live_subreg_finalize (&all_blocks); + if (lra_dump_file != NULL) { fprintf (lra_dump_file, @@ -1449,16 +1629,28 @@ lra_create_live_ranges_1 (bool all_p, bool dead_insn_p) FOR_EACH_BB_FN (bb, cfun) { bb_data_t bb_info = get_bb_data (bb); - bitmap bb_livein = df_get_live_in (bb); - bitmap bb_liveout = df_get_live_out (bb); fprintf (lra_dump_file, "\nBB %d:\n", bb->index); - lra_dump_bitmap_with_title (" gen:", - &bb_info->gen_pseudos, bb->index); - lra_dump_bitmap_with_title (" killed:", - &bb_info->killed_pseudos, bb->index); - lra_dump_bitmap_with_title (" livein:", bb_livein, bb->index); - lra_dump_bitmap_with_title (" liveout:", bb_liveout, bb->index); + lra_dump_bitmap_with_title (" full use", &bb_info->full_use, + bb->index); + lra_dump_bitmap_with_title (" partial use", + &bb_info->partial_use, bb->index); + lra_dump_bitmap_with_title (" full def", &bb_info->full_def, + bb->index); + lra_dump_bitmap_with_title (" partial def", + &bb_info->partial_def, bb->index); + lra_dump_bitmap_with_title (" live in full", + DF_LIVE_SUBREG_FULL_IN (bb), + bb->index); + lra_dump_bitmap_with_title (" live in partial", + DF_LIVE_SUBREG_PARTIAL_IN (bb), + bb->index); + lra_dump_bitmap_with_title (" live out full", + DF_LIVE_SUBREG_FULL_OUT (bb), + bb->index); + lra_dump_bitmap_with_title (" live out partial", + DF_LIVE_SUBREG_PARTIAL_OUT (bb), + bb->index); } } } diff --git a/gcc/lra-remat.cc b/gcc/lra-remat.cc index 681dcf36331..26d3da07b00 100644 --- a/gcc/lra-remat.cc +++ b/gcc/lra-remat.cc @@ -556,11 +556,11 @@ dump_candidates_and_remat_bb_data (void) fprintf (lra_dump_file, "\nBB %d:\n", bb->index); /* Livein */ fprintf (lra_dump_file, " register live in:"); - dump_regset (df_get_live_in (bb), lra_dump_file); + dump_regset (DF_LIVE_SUBREG_IN (bb), lra_dump_file); putc ('\n', lra_dump_file); /* Liveout */ fprintf (lra_dump_file, " register live out:"); - dump_regset (df_get_live_out (bb), lra_dump_file); + dump_regset (DF_LIVE_SUBREG_OUT (bb), lra_dump_file); putc ('\n', lra_dump_file); /* Changed/dead regs: */ fprintf (lra_dump_file, " changed regs:"); @@ -727,7 +727,7 @@ calculate_livein_cands (void) FOR_EACH_BB_FN (bb, cfun) { - bitmap livein_regs = df_get_live_in (bb); + bitmap livein_regs = DF_LIVE_SUBREG_IN (bb); bitmap livein_cands = &get_remat_bb_data (bb)->livein_cands; for (unsigned int i = 0; i < cands_num; i++) { @@ -1064,11 +1064,10 @@ do_remat (void) FOR_EACH_BB_FN (bb, cfun) { CLEAR_HARD_REG_SET (live_hard_regs); - EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), 0, regno, bi) + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_IN (bb), 0, regno, bi) { - int hard_regno = regno < FIRST_PSEUDO_REGISTER - ? regno - : reg_renumber[regno]; + int hard_regno + = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno]; if (hard_regno >= 0) SET_HARD_REG_BIT (live_hard_regs, hard_regno); } diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc index a663a1931e3..d38a2ffe2a7 100644 --- a/gcc/lra-spills.cc +++ b/gcc/lra-spills.cc @@ -566,8 +566,26 @@ spill_pseudos (void) "Debug insn #%u is reset because it referenced " "removed pseudo\n", INSN_UID (insn)); } - bitmap_and_compl_into (df_get_live_in (bb), spilled_pseudos); - bitmap_and_compl_into (df_get_live_out (bb), spilled_pseudos); + unsigned int regno; + bitmap_iterator bi; + + bitmap_and_compl_into (DF_LIVE_SUBREG_IN (bb), spilled_pseudos); + bitmap_and_compl_into (DF_LIVE_SUBREG_FULL_IN (bb), spilled_pseudos); + bitmap partial_in = DF_LIVE_SUBREG_PARTIAL_IN (bb); + subregs_live *range_in = DF_LIVE_SUBREG_RANGE_IN (bb); + EXECUTE_IF_AND_IN_BITMAP (partial_in, spilled_pseudos, + FIRST_PSEUDO_REGISTER, regno, bi) + range_in->remove_live (regno); + bitmap_and_compl_into (partial_in, spilled_pseudos); + + bitmap_and_compl_into (DF_LIVE_SUBREG_OUT (bb), spilled_pseudos); + bitmap_and_compl_into (DF_LIVE_SUBREG_FULL_OUT (bb), spilled_pseudos); + bitmap partial_out = DF_LIVE_SUBREG_PARTIAL_OUT (bb); + subregs_live *range_out = DF_LIVE_SUBREG_RANGE_OUT (bb); + EXECUTE_IF_AND_IN_BITMAP (partial_out, spilled_pseudos, + FIRST_PSEUDO_REGISTER, regno, bi) + range_out->remove_live (regno); + bitmap_and_compl_into (partial_out, spilled_pseudos); } } } From patchwork Sun Nov 12 12:08:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 164251 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp660238vqg; Sun, 12 Nov 2023 04:10:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IHemO9qed9bDeUTz3lXeII46Dyb3K1/IMu7zXmPYPK7gZU0G8MpMh323qTuR5W/sMSBmkS9 X-Received: by 2002:a05:6870:2424:b0:1ef:aba1:1995 with SMTP id n36-20020a056870242400b001efaba11995mr5300429oap.59.1699791057447; Sun, 12 Nov 2023 04:10:57 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699791057; cv=pass; d=google.com; s=arc-20160816; b=0RNJZzEsIT3SdB7KtKvBMexkAzoNKcn/mjegZUpA/eXVxTV//+lfL7DvvuDcRGaHHg qN318GDV70nYXNQzYP6Tmv23rKNDANRcoBDuhVI+2F6bImk+ryFMnWjwrFuVneOFU1N3 RsPC87xAoUg5+umr5LWFnq6jiSkMntEcHUFBpVxPuh3LkDzTybQQRW5F8duyDqv4AVAA g/5BYHG/mJa409cQufHJMCy57nIuFKf18BORnCz7JbcHALvxlLPSE8cT0rAD6SrzkqWU CZxTVT0SScCVDusmz28+vmcelYhs+dylpN1dx5imydotR7tgScerHvJOi3cJuSmwmmQM pgjg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=viwnp7foIOfy+VU8zUP68Chx+HH0VVG3vPIyl7KoIRw=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=w2uNq9vDvdkP9Slkn3d+MzYHgpy3BmXjJT5xjHhORR6XJ46X0nLPOx33J67SWowrBO eaO9RcumqmpRTbuOTxZ/7unReripMwolr1XCbBDMLdY+DhOO5M0URZ2YwvN7yYrVWOLB 2Y4Vl3Y5lAsntDgoxRjK90Ne3EZBfzwosUQ8ET9/fM9Mjq/V8lnCYp9iUAL0bVdUwEqk /C00bL5Oxyk7FqMeKHanVIhACWabz7SUSj63rdRKq6YeqNtN8FzHxGRjLWY3tGuhmSuV PDdx22noQpWFc7Ps0FRF1+kYrWEpnR0N8SJSxPBT6zO0UsSPhQPcNA24Bj3td9Zg6rsI fawQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l18-20020a05622a051200b00403be2ad0besi2773669qtx.11.2023.11.12.04.10.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Nov 2023 04:10:57 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2DEEE3858D37 for ; Sun, 12 Nov 2023 12:10:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast1.qq.com (smtpbguseast1.qq.com [54.204.34.129]) by sourceware.org (Postfix) with ESMTPS id CD8B93858028 for ; Sun, 12 Nov 2023 12:08:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD8B93858028 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD8B93858028 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.204.34.129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790935; cv=none; b=gx0LdakkrxI1w8uWw2BsIUIhmlsXx6KFFA+WCKryBQ8ErK6kV1MHNcFNcvyV1hB8l5/7jiHQj7bIIlnkUB+Wfj9YPU2bvvaRvStIDDRTceK0v/iyy1UgWeDj1tF2cn7zfwH8sa93iW5roGsmW1eLJLTIFO155gNFbHlor35P5R4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699790935; c=relaxed/simple; bh=pikDaNVnhaUuEwHpU12Zq79oUSz8HqKIm17PBR9EzUI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=qJgJt7KewO+FNMgDwXQEDV3wvgrBwokt6Ab9y6CnNre1d8SVh+DBDvo//MFXPxRqamXpD/ETTI6rxRh2QaegiYWwHfmGYcMBUn+ShUoH0hnWnYKZgpSD07IN1YYvs58z+xJfXt+SXREgg7AT9UwedOg6P3iN8Wdgo9kVFb0isyU= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp67t1699790923tun14vrq Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 12 Nov 2023 20:08:42 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: +ynUkgUhZJn1Mg+n5EqnxXtvSxHuKDSXt3Re/YbaM2vTfCcDaO15ZrC79t4D2 kvcsIFJVyoBORVvKqimoVk4SkF2TMBhCVOTHZUG+797o0gJ7WT4BLfbV2urkQWch1AclDhQ zUWFA9LZM+Wc06mmhHJout4jktL5V2J7AJ/sefjTrvAZ4/9FQmTpk2QrctVAap1tWAwriaK F6HETGfOfWQGMBHoQrDaUpFYARDO3Ld6LrghETbjHTqko/08+DapliCeKqGlQjJvadmTdLZ KCPMOD755nlirDpCPlwVTdpbMRkkjXjCcV7uCnhJZpXXQt9tsyEbXkP8AfkEOFRqodlxsBg viRVNJspGuEyk3miDD6qJ4ypQfFMWD/4J1o36nv3WGOAyq3gSEOERB9g13qJ//K3jcTLQu1 R+PYLyNC9lQ= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 6025235424158987513 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH V3 7/7] lra: Support subreg live range track and conflict detect Date: Sun, 12 Nov 2023 20:08:17 +0800 Message-Id: <20231112120817.2635864-8-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231112120817.2635864-1-lehua.ding@rivai.ai> References: <20231112120817.2635864-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782360108149919563 X-GMAIL-MSGID: 1782360108149919563 This patch supports tracking the liveness of a subreg in a lra pass, with the goal of getting it to agree with ira's register allocation scheme. There is some duplication, maybe in the future this part of the code logic can be harmonized. gcc/ChangeLog: * ira-build.cc (setup_pseudos_has_subreg_object): Collect new data for lra to use. (ira_build): Ditto. * lra-assigns.cc (set_offset_conflicts): New function. (setup_live_pseudos_and_spill_after_risky_transforms): Adjust. (lra_assign): Ditto. * lra-constraints.cc (process_alt_operands): Ditto. * lra-int.h (GCC_LRA_INT_H): Ditto. (struct lra_live_range): Ditto. (struct lra_insn_reg): Ditto. (get_range_hard_regs): New. (get_nregs): New. (has_subreg_object_p): New. * lra-lives.cc (INCLUDE_VECTOR): Adjust. (lra_live_range_pool): Ditto. (create_live_range): Ditto. (lra_merge_live_ranges): Ditto. (update_pseudo_point): Ditto. (mark_regno_live): Ditto. (mark_regno_dead): Ditto. (process_bb_lives): Ditto. (remove_some_program_points_and_update_live_ranges): Ditto. (lra_print_live_range_list): Ditto. (class subreg_live_item): New. (create_subregs_live_ranges): New. (lra_create_live_ranges_1): Ditto. * lra.cc (get_range_blocks): Ditto. (get_range_hard_regs): Ditto. (new_insn_reg): Ditto. (collect_non_operand_hard_regs): Ditto. (initialize_lra_reg_info_element): Ditto. (reg_same_range_p): New. (add_regs_to_insn_regno_info): Adjust. --- gcc/ira-build.cc | 31 ++++ gcc/lra-assigns.cc | 111 ++++++++++++-- gcc/lra-constraints.cc | 18 ++- gcc/lra-int.h | 31 ++++ gcc/lra-lives.cc | 340 ++++++++++++++++++++++++++++++++++------- gcc/lra.cc | 139 +++++++++++++++-- 6 files changed, 585 insertions(+), 85 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index f88aeeeeaef..bb29627d375 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -95,6 +95,9 @@ int ira_copies_num; basic block. */ static int last_basic_block_before_change; +/* Record these pseudos which has subreg object. Used by LRA pass. */ +bitmap_head pseudos_has_subreg_object; + /* Initialize some members in loop tree node NODE. Use LOOP_NUM for the member loop_num. */ static void @@ -3711,6 +3714,33 @@ update_conflict_hard_reg_costs (void) } } +/* Setup speudos_has_subreg_object. */ +static void +setup_pseudos_has_subreg_object () +{ + bitmap_initialize (&pseudos_has_subreg_object, ®_obstack); + ira_allocno_t a; + ira_allocno_iterator ai; + FOR_EACH_ALLOCNO (a, ai) + if (has_subreg_object_p (a)) + { + bitmap_set_bit (&pseudos_has_subreg_object, ALLOCNO_REGNO (a)); + if (ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " a%d(r%d, nregs: %d) has subreg objects:\n", + ALLOCNO_NUM (a), ALLOCNO_REGNO (a), ALLOCNO_NREGS (a)); + ira_allocno_object_iterator oi; + ira_object_t obj; + FOR_EACH_ALLOCNO_OBJECT (a, obj, oi) + fprintf (ira_dump_file, " object %d: start: %d, nregs: %d\n", + OBJECT_INDEX (obj), OBJECT_START (obj), + OBJECT_NREGS (obj)); + fprintf (ira_dump_file, "\n"); + } + } +} + /* Create a internal representation (IR) for IRA (allocnos, copies, loop tree nodes). The function returns TRUE if we generate loop structure (besides nodes representing all function and the basic @@ -3731,6 +3761,7 @@ ira_build (void) create_allocnos (); ira_costs (); create_allocno_objects (); + setup_pseudos_has_subreg_object (); ira_create_allocno_live_ranges (); remove_unnecessary_regions (false); ira_compress_allocno_live_ranges (); diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc index d2ebcfd5056..6588a740162 100644 --- a/gcc/lra-assigns.cc +++ b/gcc/lra-assigns.cc @@ -1131,6 +1131,52 @@ assign_hard_regno (int hard_regno, int regno) /* Array used for sorting different pseudos. */ static int *sorted_pseudos; +/* The detail conflict offsets If two live ranges conflict. Use to record + partail conflict. */ +static bitmap_head live_range_conflicts; + +/* Set the conflict offset of the two registers REGNO1 and REGNO2. Use the + regno with bigger nregs as the base. */ +static void +set_offset_conflicts (int regno1, int regno2) +{ + gcc_assert (reg_renumber[regno1] >= 0 && reg_renumber[regno2] >= 0); + int nregs1 = get_nregs (regno1); + int nregs2 = get_nregs (regno2); + if (nregs1 < nregs2) + { + std::swap (nregs1, nregs2); + std::swap (regno1, regno2); + } + + lra_live_range_t r1 = lra_reg_info[regno1].live_ranges; + lra_live_range_t r2 = lra_reg_info[regno2].live_ranges; + int total = nregs1; + + bitmap_clear (&live_range_conflicts); + while (r1 != NULL && r2 != NULL) + { + if (r1->start > r2->finish) + r1 = r1->next; + else if (r2->start > r1->finish) + r2 = r2->next; + else + { + for (const subreg_range &range1 : r1->subreg.ranges) + for (const subreg_range &range2 : r2->subreg.ranges) + /* Record all overlap offset. */ + for (int i = range1.start - (range2.end - range2.start) + 1; + i < range1.end; i++) + if (i >= 0 && i < total) + bitmap_set_bit (&live_range_conflicts, i); + if (r1->finish < r2->finish) + r1 = r1->next; + else + r2 = r2->next; + } + } +} + /* The constraints pass is allowed to create equivalences between pseudos that make the current allocation "incorrect" (in the sense that pseudos are assigned to hard registers from their own conflict @@ -1226,19 +1272,56 @@ setup_live_pseudos_and_spill_after_risky_transforms (bitmap the same hard register. */ || hard_regno != reg_renumber[conflict_regno]) { - int conflict_hard_regno = reg_renumber[conflict_regno]; - - biggest_mode = lra_reg_info[conflict_regno].biggest_mode; - biggest_nregs = hard_regno_nregs (conflict_hard_regno, - biggest_mode); - nregs_diff - = (biggest_nregs - - hard_regno_nregs (conflict_hard_regno, - PSEUDO_REGNO_MODE (conflict_regno))); - add_to_hard_reg_set (&conflict_set, - biggest_mode, - conflict_hard_regno - - (WORDS_BIG_ENDIAN ? nregs_diff : 0)); + if (hard_regno >= 0 && reg_renumber[conflict_regno] >= 0 + && (has_subreg_object_p (regno) + || has_subreg_object_p (conflict_regno))) + { + int nregs1 = get_nregs (regno); + int nregs2 = get_nregs (conflict_regno); + /* Quick check it is no overlap at all between them. */ + if (hard_regno + nregs1 <= reg_renumber[conflict_regno] + || reg_renumber[conflict_regno] + nregs2 <= hard_regno) + continue; + + /* Check the overlap is ok if them have partial overlap. */ + set_offset_conflicts (regno, conflict_regno); + if (nregs1 >= nregs2) + EXECUTE_IF_SET_IN_BITMAP (&live_range_conflicts, 0, k, bi) + { + int start_regno + = WORDS_BIG_ENDIAN + ? reg_renumber[conflict_regno] + nregs2 + k - nregs1 + : reg_renumber[conflict_regno] - k; + if (start_regno >= 0 && hard_regno == start_regno) + SET_HARD_REG_BIT (conflict_set, start_regno); + } + else + EXECUTE_IF_SET_IN_BITMAP (&live_range_conflicts, 0, k, bi) + { + int start_regno + = WORDS_BIG_ENDIAN + ? reg_renumber[conflict_regno] + nregs2 - k - nregs1 + : reg_renumber[conflict_regno] + k; + if (start_regno < FIRST_PSEUDO_REGISTER + && hard_regno == start_regno) + SET_HARD_REG_BIT (conflict_set, start_regno); + } + } + else + { + int conflict_hard_regno = reg_renumber[conflict_regno]; + + biggest_mode = lra_reg_info[conflict_regno].biggest_mode; + biggest_nregs + = hard_regno_nregs (conflict_hard_regno, biggest_mode); + nregs_diff + = (biggest_nregs + - hard_regno_nregs (conflict_hard_regno, + PSEUDO_REGNO_MODE (conflict_regno))); + add_to_hard_reg_set (&conflict_set, biggest_mode, + conflict_hard_regno + - (WORDS_BIG_ENDIAN ? nregs_diff : 0)); + } } if (! overlaps_hard_reg_set_p (conflict_set, mode, hard_regno)) { @@ -1637,7 +1720,9 @@ lra_assign (bool &fails_p) init_regno_assign_info (); bitmap_initialize (&all_spilled_pseudos, ®_obstack); create_live_range_start_chains (); + bitmap_initialize (&live_range_conflicts, ®_obstack); setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos); + bitmap_clear (&live_range_conflicts); if (! lra_hard_reg_split_p && ! lra_asm_error_p && flag_checking) /* Check correctness of allocation but only when there are no hard reg splits and asm errors as in the case of errors explicit insns involving diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index c3ad846b97b..912d0c3feec 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -2363,13 +2363,19 @@ process_alt_operands (int only_alternative) { /* We should reject matching of an early clobber operand if the matching operand is - not dying in the insn. */ - if (!TEST_BIT (curr_static_id->operand[m] - .early_clobber_alts, nalt) + not dying in the insn. But for subreg of pseudo which + has subreg live be tracked in ira, the REG_DEAD note + doesn't have. that case we think them the matching is + ok. */ + if (!TEST_BIT ( + curr_static_id->operand[m].early_clobber_alts, + nalt) || operand_reg[nop] == NULL_RTX - || (find_regno_note (curr_insn, REG_DEAD, - REGNO (op)) - || REGNO (op) == REGNO (operand_reg[m]))) + || find_regno_note (curr_insn, REG_DEAD, REGNO (op)) + || (read_modify_subreg_p ( + *curr_id->operand_loc[nop]) + && has_subreg_object_p (REGNO (op))) + || REGNO (op) == REGNO (operand_reg[m])) match_p = true; } if (match_p) diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 678377d9ec6..5a97bd61475 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_LRA_INT_H #define GCC_LRA_INT_H +#include "lra.h" #include "subreg-live-range.h" #define lra_assert(c) gcc_checking_assert (c) @@ -48,6 +49,8 @@ struct lra_live_range lra_live_range_t next; /* Pointer to structures with the same start. */ lra_live_range_t start_next; + /* Object whose live range is described by given structure. */ + subreg_ranges subreg; }; typedef struct lra_copy *lra_copy_t; @@ -110,6 +113,8 @@ public: /* The biggest size mode in which each pseudo reg is referred in whole function (possibly via subreg). */ machine_mode biggest_mode; + /* The real reg MODE. */ + machine_mode reg_mode; /* Live ranges of the pseudo. */ lra_live_range_t live_ranges; /* This member is set up in lra-lives.cc for subsequent @@ -161,6 +166,12 @@ struct lra_insn_reg unsigned int subreg_p : 1; /* The corresponding regno of the register. */ int regno; + /* The start and end of current ref of blocks, remember the use/def can be + a normal subreg. */ + int start, end; + /* The start and end of current ref of hard regs, remember the use/def can be + a normal subreg. */ + int start_reg, end_reg; /* Next reg info of the same insn. */ struct lra_insn_reg *next; }; @@ -332,6 +343,8 @@ extern struct lra_insn_reg *lra_get_insn_regs (int); extern void lra_free_copies (void); extern void lra_create_copy (int, int, int); extern lra_copy_t lra_get_copy (int); +extern subreg_range +get_range_hard_regs (int regno, const subreg_range &r); extern int lra_new_regno_start; extern int lra_constraint_new_regno_start; @@ -533,4 +546,22 @@ lra_assign_reg_val (int from, int to) lra_reg_info[to].offset = lra_reg_info[from].offset; } +/* Return the number regs of REGNO. */ +inline int +get_nregs (int regno) +{ + enum reg_class aclass = lra_get_allocno_class (regno); + gcc_assert (aclass != NO_REGS); + int nregs = ira_reg_class_max_nregs[aclass][lra_reg_info[regno].reg_mode]; + return nregs; +} + +extern bitmap_head pseudos_has_subreg_object; +/* Return true if pseudo REGNO has subreg live range. */ +inline bool +has_subreg_object_p (int regno) +{ + return bitmap_bit_p (&pseudos_has_subreg_object, regno); +} + #endif /* GCC_LRA_INT_H */ diff --git a/gcc/lra-lives.cc b/gcc/lra-lives.cc index d93921ad302..8a7c653fb09 100644 --- a/gcc/lra-lives.cc +++ b/gcc/lra-lives.cc @@ -26,6 +26,7 @@ along with GCC; see the file COPYING3. If not see stack memory slots to spilled pseudos. */ #include "config.h" +#define INCLUDE_VECTOR #include "system.h" #include "coretypes.h" #include "backend.h" @@ -97,6 +98,9 @@ static bitmap_head temp_bitmap; /* Pool for pseudo live ranges. */ static object_allocator lra_live_range_pool ("live ranges"); +/* Store def/use point of has_subreg_object_p register. */ +static class subregs_live_points *live_points; + /* Free live range list LR. */ static void free_live_range_list (lra_live_range_t lr) @@ -113,16 +117,26 @@ free_live_range_list (lra_live_range_t lr) /* Create and return pseudo live range with given attributes. */ static lra_live_range_t -create_live_range (int regno, int start, int finish, lra_live_range_t next) +create_live_range (int regno, const subreg_ranges &sr, int start, int finish, + lra_live_range_t next) { lra_live_range_t p = lra_live_range_pool.allocate (); p->regno = regno; p->start = start; p->finish = finish; p->next = next; + p->subreg = sr; return p; } +static lra_live_range_t +create_live_range (int regno, int start, int finish, lra_live_range_t next) +{ + subreg_ranges sr = subreg_ranges (1); + sr.add_range (1, subreg_range (0, 1)); + return create_live_range (regno, sr, start, finish, next); +} + /* Copy live range R and return the result. */ static lra_live_range_t copy_live_range (lra_live_range_t r) @@ -164,7 +178,8 @@ lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) if (r1->start < r2->start) std::swap (r1, r2); - if (r1->start == r2->finish + 1) + if (r1->start == r2->finish + 1 + && (r1->regno != r2->regno || r1->subreg.same_p (r2->subreg))) { /* Joint ranges: merge r1 and r2 into r1. */ r1->start = r2->start; @@ -174,7 +189,8 @@ lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) } else { - gcc_assert (r2->finish + 1 < r1->start); + gcc_assert (r2->finish + 1 < r1->start + || !r1->subreg.same_p (r2->subreg)); /* Add r1 to the result. */ if (first == NULL) first = last = r1; @@ -237,6 +253,10 @@ sparseset_contains_pseudos_p (sparseset a) return false; } +static void +update_pseudo_point (int regno, const subreg_range &range, int point, + enum point_type type); + /* Mark pseudo REGNO as living or dying at program point POINT, depending on whether TYPE is a definition or a use. If this is the first reference to REGNO that we've encountered, then create a new live range for it. */ @@ -249,27 +269,78 @@ update_pseudo_point (int regno, int point, enum point_type type) /* Don't compute points for hard registers. */ if (HARD_REGISTER_NUM_P (regno)) return; + if (!complete_info_p && lra_get_regno_hard_regno (regno) >= 0) + return; - if (complete_info_p || lra_get_regno_hard_regno (regno) < 0) + if (has_subreg_object_p (regno)) { - if (type == DEF_POINT) - { - if (sparseset_bit_p (pseudos_live, regno)) - { - p = lra_reg_info[regno].live_ranges; - lra_assert (p != NULL); - p->finish = point; - } - } - else /* USE_POINT */ + update_pseudo_point (regno, subreg_range (0, get_nregs (regno)), point, + type); + return; + } + + if (type == DEF_POINT) + { + if (sparseset_bit_p (pseudos_live, regno)) { - if (!sparseset_bit_p (pseudos_live, regno) - && ((p = lra_reg_info[regno].live_ranges) == NULL - || (p->finish != point && p->finish + 1 != point))) - lra_reg_info[regno].live_ranges - = create_live_range (regno, point, -1, p); + p = lra_reg_info[regno].live_ranges; + lra_assert (p != NULL); + p->finish = point; } } + else /* USE_POINT */ + { + if (!sparseset_bit_p (pseudos_live, regno) + && ((p = lra_reg_info[regno].live_ranges) == NULL + || (p->finish != point && p->finish + 1 != point))) + lra_reg_info[regno].live_ranges + = create_live_range (regno, point, -1, p); + } +} + +/* Like the above mark_regno_dead but for has_subreg_object_p REGNO. */ +static void +update_pseudo_point (int regno, const subreg_range &range, int point, + enum point_type type) +{ + /* Don't compute points for hard registers. */ + if (HARD_REGISTER_NUM_P (regno)) + return; + + if (!complete_info_p && lra_get_regno_hard_regno (regno) >= 0) + { + if (has_subreg_object_p (regno)) + live_points->add_range (regno, get_nregs (regno), range, + type == DEF_POINT); + return; + } + + if (!has_subreg_object_p (regno)) + { + update_pseudo_point (regno, point, type); + return; + } + + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " %s r%d", + type == DEF_POINT ? "def" : "use", regno); + fprintf (lra_dump_file, "[subreg: start %d, nregs: %d]", range.start, + range.end - range.start); + fprintf (lra_dump_file, " at point %d\n", point); + } + + live_points->add_point (regno, get_nregs (regno), range, type == DEF_POINT, + point); +} + +/* Update each range in SR. */ +static void +update_pseudo_point (int regno, const subreg_ranges sr, int point, + enum point_type type) +{ + for (const subreg_range &range : sr.ranges) + update_pseudo_point (regno, range, point, type); } /* Structure describing local BB data used for pseudo @@ -354,12 +425,18 @@ mark_pseudo_dead (int regno) if (!sparseset_bit_p (pseudos_live, regno)) return; + /* Just return if regno have partial subreg live for subreg access. */ + if (has_subreg_object_p (regno) && !live_points->empty_live_p (regno)) + return; + sparseset_clear_bit (pseudos_live, regno); sparseset_set_bit (start_dying, regno); } +static void +mark_regno_live (int regno, const subreg_range &range, machine_mode mode); /* Mark register REGNO (pseudo or hard register) in MODE as being live - and update BB_GEN_PSEUDOS. */ + and update CURR_BB_INFO. */ static void mark_regno_live (int regno, machine_mode mode) { @@ -370,6 +447,11 @@ mark_regno_live (int regno, machine_mode mode) for (last = end_hard_regno (mode, regno); regno < last; regno++) make_hard_regno_live (regno); } + else if (has_subreg_object_p (regno)) + { + machine_mode mode = lra_reg_info[regno].reg_mode; + mark_regno_live (regno, subreg_range (0, get_nregs (regno)), mode); + } else { mark_pseudo_live (regno); @@ -379,9 +461,26 @@ mark_regno_live (int regno, machine_mode mode) } } +/* Like the above mark_regno_dead but for has_subreg_object_p REGNO. */ +static void +mark_regno_live (int regno, const subreg_range &range, machine_mode mode) +{ + if (HARD_REGISTER_NUM_P (regno) || !has_subreg_object_p (regno)) + mark_regno_live (regno, mode); + else + { + mark_pseudo_live (regno); + machine_mode mode = lra_reg_info[regno].reg_mode; + if (!range.full_p (get_nregs (regno))) + has_subreg_live_p = true; + add_subreg_range (curr_bb_info, regno, mode, range, false); + } +} +static void +mark_regno_dead (int regno, const subreg_range &range, machine_mode mode); /* Mark register REGNO (pseudo or hard register) in MODE as being dead - and update BB_GEN_PSEUDOS and BB_KILLED_PSEUDOS. */ + and update CURR_BB_INFO. */ static void mark_regno_dead (int regno, machine_mode mode) { @@ -392,6 +491,12 @@ mark_regno_dead (int regno, machine_mode mode) for (last = end_hard_regno (mode, regno); regno < last; regno++) make_hard_regno_dead (regno); } + else if (has_subreg_object_p (regno)) + { + machine_mode mode = lra_reg_info[regno].reg_mode; + subreg_range range = subreg_range (0, get_nregs (regno)); + mark_regno_dead (regno, range, mode); + } else { mark_pseudo_dead (regno); @@ -402,7 +507,22 @@ mark_regno_dead (int regno, machine_mode mode) } } - +/* Like the above mark_regno_dead but for has_subreg_object_p REGNO. */ +static void +mark_regno_dead (int regno, const subreg_range &range, machine_mode mode) +{ + if (HARD_REGISTER_NUM_P (regno) || !has_subreg_object_p (regno)) + mark_regno_dead (regno, mode); + else + { + mark_pseudo_dead (regno); + machine_mode mode = lra_reg_info[regno].reg_mode; + if (!range.full_p (get_nregs (regno))) + has_subreg_live_p = true; + remove_subreg_range (curr_bb_info, regno, mode, range); + add_subreg_range (curr_bb_info, regno, mode, range, true); + } +} /* This page contains code for making global live analysis of pseudos. The code works only when pseudo live info is changed on a BB @@ -823,6 +943,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) function_abi last_call_abi = default_function_abi; reg_live_out = DF_LIVE_SUBREG_OUT (bb); + bitmap reg_live_partial_out = DF_LIVE_SUBREG_PARTIAL_OUT (bb); + subregs_live *range_out = DF_LIVE_SUBREG_RANGE_OUT (bb); sparseset_clear (pseudos_live); sparseset_clear (pseudos_live_through_calls); sparseset_clear (pseudos_live_through_setjumps); @@ -830,7 +952,12 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) hard_regs_live &= ~eliminable_regset; EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) { - update_pseudo_point (j, curr_point, USE_POINT); + if (bitmap_bit_p (reg_live_partial_out, j) && has_subreg_object_p (j)) + for (const subreg_range &r : range_out->lives.at (j).ranges) + update_pseudo_point (j, get_range_hard_regs (j, r), curr_point, + USE_POINT); + else + update_pseudo_point (j, curr_point, USE_POINT); mark_pseudo_live (j); } @@ -1023,8 +1150,11 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) for (reg = curr_id->regs; reg != NULL; reg = reg->next) if (reg->type != OP_IN) { - update_pseudo_point (reg->regno, curr_point, USE_POINT); - mark_regno_live (reg->regno, reg->biggest_mode); + const subreg_range &range = subreg_range (reg->start, reg->end); + update_pseudo_point (reg->regno, + get_range_hard_regs (reg->regno, range), + curr_point, USE_POINT); + mark_regno_live (reg->regno, range, reg->biggest_mode); /* ??? Should be a no-op for unused registers. */ check_pseudos_live_through_calls (reg->regno, last_call_abi); } @@ -1045,17 +1175,20 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) /* See which defined values die here. */ for (reg = curr_id->regs; reg != NULL; reg = reg->next) - if (reg->type != OP_IN - && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p) + if (reg->type != OP_IN && !reg_early_clobber_p (reg, n_alt) + && (!reg->subreg_p || has_subreg_object_p (reg->regno))) { + const subreg_range &range = subreg_range (reg->start, reg->end); if (reg->type == OP_OUT) - update_pseudo_point (reg->regno, curr_point, DEF_POINT); - mark_regno_dead (reg->regno, reg->biggest_mode); + update_pseudo_point (reg->regno, + get_range_hard_regs (reg->regno, range), + curr_point, DEF_POINT); + mark_regno_dead (reg->regno, range, reg->biggest_mode); } for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) - if (reg->type != OP_IN - && ! reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p) + if (reg->type != OP_IN && !reg_early_clobber_p (reg, n_alt) + && !reg->subreg_p) make_hard_regno_dead (reg->regno); if (curr_id->arg_hard_regs != NULL) @@ -1086,7 +1219,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) /* Increment the current program point if we must. */ if (sparseset_contains_pseudos_p (unused_set) - || sparseset_contains_pseudos_p (start_dying)) + || sparseset_contains_pseudos_p (start_dying) || has_subreg_live_p) next_program_point (curr_point, freq); /* If we removed the source reg from a simple register copy from the @@ -1107,9 +1240,12 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) for (reg = curr_id->regs; reg != NULL; reg = reg->next) if (reg->type != OP_OUT) { + const subreg_range &range = subreg_range (reg->start, reg->end); if (reg->type == OP_IN) - update_pseudo_point (reg->regno, curr_point, USE_POINT); - mark_regno_live (reg->regno, reg->biggest_mode); + update_pseudo_point (reg->regno, + get_range_hard_regs (reg->regno, range), + curr_point, USE_POINT); + mark_regno_live (reg->regno, range, reg->biggest_mode); check_pseudos_live_through_calls (reg->regno, last_call_abi); } @@ -1129,22 +1265,25 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) /* Mark early clobber outputs dead. */ for (reg = curr_id->regs; reg != NULL; reg = reg->next) - if (reg->type != OP_IN - && reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p) + if (reg->type != OP_IN && reg_early_clobber_p (reg, n_alt) + && (!reg->subreg_p || has_subreg_object_p (reg->regno))) { + const subreg_range &range = subreg_range (reg->start, reg->end); if (reg->type == OP_OUT) - update_pseudo_point (reg->regno, curr_point, DEF_POINT); - mark_regno_dead (reg->regno, reg->biggest_mode); + update_pseudo_point (reg->regno, + get_range_hard_regs (reg->regno, range), + curr_point, DEF_POINT); + mark_regno_dead (reg->regno, range, reg->biggest_mode); /* We're done processing inputs, so make sure early clobber operands that are both inputs and outputs are still live. */ if (reg->type == OP_INOUT) - mark_regno_live (reg->regno, reg->biggest_mode); + mark_regno_live (reg->regno, range, reg->biggest_mode); } for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) - if (reg->type != OP_IN - && reg_early_clobber_p (reg, n_alt) && ! reg->subreg_p) + if (reg->type != OP_IN && reg_early_clobber_p (reg, n_alt) + && !reg->subreg_p) { struct lra_insn_reg *reg2; @@ -1160,7 +1299,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) /* Increment the current program point if we must. */ if (sparseset_contains_pseudos_p (dead_set) - || sparseset_contains_pseudos_p (start_dying)) + || sparseset_contains_pseudos_p (start_dying) || has_subreg_live_p) next_program_point (curr_point, freq); /* Update notes. */ @@ -1293,13 +1432,17 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i) { - update_pseudo_point (i, curr_point, DEF_POINT); + if (has_subreg_object_p (i)) + update_pseudo_point (i, live_points->subreg_live_ranges.at (i), + curr_point, DEF_POINT); + else + update_pseudo_point (i, curr_point, DEF_POINT); mark_pseudo_dead (i); } - EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, j, - bi) - { + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_IN (bb), FIRST_PSEUDO_REGISTER, j, + bi) + { if (sparseset_cardinality (pseudos_live_through_calls) == 0) break; if (sparseset_bit_p (pseudos_live_through_calls, j)) @@ -1400,7 +1543,8 @@ remove_some_program_points_and_update_live_ranges (void) next_r = r->next; r->start = map[r->start]; r->finish = map[r->finish]; - if (prev_r == NULL || prev_r->start > r->finish + 1) + if (prev_r == NULL || prev_r->start > r->finish + 1 + || !prev_r->subreg.same_p (r->subreg)) { prev_r = r; continue; @@ -1418,8 +1562,18 @@ remove_some_program_points_and_update_live_ranges (void) void lra_print_live_range_list (FILE *f, lra_live_range_t r) { - for (; r != NULL; r = r->next) - fprintf (f, " [%d..%d]", r->start, r->finish); + if (r != NULL && has_subreg_object_p (r->regno)) + { + for (; r != NULL; r = r->next) + { + fprintf (f, " [%d..%d]{", r->start, r->finish); + r->subreg.dump (f); + fprintf (f, "}"); + } + } + else + for (; r != NULL; r = r->next) + fprintf (f, " [%d..%d]", r->start, r->finish); fprintf (f, "\n"); } @@ -1492,7 +1646,84 @@ compress_live_ranges (void) } } - +/* Use to temp record subregs live range in create_subregs_live_ranges function. + */ +class subreg_live_item +{ +public: + subreg_ranges subreg; + int start, finish; +}; + +/* Create subreg live ranges from objects def/use point info. */ +static void +create_subregs_live_ranges () +{ + for (const auto &subreg_point_it : live_points->subreg_points) + { + unsigned int regno = subreg_point_it.first; + const class live_points &points = subreg_point_it.second; + class lra_reg *reg_info = &lra_reg_info[regno]; + std::vector temps; + gcc_assert (has_subreg_object_p (regno)); + for (const auto &point_it : points.points) + { + int point = point_it.first; + const live_point ®s = point_it.second; + gcc_assert (temps.empty () || temps.back ().finish <= point); + if (!regs.use_reg.empty_p ()) + { + if (temps.empty ()) + temps.push_back ({regs.use_reg, point, -1}); + else if (temps.back ().finish == -1) + { + if (!temps.back ().subreg.same_p (regs.use_reg)) + { + if (temps.back ().start == point) + temps.back ().subreg.add_ranges (regs.use_reg); + else + { + temps.back ().finish = point - 1; + + subreg_ranges temp = regs.use_reg; + temp.add_ranges (temps.back ().subreg); + temps.push_back ({temp, point, -1}); + } + } + } + else if (temps.back ().subreg.same_p (regs.use_reg) + && (temps.back ().finish == point + || temps.back ().finish + 1 == point)) + temps.back ().finish = -1; + else + temps.push_back ({regs.use_reg, point, -1}); + } + if (!regs.def_reg.empty_p ()) + { + gcc_assert (!temps.empty ()); + if (regs.def_reg.include_ranges_p (temps.back ().subreg)) + temps.back ().finish = point; + else if (temps.back ().subreg.include_ranges_p (regs.def_reg)) + { + temps.back ().finish = point; + + subreg_ranges diff = temps.back ().subreg; + diff.remove_ranges (regs.def_reg); + temps.push_back ({diff, point + 1, -1}); + } + else + gcc_unreachable (); + } + } + + gcc_assert (reg_info->live_ranges == NULL); + + for (const subreg_live_item &item : temps) + reg_info->live_ranges + = create_live_range (regno, item.subreg, item.start, item.finish, + reg_info->live_ranges); + } +} /* The number of the current live range pass. */ int lra_live_range_iter; @@ -1573,6 +1804,8 @@ lra_create_live_ranges_1 (bool all_p, bool dead_insn_p) int n = inverted_rev_post_order_compute (cfun, rpo); lra_assert (n == n_basic_blocks_for_fn (cfun)); bb_live_change_p = false; + has_subreg_live_p = false; + live_points = new subregs_live_points (); for (i = 0; i < n; ++i) { bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]); @@ -1655,9 +1888,14 @@ lra_create_live_ranges_1 (bool all_p, bool dead_insn_p) } } lra_live_max_point = curr_point; + create_subregs_live_ranges (); if (lra_dump_file != NULL) - print_live_ranges (lra_dump_file); + { + live_points->dump (lra_dump_file); + print_live_ranges (lra_dump_file); + } /* Clean up. */ + delete live_points; sparseset_free (unused_set); sparseset_free (dead_set); sparseset_free (start_dying); diff --git a/gcc/lra.cc b/gcc/lra.cc index bcc00ff7d6b..23fc0daf1ed 100644 --- a/gcc/lra.cc +++ b/gcc/lra.cc @@ -566,6 +566,54 @@ lra_asm_insn_error (rtx_insn *insn) /* Pools for insn reg info. */ object_allocator lra_insn_reg_pool ("insn regs"); +/* Return the subreg range of rtx SUBREG in blocks. */ +static subreg_range +get_range_blocks (int regno, bool subreg_p, machine_mode reg_mode, + poly_int64 offset, poly_int64 size) +{ + gcc_assert (has_subreg_object_p (regno)); + int nblocks = get_nblocks (reg_mode); + if (!subreg_p) + return subreg_range (0, nblocks); + + poly_int64 unit_size = REGMODE_NATURAL_SIZE (reg_mode); + poly_int64 left = offset + size; + + int subreg_start = -1; + int subreg_nregs = -1; + for (int i = 0; i < nblocks; i += 1) + { + poly_int64 right = unit_size * (i + 1); + if (subreg_start < 0 && maybe_lt (offset, right)) + subreg_start = i; + if (subreg_nregs < 0 && maybe_le (left, right)) + { + subreg_nregs = i + 1 - subreg_start; + break; + } + } + gcc_assert (subreg_start >= 0 && subreg_nregs > 0); + return subreg_range (subreg_start, subreg_start + subreg_nregs); +} + +/* Return the subreg range of rtx SUBREG in hard regs. */ +subreg_range +get_range_hard_regs (int regno, const subreg_range &r) +{ + if (!has_subreg_object_p (regno) || lra_reg_info[regno].reg_mode == VOIDmode) + return subreg_range (0, 1); + enum reg_class aclass = lra_get_allocno_class (regno); + gcc_assert (aclass != NO_REGS); + int nregs = ira_reg_class_max_nregs[aclass][lra_reg_info[regno].reg_mode]; + int nblocks = get_nblocks (lra_reg_info[regno].reg_mode); + int times = nblocks / nregs; + gcc_assert (nblocks >= nregs && times * nregs == nblocks); + int start = r.start / times; + int end = CEIL (r.end, times); + + return subreg_range (start, end); +} + /* Create LRA insn related info about a reference to REGNO in INSN with TYPE (in/out/inout), biggest reference mode MODE, flag that it is reference through subreg (SUBREG_P), and reference to the next @@ -573,21 +621,49 @@ object_allocator lra_insn_reg_pool ("insn regs"); alternatives in which it can be early clobbered are given by EARLY_CLOBBER_ALTS. */ static struct lra_insn_reg * -new_insn_reg (rtx_insn *insn, int regno, enum op_type type, - machine_mode mode, bool subreg_p, - alternative_mask early_clobber_alts, +new_insn_reg (rtx_insn *insn, int regno, enum op_type type, poly_int64 size, + poly_int64 offset, machine_mode mode, machine_mode reg_mode, + bool subreg_p, alternative_mask early_clobber_alts, struct lra_insn_reg *next) { lra_insn_reg *ir = lra_insn_reg_pool.allocate (); ir->type = type; ir->biggest_mode = mode; - if (NONDEBUG_INSN_P (insn) - && partial_subreg_p (lra_reg_info[regno].biggest_mode, mode)) - lra_reg_info[regno].biggest_mode = mode; + if (NONDEBUG_INSN_P (insn)) + { + if (partial_subreg_p (lra_reg_info[regno].biggest_mode, mode)) + { + lra_reg_info[regno].biggest_mode = mode; + } + + if (regno >= FIRST_PSEUDO_REGISTER) + { + if (lra_reg_info[regno].reg_mode == VOIDmode) + lra_reg_info[regno].reg_mode = reg_mode; + else + gcc_assert (maybe_eq (GET_MODE_SIZE (lra_reg_info[regno].reg_mode), + GET_MODE_SIZE (reg_mode))); + } + } ir->subreg_p = subreg_p; ir->early_clobber_alts = early_clobber_alts; ir->regno = regno; ir->next = next; + if (has_subreg_object_p (regno)) + { + const subreg_range &r + = get_range_blocks (regno, subreg_p, reg_mode, offset, size); + ir->start = r.start; + ir->end = r.end; + const subreg_range &r_hard = get_range_hard_regs (regno, r); + ir->start_reg = r_hard.start; + ir->end_reg = r_hard.end; + } + else + { + ir->start = 0; + ir->end = 1; + } return ir; } @@ -887,11 +963,18 @@ collect_non_operand_hard_regs (rtx_insn *insn, rtx *x, return list; mode = GET_MODE (op); subreg_p = false; + poly_int64 size = GET_MODE_SIZE (mode); + poly_int64 offset = 0; if (code == SUBREG) { mode = wider_subreg_mode (op); if (read_modify_subreg_p (op)) - subreg_p = true; + { + offset = SUBREG_BYTE (op); + subreg_p = true; + } + else + size = GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))); op = SUBREG_REG (op); code = GET_CODE (op); } @@ -925,7 +1008,8 @@ collect_non_operand_hard_regs (rtx_insn *insn, rtx *x, && ! (FIRST_STACK_REG <= regno && regno <= LAST_STACK_REG)); #endif - list = new_insn_reg (data->insn, regno, type, mode, subreg_p, + list = new_insn_reg (data->insn, regno, type, size, offset, mode, + GET_MODE (op), subreg_p, early_clobber ? ALL_ALTERNATIVES : 0, list); } } @@ -1354,6 +1438,7 @@ initialize_lra_reg_info_element (int i) lra_reg_info[i].preferred_hard_regno_profit1 = 0; lra_reg_info[i].preferred_hard_regno_profit2 = 0; lra_reg_info[i].biggest_mode = VOIDmode; + lra_reg_info[i].reg_mode = VOIDmode; lra_reg_info[i].live_ranges = NULL; lra_reg_info[i].nrefs = lra_reg_info[i].freq = 0; lra_reg_info[i].last_reload = 0; @@ -1459,7 +1544,21 @@ lra_get_copy (int n) return copy_vec[n]; } - +/* Return true if REG occupied the same blocks as OFFSET + SIZE subreg. */ +static bool +reg_same_range_p (lra_insn_reg *reg, poly_int64 offset, poly_int64 size, + bool subreg_p) +{ + if (has_subreg_object_p (reg->regno)) + { + const subreg_range &r + = get_range_blocks (reg->regno, subreg_p, + lra_reg_info[reg->regno].reg_mode, offset, size); + return r.start == reg->start && r.end == reg->end; + } + else + return true; +} /* This page contains code dealing with info about registers in insns. */ @@ -1483,11 +1582,18 @@ add_regs_to_insn_regno_info (lra_insn_recog_data_t data, rtx x, code = GET_CODE (x); mode = GET_MODE (x); subreg_p = false; + poly_int64 size = GET_MODE_SIZE (mode); + poly_int64 offset = 0; if (GET_CODE (x) == SUBREG) { mode = wider_subreg_mode (x); if (read_modify_subreg_p (x)) - subreg_p = true; + { + offset = SUBREG_BYTE (x); + subreg_p = true; + } + else + size = GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))); x = SUBREG_REG (x); code = GET_CODE (x); } @@ -1499,7 +1605,8 @@ add_regs_to_insn_regno_info (lra_insn_recog_data_t data, rtx x, expand_reg_info (); if (bitmap_set_bit (&lra_reg_info[regno].insn_bitmap, INSN_UID (insn))) { - data->regs = new_insn_reg (data->insn, regno, type, mode, subreg_p, + data->regs = new_insn_reg (data->insn, regno, type, size, offset, + mode, GET_MODE (x), subreg_p, early_clobber_alts, data->regs); return; } @@ -1508,12 +1615,14 @@ add_regs_to_insn_regno_info (lra_insn_recog_data_t data, rtx x, for (curr = data->regs; curr != NULL; curr = curr->next) if (curr->regno == regno) { - if (curr->subreg_p != subreg_p || curr->biggest_mode != mode) + if (!reg_same_range_p (curr, offset, size, subreg_p) + || curr->biggest_mode != mode) /* The info cannot be integrated into the found structure. */ - data->regs = new_insn_reg (data->insn, regno, type, mode, - subreg_p, early_clobber_alts, - data->regs); + data->regs + = new_insn_reg (data->insn, regno, type, size, offset, mode, + GET_MODE (x), subreg_p, early_clobber_alts, + data->regs); else { if (curr->type != type)