From patchwork Wed Nov 8 03:47:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 162875 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp676323vqo; Tue, 7 Nov 2023 19:48:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IF40YXg91ZweZA12BOfaKK+URefpKR5On/Jgh5aAlvuubTTFo/YCcCgToFzhXUbToxUisFK X-Received: by 2002:a05:620a:47c1:b0:767:f1de:293c with SMTP id du1-20020a05620a47c100b00767f1de293cmr511358qkb.59.1699415302921; Tue, 07 Nov 2023 19:48:22 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699415302; cv=pass; d=google.com; s=arc-20160816; b=orqA3mND3hKBLJsE5c8nZiPjLwkjvi1m7FDbRBWeBZ80rGBxrb7Di+I/WQBvsEP3XE g5G+Tiq24+1CZ3mK1hXZAE8HWsE5laxI+ZHZ9boW5Kae1SZ5+J6odl+urWf+5yXe0Uxb mMIHcedQWcJRVs5ruGuh9WHTZdNhqieXCh5pMECW0/H13BZWvfwXnPxj8jydp6ll1dL8 ntGG38FLgAeFrCdg4cuSt3qn1jje7czoN4giyO5/qCChyK6jdWYSa7STqUM0gYL+ReJW HayNSw0cwRyauVRrsqNMdIL+/TokuiDXUTR9U8igt2grjUNEnc+fn5Fp+NDCdFU6d+xQ lR1g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=AKSWjqSWNsCw9P4/vsY78hyDlpw6aFXHNnl4D08sW2w=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=mZtrkIiTHq7wuzf7Ffz2Qhxo1ciAZbWFq9QLMa1OvmfawNylJ1koahlR5cce6cUGu3 pYWGoUtfHtuaq1caxf/SDZ6xZKeAd5YQakNO01q0TXeHKMMj3Jfrii0poybtckK8X9CF aUEocFd/h85zrPIGt7VD5+W18zBF38eL9kfWvm+55uyt88QLmGKXUCguYYLm1wTjqb4I 8j1v5tnxHCHLNlZMlWjxf/CIsYfoUZs4V5ZlPm27+SNrThr9F4sIikGQJZW3fKajvcdW ZixviuPuqPpWdMUR2/VhxeThxXJvKIUhSg3dpHHPChzS85Vx6xU+FoP2UJSjceH8lL4h zuEg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id x11-20020a05620a258b00b00765ad3da644si782168qko.600.2023.11.07.19.48.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Nov 2023 19:48:22 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C331B385800A for ; Wed, 8 Nov 2023 03:48:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu2.qq.com (smtpbgeu2.qq.com [18.194.254.142]) by sourceware.org (Postfix) with ESMTPS id 662013858D1E for ; Wed, 8 Nov 2023 03:47:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 662013858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 662013858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.194.254.142 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415276; cv=none; b=Ih2wxvzL/GClni2xZ05753RNf3B4CUcgPdfjgppg3Q7TlnDONfFLJeh1AV8z3GgynqMMo9fjvwOam3WiHCvSCl/i7oMThByBcBC3oGCmCB8l8BGrC6+rRoNZ7AIwoRZRNgKG3ZXJs5yJXVenVwqOOWV9OnD0k7fdWxtgciC0mws= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415276; c=relaxed/simple; bh=v4W3E55FgFNEGH9jr3lAVGsI9hJFqA5zXEbiQkqL1Zs=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=LoyBzh4w4tnXRF6tpwAoU1uQEB0KJ3GcuZ6JaCoGVJslp7fNKJzYPCJcGGfmrhM1naTnvAyrGkdBCH4dEE8YEciNC6XtGjpYN5p7R8idwrQJVNpbTU1v0liMz4LGrWMS0YJe8xhx0ItB3NiD5ZLjmEFX2mlD9PFj6q9WsOQQPJw= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1699415264tkfnfoog Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 11:47:44 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: 90EFqYDyPxBc3KQ/9XgGnDhIoC+1rKpcuLonAkuhR+5nbRq0jsPVEe948hUy7 c8x3Ybckbge4ELbL9eFh3wWnOPAHF+DmMc5BsB9YIXFlZ/s6ZxTlquuUoq9w4zlBEb7CUau pdJP9neM3kRHma1QaWerdgUbD7XlZKwncwgk3kG9Xk/x7T58vEIeqZpU6gqKiIHk+XjvnLW WA1TQCjCFin0CvRljZ/u1D13/2jFXaxtb/YBIixaXDsyH5ACmbPWgy3ouDk2oeLp079veOA ZKsp+b4fY1tq8KmsGz02q7oswY7Q/cZ9XcHUto10UXK2jYAtb+Fb+z5FwtidwDrvZIB2ym5 GIO6zJAk9iI1TcUmo6MKMOoPpFdvbeqdhkCz90fGvk5vo1PYdtJasPKuu2CJA== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 17222830383864493665 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH 1/7] ira: Refactor the handling of register conflicts to make it more general Date: Wed, 8 Nov 2023 11:47:34 +0800 Message-Id: <20231108034740.834590-2-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231108034740.834590-1-lehua.ding@rivai.ai> References: <20231108034740.834590-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781966100877374918 X-GMAIL-MSGID: 1781966100877374918 This patch does not make any functional changes. It mainly refactor two parts: 1. The ira_allocno's objects field is expanded to an scalable array, and multi-word pseduo registers are split and tracked only when necessary. 2. Since the objects array has been expanded, there will be more subreg objects that pass through later, rather than the previous fixed two. Therefore, it is necessary to modify the detection of whether two objects conflict, and the check method is to pull back the registers occupied by the object to the first register of the allocno for judgment. gcc/ChangeLog: * hard-reg-set.h (struct HARD_REG_SET): Add operator>>. * ira-build.cc (init_object_start_and_nregs): New func. (find_object): Ditto. (ira_create_allocno): Adjust. (ira_set_allocno_class): Set subreg info. (ira_create_allocno_objects): Adjust. (init_regs_with_subreg): Collect access in subreg. (ira_build): Call init_regs_with_subreg (ira_destroy): Clear regs_with_subreg * ira-color.cc (setup_profitable_hard_regs): Adjust. (get_conflict_and_start_profitable_regs): Adjust. (check_hard_reg_p): Adjust. (assign_hard_reg): Adjust. (improve_allocation): Adjust. * ira-int.h (struct ira_object): Adjust fields. (struct ira_allocno): Adjust objects filed. (ALLOCNO_NUM_OBJECTS): Adjust. (ALLOCNO_UNIT_SIZE): New. (ALLOCNO_TRACK_SUBREG_P): New. (ALLOCNO_NREGS): New. (OBJECT_SIZE): New. (OBJECT_OFFSET): New. (OBJECT_START): New. (OBJECT_NREGS): New. (find_object): New. (has_subreg_object_p): New. (get_full_object): New. * ira.cc (check_allocation): Adjust. --- gcc/hard-reg-set.h | 33 +++++++ gcc/ira-build.cc | 106 +++++++++++++++++++- gcc/ira-color.cc | 234 ++++++++++++++++++++++++++++++--------------- gcc/ira-int.h | 45 ++++++++- gcc/ira.cc | 52 ++++------ 5 files changed, 349 insertions(+), 121 deletions(-) diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h index b0bb9bce074..760eadba186 100644 --- a/gcc/hard-reg-set.h +++ b/gcc/hard-reg-set.h @@ -113,6 +113,39 @@ struct HARD_REG_SET return !operator== (other); } + HARD_REG_SET + operator>> (unsigned int shift_amount) const + { + if (shift_amount == 0) + return *this; + + HARD_REG_SET res; + unsigned int total_bits = sizeof (HARD_REG_ELT_TYPE) * 8; + if (shift_amount >= total_bits) + { + unsigned int n_elt = shift_amount % total_bits; + shift_amount -= n_elt * total_bits; + for (unsigned int i = 0; i < ARRAY_SIZE (elts) - n_elt - 1; i += 1) + res.elts[i] = elts[i + n_elt]; + /* clear upper n_elt elements. */ + for (unsigned int i = 0; i < n_elt; i += 1) + res.elts[ARRAY_SIZE (elts) - 1 - i] = 0; + } + + if (shift_amount > 0) + { + /* The left bits of an element be shifted. */ + HARD_REG_ELT_TYPE left = 0; + /* Total bits of an element. */ + for (int i = ARRAY_SIZE (elts); i >= 0; --i) + { + res.elts[i] = (elts[i] >> shift_amount) | left; + left = elts[i] << (total_bits - shift_amount); + } + } + return res; + } + HARD_REG_ELT_TYPE elts[HARD_REG_SET_LONGS]; }; typedef const HARD_REG_SET &const_hard_reg_set; diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 93e46033170..07aba27c1c9 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -440,6 +440,40 @@ initiate_allocnos (void) memset (ira_regno_allocno_map, 0, max_reg_num () * sizeof (ira_allocno_t)); } +/* Update OBJ's start and nregs field according A and OBJ info. */ +static void +init_object_start_and_nregs (ira_allocno_t a, ira_object_t obj) +{ + enum reg_class aclass = ALLOCNO_CLASS (a); + gcc_assert (aclass != NO_REGS); + + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ira_reg_class_max_nregs[aclass][mode]; + if (ALLOCNO_TRACK_SUBREG_P (a)) + { + poly_int64 end = OBJECT_OFFSET (obj) + OBJECT_SIZE (obj); + for (int i = 0; i < nregs; i += 1) + { + poly_int64 right = ALLOCNO_UNIT_SIZE (a) * (i + 1); + if (OBJECT_START (obj) < 0 && maybe_lt (OBJECT_OFFSET (obj), right)) + { + OBJECT_START (obj) = i; + } + if (OBJECT_NREGS (obj) < 0 && maybe_le (end, right)) + { + OBJECT_NREGS (obj) = i + 1 - OBJECT_START (obj); + break; + } + } + gcc_assert (OBJECT_START (obj) >= 0 && OBJECT_NREGS (obj) > 0); + } + else + { + OBJECT_START (obj) = 0; + OBJECT_NREGS (obj) = nregs; + } +} + /* Create and return an object corresponding to a new allocno A. */ static ira_object_t ira_create_object (ira_allocno_t a, int subword) @@ -460,15 +494,36 @@ ira_create_object (ira_allocno_t a, int subword) OBJECT_MIN (obj) = INT_MAX; OBJECT_MAX (obj) = -1; OBJECT_LIVE_RANGES (obj) = NULL; + OBJECT_SIZE (obj) = UNITS_PER_WORD; + OBJECT_OFFSET (obj) = subword * UNITS_PER_WORD; + OBJECT_START (obj) = -1; + OBJECT_NREGS (obj) = -1; ira_object_id_map_vec.safe_push (obj); ira_object_id_map = ira_object_id_map_vec.address (); ira_objects_num = ira_object_id_map_vec.length (); + if (aclass != NO_REGS) + init_object_start_and_nregs (a, obj); + + a->objects.push_back (obj); + return obj; } +/* Return the object in allocno A which match START & NREGS. */ +ira_object_t +find_object (ira_allocno_t a, int start, int nregs) +{ + for (ira_object_t obj : a->objects) + { + if (OBJECT_START (obj) == start && OBJECT_NREGS (obj) == nregs) + return obj; + } + return NULL; +} + /* Create and return the allocno corresponding to REGNO in LOOP_TREE_NODE. Add the allocno to the list of allocnos with the same regno if CAP_P is FALSE. */ @@ -525,7 +580,8 @@ ira_create_allocno (int regno, bool cap_p, ALLOCNO_MEMORY_COST (a) = 0; ALLOCNO_UPDATED_MEMORY_COST (a) = 0; ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) = 0; - ALLOCNO_NUM_OBJECTS (a) = 0; + ALLOCNO_UNIT_SIZE (a) = 0; + ALLOCNO_TRACK_SUBREG_P (a) = false; ALLOCNO_ADD_DATA (a) = NULL; allocno_vec.safe_push (a); @@ -535,6 +591,9 @@ ira_create_allocno (int regno, bool cap_p, return a; } +/* Record the regs referenced by subreg. */ +static bitmap_head regs_with_subreg; + /* Set up register class for A and update its conflict hard registers. */ void @@ -549,6 +608,19 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) OBJECT_CONFLICT_HARD_REGS (obj) |= ~reg_class_contents[aclass]; OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= ~reg_class_contents[aclass]; } + + if (aclass == NO_REGS) + return; + /* SET the unit_size of one register. */ + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ira_reg_class_max_nregs[aclass][mode]; + if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD) + && bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) + { + ALLOCNO_UNIT_SIZE (a) = UNITS_PER_WORD; + ALLOCNO_TRACK_SUBREG_P (a) = true; + return; + } } /* Determine the number of objects we should associate with allocno A @@ -561,12 +633,12 @@ ira_create_allocno_objects (ira_allocno_t a) int n = ira_reg_class_max_nregs[aclass][mode]; int i; - if (n != 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD)) + if (n != 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD) + || !bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) n = 1; - ALLOCNO_NUM_OBJECTS (a) = n; for (i = 0; i < n; i++) - ALLOCNO_OBJECT (a, i) = ira_create_object (a, i); + ira_create_object (a, i); } /* For each allocno, set ALLOCNO_NUM_OBJECTS and create the @@ -3460,6 +3532,30 @@ update_conflict_hard_reg_costs (void) } } +/* Traverse all instructions to determine which ones have access through subreg. + */ +static void +init_regs_with_subreg () +{ + bitmap_initialize (®s_with_subreg, ®_obstack); + basic_block bb; + rtx_insn *insn; + df_ref def, use; + FOR_ALL_BB_FN (bb, cfun) + FOR_BB_INSNS (bb, insn) + { + if (!NONDEBUG_INSN_P (insn)) + continue; + df_insn_info *insn_info = DF_INSN_INFO_GET (insn); + FOR_EACH_INSN_INFO_DEF (def, insn_info) + if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_SUBREG)) + bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (def)); + FOR_EACH_INSN_INFO_USE (use, insn_info) + if (DF_REF_FLAGS (use) & (DF_REF_PARTIAL | DF_REF_SUBREG)) + bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (use)); + } +} + /* Create a internal representation (IR) for IRA (allocnos, copies, loop tree nodes). The function returns TRUE if we generate loop structure (besides nodes representing all function and the basic @@ -3475,6 +3571,7 @@ ira_build (void) initiate_allocnos (); initiate_prefs (); initiate_copies (); + init_regs_with_subreg (); create_loop_tree_nodes (); form_loop_tree (); create_allocnos (); @@ -3565,4 +3662,5 @@ ira_destroy (void) finish_allocnos (); finish_cost_vectors (); ira_finish_allocno_live_ranges (); + bitmap_clear (®s_with_subreg); } diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index f2e8ea34152..6af8318e5f5 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -1031,7 +1031,7 @@ static void setup_profitable_hard_regs (void) { unsigned int i; - int j, k, nobj, hard_regno, nregs, class_size; + int j, k, nobj, hard_regno, class_size; ira_allocno_t a; bitmap_iterator bi; enum reg_class aclass; @@ -1076,7 +1076,6 @@ setup_profitable_hard_regs (void) || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) continue; mode = ALLOCNO_MODE (a); - nregs = hard_regno_nregs (hard_regno, mode); nobj = ALLOCNO_NUM_OBJECTS (a); for (k = 0; k < nobj; k++) { @@ -1088,24 +1087,39 @@ setup_profitable_hard_regs (void) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - /* We can process the conflict allocno repeatedly with - the same result. */ - if (nregs == nobj && nregs > 1) + if (!has_subreg_object_p (a)) { - int num = OBJECT_SUBWORD (conflict_obj); - - if (REG_WORDS_BIG_ENDIAN) - CLEAR_HARD_REG_BIT - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, - hard_regno + nobj - num - 1); - else - CLEAR_HARD_REG_BIT - (ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, - hard_regno + num); + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs + &= ~ira_reg_mode_hard_regset[hard_regno][mode]; + continue; + } + + /* Clear all hard regs occupied by obj. */ + if (REG_WORDS_BIG_ENDIAN) + { + int start_regno + = hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_START (obj); + for (int i = 0; i < OBJECT_NREGS (obj); i += 1) + { + int regno = start_regno - i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + CLEAR_HARD_REG_BIT ( + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, + regno); + } } else - ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs - &= ~ira_reg_mode_hard_regset[hard_regno][mode]; + { + int start_regno = hard_regno + OBJECT_START (obj); + for (int i = 0; i < OBJECT_NREGS (obj); i += 1) + { + int regno = start_regno + i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + CLEAR_HARD_REG_BIT ( + ALLOCNO_COLOR_DATA (conflict_a)->profitable_hard_regs, + regno); + } + } } } } @@ -1677,18 +1691,25 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, aligned. */ static inline void get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p, - HARD_REG_SET *conflict_regs, + HARD_REG_SET *start_conflict_regs, HARD_REG_SET *start_profitable_regs) { int i, nwords; ira_object_t obj; nwords = ALLOCNO_NUM_OBJECTS (a); - for (i = 0; i < nwords; i++) - { - obj = ALLOCNO_OBJECT (a, i); - conflict_regs[i] = OBJECT_TOTAL_CONFLICT_HARD_REGS (obj); - } + CLEAR_HARD_REG_SET (*start_conflict_regs); + if (has_subreg_object_p (a)) + for (i = 0; i < nwords; i++) + { + obj = ALLOCNO_OBJECT (a, i); + for (int j = 0; j < OBJECT_NREGS (obj); j += 1) + *start_conflict_regs |= OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) + >> (OBJECT_START (obj) + j); + } + else + *start_conflict_regs + = OBJECT_TOTAL_CONFLICT_HARD_REGS (get_full_object (a)); if (retry_p) *start_profitable_regs = (reg_class_contents[ALLOCNO_CLASS (a)] @@ -1702,9 +1723,9 @@ get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p, PROFITABLE_REGS and whose objects have CONFLICT_REGS. */ static inline bool check_hard_reg_p (ira_allocno_t a, int hard_regno, - HARD_REG_SET *conflict_regs, HARD_REG_SET profitable_regs) + HARD_REG_SET start_conflict_regs, + HARD_REG_SET profitable_regs) { - int j, nwords, nregs; enum reg_class aclass; machine_mode mode; @@ -1716,28 +1737,17 @@ check_hard_reg_p (ira_allocno_t a, int hard_regno, /* Checking only profitable hard regs. */ if (! TEST_HARD_REG_BIT (profitable_regs, hard_regno)) return false; - nregs = hard_regno_nregs (hard_regno, mode); - nwords = ALLOCNO_NUM_OBJECTS (a); - for (j = 0; j < nregs; j++) + + if (has_subreg_object_p (a)) + return !TEST_HARD_REG_BIT (start_conflict_regs, hard_regno); + else { - int k; - int set_to_test_start = 0, set_to_test_end = nwords; - - if (nregs == nwords) - { - if (REG_WORDS_BIG_ENDIAN) - set_to_test_start = nwords - j - 1; - else - set_to_test_start = j; - set_to_test_end = set_to_test_start + 1; - } - for (k = set_to_test_start; k < set_to_test_end; k++) - if (TEST_HARD_REG_BIT (conflict_regs[k], hard_regno + j)) - break; - if (k != set_to_test_end) - break; + int nregs = hard_regno_nregs (hard_regno, mode); + for (int i = 0; i < nregs; i += 1) + if (TEST_HARD_REG_BIT (start_conflict_regs, hard_regno + i)) + return false; + return true; } - return j == nregs; } /* Return number of registers needed to be saved and restored at @@ -1945,7 +1955,7 @@ spill_soft_conflicts (ira_allocno_t a, bitmap allocnos_to_spill, static bool assign_hard_reg (ira_allocno_t a, bool retry_p) { - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; int i, j, hard_regno, best_hard_regno, class_size; int cost, mem_cost, min_cost, full_cost, min_full_cost, nwords, word; int *a_costs; @@ -1962,8 +1972,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) HARD_REG_SET soft_conflict_regs = {}; ira_assert (! ALLOCNO_ASSIGNED_P (a)); - get_conflict_and_start_profitable_regs (a, retry_p, - conflicting_regs, + get_conflict_and_start_profitable_regs (a, retry_p, &start_conflicting_regs, &profitable_hard_regs); aclass = ALLOCNO_CLASS (a); class_size = ira_class_hard_regs_num[aclass]; @@ -2041,7 +2050,6 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) (hard_regno, ALLOCNO_MODE (conflict_a), reg_class_contents[aclass]))) { - int n_objects = ALLOCNO_NUM_OBJECTS (conflict_a); int conflict_nregs; mode = ALLOCNO_MODE (conflict_a); @@ -2076,24 +2084,95 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) note_conflict (r); } } + else if (has_subreg_object_p (a)) + { + /* Set start_conflicting_regs if that cause obj and + conflict_obj overlap. the overlap position: + +--------------+ + | conflict_obj | + +--------------+ + + +-----------+ +-----------+ + | obj | ... | obj | + +-----------+ +-----------+ + + Point: A B C + + the hard regs from A to C point will cause overlap. + For REG_WORDS_BIG_ENDIAN: + A = hard_regno + ALLOCNO_NREGS (conflict_a) - 1 + - OBJECT_START (conflict_obj) + - OBJECT_NREGS (obj) + 1 + C = A + OBJECT_NREGS (obj) + + OBJECT_NREGS (conflict_obj) - 2 + For !REG_WORDS_BIG_ENDIAN: + A = hard_regno + OBJECT_START (conflict_obj) + - OBJECT_NREGS (obj) + 1 + C = A + OBJECT_NREGS (obj) + + OBJECT_NREGS (conflict_obj) - 2 + */ + int start_regno; + int conflict_allocno_nregs, conflict_object_nregs, + conflict_object_start; + if (has_subreg_object_p (conflict_a)) + { + conflict_allocno_nregs = ALLOCNO_NREGS (conflict_a); + conflict_object_nregs = OBJECT_NREGS (conflict_obj); + conflict_object_start = OBJECT_START (conflict_obj); + } + else + { + conflict_allocno_nregs = conflict_object_nregs + = hard_regno_nregs (hard_regno, mode); + conflict_object_start = 0; + } + if (REG_WORDS_BIG_ENDIAN) + { + int A = hard_regno + conflict_allocno_nregs - 1 + - conflict_object_start - OBJECT_NREGS (obj) + + 1; + start_regno = A + OBJECT_NREGS (obj) - 1 + + OBJECT_START (obj) - ALLOCNO_NREGS (a) + + 1; + } + else + { + int A = hard_regno + conflict_object_start + - OBJECT_NREGS (obj) + 1; + start_regno = A - OBJECT_START (obj); + } + + for (int i = 0; + i <= OBJECT_NREGS (obj) + conflict_object_nregs - 2; + i += 1) + { + int regno = start_regno + i; + if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER) + SET_HARD_REG_BIT (start_conflicting_regs, regno); + } + if (hard_reg_set_subset_p (profitable_hard_regs, + start_conflicting_regs)) + goto fail; + } else { - if (conflict_nregs == n_objects && conflict_nregs > 1) + if (has_subreg_object_p (conflict_a)) { - int num = OBJECT_SUBWORD (conflict_obj); - - if (REG_WORDS_BIG_ENDIAN) - SET_HARD_REG_BIT (conflicting_regs[word], - hard_regno + n_objects - num - 1); - else - SET_HARD_REG_BIT (conflicting_regs[word], - hard_regno + num); + int start_hard_regno + = REG_WORDS_BIG_ENDIAN + ? hard_regno + ALLOCNO_NREGS (conflict_a) + - OBJECT_START (conflict_obj) + : hard_regno + OBJECT_START (conflict_obj); + for (int i = 0; i < OBJECT_NREGS (conflict_obj); + i += 1) + SET_HARD_REG_BIT (start_conflicting_regs, + start_hard_regno + i); } else - conflicting_regs[word] + start_conflicting_regs |= ira_reg_mode_hard_regset[hard_regno][mode]; if (hard_reg_set_subset_p (profitable_hard_regs, - conflicting_regs[word])) + start_conflicting_regs)) goto fail; } } @@ -2160,8 +2239,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) && FIRST_STACK_REG <= hard_regno && hard_regno <= LAST_STACK_REG) continue; #endif - if (! check_hard_reg_p (a, hard_regno, - conflicting_regs, profitable_hard_regs)) + if (!check_hard_reg_p (a, hard_regno, start_conflicting_regs, + profitable_hard_regs)) continue; cost = costs[i]; full_cost = full_costs[i]; @@ -3154,7 +3233,7 @@ improve_allocation (void) machine_mode mode; int *allocno_costs; int costs[FIRST_PSEUDO_REGISTER]; - HARD_REG_SET conflicting_regs[2], profitable_hard_regs; + HARD_REG_SET start_conflicting_regs, profitable_hard_regs; ira_allocno_t a; bitmap_iterator bi; int saved_nregs; @@ -3193,7 +3272,7 @@ improve_allocation (void) - allocno_copy_cost_saving (a, hregno)); try_p = false; get_conflict_and_start_profitable_regs (a, false, - conflicting_regs, + &start_conflicting_regs, &profitable_hard_regs); class_size = ira_class_hard_regs_num[aclass]; mode = ALLOCNO_MODE (a); @@ -3202,8 +3281,8 @@ improve_allocation (void) for (j = 0; j < class_size; j++) { hregno = ira_class_hard_regs[aclass][j]; - if (! check_hard_reg_p (a, hregno, - conflicting_regs, profitable_hard_regs)) + if (!check_hard_reg_p (a, hregno, start_conflicting_regs, + profitable_hard_regs)) continue; ira_assert (ira_class_hard_reg_index[aclass][hregno] == j); k = allocno_costs == NULL ? 0 : j; @@ -3287,16 +3366,15 @@ improve_allocation (void) } conflict_nregs = hard_regno_nregs (conflict_hregno, ALLOCNO_MODE (conflict_a)); - auto note_conflict = [&](int r) - { - if (check_hard_reg_p (a, r, - conflicting_regs, profitable_hard_regs)) - { - if (spill_a) - SET_HARD_REG_BIT (soft_conflict_regs, r); - costs[r] += spill_cost; - } - }; + auto note_conflict = [&] (int r) { + if (check_hard_reg_p (a, r, start_conflicting_regs, + profitable_hard_regs)) + { + if (spill_a) + SET_HARD_REG_BIT (soft_conflict_regs, r); + costs[r] += spill_cost; + } + }; for (r = conflict_hregno; r >= 0 && (int) end_hard_regno (mode, r) > conflict_hregno; r--) @@ -3314,8 +3392,8 @@ improve_allocation (void) for (j = 0; j < class_size; j++) { hregno = ira_class_hard_regs[aclass][j]; - if (check_hard_reg_p (a, hregno, - conflicting_regs, profitable_hard_regs) + if (check_hard_reg_p (a, hregno, start_conflicting_regs, + profitable_hard_regs) && min_cost > costs[hregno]) { best = hregno; diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 0685e1f4e8d..b6281d3df6d 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see #include "recog.h" #include "function-abi.h" +#include /* To provide consistency in naming, all IRA external variables, functions, common typedefs start with prefix ira_. */ @@ -240,6 +241,13 @@ struct ira_object Zero means the lowest-order subword (or the entire allocno in case it is not being tracked in subwords). */ int subword; + /* Reprensent OBJECT occupied [start, start + nregs) registers of it's + ALLOCNO. */ + int start, nregs; + /* Reprensent the size and offset of current object, use to track subreg + range, For full reg, the size is GET_MODE_SIZE (ALLOCNO_MODE (allocno)), + offset is 0. */ + poly_int64 size, offset; /* Allocated size of the conflicts array. */ unsigned int conflicts_array_size; /* A unique number for every instance of this structure, which is used @@ -295,6 +303,11 @@ struct ira_allocno reload (at this point pseudo-register has only one allocno) which did not get stack slot yet. */ signed int hard_regno : 16; + /* Unit size of one register that allocate for the allocno. Only use to + compute the start and nregs of subreg which be tracked. */ + poly_int64 unit_size; + /* Flag means need track subreg live range for the allocno. */ + bool track_subreg_p; /* A bitmask of the ABIs used by calls that occur while the allocno is live. */ unsigned int crossed_calls_abis : NUM_ABI_IDS; @@ -353,8 +366,6 @@ struct ira_allocno register class living at the point than number of hard-registers of the class available for the allocation. */ int excess_pressure_points_num; - /* The number of objects tracked in the following array. */ - int num_objects; /* Accumulated frequency of calls which given allocno intersects. */ int call_freq; @@ -387,8 +398,8 @@ struct ira_allocno /* An array of structures describing conflict information and live ranges for each object associated with the allocno. There may be more than one such object in cases where the allocno represents a - multi-word register. */ - ira_object_t objects[2]; + multi-hardreg pesudo. */ + std::vector objects; /* Registers clobbered by intersected calls. */ HARD_REG_SET crossed_calls_clobbered_regs; /* Array of usage costs (accumulated and the one updated during @@ -468,8 +479,12 @@ struct ira_allocno #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) \ ((A)->excess_pressure_points_num) #define ALLOCNO_OBJECT(A,N) ((A)->objects[N]) -#define ALLOCNO_NUM_OBJECTS(A) ((A)->num_objects) +#define ALLOCNO_NUM_OBJECTS(A) ((int) (A)->objects.size ()) #define ALLOCNO_ADD_DATA(A) ((A)->add_data) +#define ALLOCNO_UNIT_SIZE(A) ((A)->unit_size) +#define ALLOCNO_TRACK_SUBREG_P(A) ((A)->track_subreg_p) +#define ALLOCNO_NREGS(A) \ + (ira_reg_class_max_nregs[ALLOCNO_CLASS (A)][ALLOCNO_MODE (A)]) /* Typedef for pointer to the subsequent structure. */ typedef struct ira_emit_data *ira_emit_data_t; @@ -511,6 +526,8 @@ allocno_emit_reg (ira_allocno_t a) } #define OBJECT_ALLOCNO(O) ((O)->allocno) +#define OBJECT_SIZE(O) ((O)->size) +#define OBJECT_OFFSET(O) ((O)->offset) #define OBJECT_SUBWORD(O) ((O)->subword) #define OBJECT_CONFLICT_ARRAY(O) ((O)->conflicts_array) #define OBJECT_CONFLICT_VEC(O) ((ira_object_t *)(O)->conflicts_array) @@ -524,6 +541,8 @@ allocno_emit_reg (ira_allocno_t a) #define OBJECT_MAX(O) ((O)->max) #define OBJECT_CONFLICT_ID(O) ((O)->id) #define OBJECT_LIVE_RANGES(O) ((O)->live_ranges) +#define OBJECT_START(O) ((O)->start) +#define OBJECT_NREGS(O) ((O)->nregs) /* Map regno -> allocnos with given regno (see comments for allocno member `next_regno_allocno'). */ @@ -1041,6 +1060,8 @@ extern void ira_free_cost_vector (int *, reg_class_t); extern void ira_flattening (int, int); extern bool ira_build (void); extern void ira_destroy (void); +extern ira_object_t +find_object (ira_allocno_t, int, int); /* ira-costs.cc */ extern void ira_init_costs_once (void); @@ -1708,4 +1729,18 @@ ira_caller_save_loop_spill_p (ira_allocno_t a, ira_allocno_t subloop_a, return call_cost && call_cost >= spill_cost; } +/* Return true if allocno A has subreg object. */ +inline bool +has_subreg_object_p (ira_allocno_t a) +{ + return ALLOCNO_NUM_OBJECTS (a) > 1; +} + +/* Return the full object of allocno A. */ +inline ira_object_t +get_full_object (ira_allocno_t a) +{ + return find_object (a, 0, ALLOCNO_NREGS (a)); +} + #endif /* GCC_IRA_INT_H */ diff --git a/gcc/ira.cc b/gcc/ira.cc index d7530f01380..2fa6e0e5c94 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -2623,7 +2623,7 @@ static void check_allocation (void) { ira_allocno_t a; - int hard_regno, nregs, conflict_nregs; + int hard_regno; ira_allocno_iterator ai; FOR_EACH_ALLOCNO (a, ai) @@ -2634,28 +2634,18 @@ check_allocation (void) if (ALLOCNO_CAP_MEMBER (a) != NULL || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0) continue; - nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a)); - if (nregs == 1) - /* We allocated a single hard register. */ - n = 1; - else if (n > 1) - /* We allocated multiple hard registers, and we will test - conflicts in a granularity of single hard regs. */ - nregs = 1; for (i = 0; i < n; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); ira_object_t conflict_obj; ira_object_conflict_iterator oci; - int this_regno = hard_regno; - if (n > 1) - { - if (REG_WORDS_BIG_ENDIAN) - this_regno += n - i - 1; - else - this_regno += i; - } + int this_regno; + if (REG_WORDS_BIG_ENDIAN) + this_regno = hard_regno + ALLOCNO_NREGS (a) - 1 - OBJECT_START (obj) + - OBJECT_NREGS (obj) + 1; + else + this_regno = hard_regno + OBJECT_START (obj); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); @@ -2665,24 +2655,18 @@ check_allocation (void) if (ira_soft_conflict (a, conflict_a)) continue; - conflict_nregs = hard_regno_nregs (conflict_hard_regno, - ALLOCNO_MODE (conflict_a)); - - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1 - && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a)) - { - if (REG_WORDS_BIG_ENDIAN) - conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a) - - OBJECT_SUBWORD (conflict_obj) - 1); - else - conflict_hard_regno += OBJECT_SUBWORD (conflict_obj); - conflict_nregs = 1; - } + if (REG_WORDS_BIG_ENDIAN) + conflict_hard_regno = conflict_hard_regno + + ALLOCNO_NREGS (conflict_a) - 1 + - OBJECT_START (conflict_obj) + - OBJECT_NREGS (conflict_obj) + 1; + else + conflict_hard_regno + = conflict_hard_regno + OBJECT_START (conflict_obj); - if ((conflict_hard_regno <= this_regno - && this_regno < conflict_hard_regno + conflict_nregs) - || (this_regno <= conflict_hard_regno - && conflict_hard_regno < this_regno + nregs)) + if (!(this_regno + OBJECT_NREGS (obj) <= conflict_hard_regno + || conflict_hard_regno + OBJECT_NREGS (conflict_obj) + <= this_regno)) { fprintf (stderr, "bad allocation for %d and %d\n", ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a)); From patchwork Wed Nov 8 03:47:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 162877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp676630vqo; Tue, 7 Nov 2023 19:49:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IFyquyLF3ZKFXpVVVH8kQmQ/XpoHYl82ajRtYFVOlEvmjjdmU9MRb5Z49TBDnjx3SJ1q+f4 X-Received: by 2002:a05:620a:1903:b0:767:f178:b571 with SMTP id bj3-20020a05620a190300b00767f178b571mr5935588qkb.10.1699415366055; Tue, 07 Nov 2023 19:49:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699415366; cv=pass; d=google.com; s=arc-20160816; b=0ZTFrGLye7oomO5eBa6O4K/q8ADQ+gg3O2HqyyPb6JeLWy9TaUf6OjQzQM6msP0hxL l3Gu0LcrJKVB/9pVYczRelbX1Wflm2Nu4SNedAF2et5J/dMOjzfRJP2yDatZy3HH1pKS xpFTx7J9HFEBnwqWKFK3C8/I1za4jRnq5kcEGlENQ3TzLGiGsHZ62dmVI/LCqSkka9Bq RgtlAs8wvGEjvq2bi+jpb7UO4J4EqZ9tzLJD+JpN/blgnRBiyB5pZrO3cFMxyIniRrge 9H0+sLRrhH3mzdIAkp+tfSIjClQX5UW4l4sw0LTj0pg76qBxUUYDa117VWhB51vdi7Fh V2cA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=qquC8sWDjqll8RtIOVb4d2Wku1MQt1L3wBX1tqbgWug=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=pQmHkK5Yip78Z2LMdlRw0gT4ta6jL8OfWSflGL7/6tlvsz+IjCuoOEoxrwXGpPmtml VTzHVc7upn+vY/FlLfxA9V1Mr0eb7gf4iGKSqI5KxavUbEQQRShjp2hlG82Uvy2gRWVr gXXw2MjSvFW/V1Y9/FTXAbV4MTr5lqtf2CEjO+MCb45KpVcQRSi++PsNF4Ew4KMAbSfP xbfRWDPuHaO/PkZwDwnhW47/irqea5wbM4KDMwDI8YLxKwvLiibWDA0nN1nKFJ8XS/wc OLL+/qYgnBHSGFGsWGzKvYYYjE9LYblElr09l/4vhlh1gUCaMoBP6x1JT6ZzgOPTPzo3 aZDA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id v14-20020a05622a014e00b0041cba17e38fsi803548qtw.276.2023.11.07.19.49.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Nov 2023 19:49:26 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 069603858C78 for ; Wed, 8 Nov 2023 03:49:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg1.qq.com (smtpbgsg1.qq.com [54.254.200.92]) by sourceware.org (Postfix) with ESMTPS id 496353858C66 for ; Wed, 8 Nov 2023 03:47:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 496353858C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 496353858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.92 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415284; cv=none; b=upjDves8MGFENd1bY1+d4AC4oOm/Lu94SNaUblOlmAy3653bOMYMhfY/NHUqb8W6+ubgKBsXOwas10yJyddRVWrg6cWOQDod2Idj7Gykzcedlvbwv/N7J5JM/TxFgyE6j5nYP31oOP+McsHbsdPowsr3YbTRdn6TMKaB4M4KKHc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415284; c=relaxed/simple; bh=r0WkdUJRuiPiC2qnxSeuxfpgUuc0g0HQU0EsTYMbv9U=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=jihhNT1+P2Gn6w8KcQno5HnJypq62VV/ed5e2MJBDeMTSOhDLlGQn/3tu9S3PH15ccGQFIFL+aot1sG+SNrgdlpTcYwq6olLajMQ313PAqaWWMAInlYVecfaPYJAV4kZ0UpTigejzzN35nhpr8K+Xm/Dtp1sie3twXN6pvmFF4g= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1699415271tohivwaj Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 11:47:50 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: rZJGTgY0+YMFfx21uTpoY2RVPTQYeOzWdiEVhViwKXtCzLoTcDoldcqevCG+9 8Z9xSAkMPJWg91jgY3qmQSvWDTOX8neTn3NjJUCXh5nQtl9oD7zj0ATIVOIp3IJ03HuoxIH +pLMibQRJrDs/bYLdtXS+gCOIO9y9tKi/sP/ZsPIKrMFAneMPneaxQSpKW6xMSS8jBkmB7a Re8IYixHuW80V58MXc4NSHTnG1qZXtz9qoltLrQN1nBSc/ZOdxUDlfJxlDmO71jKI9tn5U9 3VeCRUjkh2xP6XhU9aUOcNkcEYVVeL56bprbbOUV/N7d8lpqgFIx5asZOAJ23azLOTiehAS 2Pe7AkXJKlBrrxp6HsbPn/ccdVJZoIHxqSXUu1mTR84cLO3lDkH+ruLojsW/R4DFpY+lFis X-QQ-GoodBg: 2 X-BIZMAIL-ID: 6175444661060236510 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH 3/7] ira: Support subreg live range track Date: Wed, 8 Nov 2023 11:47:36 +0800 Message-Id: <20231108034740.834590-4-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231108034740.834590-1-lehua.ding@rivai.ai> References: <20231108034740.834590-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781966166700273523 X-GMAIL-MSGID: 1781966166700273523 This patch extends the reg live range in ira to track the lifecycle of subreg, thus enabling more granular tracking of the live range and conflict of a pseudo subreg part. This patch will divide allocno into two categories: one has single object, and the other is the case where it contains subreg objects. gcc/ChangeLog: * ira-build.cc (init_object_start_and_nregs): Removed. (ira_create_object): Adjust. (find_object): New. (find_object_anyway): New. (ira_create_allocno): Removed regs_with_subreg. (ira_set_allocno_class): Adjust. (get_range): New. (ira_copy_allocno_objects): New. (merge_hard_reg_conflicts): Adjust. (create_cap_allocno): Adjust. (find_subreg_p): New. (add_subregs): New. (create_insn_allocnos): Adjust. (create_bb_allocnos): Adjust. (move_allocno_live_ranges): Adjust. (copy_allocno_live_ranges): Adjust. (setup_min_max_allocno_live_range_point): Adjust. (init_regs_with_subreg): Removed. (ira_build): Removed. (ira_destroy): Removed. * ira-color.cc (INCLUDE_MAP): use std::map (setup_left_conflict_sizes_p): Adjust. (push_allocno_to_stack): Adjust. * ira-conflicts.cc (record_object_conflict): Adjust. (build_object_conflicts): Adjust. (build_conflicts): Adjust. (print_allocno_conflicts): Adjust. * ira-emit.cc (modify_move_list): Adjust. * ira-int.h (struct ira_object): Adjust. (struct ira_allocno): Adjust. (OBJECT_SIZE): New. (OBJECT_OFFSET): New. (OBJECT_SUBWORD): New. (find_object): New. (find_object_anyway): New. (ira_copy_allocno_objects): New. * ira-lives.cc (INCLUDE_VECTOR): use std::vector. (set_subreg_conflict_hard_regs): New. (make_hard_regno_dead): Adjust. (make_object_live): Adjust. (update_allocno_pressure_excess_length): Adjust. (make_object_dead): Adjust. (mark_pseudo_regno_live): New. (add_subreg_point): New. (mark_pseudo_object_live): New. (mark_pseudo_regno_subword_live): Removed. (mark_pseudo_regno_subreg_live): New. (mark_pseudo_regno_subregs_live): New. (mark_pseudo_reg_live): New. (mark_pseudo_regno_dead): Removed. (mark_pseudo_object_dead): New. (mark_pseudo_regno_subword_dead): Removed. (mark_pseudo_regno_subreg_dead): New. (mark_pseudo_reg_dead): Adjust. (process_single_reg_class_operands): Adjust. (process_out_of_region_eh_regs): Adjust. (process_bb_node_lives): Adjust. (class subreg_live_item): New. (create_subregs_live_ranges): New. (ira_create_allocno_live_ranges): Adjust. * subreg-live-range.h: New fields. --- gcc/ira-build.cc | 275 +++++++++++++-------- gcc/ira-color.cc | 68 ++++-- gcc/ira-conflicts.cc | 48 ++-- gcc/ira-emit.cc | 2 +- gcc/ira-int.h | 21 +- gcc/ira-lives.cc | 522 +++++++++++++++++++++++++++++----------- gcc/subreg-live-range.h | 16 ++ 7 files changed, 653 insertions(+), 299 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 7df98164503..5fb7a9f800f 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -29,10 +29,12 @@ along with GCC; see the file COPYING3. If not see #include "insn-config.h" #include "regs.h" #include "memmodel.h" +#include "tm_p.h" #include "ira.h" #include "ira-int.h" #include "sparseset.h" #include "cfgloop.h" +#include "subreg-live-range.h" static ira_copy_t find_allocno_copy (ira_allocno_t, ira_allocno_t, rtx_insn *, ira_loop_tree_node_t); @@ -440,49 +442,14 @@ initiate_allocnos (void) memset (ira_regno_allocno_map, 0, max_reg_num () * sizeof (ira_allocno_t)); } -/* Update OBJ's start and nregs field according A and OBJ info. */ -static void -init_object_start_and_nregs (ira_allocno_t a, ira_object_t obj) -{ - enum reg_class aclass = ALLOCNO_CLASS (a); - gcc_assert (aclass != NO_REGS); - - machine_mode mode = ALLOCNO_MODE (a); - int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (ALLOCNO_TRACK_SUBREG_P (a)) - { - poly_int64 end = OBJECT_OFFSET (obj) + OBJECT_SIZE (obj); - for (int i = 0; i < nregs; i += 1) - { - poly_int64 right = ALLOCNO_UNIT_SIZE (a) * (i + 1); - if (OBJECT_START (obj) < 0 && maybe_lt (OBJECT_OFFSET (obj), right)) - { - OBJECT_START (obj) = i; - } - if (OBJECT_NREGS (obj) < 0 && maybe_le (end, right)) - { - OBJECT_NREGS (obj) = i + 1 - OBJECT_START (obj); - break; - } - } - gcc_assert (OBJECT_START (obj) >= 0 && OBJECT_NREGS (obj) > 0); - } - else - { - OBJECT_START (obj) = 0; - OBJECT_NREGS (obj) = nregs; - } -} - /* Create and return an object corresponding to a new allocno A. */ static ira_object_t -ira_create_object (ira_allocno_t a, int subword) +ira_create_object (ira_allocno_t a, int start, int nregs) { enum reg_class aclass = ALLOCNO_CLASS (a); ira_object_t obj = object_pool.allocate (); OBJECT_ALLOCNO (obj) = a; - OBJECT_SUBWORD (obj) = subword; OBJECT_CONFLICT_ID (obj) = ira_objects_num; OBJECT_CONFLICT_VEC_P (obj) = false; OBJECT_CONFLICT_ARRAY (obj) = NULL; @@ -494,19 +461,14 @@ ira_create_object (ira_allocno_t a, int subword) OBJECT_MIN (obj) = INT_MAX; OBJECT_MAX (obj) = -1; OBJECT_LIVE_RANGES (obj) = NULL; - OBJECT_SIZE (obj) = UNITS_PER_WORD; - OBJECT_OFFSET (obj) = subword * UNITS_PER_WORD; - OBJECT_START (obj) = -1; - OBJECT_NREGS (obj) = -1; + OBJECT_START (obj) = start; + OBJECT_NREGS (obj) = nregs; ira_object_id_map_vec.safe_push (obj); ira_object_id_map = ira_object_id_map_vec.address (); ira_objects_num = ira_object_id_map_vec.length (); - if (aclass != NO_REGS) - init_object_start_and_nregs (a, obj); - a->objects.push_back (obj); return obj; @@ -524,6 +486,52 @@ find_object (ira_allocno_t a, int start, int nregs) return NULL; } +ira_object_t +find_object (ira_allocno_t a, poly_int64 offset, poly_int64 size) +{ + enum reg_class aclass = ALLOCNO_CLASS (a); + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ira_reg_class_max_nregs[aclass][mode]; + + if (!has_subreg_object_p (a) + || maybe_eq (GET_MODE_SIZE (ALLOCNO_MODE (a)), size)) + return find_object (a, 0, nregs); + + gcc_assert (maybe_lt (size, GET_MODE_SIZE (ALLOCNO_MODE (a))) + && maybe_le (offset + size, GET_MODE_SIZE (ALLOCNO_MODE (a)))); + + int subreg_start = -1; + int subreg_nregs = -1; + for (int i = 0; i < nregs; i += 1) + { + poly_int64 right = ALLOCNO_UNIT_SIZE (a) * (i + 1); + if (subreg_start < 0 && maybe_lt (offset, right)) + { + subreg_start = i; + } + if (subreg_nregs < 0 && maybe_le (offset + size, right)) + { + subreg_nregs = i + 1 - subreg_start; + break; + } + } + gcc_assert (subreg_start >= 0 && subreg_nregs > 0); + return find_object (a, subreg_start, subreg_nregs); +} + +/* Return the object in allocno A which match START & NREGS. Create when not + found. */ +ira_object_t +find_object_anyway (ira_allocno_t a, int start, int nregs) +{ + ira_object_t obj = find_object (a, start, nregs); + if (obj == NULL && ALLOCNO_TRACK_SUBREG_P (a)) + obj = ira_create_object (a, start, nregs); + + gcc_assert (obj != NULL); + return obj; +} + /* Create and return the allocno corresponding to REGNO in LOOP_TREE_NODE. Add the allocno to the list of allocnos with the same regno if CAP_P is FALSE. */ @@ -591,9 +599,6 @@ ira_create_allocno (int regno, bool cap_p, return a; } -/* Record the regs referenced by subreg. */ -static bitmap_head regs_with_subreg; - /* Set up register class for A and update its conflict hard registers. */ void @@ -614,8 +619,7 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) /* SET the unit_size of one register. */ machine_mode mode = ALLOCNO_MODE (a); int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD) - && bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) + if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD)) { ALLOCNO_UNIT_SIZE (a) = UNITS_PER_WORD; ALLOCNO_TRACK_SUBREG_P (a) = true; @@ -623,6 +627,39 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) } } +/* Return the subreg range of rtx SUBREG. */ +static subreg_range +get_range (rtx subreg) +{ + gcc_assert (read_modify_subreg_p (subreg)); + rtx reg = SUBREG_REG (subreg); + machine_mode reg_mode = GET_MODE (reg); + + machine_mode subreg_mode = GET_MODE (subreg); + int nblocks = get_nblocks (reg_mode); + poly_int64 unit_size = REGMODE_NATURAL_SIZE (reg_mode); + + poly_int64 offset = SUBREG_BYTE (subreg); + poly_int64 left = offset + GET_MODE_SIZE (subreg_mode); + + int subreg_start = -1; + int subreg_nblocks = -1; + for (int i = 0; i < nblocks; i += 1) + { + poly_int64 right = unit_size * (i + 1); + if (subreg_start < 0 && maybe_lt (offset, right)) + subreg_start = i; + if (subreg_nblocks < 0 && maybe_le (left, right)) + { + subreg_nblocks = i + 1 - subreg_start; + break; + } + } + gcc_assert (subreg_start >= 0 && subreg_nblocks > 0); + + return subreg_range (subreg_start, subreg_start + subreg_nblocks); +} + /* Determine the number of objects we should associate with allocno A and allocate them. */ void @@ -630,15 +667,37 @@ ira_create_allocno_objects (ira_allocno_t a) { machine_mode mode = ALLOCNO_MODE (a); enum reg_class aclass = ALLOCNO_CLASS (a); - int n = ira_reg_class_max_nregs[aclass][mode]; - int i; + int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (n != 2 || maybe_ne (GET_MODE_SIZE (mode), n * UNITS_PER_WORD) - || !bitmap_bit_p (®s_with_subreg, ALLOCNO_REGNO (a))) - n = 1; + ira_create_object (a, 0, nregs); - for (i = 0; i < n; i++) - ira_create_object (a, i); + if (aclass == NO_REGS || !ALLOCNO_TRACK_SUBREG_P (a) || a->subregs.empty ()) + return; + + int nblocks = get_nblocks (ALLOCNO_MODE (a)); + int times = nblocks / ALLOCNO_NREGS (a); + gcc_assert (times >= 1 && nblocks % ALLOCNO_NREGS (a) == 0); + + for (const auto &range : a->subregs) + { + int start = range.start / times; + int end = CEIL (range.end, times); + if (find_object (a, start, end - start) != NULL) + continue; + ira_create_object (a, start, end - start); + } + + a->subregs.clear (); +} + +/* Copy the objects from FROM to TO. */ +void +ira_copy_allocno_objects (ira_allocno_t to, ira_allocno_t from) +{ + ira_allocno_object_iterator oi; + ira_object_t obj; + FOR_EACH_ALLOCNO_OBJECT (from, obj, oi) + ira_create_object (to, OBJECT_START (obj), OBJECT_NREGS (obj)); } /* For each allocno, set ALLOCNO_NUM_OBJECTS and create the @@ -662,11 +721,11 @@ merge_hard_reg_conflicts (ira_allocno_t from, ira_allocno_t to, bool total_only) { int i; - gcc_assert (ALLOCNO_NUM_OBJECTS (to) == ALLOCNO_NUM_OBJECTS (from)); - for (i = 0; i < ALLOCNO_NUM_OBJECTS (to); i++) + for (i = 0; i < ALLOCNO_NUM_OBJECTS (from); i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); if (!total_only) OBJECT_CONFLICT_HARD_REGS (to_obj) @@ -960,7 +1019,7 @@ create_cap_allocno (ira_allocno_t a) ALLOCNO_WMODE (cap) = ALLOCNO_WMODE (a); aclass = ALLOCNO_CLASS (a); ira_set_allocno_class (cap, aclass); - ira_create_allocno_objects (cap); + ira_copy_allocno_objects (cap, a); ALLOCNO_CAP_MEMBER (cap) = a; ALLOCNO_CAP (a) = cap; ALLOCNO_CLASS_COST (cap) = ALLOCNO_CLASS_COST (a); @@ -1902,6 +1961,26 @@ ira_traverse_loop_tree (bool bb_p, ira_loop_tree_node_t loop_node, /* The basic block currently being processed. */ static basic_block curr_bb; +/* Return true if A's subregs has a subreg with same SIZE and OFFSET. */ +static bool +find_subreg_p (ira_allocno_t a, const subreg_range &r) +{ + for (const auto &item : a->subregs) + if (item.start == r.start && item.end == r.end) + return true; + return false; +} + +/* Return start and nregs subregs from DF_LIVE_SUBREG. */ +static void +add_subregs (ira_allocno_t a, const subreg_ranges &sr) +{ + gcc_assert (get_nblocks (ALLOCNO_MODE (a)) == sr.max); + for (const subreg_range &r : sr.ranges) + if (!find_subreg_p (a, r)) + a->subregs.push_back (r); +} + /* This recursive function creates allocnos corresponding to pseudo-registers containing in X. True OUTPUT_P means that X is an lvalue. OUTER corresponds to the parent expression of X. */ @@ -1931,6 +2010,14 @@ create_insn_allocnos (rtx x, rtx outer, bool output_p) } } + /* Collect subreg reference. */ + if (outer != NULL && read_modify_subreg_p (outer)) + { + const subreg_range r = get_range (outer); + if (!find_subreg_p (a, r)) + a->subregs.push_back (r); + } + ALLOCNO_NREFS (a)++; ALLOCNO_FREQ (a) += REG_FREQ_FROM_BB (curr_bb); if (output_p) @@ -1998,8 +2085,21 @@ create_bb_allocnos (ira_loop_tree_node_t bb_node) EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_PARTIAL_IN (bb), FIRST_PSEUDO_REGISTER, i, bi) - if (ira_curr_regno_allocno_map[i] == NULL) - ira_create_allocno (i, false, ira_curr_loop_tree_node); + { + if (ira_curr_regno_allocno_map[i] == NULL) + ira_create_allocno (i, false, ira_curr_loop_tree_node); + add_subregs (ira_curr_regno_allocno_map[i], + DF_LIVE_SUBREG_RANGE_IN (bb)->lives.at (i)); + } + + EXECUTE_IF_SET_IN_REG_SET (DF_LIVE_SUBREG_PARTIAL_OUT (bb), + FIRST_PSEUDO_REGISTER, i, bi) + { + if (ira_curr_regno_allocno_map[i] == NULL) + ira_create_allocno (i, false, ira_curr_loop_tree_node); + add_subregs (ira_curr_regno_allocno_map[i], + DF_LIVE_SUBREG_RANGE_OUT (bb)->lives.at (i)); + } } /* Create allocnos corresponding to pseudo-registers living on edge E @@ -2214,20 +2314,20 @@ move_allocno_live_ranges (ira_allocno_t from, ira_allocno_t to) int i; int n = ALLOCNO_NUM_OBJECTS (from); - gcc_assert (n == ALLOCNO_NUM_OBJECTS (to)); - for (i = 0; i < n; i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); live_range_t lr = OBJECT_LIVE_RANGES (from_obj); if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) { fprintf (ira_dump_file, - " Moving ranges of a%dr%d to a%dr%d: ", + " Moving ranges of a%dr%d_obj%d to a%dr%d_obj%d: ", ALLOCNO_NUM (from), ALLOCNO_REGNO (from), - ALLOCNO_NUM (to), ALLOCNO_REGNO (to)); + OBJECT_INDEX (from_obj), ALLOCNO_NUM (to), + ALLOCNO_REGNO (to), OBJECT_INDEX (to_obj)); ira_print_live_range_list (ira_dump_file, lr); } change_object_in_range_list (lr, to_obj); @@ -2243,12 +2343,11 @@ copy_allocno_live_ranges (ira_allocno_t from, ira_allocno_t to) int i; int n = ALLOCNO_NUM_OBJECTS (from); - gcc_assert (n == ALLOCNO_NUM_OBJECTS (to)); - for (i = 0; i < n; i++) { ira_object_t from_obj = ALLOCNO_OBJECT (from, i); - ira_object_t to_obj = ALLOCNO_OBJECT (to, i); + ira_object_t to_obj = find_object_anyway (to, OBJECT_START (from_obj), + OBJECT_NREGS (from_obj)); live_range_t lr = OBJECT_LIVE_RANGES (from_obj); if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) @@ -2860,15 +2959,17 @@ setup_min_max_allocno_live_range_point (void) ira_assert (OBJECT_LIVE_RANGES (obj) == NULL); OBJECT_MAX (obj) = 0; OBJECT_MIN (obj) = 1; - continue; } ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); /* Accumulation of range info. */ if (ALLOCNO_CAP (a) != NULL) { - for (cap = ALLOCNO_CAP (a); cap != NULL; cap = ALLOCNO_CAP (cap)) + for (cap = ALLOCNO_CAP (a); cap != NULL; + cap = ALLOCNO_CAP (cap)) { - ira_object_t cap_obj = ALLOCNO_OBJECT (cap, j); + ira_object_t cap_obj = find_object (cap, OBJECT_START (obj), + OBJECT_NREGS (obj)); + gcc_assert (cap_obj != NULL); if (OBJECT_MAX (cap_obj) < OBJECT_MAX (obj)) OBJECT_MAX (cap_obj) = OBJECT_MAX (obj); if (OBJECT_MIN (cap_obj) > OBJECT_MIN (obj)) @@ -2879,7 +2980,9 @@ setup_min_max_allocno_live_range_point (void) if ((parent = ALLOCNO_LOOP_TREE_NODE (a)->parent) == NULL) continue; parent_a = parent->regno_allocno_map[i]; - parent_obj = ALLOCNO_OBJECT (parent_a, j); + parent_obj + = find_object (parent_a, OBJECT_START (obj), OBJECT_NREGS (obj)); + gcc_assert (parent_obj != NULL); if (OBJECT_MAX (parent_obj) < OBJECT_MAX (obj)) OBJECT_MAX (parent_obj) = OBJECT_MAX (obj); if (OBJECT_MIN (parent_obj) > OBJECT_MIN (obj)) @@ -3538,30 +3641,6 @@ update_conflict_hard_reg_costs (void) } } -/* Traverse all instructions to determine which ones have access through subreg. - */ -static void -init_regs_with_subreg () -{ - bitmap_initialize (®s_with_subreg, ®_obstack); - basic_block bb; - rtx_insn *insn; - df_ref def, use; - FOR_ALL_BB_FN (bb, cfun) - FOR_BB_INSNS (bb, insn) - { - if (!NONDEBUG_INSN_P (insn)) - continue; - df_insn_info *insn_info = DF_INSN_INFO_GET (insn); - FOR_EACH_INSN_INFO_DEF (def, insn_info) - if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_SUBREG)) - bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (def)); - FOR_EACH_INSN_INFO_USE (use, insn_info) - if (DF_REF_FLAGS (use) & (DF_REF_PARTIAL | DF_REF_SUBREG)) - bitmap_set_bit (®s_with_subreg, DF_REF_REGNO (use)); - } -} - /* Create a internal representation (IR) for IRA (allocnos, copies, loop tree nodes). The function returns TRUE if we generate loop structure (besides nodes representing all function and the basic @@ -3577,7 +3656,6 @@ ira_build (void) initiate_allocnos (); initiate_prefs (); initiate_copies (); - init_regs_with_subreg (); create_loop_tree_nodes (); form_loop_tree (); create_allocnos (); @@ -3668,5 +3746,4 @@ ira_destroy (void) finish_allocnos (); finish_cost_vectors (); ira_finish_allocno_live_ranges (); - bitmap_clear (®s_with_subreg); } diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index f1b96d1aee6..8aed25144b9 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see . */ #include "config.h" +#define INCLUDE_MAP #include "system.h" #include "coretypes.h" #include "backend.h" @@ -852,18 +853,17 @@ setup_left_conflict_sizes_p (ira_allocno_t a) node_preorder_num = node->preorder_num; node_set = node->hard_regs->set; node_check_tick++; + /* Collect conflict objects. */ + std::map allocno_conflict_regs; for (k = 0; k < nobj; k++) { ira_object_t obj = ALLOCNO_OBJECT (a, k); ira_object_t conflict_obj; ira_object_conflict_iterator oci; - + FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { - int size; - ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); - allocno_hard_regs_node_t conflict_node, temp_node; - HARD_REG_SET conflict_node_set; + ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); allocno_color_data_t conflict_data; conflict_data = ALLOCNO_COLOR_DATA (conflict_a); @@ -872,6 +872,24 @@ setup_left_conflict_sizes_p (ira_allocno_t a) conflict_data ->profitable_hard_regs)) continue; + int num = ALLOCNO_NUM (conflict_a); + if (allocno_conflict_regs.count (num) == 0) + allocno_conflict_regs.insert ({num, ira_allocate_bitmap ()}); + bitmap_head temp; + bitmap_initialize (&temp, ®_obstack); + bitmap_set_range (&temp, OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj)); + bitmap_and_compl_into (&temp, allocno_conflict_regs.at (num)); + int size = bitmap_count_bits (&temp); + bitmap_clear (&temp); + if (size == 0) + continue; + + bitmap_set_range (allocno_conflict_regs.at (num), + OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj)); + allocno_hard_regs_node_t conflict_node, temp_node; + HARD_REG_SET conflict_node_set; conflict_node = conflict_data->hard_regs_node; conflict_node_set = conflict_node->hard_regs->set; if (hard_reg_set_subset_p (node_set, conflict_node_set)) @@ -886,14 +904,13 @@ setup_left_conflict_sizes_p (ira_allocno_t a) temp_node->check = node_check_tick; temp_node->conflict_size = 0; } - size = (ira_reg_class_max_nregs - [ALLOCNO_CLASS (conflict_a)][ALLOCNO_MODE (conflict_a)]); - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) - /* We will deal with the subwords individually. */ - size = 1; temp_node->conflict_size += size; } } + /* Setup conflict nregs of ALLOCNO. */ + for (auto &kv : allocno_conflict_regs) + ira_free_bitmap (kv.second); + for (i = 0; i < data->hard_regs_subnodes_num; i++) { allocno_hard_regs_node_t temp_node; @@ -2746,21 +2763,16 @@ push_allocno_to_stack (ira_allocno_t a) { enum reg_class aclass; allocno_color_data_t data, conflict_data; - int size, i, n = ALLOCNO_NUM_OBJECTS (a); - + int i, n = ALLOCNO_NUM_OBJECTS (a); + data = ALLOCNO_COLOR_DATA (a); data->in_graph_p = false; allocno_stack_vec.safe_push (a); aclass = ALLOCNO_CLASS (a); if (aclass == NO_REGS) return; - size = ira_reg_class_max_nregs[aclass][ALLOCNO_MODE (a)]; - if (n > 1) - { - /* We will deal with the subwords individually. */ - gcc_assert (size == ALLOCNO_NUM_OBJECTS (a)); - size = 1; - } + /* Already collect conflict objects. */ + std::map allocno_conflict_regs; for (i = 0; i < n; i++) { ira_object_t obj = ALLOCNO_OBJECT (a, i); @@ -2785,6 +2797,21 @@ push_allocno_to_stack (ira_allocno_t a) continue; ira_assert (bitmap_bit_p (coloring_allocno_bitmap, ALLOCNO_NUM (conflict_a))); + + int num = ALLOCNO_NUM (conflict_a); + if (allocno_conflict_regs.count (num) == 0) + allocno_conflict_regs.insert ({num, ira_allocate_bitmap ()}); + bitmap_head temp; + bitmap_initialize (&temp, ®_obstack); + bitmap_set_range (&temp, OBJECT_START (obj), OBJECT_NREGS (obj)); + bitmap_and_compl_into (&temp, allocno_conflict_regs.at (num)); + int size = bitmap_count_bits (&temp); + bitmap_clear (&temp); + if (size == 0) + continue; + + bitmap_set_range (allocno_conflict_regs.at (num), OBJECT_START (obj), + OBJECT_NREGS (obj)); if (update_left_conflict_sizes_p (conflict_a, a, size)) { delete_allocno_from_bucket @@ -2800,6 +2827,9 @@ push_allocno_to_stack (ira_allocno_t a) } } + + for (auto &kv : allocno_conflict_regs) + ira_free_bitmap (kv.second); } /* Put ALLOCNO onto the coloring stack and remove it from its bucket. diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index a4d93c8d734..0585ad10043 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -60,23 +60,8 @@ static IRA_INT_TYPE **conflicts; static void record_object_conflict (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); - ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); - int w1 = OBJECT_SUBWORD (obj1); - int w2 = OBJECT_SUBWORD (obj2); - int id1, id2; - - /* Canonicalize the conflict. If two identically-numbered words - conflict, always record this as a conflict between words 0. That - is the only information we need, and it is easier to test for if - it is collected in each allocno's lowest-order object. */ - if (w1 == w2 && w1 > 0) - { - obj1 = ALLOCNO_OBJECT (a1, 0); - obj2 = ALLOCNO_OBJECT (a2, 0); - } - id1 = OBJECT_CONFLICT_ID (obj1); - id2 = OBJECT_CONFLICT_ID (obj2); + int id1 = OBJECT_CONFLICT_ID (obj1); + int id2 = OBJECT_CONFLICT_ID (obj2); SET_MINMAX_SET_BIT (conflicts[id1], id2, OBJECT_MIN (obj1), OBJECT_MAX (obj1)); @@ -606,8 +591,8 @@ build_object_conflicts (ira_object_t obj) if (parent_a == NULL) return; ira_assert (ALLOCNO_CLASS (a) == ALLOCNO_CLASS (parent_a)); - ira_assert (ALLOCNO_NUM_OBJECTS (a) == ALLOCNO_NUM_OBJECTS (parent_a)); - parent_obj = ALLOCNO_OBJECT (parent_a, OBJECT_SUBWORD (obj)); + parent_obj + = find_object_anyway (parent_a, OBJECT_START (obj), OBJECT_NREGS (obj)); parent_num = OBJECT_CONFLICT_ID (parent_obj); parent_min = OBJECT_MIN (parent_obj); parent_max = OBJECT_MAX (parent_obj); @@ -616,7 +601,6 @@ build_object_conflicts (ira_object_t obj) { ira_object_t another_obj = ira_object_id_map[i]; ira_allocno_t another_a = OBJECT_ALLOCNO (another_obj); - int another_word = OBJECT_SUBWORD (another_obj); ira_assert (ira_reg_classes_intersect_p [ALLOCNO_CLASS (a)][ALLOCNO_CLASS (another_a)]); @@ -627,11 +611,11 @@ build_object_conflicts (ira_object_t obj) ira_assert (ALLOCNO_NUM (another_parent_a) >= 0); ira_assert (ALLOCNO_CLASS (another_a) == ALLOCNO_CLASS (another_parent_a)); - ira_assert (ALLOCNO_NUM_OBJECTS (another_a) - == ALLOCNO_NUM_OBJECTS (another_parent_a)); SET_MINMAX_SET_BIT (conflicts[parent_num], - OBJECT_CONFLICT_ID (ALLOCNO_OBJECT (another_parent_a, - another_word)), + OBJECT_CONFLICT_ID ( + find_object_anyway (another_parent_a, + OBJECT_START (another_obj), + OBJECT_NREGS (another_obj))), parent_min, parent_max); } } @@ -659,9 +643,10 @@ build_conflicts (void) build_object_conflicts (obj); for (cap = ALLOCNO_CAP (a); cap != NULL; cap = ALLOCNO_CAP (cap)) { - ira_object_t cap_obj = ALLOCNO_OBJECT (cap, j); - gcc_assert (ALLOCNO_NUM_OBJECTS (cap) == ALLOCNO_NUM_OBJECTS (a)); - build_object_conflicts (cap_obj); + ira_object_t cap_obj + = find_object_anyway (cap, OBJECT_START (obj), + OBJECT_NREGS (obj)); + build_object_conflicts (cap_obj); } } } @@ -736,7 +721,8 @@ print_allocno_conflicts (FILE * file, bool reg_p, ira_allocno_t a) } if (n > 1) - fprintf (file, "\n;; subobject %d:", i); + fprintf (file, "\n;; subobject s%d,n%d,f%d:", OBJECT_START (obj), + OBJECT_NREGS (obj), ALLOCNO_NREGS (a)); FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); @@ -746,8 +732,10 @@ print_allocno_conflicts (FILE * file, bool reg_p, ira_allocno_t a) { fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a), ALLOCNO_REGNO (conflict_a)); - if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1) - fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj)); + if (has_subreg_object_p (conflict_a)) + fprintf (file, ",s%d,n%d,f%d", OBJECT_START (conflict_obj), + OBJECT_NREGS (conflict_obj), + ALLOCNO_NREGS (conflict_a)); if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL) fprintf (file, ",b%d", bb->index); else diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc index 84ed482e568..9dc7f3c655e 100644 --- a/gcc/ira-emit.cc +++ b/gcc/ira-emit.cc @@ -854,7 +854,7 @@ modify_move_list (move_t list) ALLOCNO_MODE (new_allocno) = ALLOCNO_MODE (set_move->to); ira_set_allocno_class (new_allocno, ALLOCNO_CLASS (set_move->to)); - ira_create_allocno_objects (new_allocno); + ira_copy_allocno_objects (new_allocno, set_move->to); ALLOCNO_ASSIGNED_P (new_allocno) = true; ALLOCNO_HARD_REGNO (new_allocno) = -1; ALLOCNO_EMIT_DATA (new_allocno)->reg diff --git a/gcc/ira-int.h b/gcc/ira-int.h index b6281d3df6d..b9e24328867 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "recog.h" #include "function-abi.h" #include +#include "subreg-live-range.h" /* To provide consistency in naming, all IRA external variables, functions, common typedefs start with prefix ira_. */ @@ -223,7 +224,7 @@ extern int ira_max_point; extern live_range_t *ira_start_point_ranges, *ira_finish_point_ranges; /* A structure representing conflict information for an allocno - (or one of its subwords). */ + (or one of its subregs). */ struct ira_object { /* The allocno associated with this record. */ @@ -237,17 +238,9 @@ struct ira_object ranges in the list are not intersected and ordered by decreasing their program points*. */ live_range_t live_ranges; - /* The subword within ALLOCNO which is represented by this object. - Zero means the lowest-order subword (or the entire allocno in case - it is not being tracked in subwords). */ - int subword; /* Reprensent OBJECT occupied [start, start + nregs) registers of it's ALLOCNO. */ int start, nregs; - /* Reprensent the size and offset of current object, use to track subreg - range, For full reg, the size is GET_MODE_SIZE (ALLOCNO_MODE (allocno)), - offset is 0. */ - poly_int64 size, offset; /* Allocated size of the conflicts array. */ unsigned int conflicts_array_size; /* A unique number for every instance of this structure, which is used @@ -400,6 +393,9 @@ struct ira_allocno more than one such object in cases where the allocno represents a multi-hardreg pesudo. */ std::vector objects; + /* An array of structures decribing the subreg mode start and subreg end for + this allocno. */ + std::vector subregs; /* Registers clobbered by intersected calls. */ HARD_REG_SET crossed_calls_clobbered_regs; /* Array of usage costs (accumulated and the one updated during @@ -526,9 +522,6 @@ allocno_emit_reg (ira_allocno_t a) } #define OBJECT_ALLOCNO(O) ((O)->allocno) -#define OBJECT_SIZE(O) ((O)->size) -#define OBJECT_OFFSET(O) ((O)->offset) -#define OBJECT_SUBWORD(O) ((O)->subword) #define OBJECT_CONFLICT_ARRAY(O) ((O)->conflicts_array) #define OBJECT_CONFLICT_VEC(O) ((ira_object_t *)(O)->conflicts_array) #define OBJECT_CONFLICT_BITVEC(O) ((IRA_INT_TYPE *)(O)->conflicts_array) @@ -1062,6 +1055,10 @@ extern bool ira_build (void); extern void ira_destroy (void); extern ira_object_t find_object (ira_allocno_t, int, int); +extern ira_object_t find_object (ira_allocno_t, poly_int64, poly_int64); +ira_object_t +find_object_anyway (ira_allocno_t a, int start, int nregs); +extern void ira_copy_allocno_objects (ira_allocno_t, ira_allocno_t); /* ira-costs.cc */ extern void ira_init_costs_once (void); diff --git a/gcc/ira-lives.cc b/gcc/ira-lives.cc index 60e6be0b0ae..e00898c0ccd 100644 --- a/gcc/ira-lives.cc +++ b/gcc/ira-lives.cc @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see . */ #include "config.h" +#define INCLUDE_VECTOR #include "system.h" #include "coretypes.h" #include "backend.h" @@ -35,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include "sparseset.h" #include "function-abi.h" #include "except.h" +#include "subreg-live-range.h" /* The code in this file is similar to one in global but the code works on the allocno basis and creates live ranges instead of @@ -91,6 +93,9 @@ static alternative_mask preferred_alternatives; we should not add a conflict with the copy's destination operand. */ static rtx ignore_reg_for_conflicts; +/* Store def/use point of has_subreg_object_p register. */ +static class subregs_live_points *subreg_live_points; + /* Record hard register REGNO as now being live. */ static void make_hard_regno_live (int regno) @@ -98,6 +103,33 @@ make_hard_regno_live (int regno) SET_HARD_REG_BIT (hard_regs_live, regno); } +/* Update conflict hard regs of ALLOCNO a for current live part. */ +static void +set_subreg_conflict_hard_regs (ira_allocno_t a, HARD_REG_SET regs) +{ + gcc_assert (has_subreg_object_p (a)); + + if (subreg_live_points->subreg_live_ranges.count (ALLOCNO_NUM (a)) == 0) + return; + + for (const subreg_range &r : + subreg_live_points->subreg_live_ranges.at (ALLOCNO_NUM (a)).ranges) + { + ira_object_t obj = find_object_anyway (a, r.start, r.end - r.start); + OBJECT_CONFLICT_HARD_REGS (obj) |= regs; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= regs; + } +} + +static void +set_subreg_conflict_hard_regs (ira_allocno_t a, unsigned int regno) +{ + HARD_REG_SET set; + CLEAR_HARD_REG_SET (set); + SET_HARD_REG_BIT (set, regno); + set_subreg_conflict_hard_regs (a, set); +} + /* Process the definition of hard register REGNO. This updates hard_regs_live and hard reg conflict information for living allocnos. */ static void @@ -113,8 +145,13 @@ make_hard_regno_dead (int regno) == (unsigned int) ALLOCNO_REGNO (OBJECT_ALLOCNO (obj))) continue; - SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); - SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); + if (has_subreg_object_p (OBJECT_ALLOCNO (obj))) + set_subreg_conflict_hard_regs (OBJECT_ALLOCNO (obj), regno); + else + { + SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); + SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); + } } CLEAR_HARD_REG_BIT (hard_regs_live, regno); } @@ -127,9 +164,29 @@ make_object_live (ira_object_t obj) sparseset_set_bit (objects_live, OBJECT_CONFLICT_ID (obj)); live_range_t lr = OBJECT_LIVE_RANGES (obj); - if (lr == NULL - || (lr->finish != curr_point && lr->finish + 1 != curr_point)) - ira_add_live_range_to_object (obj, curr_point, -1); + if (lr == NULL || (lr->finish != curr_point && lr->finish + 1 != curr_point)) + { + ira_add_live_range_to_object (obj, curr_point, -1); + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " add new live_range for a%d(r%d): [%d...-1]\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), curr_point); + } + } + else + { + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf ( + ira_dump_file, + " use old live_range for a%d(r%d): [%d...%d], curr: %d\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), lr->start, lr->finish, + curr_point); + } + } } /* Update ALLOCNO_EXCESS_PRESSURE_POINTS_NUM for the allocno @@ -140,7 +197,6 @@ update_allocno_pressure_excess_length (ira_object_t obj) ira_allocno_t a = OBJECT_ALLOCNO (obj); int start, i; enum reg_class aclass, pclass, cl; - live_range_t p; aclass = ALLOCNO_CLASS (a); pclass = ira_pressure_class_translate[aclass]; @@ -152,10 +208,18 @@ update_allocno_pressure_excess_length (ira_object_t obj) continue; if (high_pressure_start_point[cl] < 0) continue; - p = OBJECT_LIVE_RANGES (obj); - ira_assert (p != NULL); - start = (high_pressure_start_point[cl] > p->start - ? high_pressure_start_point[cl] : p->start); + int start_point; + if (has_subreg_object_p (a)) + start_point = subreg_live_points->get_start_point (ALLOCNO_NUM (a)); + else + { + live_range_t p = OBJECT_LIVE_RANGES (obj); + ira_assert (p != NULL); + start_point = p->start; + } + start = (high_pressure_start_point[cl] > start_point + ? high_pressure_start_point[cl] + : start_point); ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a) += curr_point - start + 1; } } @@ -201,6 +265,14 @@ make_object_dead (ira_object_t obj) CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); lr = OBJECT_LIVE_RANGES (obj); + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " finish a live_range a%d(r%d): [%d...%d] => [%d...%d]\n", + ALLOCNO_NUM (OBJECT_ALLOCNO (obj)), + ALLOCNO_REGNO (OBJECT_ALLOCNO (obj)), lr->start, lr->finish, + lr->start, curr_point); + } ira_assert (lr != NULL); lr->finish = curr_point; update_allocno_pressure_excess_length (obj); @@ -295,77 +367,144 @@ pseudo_regno_single_word_and_live_p (int regno) return sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj)); } -/* Mark the pseudo register REGNO as live. Update all information about - live ranges and register pressure. */ +/* Collect the point which the OBJ be def/use. */ static void -mark_pseudo_regno_live (int regno) +add_subreg_point (ira_object_t obj, bool is_def, bool is_dec = true) { - ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - enum reg_class pclass; - int i, n, nregs; - - if (a == NULL) - return; + ira_allocno_t a = OBJECT_ALLOCNO (obj); + if (is_def) + { + OBJECT_CONFLICT_HARD_REGS (obj) |= hard_regs_live; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= hard_regs_live; + if (is_dec) + { + enum reg_class pclass + = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + dec_register_pressure (pclass, ALLOCNO_NREGS (a)); + } + update_allocno_pressure_excess_length (obj); + } + else + { + enum reg_class pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + inc_register_pressure (pclass, ALLOCNO_NREGS (a)); + } - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + subreg_range r = subreg_range ( + {OBJECT_START (obj), OBJECT_START (obj) + OBJECT_NREGS (obj)}); + subreg_live_points->add_point (ALLOCNO_NUM (a), ALLOCNO_NREGS (a), r, is_def, + curr_point); - n = ALLOCNO_NUM_OBJECTS (a); - pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; - if (n > 1) + if (internal_flag_ira_verbose > 8 && ira_dump_file != NULL) { - /* We track every subobject separately. */ - gcc_assert (nregs == n); - nregs = 1; + fprintf (ira_dump_file, " %s a%d(r%d", is_def ? "def" : "use", + ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); + if (ALLOCNO_CLASS (a) != NO_REGS + && ALLOCNO_NREGS (a) != OBJECT_NREGS (obj)) + fprintf (ira_dump_file, " [subreg: start %d, nregs %d]", + OBJECT_START (obj), OBJECT_NREGS (obj)); + else + fprintf (ira_dump_file, " [full: nregs %d]", OBJECT_NREGS (obj)); + fprintf (ira_dump_file, ") at point %d\n", curr_point); } - for (i = 0; i < n; i++) - { - ira_object_t obj = ALLOCNO_OBJECT (a, i); + gcc_assert (has_subreg_object_p (a)); + gcc_assert (subreg_live_points->subreg_live_ranges.count (ALLOCNO_NUM (a)) + != 0); + + const subreg_ranges &sr + = subreg_live_points->subreg_live_ranges.at (ALLOCNO_NUM (a)); + ira_object_t main_obj = find_object (a, 0, ALLOCNO_NREGS (a)); + gcc_assert (main_obj != NULL); + if (sr.empty_p () + && sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (main_obj))) + sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (main_obj)); + else if (!sr.empty_p () + && !sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (main_obj))) + sparseset_set_bit (objects_live, OBJECT_CONFLICT_ID (main_obj)); +} +/* Mark the object OBJ as live. */ +static void +mark_pseudo_object_live (ira_allocno_t a, ira_object_t obj) +{ + /* Invalidate because it is referenced. */ + allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + + if (has_subreg_object_p (a)) + add_subreg_point (obj, false); + else + { if (sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - continue; + return; - inc_register_pressure (pclass, nregs); + enum reg_class pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + inc_register_pressure (pclass, ALLOCNO_NREGS (a)); make_object_live (obj); } } +/* Mark the pseudo register REGNO as live. Update all information about + live ranges and register pressure. */ +static void +mark_pseudo_regno_live (int regno) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + + if (a == NULL) + return; + + int nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; + ira_object_t obj = find_object (a, 0, nregs); + gcc_assert (obj != NULL); + + mark_pseudo_object_live (a, obj); +} + /* Like mark_pseudo_regno_live, but try to only mark one subword of the pseudo as live. SUBWORD indicates which; a value of 0 indicates the low part. */ static void -mark_pseudo_regno_subword_live (int regno, int subword) +mark_pseudo_regno_subreg_live (int regno, rtx subreg) { ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n; - enum reg_class pclass; - ira_object_t obj; if (a == NULL) return; - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + ira_object_t obj + = find_object (a, SUBREG_BYTE (subreg), GET_MODE_SIZE (GET_MODE (subreg))); + gcc_assert (obj != NULL); + + mark_pseudo_object_live (a, obj); +} - n = ALLOCNO_NUM_OBJECTS (a); - if (n == 1) +/* Mark objects in subreg ranges SR as live. Update all information about + live ranges and register pressure. */ +static void +mark_pseudo_regno_subregs_live (int regno, const subreg_ranges &sr) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + if (a == NULL) + return; + + if (!ALLOCNO_TRACK_SUBREG_P (a)) { mark_pseudo_regno_live (regno); return; } - pclass = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - gcc_assert - (n == ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]); - obj = ALLOCNO_OBJECT (a, subword); - - if (sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - return; - - inc_register_pressure (pclass, 1); - make_object_live (obj); + int times = sr.max / ALLOCNO_NREGS (a); + gcc_assert (sr.max >= ALLOCNO_NREGS (a) + && times * ALLOCNO_NREGS (a) == sr.max); + for (const subreg_range &range : sr.ranges) + { + int start = range.start / times; + int end = CEIL (range.end, times); + ira_object_t obj = find_object (a, start, end - start); + gcc_assert (obj != NULL); + mark_pseudo_object_live (a, obj); + } } /* Mark the register REG as live. Store a 1 in hard_regs_live for @@ -403,10 +542,7 @@ static void mark_pseudo_reg_live (rtx orig_reg, unsigned regno) { if (read_modify_subreg_p (orig_reg)) - { - mark_pseudo_regno_subword_live (regno, - subreg_lowpart_p (orig_reg) ? 0 : 1); - } + mark_pseudo_regno_subreg_live (regno, orig_reg); else mark_pseudo_regno_live (regno); } @@ -427,72 +563,59 @@ mark_ref_live (df_ref ref) mark_hard_reg_live (reg); } -/* Mark the pseudo register REGNO as dead. Update all information about - live ranges and register pressure. */ +/* Mark object as dead. */ static void -mark_pseudo_regno_dead (int regno) +mark_pseudo_object_dead (ira_allocno_t a, ira_object_t obj) { - ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n, i, nregs; - enum reg_class cl; - - if (a == NULL) - return; - /* Invalidate because it is referenced. */ allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; - n = ALLOCNO_NUM_OBJECTS (a); - cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; - if (n > 1) - { - /* We track every subobject separately. */ - gcc_assert (nregs == n); - nregs = 1; - } - for (i = 0; i < n; i++) + if (has_subreg_object_p (a)) + add_subreg_point (obj, true); + else { - ira_object_t obj = ALLOCNO_OBJECT (a, i); if (!sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) - continue; + return; - dec_register_pressure (cl, nregs); + enum reg_class cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; + dec_register_pressure (cl, ALLOCNO_NREGS (a)); make_object_dead (obj); } } -/* Like mark_pseudo_regno_dead, but called when we know that only part of the - register dies. SUBWORD indicates which; a value of 0 indicates the low part. */ +/* Mark the pseudo register REGNO as dead. Update all information about + live ranges and register pressure. */ static void -mark_pseudo_regno_subword_dead (int regno, int subword) +mark_pseudo_regno_dead (int regno) { ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - int n; - enum reg_class cl; - ira_object_t obj; if (a == NULL) return; - /* Invalidate because it is referenced. */ - allocno_saved_at_call[ALLOCNO_NUM (a)] = 0; + int nregs = ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; + ira_object_t obj = find_object (a, 0, nregs); + gcc_assert (obj != NULL); - n = ALLOCNO_NUM_OBJECTS (a); - if (n == 1) - /* The allocno as a whole doesn't die in this case. */ - return; + mark_pseudo_object_dead (a, obj); +} - cl = ira_pressure_class_translate[ALLOCNO_CLASS (a)]; - gcc_assert - (n == ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]); +/* Like mark_pseudo_regno_dead, but called when we know that only part of the + register dies. SUBWORD indicates which; a value of 0 indicates the low part. + */ +static void +mark_pseudo_regno_subreg_dead (int regno, rtx subreg) +{ + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; - obj = ALLOCNO_OBJECT (a, subword); - if (!sparseset_bit_p (objects_live, OBJECT_CONFLICT_ID (obj))) + if (a == NULL) return; - dec_register_pressure (cl, 1); - make_object_dead (obj); + ira_object_t obj + = find_object (a, SUBREG_BYTE (subreg), GET_MODE_SIZE (GET_MODE (subreg))); + gcc_assert (obj != NULL); + + mark_pseudo_object_dead (a, obj); } /* Process the definition of hard register REG. This updates hard_regs_live @@ -528,10 +651,7 @@ static void mark_pseudo_reg_dead (rtx orig_reg, unsigned regno) { if (read_modify_subreg_p (orig_reg)) - { - mark_pseudo_regno_subword_dead (regno, - subreg_lowpart_p (orig_reg) ? 0 : 1); - } + mark_pseudo_regno_subreg_dead (regno, orig_reg); else mark_pseudo_regno_dead (regno); } @@ -1059,8 +1179,14 @@ process_single_reg_class_operands (bool in_p, int freq) /* We could increase costs of A instead of making it conflicting with the hard register. But it works worse because it will be spilled in reload in anyway. */ - OBJECT_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; - OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; + if (has_subreg_object_p (a)) + set_subreg_conflict_hard_regs (OBJECT_ALLOCNO (obj), + reg_class_contents[cl]); + else + { + OBJECT_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= reg_class_contents[cl]; + } } } } @@ -1198,17 +1324,15 @@ process_out_of_region_eh_regs (basic_block bb) bi) { ira_allocno_t a = ira_curr_regno_allocno_map[i]; - for (int n = ALLOCNO_NUM_OBJECTS (a) - 1; n >= 0; n--) + ira_object_t obj = find_object (a, 0, ALLOCNO_NREGS (a)); + for (int k = 0;; k++) { - ira_object_t obj = ALLOCNO_OBJECT (a, n); - for (int k = 0;; k++) - { - unsigned int regno = EH_RETURN_DATA_REGNO (k); - if (regno == INVALID_REGNUM) - break; - SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); - SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); - } + unsigned int regno = EH_RETURN_DATA_REGNO (k); + if (regno == INVALID_REGNUM) + break; + + SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno); + SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno); } } } @@ -1234,6 +1358,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) bb = loop_tree_node->bb; if (bb != NULL) { + if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) + fprintf (ira_dump_file, "\n BB exit(l%d): point = %d\n", + loop_tree_node->parent->loop_num, curr_point); + for (i = 0; i < ira_pressure_classes_num; i++) { curr_reg_pressure[ira_pressure_classes[i]] = 0; @@ -1242,6 +1370,7 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) curr_bb_node = loop_tree_node; reg_live_out = DF_LIVE_SUBREG_OUT (bb); sparseset_clear (objects_live); + subreg_live_points->clear_live_ranges (); REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out); hard_regs_live &= ~(eliminable_regset | ira_no_alloc_regs); for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) @@ -1265,9 +1394,17 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) <= ira_class_hard_regs_num[cl]); } } - EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_FULL_OUT (bb), + FIRST_PSEUDO_REGISTER, j, bi) mark_pseudo_regno_live (j); + EXECUTE_IF_SET_IN_BITMAP (DF_LIVE_SUBREG_PARTIAL_OUT (bb), + FIRST_PSEUDO_REGISTER, j, bi) + { + mark_pseudo_regno_subregs_live ( + j, DF_LIVE_SUBREG_RANGE_OUT (bb)->lives.at (j)); + } + #ifdef EH_RETURN_DATA_REGNO process_out_of_region_eh_regs (bb); #endif @@ -1381,27 +1518,33 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) || (!targetm.setjmp_preserves_nonvolatile_regs_p () && (find_reg_note (insn, REG_SETJMP, NULL_RTX) != NULL_RTX))) + { + if (has_subreg_object_p (a)) + { + HARD_REG_SET regs; + SET_HARD_REG_SET (regs); + set_subreg_conflict_hard_regs (a, regs); + } + else + { + SET_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj)); + SET_HARD_REG_SET ( + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); + } + } + if (can_throw_internal (insn)) { - SET_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj)); - SET_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)); - } - eh_region r; - eh_landing_pad lp; - rtx_code_label *landing_label; - basic_block landing_bb; - if (can_throw_internal (insn) - && (r = get_eh_region_from_rtx (insn)) != NULL - && (lp = gen_eh_landing_pad (r)) != NULL - && (landing_label = lp->landing_pad) != NULL - && (landing_bb = BLOCK_FOR_INSN (landing_label)) != NULL - && (r->type != ERT_CLEANUP - || bitmap_bit_p (df_get_live_in (landing_bb), - ALLOCNO_REGNO (a)))) - { - HARD_REG_SET new_conflict_regs - = callee_abi.mode_clobbers (ALLOCNO_MODE (a)); - OBJECT_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; - OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) |= new_conflict_regs; + if (has_subreg_object_p (a)) + set_subreg_conflict_hard_regs (a, + callee_abi.mode_clobbers ( + ALLOCNO_MODE (a))); + else + { + OBJECT_CONFLICT_HARD_REGS (obj) + |= callee_abi.mode_clobbers (ALLOCNO_MODE (a)); + OBJECT_TOTAL_CONFLICT_HARD_REGS (obj) + |= callee_abi.mode_clobbers (ALLOCNO_MODE (a)); + } } if (sparseset_bit_p (allocnos_processed, num)) continue; @@ -1443,7 +1586,14 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) /* Mark each used value as live. */ FOR_EACH_INSN_USE (use, insn) - mark_ref_live (use); + { + unsigned regno = DF_REF_REGNO (use); + ira_allocno_t a = ira_curr_regno_allocno_map[regno]; + if (a && has_subreg_object_p (a) + && DF_REF_FLAGS (use) & (DF_REF_READ_WRITE | DF_REF_SUBREG)) + continue; + mark_ref_live (use); + } process_single_reg_class_operands (true, freq); @@ -1473,6 +1623,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) } ignore_reg_for_conflicts = NULL_RTX; + if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) + fprintf (ira_dump_file, "\n BB head(l%d): point = %d\n", + loop_tree_node->parent->loop_num, curr_point); + if (bb_has_eh_pred (bb)) for (j = 0; ; ++j) { @@ -1526,10 +1680,15 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) } EXECUTE_IF_SET_IN_SPARSESET (objects_live, i) - make_object_dead (ira_object_id_map[i]); + { + ira_object_t obj = ira_object_id_map[i]; + if (has_subreg_object_p (OBJECT_ALLOCNO (obj))) + add_subreg_point (obj, true, false); + else + make_object_dead (obj); + } curr_point++; - } /* Propagate register pressure to upper loop tree nodes. */ if (loop_tree_node != ira_loop_tree_root) @@ -1730,6 +1889,86 @@ ira_debug_live_ranges (void) print_live_ranges (stderr); } +class subreg_live_item +{ +public: + subreg_ranges subreg; + int start, finish; +}; + +/* Create subreg live ranges from objects def/use point info. */ +static void +create_subregs_live_ranges () +{ + for (const auto &subreg_point_it : subreg_live_points->subreg_points) + { + unsigned int allocno_num = subreg_point_it.first; + const class live_points &points = subreg_point_it.second; + ira_allocno_t a = ira_allocnos[allocno_num]; + std::vector temps; + gcc_assert (has_subreg_object_p (a)); + for (const auto &point_it : points.points) + { + int point = point_it.first; + const live_point ®s = point_it.second; + gcc_assert (temps.empty () || temps.back ().finish <= point); + if (!regs.use_reg.empty_p ()) + { + if (temps.empty ()) + temps.push_back ({regs.use_reg, point, -1}); + else if (temps.back ().finish == -1) + { + if (!temps.back ().subreg.same_p (regs.use_reg)) + { + if (temps.back ().start == point) + temps.back ().subreg.add_ranges (regs.use_reg); + else + { + temps.back ().finish = point - 1; + + subreg_ranges temp = regs.use_reg; + temp.add_ranges (temps.back ().subreg); + temps.push_back ({temp, point, -1}); + } + } + } + else if (temps.back ().subreg.same_p (regs.use_reg) + && (temps.back ().finish == point + || temps.back ().finish + 1 == point)) + temps.back ().finish = -1; + else + temps.push_back ({regs.use_reg, point, -1}); + } + if (!regs.def_reg.empty_p ()) + { + gcc_assert (!temps.empty ()); + if (regs.def_reg.include_ranges_p (temps.back ().subreg)) + temps.back ().finish = point; + else if (temps.back ().subreg.include_ranges_p (regs.def_reg)) + { + temps.back ().finish = point; + + subreg_ranges diff = temps.back ().subreg; + diff.remove_ranges (regs.def_reg); + temps.push_back ({diff, point + 1, -1}); + } + else + gcc_unreachable (); + } + } + for (const subreg_live_item &item : temps) + for (const subreg_range &r : item.subreg.ranges) + { + ira_object_t obj = find_object_anyway (a, r.start, r.end - r.start); + live_range_t lr = OBJECT_LIVE_RANGES (obj); + if (lr != NULL && lr->finish + 1 == item.start) + lr->finish = item.finish; + else + ira_add_live_range_to_object (obj, item.start, item.finish); + } + } +} + /* The main entry function creates live ranges, set up CONFLICT_HARD_REGS and TOTAL_CONFLICT_HARD_REGS for objects, and calculate register pressure info. */ @@ -1743,13 +1982,20 @@ ira_create_allocno_live_ranges (void) allocno_saved_at_call = (int *) ira_allocate (ira_allocnos_num * sizeof (int)); memset (allocno_saved_at_call, 0, ira_allocnos_num * sizeof (int)); + subreg_live_points = new subregs_live_points (); ira_traverse_loop_tree (true, ira_loop_tree_root, NULL, process_bb_node_lives); ira_max_point = curr_point; + create_subregs_live_ranges (); create_start_finish_chains (); if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) - print_live_ranges (ira_dump_file); + { + fprintf (ira_dump_file, ";; subreg live points:\n"); + subreg_live_points->dump (ira_dump_file); + print_live_ranges (ira_dump_file); + } /* Clean up. */ + delete subreg_live_points; ira_free (allocno_saved_at_call); sparseset_free (objects_live); sparseset_free (allocnos_processed); diff --git a/gcc/subreg-live-range.h b/gcc/subreg-live-range.h index 56931b53550..bee97708a52 100644 --- a/gcc/subreg-live-range.h +++ b/gcc/subreg-live-range.h @@ -275,11 +275,20 @@ class subregs_live_points { public: std::map subreg_points; + std::map last_start_points; std::map subreg_live_ranges; void add_point (int id, int max, const subreg_range &range, bool is_def, int point) { + if (!is_def && empty_live_p (id)) + { + if (last_start_points.count (id) == 0) + last_start_points.insert ({id, point}); + else + last_start_points.at (id) = point; + } + if (subreg_points.count (id) == 0) subreg_points.insert ({id, live_points (id, max)}); @@ -317,6 +326,13 @@ public: || subreg_live_ranges.at (id).empty_p (); } + int get_start_point (int id) + { + int start_point = last_start_points.at (id); + gcc_assert (start_point != -1); + return start_point; + } + void clear_live_ranges () { subreg_live_ranges.clear (); } /* Debug methods. */ From patchwork Wed Nov 8 03:47:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 162878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp676661vqo; Tue, 7 Nov 2023 19:49:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IFS9K95tyBx78olCZbBVP1THz/dG8xH8Uxd3NlZqcijRAIzSQqja2Y1ls6OQ5YbWdUfvDZA X-Received: by 2002:a05:6214:409:b0:66d:542:57a8 with SMTP id z9-20020a056214040900b0066d054257a8mr1041917qvx.5.1699415372235; Tue, 07 Nov 2023 19:49:32 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699415372; cv=pass; d=google.com; s=arc-20160816; b=pxZKv9wC3C5vSM2P3C8PfaKREejiZgfAq4iuCQYY1rlV/NY6uootZN4Gm96XTt2G+v MhXcpbYGhiuOsDBINJWqxujbyEtbU6MLDOgKAK5Hb4j21FysSRucPnwdtB57SbKLP5zq SG5KiU0nKxwgiLs43/XhS3irhcJdgprqcVXRCYK6xvjs5qLZ17LTLXJ2k7l9n+SLe3vF L5u6IvQnpYmCCRccqcfF/H++ag+jpxZU6xeznFJE34wCalKRzjeaKq8cIqjgdEoJ5FTf ugCJaTEjGD2U/zkwodIiklrC+7WfhTKtDhInQtTos89+9po4UByQ1RXv1/x93de0O793 lESg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=oevJXvHSyZ9ED2YtgDy5pFCyhRAWcJtCbRrvLMYgWEc=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=wLL7qc6wiDX0IMyZe8W+i02v+vlxW6iOWkU3sri+H4REVJIAX6REJiJMbuzoPtU6BF crQtcXteb6gHHBzfxHe7El6r1jgU5VNRPFEYPJOowRPrhRs5qP9Ux65gRSdxnIacQip5 Rx9CtMS7qjzbt0ErluL7Q8czEle89Ise5RzVrML32bmNADIal4InZZGF2s3odRtSamhf 5H+WWTOgKWVl1Em0FPnKiAaRoKEX+fL7J0KJLDzHoTUx8dkxDz7QO5Wr2ncwmAuFsslU vFCzZsSripEX5nT36B8PkCb9yIC/HeacvJoyAzgTHYExU62/Et2gDIQEwG8ZrituVaWN fANw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u14-20020a05620a0c4e00b0077a1c623987si827182qki.293.2023.11.07.19.49.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Nov 2023 19:49:32 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 68AA4385773A for ; Wed, 8 Nov 2023 03:49:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu1.qq.com (smtpbgeu1.qq.com [52.59.177.22]) by sourceware.org (Postfix) with ESMTPS id 41C203858C41 for ; Wed, 8 Nov 2023 03:48:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 41C203858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 41C203858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=52.59.177.22 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415291; cv=none; b=G41xHB/I6JYfO0CCZpK41r+/OQ0HOW66vvV/jQUdF+jqm6GQcrYjWiUQlbJVli4EreqcSuoDtizSUbj9z44rTSaRBSSmsvj8uURbjzqL4/6yqZzV7rQA/Hti/qfvMwtYsXMgLem3AbEZGv1C9WarXHQk0qx1jTpv1bfdWcSkeYs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415291; c=relaxed/simple; bh=Nydk0H+3a5EzS3C7c4MbareXIe8EatA8aNgn8MH3Z6s=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=xWFIsx3zddI9Qpo0BNxnNOsmmmKHs0ZBCso0GUcnmXCnqmL87TJstblWetrAcHc038xl2sDRTAOL592iRijNIrsnDUqTyML8LgJFexBrdSATmVGOMMGQ0nqwLNuZFwGdCx2L5nI4SJW7ng6mK5JjCK9Ru5kfPiTuHYqugCBhyxc= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1699415274tgbrfvmv Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 11:47:54 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: 231R/rkLiVnQr2Qca4XP37VXrbahOotkY7DpR3RRNZJ4gN46ZCxiTT53tgERP j43O1Y8DZRqVnmzFc+OQnAHYekXhvfcrcub1lja8sICCr3y2VXthJMG8PIVJagZLQJCSL69 rg4EUrhQLpsF25Lex9IhlwM9WKOq2gXkPERhufFA2fWWjtsSWBjPRYr5CNk5Mj4BpoGrnif QhyZemJHgnr7Phbk1SuuvXSkR442UADGROEnM+yRb/jmgerFG1bzaWRBN7njOwt/ozASNvs MMB/qatSE3IDkDV5u27nBTrg+6dHk+wJRsg3mpjQviB6g1vTH8J0ze2nEqDznz5aRCgnxyj CSX7tLcgcqIRISZnbC9tBkEhUG+zdXVghNsyv8hHVuq4O6CJ863Q9FXjgnIlg== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5540395889589238823 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH 4/7] ira: Support subreg copy Date: Wed, 8 Nov 2023 11:47:37 +0800 Message-Id: <20231108034740.834590-5-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231108034740.834590-1-lehua.ding@rivai.ai> References: <20231108034740.834590-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781966173259076323 X-GMAIL-MSGID: 1781966173259076323 This patch change the copy between allocno and allocno to the copy between object and object, that is, allow partial copy between pseudo registers. gcc/ChangeLog: * ira-build.cc (find_allocno_copy): Removed. (ira_create_object): Adjust. (find_object): New. (ira_create_copy): Adjust. (add_allocno_copy_to_list): Adjust. (swap_allocno_copy_ends_if_necessary): Adjust. (ira_add_allocno_copy): Adjust. (print_copy): Adjust. (print_allocno_copies): Adjust. (ira_flattening): Adjust. * ira-color.cc (INCLUDE_VECTOR): use std::vector (struct allocno_color_data): New fields. (struct allocno_hard_regs_subnode): More comments. (form_allocno_hard_regs_nodes_forest): More comments. (update_left_conflict_sizes_p): More comments. (struct update_cost_queue_elem): New field. (queue_update_cost): Adjust. (get_next_update_cost): Adjust. (update_costs_from_allocno): Adjust. (update_conflict_hard_regno_costs): Adjust. (assign_hard_reg): Adjust. (objects_conflict_by_live_ranges_p): New. (allocno_thread_conflict_p): Removed. (object_thread_conflict_p): New. (merge_threads): Adjust. (form_threads_from_copies): Adjust. (form_threads_from_bucket): Adjust. (form_threads_from_colorable_allocno): Adjust. (init_allocno_threads): Adjust. (add_allocno_to_bucket): Adjust. (delete_allocno_from_bucket): Adjust. (allocno_copy_cost_saving): Adjust. (color_allocnos): Adjust. (color_pass): Adjust. (update_curr_costs): Adjust. (coalesce_allocnos): Adjust. (ira_reuse_stack_slot): Adjust. (ira_initiate_assign): Adjust. (ira_finish_assign): Adjust. * ira-conflicts.cc (allocnos_conflict_for_copy_p): Removed. (REG_SUBREG_P): Adjust. (subreg_move_p): New. (regs_non_conflict_for_copy_p): New. (subreg_reg_align_and_times_p): New. (process_regs_for_copy): Adjust. (add_insn_allocno_copies): Adjust. (propagate_copies): Adjust. * ira-emit.cc (add_range_and_copies_from_move_list): Adjust. * ira-int.h (struct ira_object): New field. (OBJECT_INDEX): New macro. (struct ira_allocno_copy): Adjust fields. (ira_add_allocno_copy): Exported. (find_object): Exported. (subreg_move_p): Exported. * ira.cc (print_redundant_copies): Adjust. --- gcc/ira-build.cc | 150 +++++++----- gcc/ira-color.cc | 541 +++++++++++++++++++++++++++++++------------ gcc/ira-conflicts.cc | 173 +++++++++++--- gcc/ira-emit.cc | 10 +- gcc/ira-int.h | 13 +- gcc/ira.cc | 5 +- 6 files changed, 645 insertions(+), 247 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 5fb7a9f800f..1c47f81ce9d 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -36,9 +36,6 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "subreg-live-range.h" -static ira_copy_t find_allocno_copy (ira_allocno_t, ira_allocno_t, rtx_insn *, - ira_loop_tree_node_t); - /* The root of the loop tree corresponding to the all function. */ ira_loop_tree_node_t ira_loop_tree_root; @@ -463,6 +460,7 @@ ira_create_object (ira_allocno_t a, int start, int nregs) OBJECT_LIVE_RANGES (obj) = NULL; OBJECT_START (obj) = start; OBJECT_NREGS (obj) = nregs; + OBJECT_INDEX (obj) = ALLOCNO_NUM_OBJECTS (a); ira_object_id_map_vec.safe_push (obj); ira_object_id_map @@ -519,6 +517,16 @@ find_object (ira_allocno_t a, poly_int64 offset, poly_int64 size) return find_object (a, subreg_start, subreg_nregs); } +/* Return object in allocno A for REG. */ +ira_object_t +find_object (ira_allocno_t a, rtx reg) +{ + if (has_subreg_object_p (a) && read_modify_subreg_p (reg)) + return find_object (a, SUBREG_BYTE (reg), GET_MODE_SIZE (GET_MODE (reg))); + else + return find_object (a, 0, ALLOCNO_NREGS (a)); +} + /* Return the object in allocno A which match START & NREGS. Create when not found. */ ira_object_t @@ -1502,27 +1510,36 @@ initiate_copies (void) /* Return copy connecting A1 and A2 and originated from INSN of LOOP_TREE_NODE if any. */ static ira_copy_t -find_allocno_copy (ira_allocno_t a1, ira_allocno_t a2, rtx_insn *insn, +find_allocno_copy (ira_object_t obj1, ira_object_t obj2, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { ira_copy_t cp, next_cp; - ira_allocno_t another_a; + ira_object_t another_obj; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); for (cp = ALLOCNO_COPIES (a1); cp != NULL; cp = next_cp) { - if (cp->first == a1) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a1) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + if (cp->first == obj1) + another_obj = cp->second; + else + continue; } - else if (cp->second == a1) + else if (second_a == a1) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + if (cp->second == obj1) + another_obj = cp->first; + else + continue; } else gcc_unreachable (); - if (another_a == a2 && cp->insn == insn + if (another_obj == obj2 && cp->insn == insn && cp->loop_tree_node == loop_tree_node) return cp; } @@ -1532,7 +1549,7 @@ find_allocno_copy (ira_allocno_t a1, ira_allocno_t a2, rtx_insn *insn, /* Create and return copy with given attributes LOOP_TREE_NODE, FIRST, SECOND, FREQ, CONSTRAINT_P, and INSN. */ ira_copy_t -ira_create_copy (ira_allocno_t first, ira_allocno_t second, int freq, +ira_create_copy (ira_object_t first, ira_object_t second, int freq, bool constraint_p, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { @@ -1556,28 +1573,29 @@ ira_create_copy (ira_allocno_t first, ira_allocno_t second, int freq, static void add_allocno_copy_to_list (ira_copy_t cp) { - ira_allocno_t first = cp->first, second = cp->second; + ira_object_t first = cp->first, second = cp->second; + ira_allocno_t a1 = OBJECT_ALLOCNO (first), a2 = OBJECT_ALLOCNO (second); cp->prev_first_allocno_copy = NULL; cp->prev_second_allocno_copy = NULL; - cp->next_first_allocno_copy = ALLOCNO_COPIES (first); + cp->next_first_allocno_copy = ALLOCNO_COPIES (a1); if (cp->next_first_allocno_copy != NULL) { - if (cp->next_first_allocno_copy->first == first) + if (OBJECT_ALLOCNO (cp->next_first_allocno_copy->first) == a1) cp->next_first_allocno_copy->prev_first_allocno_copy = cp; else cp->next_first_allocno_copy->prev_second_allocno_copy = cp; } - cp->next_second_allocno_copy = ALLOCNO_COPIES (second); + cp->next_second_allocno_copy = ALLOCNO_COPIES (a2); if (cp->next_second_allocno_copy != NULL) { - if (cp->next_second_allocno_copy->second == second) + if (OBJECT_ALLOCNO (cp->next_second_allocno_copy->second) == a2) cp->next_second_allocno_copy->prev_second_allocno_copy = cp; else cp->next_second_allocno_copy->prev_first_allocno_copy = cp; } - ALLOCNO_COPIES (first) = cp; - ALLOCNO_COPIES (second) = cp; + ALLOCNO_COPIES (a1) = cp; + ALLOCNO_COPIES (a2) = cp; } /* Make a copy CP a canonical copy where number of the @@ -1585,7 +1603,8 @@ add_allocno_copy_to_list (ira_copy_t cp) static void swap_allocno_copy_ends_if_necessary (ira_copy_t cp) { - if (ALLOCNO_NUM (cp->first) <= ALLOCNO_NUM (cp->second)) + if (ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)) + <= ALLOCNO_NUM (OBJECT_ALLOCNO (cp->second))) return; std::swap (cp->first, cp->second); @@ -1594,11 +1613,10 @@ swap_allocno_copy_ends_if_necessary (ira_copy_t cp) } /* Create (or update frequency if the copy already exists) and return - the copy of allocnos FIRST and SECOND with frequency FREQ - corresponding to move insn INSN (if any) and originated from - LOOP_TREE_NODE. */ + the copy of objects FIRST and SECOND with frequency FREQ corresponding to + move insn INSN (if any) and originated from LOOP_TREE_NODE. */ ira_copy_t -ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, +ira_add_allocno_copy (ira_object_t first, ira_object_t second, int freq, bool constraint_p, rtx_insn *insn, ira_loop_tree_node_t loop_tree_node) { @@ -1617,15 +1635,33 @@ ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, return cp; } +/* Create (or update frequency if the copy already exists) and return + the copy of allocnos FIRST and SECOND with frequency FREQ + corresponding to move insn INSN (if any) and originated from + LOOP_TREE_NODE. */ +ira_copy_t +ira_add_allocno_copy (ira_allocno_t first, ira_allocno_t second, int freq, + bool constraint_p, rtx_insn *insn, + ira_loop_tree_node_t loop_tree_node) +{ + ira_object_t obj1 = get_full_object (first); + ira_object_t obj2 = get_full_object (second); + gcc_assert (obj1 != NULL && obj2 != NULL); + return ira_add_allocno_copy (obj1, obj2, freq, constraint_p, insn, + loop_tree_node); +} + /* Print info about copy CP into file F. */ static void print_copy (FILE *f, ira_copy_t cp) { - fprintf (f, " cp%d:a%d(r%d)<->a%d(r%d)@%d:%s\n", cp->num, - ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), cp->freq, - cp->insn != NULL - ? "move" : cp->constraint_p ? "constraint" : "shuffle"); + ira_allocno_t a1 = OBJECT_ALLOCNO (cp->first); + ira_allocno_t a2 = OBJECT_ALLOCNO (cp->second); + fprintf (f, " cp%d:a%d(r%d)<->a%d(r%d)@%d:%s\n", cp->num, ALLOCNO_NUM (a1), + ALLOCNO_REGNO (a1), ALLOCNO_NUM (a2), ALLOCNO_REGNO (a2), cp->freq, + cp->insn != NULL ? "move" + : cp->constraint_p ? "constraint" + : "shuffle"); } DEBUG_FUNCTION void @@ -1672,24 +1708,25 @@ ira_debug_copies (void) static void print_allocno_copies (FILE *f, ira_allocno_t a) { - ira_allocno_t another_a; + ira_object_t another_obj; ira_copy_t cp, next_cp; fprintf (f, " a%d(r%d):", ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_obj = cp->second; } - else if (cp->second == a) + else if (OBJECT_ALLOCNO (cp->second) == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + ira_allocno_t another_a = OBJECT_ALLOCNO (another_obj); fprintf (f, " cp%d:a%d(r%d)@%d", cp->num, ALLOCNO_NUM (another_a), ALLOCNO_REGNO (another_a), cp->freq); } @@ -3479,25 +3516,21 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) copies. */ FOR_EACH_COPY (cp, ci) { - if (ALLOCNO_CAP_MEMBER (cp->first) != NULL - || ALLOCNO_CAP_MEMBER (cp->second) != NULL) + ira_allocno_t a1 = OBJECT_ALLOCNO (cp->first); + ira_allocno_t a2 = OBJECT_ALLOCNO (cp->second); + if (ALLOCNO_CAP_MEMBER (a1) != NULL || ALLOCNO_CAP_MEMBER (a2) != NULL) { if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) - fprintf - (ira_dump_file, " Remove cp%d:%c%dr%d-%c%dr%d\n", - cp->num, ALLOCNO_CAP_MEMBER (cp->first) != NULL ? 'c' : 'a', - ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_CAP_MEMBER (cp->second) != NULL ? 'c' : 'a', - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + fprintf (ira_dump_file, " Remove cp%d:%c%dr%d-%c%dr%d\n", + cp->num, ALLOCNO_CAP_MEMBER (a1) != NULL ? 'c' : 'a', + ALLOCNO_NUM (a1), REGNO (allocno_emit_reg (a1)), + ALLOCNO_CAP_MEMBER (a2) != NULL ? 'c' : 'a', + ALLOCNO_NUM (a2), REGNO (allocno_emit_reg (a2))); cp->loop_tree_node = NULL; continue; } - first - = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->first))]; - second - = regno_top_level_allocno_map[REGNO (allocno_emit_reg (cp->second))]; + first = regno_top_level_allocno_map[REGNO (allocno_emit_reg (a1))]; + second = regno_top_level_allocno_map[REGNO (allocno_emit_reg (a2))]; node = cp->loop_tree_node; if (node == NULL) keep_p = true; /* It copy generated in ira-emit.cc. */ @@ -3505,8 +3538,8 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) { /* Check that the copy was not propagated from level on which we will have different pseudos. */ - node_first = node->regno_allocno_map[ALLOCNO_REGNO (cp->first)]; - node_second = node->regno_allocno_map[ALLOCNO_REGNO (cp->second)]; + node_first = node->regno_allocno_map[ALLOCNO_REGNO (a1)]; + node_second = node->regno_allocno_map[ALLOCNO_REGNO (a2)]; keep_p = ((REGNO (allocno_emit_reg (first)) == REGNO (allocno_emit_reg (node_first))) && (REGNO (allocno_emit_reg (second)) @@ -3515,18 +3548,18 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) if (keep_p) { cp->loop_tree_node = ira_loop_tree_root; - cp->first = first; - cp->second = second; + cp->first = find_object_anyway (first, OBJECT_START (cp->first), + OBJECT_NREGS (cp->first)); + cp->second = find_object_anyway (second, OBJECT_START (cp->second), + OBJECT_NREGS (cp->second)); } else { cp->loop_tree_node = NULL; if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL) fprintf (ira_dump_file, " Remove cp%d:a%dr%d-a%dr%d\n", - cp->num, ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + cp->num, ALLOCNO_NUM (a1), REGNO (allocno_emit_reg (a1)), + ALLOCNO_NUM (a2), REGNO (allocno_emit_reg (a2))); } } /* Remove unnecessary allocnos on lower levels of the loop tree. */ @@ -3562,9 +3595,10 @@ ira_flattening (int max_regno_before_emit, int ira_max_point_before_emit) finish_copy (cp); continue; } - ira_assert - (ALLOCNO_LOOP_TREE_NODE (cp->first) == ira_loop_tree_root - && ALLOCNO_LOOP_TREE_NODE (cp->second) == ira_loop_tree_root); + ira_assert (ALLOCNO_LOOP_TREE_NODE (OBJECT_ALLOCNO (cp->first)) + == ira_loop_tree_root + && ALLOCNO_LOOP_TREE_NODE (OBJECT_ALLOCNO (cp->second)) + == ira_loop_tree_root); add_allocno_copy_to_list (cp); swap_allocno_copy_ends_if_necessary (cp); } diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index 8aed25144b9..099312bcdb3 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3. If not see #include "config.h" #define INCLUDE_MAP +#define INCLUDE_VECTOR #include "system.h" #include "coretypes.h" #include "backend.h" @@ -150,11 +151,18 @@ struct allocno_color_data struct update_cost_record *update_cost_records; /* Threads. We collect allocnos connected by copies into threads and try to assign hard regs to allocnos by threads. */ - /* Allocno representing all thread. */ - ira_allocno_t first_thread_allocno; + /* The head objects for all thread. */ + ira_object_t *first_thread_objects; /* Allocnos in thread forms a cycle list through the following member. */ - ira_allocno_t next_thread_allocno; + ira_object_t *next_thread_objects; + /* The allocno all thread shared. */ + ira_allocno_t first_thread_allocno; + /* The offset start relative to the first_thread_allocno. */ + int first_thread_offset; + /* All allocnos belong to the thread. */ + bitmap thread_allocnos; + /* The freq sum of all thread allocno. */ /* All thread frequency. Defined only for first thread allocno. */ int thread_freq; /* Sum of frequencies of hard register preferences of the allocno. */ @@ -188,6 +196,9 @@ static bitmap coloring_allocno_bitmap; allocnos. */ static bitmap consideration_allocno_bitmap; +/* Bitmap of allocnos which is not trivially colorable. */ +static bitmap uncolorable_allocno_set; + /* All allocnos sorted according their priorities. */ static ira_allocno_t *sorted_allocnos; @@ -647,9 +658,13 @@ struct allocno_hard_regs_subnode Overall conflict size is left_conflict_subnodes_size + MIN (max_node_impact - left_conflict_subnodes_size, - left_conflict_size) + left_conflict_size) + Use MIN here to ensure that the total conflict does not exceed + max_node_impact. */ + /* The total conflict size of subnodes. */ short left_conflict_subnodes_size; + /* The maximum number of registers that the current node can use. */ short max_node_impact; }; @@ -758,6 +773,8 @@ form_allocno_hard_regs_nodes_forest (void) collect_allocno_hard_regs_cover (hard_regs_roots, allocno_data->profitable_hard_regs); allocno_hard_regs_node = NULL; + /* Find the ancestor node in forest which cover all nodes. The ancestor is + a smallest superset of profitable_hard_regs. */ for (j = 0; hard_regs_node_vec.iterate (j, &node); j++) allocno_hard_regs_node = (j == 0 @@ -990,6 +1007,8 @@ update_left_conflict_sizes_p (ira_allocno_t a, removed_node->hard_regs->set)); start = node_preorder_num * allocno_hard_regs_nodes_num; i = allocno_hard_regs_subnode_index[start + removed_node->preorder_num]; + /* i < 0 means removed_node is parent of node instead of node is the parent of + removed_node. */ if (i < 0) i = 0; subnodes = allocno_hard_regs_subnodes + data->hard_regs_subnodes_start; @@ -999,6 +1018,7 @@ update_left_conflict_sizes_p (ira_allocno_t a, - subnodes[i].left_conflict_subnodes_size, subnodes[i].left_conflict_size)); subnodes[i].left_conflict_size -= size; + /* Update all ancestors for subnode i. */ for (;;) { conflict_size @@ -1242,6 +1262,9 @@ struct update_cost_queue_elem connecting this allocno to the one being allocated. */ int divisor; + /* Hard register regno assigned to current ALLOCNO. */ + int hard_regno; + /* Allocno from which we started chaining costs of connected allocnos. */ ira_allocno_t start; @@ -1308,7 +1331,7 @@ start_update_cost (void) /* Add (ALLOCNO, START, FROM, DIVISOR) to the end of update_cost_queue, unless ALLOCNO is already in the queue, or has NO_REGS class. */ static inline void -queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, +queue_update_cost (ira_allocno_t allocno, int hard_regno, ira_allocno_t start, ira_allocno_t from, int divisor) { struct update_cost_queue_elem *elem; @@ -1317,6 +1340,7 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, if (elem->check != update_cost_check && ALLOCNO_CLASS (allocno) != NO_REGS) { + elem->hard_regno = hard_regno; elem->check = update_cost_check; elem->start = start; elem->from = from; @@ -1334,8 +1358,8 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t start, false if the queue was empty, otherwise make (*ALLOCNO, *START, *FROM, *DIVISOR) describe the removed element. */ static inline bool -get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start, - ira_allocno_t *from, int *divisor) +get_next_update_cost (ira_allocno_t *allocno, int *hard_regno, + ira_allocno_t *start, ira_allocno_t *from, int *divisor) { struct update_cost_queue_elem *elem; @@ -1348,6 +1372,8 @@ get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start, *from = elem->from; *divisor = elem->divisor; update_cost_queue = elem->next; + if (hard_regno != NULL) + *hard_regno = elem->hard_regno; return true; } @@ -1449,31 +1475,41 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, enum reg_class rclass, aclass; ira_allocno_t another_allocno, start = allocno, from = NULL; ira_copy_t cp, next_cp; + ira_object_t another_obj; + unsigned int obj_index1, obj_index2; rclass = REGNO_REG_CLASS (hard_regno); do { + gcc_assert (hard_regno >= 0); mode = ALLOCNO_MODE (allocno); ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { + obj_index1 = OBJECT_INDEX (cp->first); + obj_index2 = OBJECT_INDEX (cp->second); next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_obj = cp->second; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { + obj_index1 = OBJECT_INDEX (cp->second); + obj_index2 = OBJECT_INDEX (cp->first); next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + another_allocno = OBJECT_ALLOCNO (another_obj); if (another_allocno == from || (ALLOCNO_COLOR_DATA (another_allocno) != NULL - && (ALLOCNO_COLOR_DATA (allocno)->first_thread_allocno - != ALLOCNO_COLOR_DATA (another_allocno)->first_thread_allocno))) + && (ALLOCNO_COLOR_DATA (allocno) + ->first_thread_objects[obj_index1] + != ALLOCNO_COLOR_DATA (another_allocno) + ->first_thread_objects[obj_index2]))) continue; aclass = ALLOCNO_CLASS (another_allocno); @@ -1482,6 +1518,8 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, || ALLOCNO_ASSIGNED_P (another_allocno)) continue; + ira_allocno_t first_allocno = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_allocno = OBJECT_ALLOCNO (cp->second); /* If we have different modes use the smallest one. It is a sub-register move. It is hard to predict what LRA will reload (the pseudo or its sub-register) but LRA @@ -1489,14 +1527,21 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, register classes bigger modes might be invalid, e.g. DImode for AREG on x86. For such cases the register move cost will be maximal. */ - mode = narrower_subreg_mode (ALLOCNO_MODE (cp->first), - ALLOCNO_MODE (cp->second)); + mode = narrower_subreg_mode (ALLOCNO_MODE (first_allocno), + ALLOCNO_MODE (second_allocno)); ira_init_register_move_cost_if_necessary (mode); - cost = (cp->second == allocno - ? ira_register_move_cost[mode][rclass][aclass] - : ira_register_move_cost[mode][aclass][rclass]); + cost = (second_allocno == allocno + ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); + /* Adjust the hard regno for another_allocno for subreg copy. */ + int start_regno = hard_regno; + if (cp->insn && subreg_move_p (cp->first, cp->second)) + { + int diff = OBJECT_START (cp->first) - OBJECT_START (cp->second); + start_regno += (first_allocno == allocno ? diff : -diff); + } if (decr_p) cost = -cost; @@ -1505,25 +1550,30 @@ update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, if (internal_flag_ira_verbose > 5 && ira_dump_file != NULL) fprintf (ira_dump_file, - " a%dr%d (hr%d): update cost by %d, conflict cost by %d\n", - ALLOCNO_NUM (another_allocno), ALLOCNO_REGNO (another_allocno), - hard_regno, update_cost, update_conflict_cost); + " a%dr%d (hr%d): update cost by %d, conflict " + "cost by %d\n", + ALLOCNO_NUM (another_allocno), + ALLOCNO_REGNO (another_allocno), start_regno, update_cost, + update_conflict_cost); if (update_cost == 0) continue; - if (! update_allocno_cost (another_allocno, hard_regno, - update_cost, update_conflict_cost)) + if (start_regno < 0 + || (start_regno + ALLOCNO_NREGS (another_allocno)) + > FIRST_PSEUDO_REGISTER + || !update_allocno_cost (another_allocno, start_regno, + update_cost, update_conflict_cost)) continue; - queue_update_cost (another_allocno, start, allocno, + queue_update_cost (another_allocno, start_regno, start, allocno, divisor * COST_HOP_DIVISOR); if (record_p && ALLOCNO_COLOR_DATA (another_allocno) != NULL) ALLOCNO_COLOR_DATA (another_allocno)->update_cost_records - = get_update_cost_record (hard_regno, divisor, - ALLOCNO_COLOR_DATA (another_allocno) - ->update_cost_records); + = get_update_cost_record ( + start_regno, divisor, + ALLOCNO_COLOR_DATA (another_allocno)->update_cost_records); } - } - while (get_next_update_cost (&allocno, &start, &from, &divisor)); + } while ( + get_next_update_cost (&allocno, &hard_regno, &start, &from, &divisor)); } /* Decrease preferred ALLOCNO hard register costs and costs of @@ -1632,23 +1682,25 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, enum reg_class another_aclass; ira_allocno_t allocno, another_allocno, start, from; ira_copy_t cp, next_cp; + ira_object_t another_obj; - while (get_next_update_cost (&allocno, &start, &from, &divisor)) + while (get_next_update_cost (&allocno, NULL, &start, &from, &divisor)) for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_obj = cp->second; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_obj = cp->first; } else gcc_unreachable (); + another_allocno = OBJECT_ALLOCNO (another_obj); another_aclass = ALLOCNO_CLASS (another_allocno); if (another_allocno == from || ALLOCNO_ASSIGNED_P (another_allocno) @@ -1696,7 +1748,8 @@ update_conflict_hard_regno_costs (int *costs, enum reg_class aclass, * COST_HOP_DIVISOR * COST_HOP_DIVISOR * COST_HOP_DIVISOR)) - queue_update_cost (another_allocno, start, from, divisor * COST_HOP_DIVISOR); + queue_update_cost (another_allocno, -1, start, from, + divisor * COST_HOP_DIVISOR); } } @@ -2034,6 +2087,11 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); + + if (ALLOCNO_COLOR_DATA (a)->first_thread_allocno + == ALLOCNO_COLOR_DATA (conflict_a)->first_thread_allocno) + continue; + enum reg_class conflict_aclass; allocno_color_data_t data = ALLOCNO_COLOR_DATA (conflict_a); @@ -2225,7 +2283,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) continue; full_costs[j] -= conflict_costs[k]; } - queue_update_cost (conflict_a, conflict_a, NULL, COST_HOP_DIVISOR); + queue_update_cost (conflict_a, -1, conflict_a, NULL, + COST_HOP_DIVISOR); } } } @@ -2239,7 +2298,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) if (! retry_p) { start_update_cost (); - queue_update_cost (a, a, NULL, COST_HOP_DIVISOR); + queue_update_cost (a, -1, a, NULL, COST_HOP_DIVISOR); update_conflict_hard_regno_costs (full_costs, aclass, false); } min_cost = min_full_cost = INT_MAX; @@ -2264,17 +2323,17 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) if (!HONOR_REG_ALLOC_ORDER) { if ((saved_nregs = calculate_saved_nregs (hard_regno, mode)) != 0) - /* We need to save/restore the hard register in - epilogue/prologue. Therefore we increase the cost. */ - { - rclass = REGNO_REG_CLASS (hard_regno); - add_cost = ((ira_memory_move_cost[mode][rclass][0] - + ira_memory_move_cost[mode][rclass][1]) + /* We need to save/restore the hard register in + epilogue/prologue. Therefore we increase the cost. */ + { + rclass = REGNO_REG_CLASS (hard_regno); + add_cost = ((ira_memory_move_cost[mode][rclass][0] + + ira_memory_move_cost[mode][rclass][1]) * saved_nregs / hard_regno_nregs (hard_regno, mode) - 1); - cost += add_cost; - full_cost += add_cost; - } + cost += add_cost; + full_cost += add_cost; + } } if (min_cost > cost) min_cost = cost; @@ -2393,54 +2452,173 @@ copy_freq_compare_func (const void *v1p, const void *v2p) return cp1->num - cp2->num; } - +/* Return true if object OBJ1 conflict with OBJ2. */ +static bool +objects_conflict_by_live_ranges_p (ira_object_t obj1, ira_object_t obj2) +{ + rtx reg1, reg2; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + if (a1 == a2) + return false; + reg1 = regno_reg_rtx[ALLOCNO_REGNO (a1)]; + reg2 = regno_reg_rtx[ALLOCNO_REGNO (a2)]; + if (reg1 != NULL && reg2 != NULL + && ORIGINAL_REGNO (reg1) == ORIGINAL_REGNO (reg2)) + return false; + + /* We don't keep live ranges for caps because they can be quite big. + Use ranges of non-cap allocno from which caps are created. */ + a1 = get_cap_member (a1); + a2 = get_cap_member (a2); + + obj1 = find_object (a1, OBJECT_START (obj1), OBJECT_NREGS (obj1)); + obj2 = find_object (a2, OBJECT_START (obj2), OBJECT_NREGS (obj2)); + return ira_live_ranges_intersect_p (OBJECT_LIVE_RANGES (obj1), + OBJECT_LIVE_RANGES (obj2)); +} -/* Return true if any allocno from thread of A1 conflicts with any - allocno from thread A2. */ +/* Return true if any object from thread of OBJ1 conflicts with any + object from thread OBJ2. */ static bool -allocno_thread_conflict_p (ira_allocno_t a1, ira_allocno_t a2) +object_thread_conflict_p (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a, conflict_a; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + + gcc_assert ( + obj1 != obj2 + && ALLOCNO_COLOR_DATA (a1)->first_thread_objects[OBJECT_INDEX (obj1)] + == obj1 + && ALLOCNO_COLOR_DATA (a2)->first_thread_objects[OBJECT_INDEX (obj2)] + == obj2); + + ira_allocno_t first_thread_allocno1 + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + ira_allocno_t first_thread_allocno2 + = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno; + + int offset + = (ALLOCNO_COLOR_DATA (a1)->first_thread_offset + OBJECT_START (obj1)) + - (ALLOCNO_COLOR_DATA (a2)->first_thread_offset + OBJECT_START (obj2)); + + /* Update first_thread_allocno and thread_allocnos info. */ + bitmap thread_allocnos1 + = ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_allocnos; + bitmap thread_allocnos2 + = ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos; + gcc_assert (!bitmap_empty_p (thread_allocnos1) + && !bitmap_empty_p (thread_allocnos2)); + std::vector thread_objects_2; - for (a = ALLOCNO_COLOR_DATA (a2)->next_thread_allocno;; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) + unsigned int i; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos2, 0, i, bi) { - for (conflict_a = ALLOCNO_COLOR_DATA (a1)->next_thread_allocno;; - conflict_a = ALLOCNO_COLOR_DATA (conflict_a)->next_thread_allocno) - { - if (allocnos_conflict_by_live_ranges_p (a, conflict_a)) - return true; - if (conflict_a == a1) - break; - } - if (a == a2) - break; + ira_allocno_object_iterator oi; + ira_object_t obj; + FOR_EACH_ALLOCNO_OBJECT (ira_allocnos[i], obj, oi) + thread_objects_2.push_back (obj); + } + + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos1, 0, i, bi) + { + ira_allocno_object_iterator oi; + ira_object_t obj; + ira_allocno_t a = ira_allocnos[i]; + FOR_EACH_ALLOCNO_OBJECT (ira_allocnos[i], obj, oi) + for (ira_object_t other_obj : thread_objects_2) + { + int thread_start1 = ALLOCNO_COLOR_DATA (a)->first_thread_offset + + OBJECT_START (obj); + int thread_start2 = ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (other_obj)) + ->first_thread_offset + + offset + OBJECT_START (other_obj); + if (!(thread_start1 + OBJECT_NREGS (obj) <= thread_start2 + || thread_start2 + OBJECT_NREGS (other_obj) <= thread_start1) + && objects_conflict_by_live_ranges_p (obj, other_obj)) + return true; + } } + return false; } -/* Merge two threads given correspondingly by their first allocnos T1 - and T2 (more accurately merging T2 into T1). */ +/* Merge two threads given correspondingly by their first objects OBJ1 + and OBJ2 (more accurately merging OBJ2 into OBJ1). */ static void -merge_threads (ira_allocno_t t1, ira_allocno_t t2) +merge_threads (ira_object_t obj1, ira_object_t obj2) { - ira_allocno_t a, next, last; + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + + gcc_assert ( + obj1 != obj2 + && ALLOCNO_COLOR_DATA (a1)->first_thread_objects[OBJECT_INDEX (obj1)] + == obj1 + && ALLOCNO_COLOR_DATA (a2)->first_thread_objects[OBJECT_INDEX (obj2)] + == obj2); + + ira_allocno_t first_thread_allocno1 + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + ira_allocno_t first_thread_allocno2 + = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno; + + gcc_assert (first_thread_allocno1 != first_thread_allocno2); - gcc_assert (t1 != t2 - && ALLOCNO_COLOR_DATA (t1)->first_thread_allocno == t1 - && ALLOCNO_COLOR_DATA (t2)->first_thread_allocno == t2); - for (last = t2, a = ALLOCNO_COLOR_DATA (t2)->next_thread_allocno;; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) + int offset + = (ALLOCNO_COLOR_DATA (a1)->first_thread_offset + OBJECT_START (obj1)) + - (ALLOCNO_COLOR_DATA (a2)->first_thread_offset + OBJECT_START (obj2)); + + /* Update first_thread_allocno and thread_allocnos info. */ + unsigned int i; + bitmap_iterator bi; + bitmap thread_allocnos2 + = ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos; + bitmap thread_allocnos1 + = ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_allocnos; + gcc_assert (!bitmap_empty_p (thread_allocnos1) + && !bitmap_empty_p (thread_allocnos2)); + EXECUTE_IF_SET_IN_BITMAP (thread_allocnos2, 0, i, bi) + { + ira_allocno_t a = ira_allocnos[i]; + gcc_assert (ALLOCNO_COLOR_DATA (a)->first_thread_allocno + == first_thread_allocno2); + /* Update first_thread_allocno and first_thread_offset filed. */ + ALLOCNO_COLOR_DATA (a)->first_thread_allocno = first_thread_allocno1; + ALLOCNO_COLOR_DATA (a)->first_thread_offset += offset; + bitmap_set_bit (thread_allocnos1, i); + } + bitmap_clear (thread_allocnos2); + ira_free_bitmap (thread_allocnos2); + ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_allocnos = NULL; + + ira_object_t last_obj = obj2; + for (ira_object_t next_obj + = ALLOCNO_COLOR_DATA (a2)->next_thread_objects[OBJECT_INDEX (obj2)]; + ; next_obj = ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (next_obj)) + ->next_thread_objects[OBJECT_INDEX (next_obj)]) { - ALLOCNO_COLOR_DATA (a)->first_thread_allocno = t1; - if (a == t2) + ira_allocno_t next_a = OBJECT_ALLOCNO (next_obj); + ALLOCNO_COLOR_DATA (next_a)->first_thread_objects[OBJECT_INDEX (next_obj)] + = obj1; + gcc_assert (ALLOCNO_COLOR_DATA (next_a)->first_thread_allocno + == first_thread_allocno1); + gcc_assert (bitmap_bit_p (thread_allocnos1, ALLOCNO_NUM (next_a))); + if (next_obj == obj2) break; - last = a; + last_obj = next_obj; } - next = ALLOCNO_COLOR_DATA (t1)->next_thread_allocno; - ALLOCNO_COLOR_DATA (t1)->next_thread_allocno = t2; - ALLOCNO_COLOR_DATA (last)->next_thread_allocno = next; - ALLOCNO_COLOR_DATA (t1)->thread_freq += ALLOCNO_COLOR_DATA (t2)->thread_freq; + /* Add OBJ2's threads chain to OBJ1. */ + ira_object_t temp_obj + = ALLOCNO_COLOR_DATA (a1)->next_thread_objects[OBJECT_INDEX (obj1)]; + ALLOCNO_COLOR_DATA (a1)->next_thread_objects[OBJECT_INDEX (obj1)] = obj2; + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (last_obj)) + ->next_thread_objects[OBJECT_INDEX (last_obj)] + = temp_obj; + + ALLOCNO_COLOR_DATA (first_thread_allocno1)->thread_freq + += ALLOCNO_COLOR_DATA (first_thread_allocno2)->thread_freq; } /* Create threads by processing CP_NUM copies from sorted copies. We @@ -2448,7 +2626,6 @@ merge_threads (ira_allocno_t t1, ira_allocno_t t2) static void form_threads_from_copies (int cp_num) { - ira_allocno_t a, thread1, thread2; ira_copy_t cp; qsort (sorted_copies, cp_num, sizeof (ira_copy_t), copy_freq_compare_func); @@ -2457,33 +2634,43 @@ form_threads_from_copies (int cp_num) for (int i = 0; i < cp_num; i++) { cp = sorted_copies[i]; - thread1 = ALLOCNO_COLOR_DATA (cp->first)->first_thread_allocno; - thread2 = ALLOCNO_COLOR_DATA (cp->second)->first_thread_allocno; - if (thread1 == thread2) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + ira_object_t thread1 = ALLOCNO_COLOR_DATA (first_a) + ->first_thread_objects[OBJECT_INDEX (cp->first)]; + ira_object_t thread2 + = ALLOCNO_COLOR_DATA (second_a) + ->first_thread_objects[OBJECT_INDEX (cp->second)]; + if (thread1 == thread2 + || ALLOCNO_COLOR_DATA (first_a)->first_thread_allocno + == ALLOCNO_COLOR_DATA (second_a)->first_thread_allocno) continue; - if (! allocno_thread_conflict_p (thread1, thread2)) + if (!object_thread_conflict_p (thread1, thread2)) { if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) - fprintf - (ira_dump_file, - " Forming thread by copy %d:a%dr%d-a%dr%d (freq=%d):\n", - cp->num, ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), - cp->freq); + fprintf ( + ira_dump_file, + " Forming thread by copy %d:a%dr%d-a%dr%d (freq=%d):\n", + cp->num, ALLOCNO_NUM (first_a), ALLOCNO_REGNO (first_a), + ALLOCNO_NUM (second_a), ALLOCNO_REGNO (second_a), cp->freq); merge_threads (thread1, thread2); if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) { - thread1 = ALLOCNO_COLOR_DATA (thread1)->first_thread_allocno; - fprintf (ira_dump_file, " Result (freq=%d): a%dr%d(%d)", - ALLOCNO_COLOR_DATA (thread1)->thread_freq, - ALLOCNO_NUM (thread1), ALLOCNO_REGNO (thread1), - ALLOCNO_FREQ (thread1)); - for (a = ALLOCNO_COLOR_DATA (thread1)->next_thread_allocno; - a != thread1; - a = ALLOCNO_COLOR_DATA (a)->next_thread_allocno) - fprintf (ira_dump_file, " a%dr%d(%d)", - ALLOCNO_NUM (a), ALLOCNO_REGNO (a), - ALLOCNO_FREQ (a)); + ira_allocno_t a1 = OBJECT_ALLOCNO (thread1); + ira_allocno_t first_thread_allocno + = ALLOCNO_COLOR_DATA (a1)->first_thread_allocno; + fprintf (ira_dump_file, " Result (freq=%d):", + ALLOCNO_COLOR_DATA (first_thread_allocno)->thread_freq); + unsigned int i; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP ( + ALLOCNO_COLOR_DATA (first_thread_allocno)->thread_allocnos, 0, + i, bi) + { + ira_allocno_t a = ira_allocnos[i]; + fprintf (ira_dump_file, " a%dr%d(%d)", ALLOCNO_NUM (a), + ALLOCNO_REGNO (a), ALLOCNO_FREQ (a)); + } fprintf (ira_dump_file, "\n"); } } @@ -2503,13 +2690,27 @@ form_threads_from_bucket (ira_allocno_t bucket) { for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + bool intersect_p = hard_reg_set_intersect_p ( + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (cp->first)) + ->profitable_hard_regs, + ALLOCNO_COLOR_DATA (OBJECT_ALLOCNO (cp->second)) + ->profitable_hard_regs); + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; + if (!intersect_p) + continue; + sorted_copies[cp_num++] = cp; + } + else if (OBJECT_ALLOCNO (cp->second) == a) + { + next_cp = cp->next_second_allocno_copy; + if (!intersect_p + || !bitmap_bit_p (uncolorable_allocno_set, + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)))) + continue; sorted_copies[cp_num++] = cp; } - else if (cp->second == a) - next_cp = cp->next_second_allocno_copy; else gcc_unreachable (); } @@ -2531,15 +2732,15 @@ form_threads_from_colorable_allocno (ira_allocno_t a) ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_a = OBJECT_ALLOCNO (cp->second); } - else if (cp->second == a) + else if (OBJECT_ALLOCNO (cp->second) == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_a = OBJECT_ALLOCNO (cp->first); } else gcc_unreachable (); @@ -2564,8 +2765,16 @@ init_allocno_threads (void) { a = ira_allocnos[j]; /* Set up initial thread data: */ - ALLOCNO_COLOR_DATA (a)->first_thread_allocno - = ALLOCNO_COLOR_DATA (a)->next_thread_allocno = a; + for (int i = 0; i < ALLOCNO_NUM_OBJECTS (a); i += 1) + { + ira_object_t obj = ALLOCNO_OBJECT (a, i); + ALLOCNO_COLOR_DATA (a)->first_thread_objects[i] + = ALLOCNO_COLOR_DATA (a)->next_thread_objects[i] = obj; + } + ALLOCNO_COLOR_DATA (a)->first_thread_allocno = a; + ALLOCNO_COLOR_DATA (a)->first_thread_offset = 0; + ALLOCNO_COLOR_DATA (a)->thread_allocnos = ira_allocate_bitmap (); + bitmap_set_bit (ALLOCNO_COLOR_DATA (a)->thread_allocnos, ALLOCNO_NUM (a)); ALLOCNO_COLOR_DATA (a)->thread_freq = ALLOCNO_FREQ (a); ALLOCNO_COLOR_DATA (a)->hard_reg_prefs = 0; for (pref = ALLOCNO_PREFS (a); pref != NULL; pref = pref->next_pref) @@ -2608,6 +2817,9 @@ add_allocno_to_bucket (ira_allocno_t a, ira_allocno_t *bucket_ptr) ira_allocno_t first_a; allocno_color_data_t data; + if (bucket_ptr == &uncolorable_allocno_bucket) + bitmap_set_bit (uncolorable_allocno_set, ALLOCNO_NUM (a)); + if (bucket_ptr == &uncolorable_allocno_bucket && ALLOCNO_CLASS (a) != NO_REGS) { @@ -2734,6 +2946,9 @@ delete_allocno_from_bucket (ira_allocno_t allocno, ira_allocno_t *bucket_ptr) { ira_allocno_t prev_allocno, next_allocno; + if (bucket_ptr == &uncolorable_allocno_bucket) + bitmap_clear_bit (uncolorable_allocno_set, ALLOCNO_NUM (allocno)); + if (bucket_ptr == &uncolorable_allocno_bucket && ALLOCNO_CLASS (allocno) != NO_REGS) { @@ -3227,16 +3442,23 @@ allocno_copy_cost_saving (ira_allocno_t allocno, int hard_regno) rclass = ALLOCNO_CLASS (allocno); for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - if (ALLOCNO_HARD_REGNO (cp->second) != hard_regno) + ira_allocno_t another_a = OBJECT_ALLOCNO (cp->second); + if (ALLOCNO_HARD_REGNO (another_a) > -1 + && hard_regno + OBJECT_START (cp->first) + != ALLOCNO_HARD_REGNO (another_a) + + OBJECT_START (cp->second)) continue; } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - if (ALLOCNO_HARD_REGNO (cp->first) != hard_regno) + ira_allocno_t another_a = OBJECT_ALLOCNO (cp->first); + if (ALLOCNO_HARD_REGNO (another_a) > -1 + && hard_regno + OBJECT_START (cp->second) + != ALLOCNO_HARD_REGNO (another_a) + OBJECT_START (cp->first)) continue; } else @@ -3643,6 +3865,7 @@ color_allocnos (void) /* Put the allocnos into the corresponding buckets. */ colorable_allocno_bucket = NULL; uncolorable_allocno_bucket = NULL; + bitmap_clear (uncolorable_allocno_set); EXECUTE_IF_SET_IN_BITMAP (coloring_allocno_bitmap, 0, i, bi) { a = ira_allocnos[i]; @@ -3740,10 +3963,12 @@ color_pass (ira_loop_tree_node_t loop_tree_node) bitmap_copy (coloring_allocno_bitmap, loop_tree_node->all_allocnos); bitmap_copy (consideration_allocno_bitmap, coloring_allocno_bitmap); n = 0; + size_t obj_n = 0; EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; n++; + obj_n += ALLOCNO_NUM_OBJECTS (a); if (! ALLOCNO_ASSIGNED_P (a)) continue; bitmap_clear_bit (coloring_allocno_bitmap, ALLOCNO_NUM (a)); @@ -3752,20 +3977,29 @@ color_pass (ira_loop_tree_node_t loop_tree_node) = (allocno_color_data_t) ira_allocate (sizeof (struct allocno_color_data) * n); memset (allocno_color_data, 0, sizeof (struct allocno_color_data) * n); + ira_object_t *thread_objects + = (ira_object_t *) ira_allocate (sizeof (ira_object_t *) * obj_n * 2); + memset (thread_objects, 0, sizeof (ira_object_t *) * obj_n * 2); curr_allocno_process = 0; n = 0; + size_t obj_offset = 0; EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; ALLOCNO_ADD_DATA (a) = allocno_color_data + n; + ALLOCNO_COLOR_DATA (a)->first_thread_objects + = thread_objects + obj_offset; + obj_offset += ALLOCNO_NUM_OBJECTS (a); + ALLOCNO_COLOR_DATA (a)->next_thread_objects = thread_objects + obj_offset; + obj_offset += ALLOCNO_NUM_OBJECTS (a); n++; } + gcc_assert (obj_n * 2 == obj_offset); init_allocno_threads (); /* Color all mentioned allocnos including transparent ones. */ color_allocnos (); /* Process caps. They are processed just once. */ - if (flag_ira_region == IRA_REGION_MIXED - || flag_ira_region == IRA_REGION_ALL) + if (flag_ira_region == IRA_REGION_MIXED || flag_ira_region == IRA_REGION_ALL) EXECUTE_IF_SET_IN_BITMAP (loop_tree_node->all_allocnos, 0, j, bi) { a = ira_allocnos[j]; @@ -3881,12 +4115,22 @@ color_pass (ira_loop_tree_node_t loop_tree_node) } } } - ira_free (allocno_color_data); EXECUTE_IF_SET_IN_BITMAP (consideration_allocno_bitmap, 0, j, bi) { a = ira_allocnos[j]; + gcc_assert (a != NULL); + ALLOCNO_COLOR_DATA (a)->first_thread_objects = NULL; + ALLOCNO_COLOR_DATA (a)->next_thread_objects = NULL; + if (ALLOCNO_COLOR_DATA (a)->thread_allocnos != NULL) + { + bitmap_clear (ALLOCNO_COLOR_DATA (a)->thread_allocnos); + ira_free_bitmap (ALLOCNO_COLOR_DATA (a)->thread_allocnos); + ALLOCNO_COLOR_DATA (a)->thread_allocnos = NULL; + } ALLOCNO_ADD_DATA (a) = NULL; } + ira_free (allocno_color_data); + ira_free (thread_objects); } /* Initialize the common data for coloring and calls functions to do @@ -4080,15 +4324,17 @@ update_curr_costs (ira_allocno_t a) ira_init_register_move_cost_if_necessary (mode); for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a) { next_cp = cp->next_first_allocno_copy; - another_a = cp->second; + another_a = second_a; } - else if (cp->second == a) + else if (second_a == a) { next_cp = cp->next_second_allocno_copy; - another_a = cp->first; + another_a = first_a; } else gcc_unreachable (); @@ -4100,9 +4346,8 @@ update_curr_costs (ira_allocno_t a) i = ira_class_hard_reg_index[aclass][hard_regno]; if (i < 0) continue; - cost = (cp->first == a - ? ira_register_move_cost[mode][rclass][aclass] - : ira_register_move_cost[mode][aclass][rclass]); + cost = (first_a == a ? ira_register_move_cost[mode][rclass][aclass] + : ira_register_move_cost[mode][aclass][rclass]); ira_allocate_and_set_or_copy_costs (&ALLOCNO_UPDATED_HARD_REG_COSTS (a), aclass, ALLOCNO_CLASS_COST (a), ALLOCNO_HARD_REG_COSTS (a)); @@ -4349,21 +4594,23 @@ coalesce_allocnos (void) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { - if (cp->first == a) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (first_a == a) { next_cp = cp->next_first_allocno_copy; - regno = ALLOCNO_REGNO (cp->second); + regno = ALLOCNO_REGNO (second_a); /* For priority coloring we coalesce allocnos only with the same allocno class not with intersected allocno classes as it were possible. It is done for simplicity. */ if ((cp->insn != NULL || cp->constraint_p) - && ALLOCNO_ASSIGNED_P (cp->second) - && ALLOCNO_HARD_REGNO (cp->second) < 0 - && ! ira_equiv_no_lvalue_p (regno)) + && ALLOCNO_ASSIGNED_P (second_a) + && ALLOCNO_HARD_REGNO (second_a) < 0 + && !ira_equiv_no_lvalue_p (regno)) sorted_copies[cp_num++] = cp; } - else if (cp->second == a) + else if (second_a == a) next_cp = cp->next_second_allocno_copy; else gcc_unreachable (); @@ -4376,17 +4623,18 @@ coalesce_allocnos (void) for (i = 0; i < cp_num; i++) { cp = sorted_copies[i]; - if (! coalesced_allocno_conflict_p (cp->first, cp->second)) + ira_allocno_t first_a = OBJECT_ALLOCNO (cp->first); + ira_allocno_t second_a = OBJECT_ALLOCNO (cp->second); + if (!coalesced_allocno_conflict_p (first_a, second_a)) { allocno_coalesced_p = true; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) - fprintf - (ira_dump_file, - " Coalescing copy %d:a%dr%d-a%dr%d (freq=%d)\n", - cp->num, ALLOCNO_NUM (cp->first), ALLOCNO_REGNO (cp->first), - ALLOCNO_NUM (cp->second), ALLOCNO_REGNO (cp->second), - cp->freq); - merge_allocnos (cp->first, cp->second); + fprintf (ira_dump_file, + " Coalescing copy %d:a%dr%d-a%dr%d (freq=%d)\n", + cp->num, ALLOCNO_NUM (first_a), + ALLOCNO_REGNO (first_a), ALLOCNO_NUM (second_a), + ALLOCNO_REGNO (second_a), cp->freq); + merge_allocnos (first_a, second_a); i++; break; } @@ -4395,8 +4643,11 @@ coalesce_allocnos (void) for (n = 0; i < cp_num; i++) { cp = sorted_copies[i]; - if (allocno_coalesce_data[ALLOCNO_NUM (cp->first)].first - != allocno_coalesce_data[ALLOCNO_NUM (cp->second)].first) + if (allocno_coalesce_data[ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first))] + .first + != allocno_coalesce_data[ALLOCNO_NUM ( + OBJECT_ALLOCNO (cp->second))] + .first) sorted_copies[n++] = cp; } cp_num = n; @@ -5070,15 +5321,15 @@ ira_reuse_stack_slot (int regno, poly_uint64 inherent_size, cp != NULL; cp = next_cp) { - if (cp->first == allocno) + if (OBJECT_ALLOCNO (cp->first) == allocno) { next_cp = cp->next_first_allocno_copy; - another_allocno = cp->second; + another_allocno = OBJECT_ALLOCNO (cp->second); } - else if (cp->second == allocno) + else if (OBJECT_ALLOCNO (cp->second) == allocno) { next_cp = cp->next_second_allocno_copy; - another_allocno = cp->first; + another_allocno = OBJECT_ALLOCNO (cp->first); } else gcc_unreachable (); @@ -5274,6 +5525,7 @@ ira_initiate_assign (void) = (ira_allocno_t *) ira_allocate (sizeof (ira_allocno_t) * ira_allocnos_num); consideration_allocno_bitmap = ira_allocate_bitmap (); + uncolorable_allocno_set = ira_allocate_bitmap (); initiate_cost_update (); allocno_priorities = (int *) ira_allocate (sizeof (int) * ira_allocnos_num); sorted_copies = (ira_copy_t *) ira_allocate (ira_copies_num @@ -5286,6 +5538,7 @@ ira_finish_assign (void) { ira_free (sorted_allocnos); ira_free_bitmap (consideration_allocno_bitmap); + ira_free_bitmap (uncolorable_allocno_set); finish_cost_update (); ira_free (allocno_priorities); ira_free (sorted_copies); diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index 0585ad10043..7aeed7202ce 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -173,25 +173,115 @@ build_conflict_bit_table (void) sparseset_free (objects_live); return true; } - -/* Return true iff allocnos A1 and A2 cannot be allocated to the same - register due to conflicts. */ -static bool -allocnos_conflict_for_copy_p (ira_allocno_t a1, ira_allocno_t a2) +/* Check that X is REG or SUBREG of REG. */ +#define REG_SUBREG_P(x) \ + (REG_P (x) || (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x)))) + +/* Return true if OBJ1 and OBJ2 can be a move INSN. */ +bool +subreg_move_p (ira_object_t obj1, ira_object_t obj2) { - /* Due to the fact that we canonicalize conflicts (see - record_object_conflict), we only need to test for conflicts of - the lowest order words. */ - ira_object_t obj1 = ALLOCNO_OBJECT (a1, 0); - ira_object_t obj2 = ALLOCNO_OBJECT (a2, 0); + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + return ALLOCNO_CLASS (a1) != NO_REGS && ALLOCNO_CLASS (a2) != NO_REGS + && (ALLOCNO_TRACK_SUBREG_P (a1) || ALLOCNO_TRACK_SUBREG_P (a2)) + && OBJECT_NREGS (obj1) == OBJECT_NREGS (obj2) + && (OBJECT_NREGS (obj1) != ALLOCNO_NREGS (a1) + || OBJECT_NREGS (obj2) != ALLOCNO_NREGS (a2)); +} - return OBJECTS_CONFLICT_P (obj1, obj2); +/* Return true if ORIG_DEST_REG and ORIG_SRC_REG can be a move INSN. */ +bool +subreg_move_p (rtx orig_dest_reg, rtx orig_src_reg) +{ + gcc_assert (REG_SUBREG_P (orig_dest_reg) && REG_SUBREG_P (orig_src_reg)); + rtx reg1 + = SUBREG_P (orig_dest_reg) ? SUBREG_REG (orig_dest_reg) : orig_dest_reg; + rtx reg2 = SUBREG_P (orig_src_reg) ? SUBREG_REG (orig_src_reg) : orig_src_reg; + if (HARD_REGISTER_P (reg1) || HARD_REGISTER_P (reg2)) + return false; + ira_allocno_t a1 = ira_curr_regno_allocno_map[REGNO (reg1)]; + ira_allocno_t a2 = ira_curr_regno_allocno_map[REGNO (reg2)]; + ira_object_t obj1 = find_object (a1, orig_dest_reg); + ira_object_t obj2 = find_object (a2, orig_src_reg); + return subreg_move_p (obj1, obj2); } -/* Check that X is REG or SUBREG of REG. */ -#define REG_SUBREG_P(x) \ - (REG_P (x) || (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x)))) +/* Return true if OBJ1 and OBJ2 can allocate to the same register. */ +static bool +regs_non_conflict_for_copy_p (ira_object_t obj1, ira_object_t obj2, + bool is_move, bool offset_equal) +{ + ira_allocno_t a1 = OBJECT_ALLOCNO (obj1); + ira_allocno_t a2 = OBJECT_ALLOCNO (obj2); + if (is_move && subreg_move_p (obj1, obj2)) + { + if (OBJECTS_CONFLICT_P (obj1, obj2)) + return false; + /* Assume a1 allocate to `OBJECT_START (obj2)` and a2 allocate to + `OBJECT_START (obj1)` hard register, so both objects can use the same + hard register `OBJECT_START (obj1) + OBJECT_START (obj2)`. */ + int start_regno1 = OBJECT_START (obj2); + int start_regno2 = OBJECT_START (obj1); + + ira_object_t obj_a, obj_b; + ira_allocno_object_iterator oi_a, oi_b; + FOR_EACH_ALLOCNO_OBJECT (a1, obj_a, oi_a) + FOR_EACH_ALLOCNO_OBJECT (a2, obj_b, oi_b) + /* If there have a conflict between a1 and a2 and prevent the + allocation before, then obj1 and obj2 cannot be a copy. */ + if (OBJECTS_CONFLICT_P (obj_a, obj_b) + && !(start_regno1 + OBJECT_START (obj_a) + OBJECT_NREGS (obj_a) + <= (start_regno2 + OBJECT_START (obj_b)) + || start_regno2 + OBJECT_START (obj_b) + OBJECT_NREGS (obj_b) + <= (start_regno1 + OBJECT_START (obj_a)))) + return false; + + return true; + } + else + { + /* For normal case, make sure full_obj1 and full_obj2 can allocate to the + same register. */ + ira_object_t full_obj1 = find_object (a1, 0, ALLOCNO_NREGS (a1)); + ira_object_t full_obj2 = find_object (a2, 0, ALLOCNO_NREGS (a2)); + return !OBJECTS_CONFLICT_P (full_obj1, full_obj2) && offset_equal; + } +} + +/* Return true if ORIG_REG offset align in ALLOCNO_UNIT_SIZE (A) and times of + ALLOCNO_UNIT_SIZE (A). Use to forbidden bellow rtl which has a subreg move to + create copy (from testsuite/gcc.dg/vect/vect-simd-20.c on AArch64). Suppose + they are all allocated to the fourth register, that is, pseudo 127 is + allocated to w4, and pseudo 149 is allocated to x4 and x5. Then the third + instruction can be safely deleted without affecting the result of pseudo 149. + But when the second instruction is executed, the upper 32 bits of x4 will be + set to 0 (the behavior of the add instruction), that is to say, the result of + pseudo 149 is modified, and its 32~63 bits are set to 0, Not the desired + result. + + (set (reg:SI 127) + (subreg:SI (reg:TI 149) 0)) + ... + (set (reg:SI 127) + (plus:SI (reg:SI 127) + (reg:SI 180))) + ... + (set (zero_extract:DI (subreg:DI (reg:TI 149) 0) + (const_int 32 [0x20]) + (const_int 0 [0])) + (subreg:DI (reg:SI 127) 0)) */ +static bool +subreg_reg_align_and_times_p (ira_allocno_t a, rtx orig_reg) +{ + if (!has_subreg_object_p (a) || !SUBREG_P (orig_reg)) + return true; + + return multiple_p (SUBREG_BYTE (orig_reg), ALLOCNO_UNIT_SIZE (a)) + && multiple_p (GET_MODE_SIZE (GET_MODE (orig_reg)), + ALLOCNO_UNIT_SIZE (a)); +} /* Return X if X is a REG, otherwise it should be SUBREG of REG and the function returns the reg in this case. *OFFSET will be set to @@ -237,8 +327,9 @@ get_freq_for_shuffle_copy (int freq) SINGLE_INPUT_OP_HAS_CSTR_P is only meaningful when constraint_p is true, see function ira_get_dup_out_num for its meaning. */ static bool -process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, - int freq, bool single_input_op_has_cstr_p = true) +process_regs_for_copy (rtx orig_reg1, rtx orig_reg2, bool constraint_p, + rtx_insn *insn, int freq, + bool single_input_op_has_cstr_p = true) { int allocno_preferenced_hard_regno, index, offset1, offset2; int cost, conflict_cost, move_cost; @@ -248,10 +339,10 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, machine_mode mode; ira_copy_t cp; - gcc_assert (REG_SUBREG_P (reg1) && REG_SUBREG_P (reg2)); - only_regs_p = REG_P (reg1) && REG_P (reg2); - reg1 = go_through_subreg (reg1, &offset1); - reg2 = go_through_subreg (reg2, &offset2); + gcc_assert (REG_SUBREG_P (orig_reg1) && REG_SUBREG_P (orig_reg2)); + only_regs_p = REG_P (orig_reg1) && REG_P (orig_reg2); + rtx reg1 = go_through_subreg (orig_reg1, &offset1); + rtx reg2 = go_through_subreg (orig_reg2, &offset2); /* Set up hard regno preferenced by allocno. If allocno gets the hard regno the copy (or potential move) insn will be removed. */ if (HARD_REGISTER_P (reg1)) @@ -270,13 +361,17 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, { ira_allocno_t a1 = ira_curr_regno_allocno_map[REGNO (reg1)]; ira_allocno_t a2 = ira_curr_regno_allocno_map[REGNO (reg2)]; + ira_object_t obj1 = find_object (a1, orig_reg1); + ira_object_t obj2 = find_object (a2, orig_reg2); - if (!allocnos_conflict_for_copy_p (a1, a2) - && offset1 == offset2 + if (subreg_reg_align_and_times_p (a1, orig_reg1) + && subreg_reg_align_and_times_p (a2, orig_reg2) + && regs_non_conflict_for_copy_p (obj1, obj2, insn != NULL, + offset1 == offset2) && ordered_p (GET_MODE_PRECISION (ALLOCNO_MODE (a1)), GET_MODE_PRECISION (ALLOCNO_MODE (a2)))) { - cp = ira_add_allocno_copy (a1, a2, freq, constraint_p, insn, + cp = ira_add_allocno_copy (obj1, obj2, freq, constraint_p, insn, ira_curr_loop_tree_node); bitmap_set_bit (ira_curr_loop_tree_node->local_copies, cp->num); return true; @@ -438,16 +533,15 @@ add_insn_allocno_copies (rtx_insn *insn) freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn)); if (freq == 0) freq = 1; - if ((set = single_set (insn)) != NULL_RTX - && REG_SUBREG_P (SET_DEST (set)) && REG_SUBREG_P (SET_SRC (set)) - && ! side_effects_p (set) - && find_reg_note (insn, REG_DEAD, - REG_P (SET_SRC (set)) - ? SET_SRC (set) - : SUBREG_REG (SET_SRC (set))) != NULL_RTX) + if ((set = single_set (insn)) != NULL_RTX && REG_SUBREG_P (SET_DEST (set)) + && REG_SUBREG_P (SET_SRC (set)) && !side_effects_p (set) + && (find_reg_note (insn, REG_DEAD, + REG_P (SET_SRC (set)) ? SET_SRC (set) + : SUBREG_REG (SET_SRC (set))) + != NULL_RTX + || subreg_move_p (SET_DEST (set), SET_SRC (set)))) { - process_regs_for_copy (SET_SRC (set), SET_DEST (set), - false, insn, freq); + process_regs_for_copy (SET_SRC (set), SET_DEST (set), false, insn, freq); return; } /* Fast check of possibility of constraint or shuffle copies. If @@ -521,16 +615,23 @@ propagate_copies (void) FOR_EACH_COPY (cp, ci) { - a1 = cp->first; - a2 = cp->second; + a1 = OBJECT_ALLOCNO (cp->first); + a2 = OBJECT_ALLOCNO (cp->second); if (ALLOCNO_LOOP_TREE_NODE (a1) == ira_loop_tree_root) continue; ira_assert ((ALLOCNO_LOOP_TREE_NODE (a2) != ira_loop_tree_root)); parent_a1 = ira_parent_or_cap_allocno (a1); parent_a2 = ira_parent_or_cap_allocno (a2); + ira_object_t parent_obj1 + = find_object_anyway (parent_a1, OBJECT_START (cp->first), + OBJECT_NREGS (cp->first)); + ira_object_t parent_obj2 + = find_object_anyway (parent_a2, OBJECT_START (cp->second), + OBJECT_NREGS (cp->second)); ira_assert (parent_a1 != NULL && parent_a2 != NULL); - if (! allocnos_conflict_for_copy_p (parent_a1, parent_a2)) - ira_add_allocno_copy (parent_a1, parent_a2, cp->freq, + if (regs_non_conflict_for_copy_p (parent_obj1, parent_obj2, + cp->insn != NULL, true)) + ira_add_allocno_copy (parent_obj1, parent_obj2, cp->freq, cp->constraint_p, cp->insn, cp->loop_tree_node); } } diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc index 9dc7f3c655e..30ff46980f5 100644 --- a/gcc/ira-emit.cc +++ b/gcc/ira-emit.cc @@ -1129,11 +1129,11 @@ add_range_and_copies_from_move_list (move_t list, ira_loop_tree_node_t node, update_costs (to, false, freq); cp = ira_add_allocno_copy (from, to, freq, false, move->insn, NULL); if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL) - fprintf (ira_dump_file, " Adding cp%d:a%dr%d-a%dr%d\n", - cp->num, ALLOCNO_NUM (cp->first), - REGNO (allocno_emit_reg (cp->first)), - ALLOCNO_NUM (cp->second), - REGNO (allocno_emit_reg (cp->second))); + fprintf (ira_dump_file, " Adding cp%d:a%dr%d-a%dr%d\n", cp->num, + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->first)), + REGNO (allocno_emit_reg (OBJECT_ALLOCNO (cp->first))), + ALLOCNO_NUM (OBJECT_ALLOCNO (cp->second)), + REGNO (allocno_emit_reg (OBJECT_ALLOCNO (cp->second)))); nr = ALLOCNO_NUM_OBJECTS (from); for (i = 0; i < nr; i++) diff --git a/gcc/ira-int.h b/gcc/ira-int.h index b9e24328867..963e533e448 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -229,6 +229,8 @@ struct ira_object { /* The allocno associated with this record. */ ira_allocno_t allocno; + /* Index in allocno->objects array */ + unsigned int index; /* Vector of accumulated conflicting conflict_redords with NULL end marker (if OBJECT_CONFLICT_VEC_P is true) or conflict bit vector otherwise. */ @@ -522,6 +524,7 @@ allocno_emit_reg (ira_allocno_t a) } #define OBJECT_ALLOCNO(O) ((O)->allocno) +#define OBJECT_INDEX(O) ((O)->index) #define OBJECT_CONFLICT_ARRAY(O) ((O)->conflicts_array) #define OBJECT_CONFLICT_VEC(O) ((ira_object_t *)(O)->conflicts_array) #define OBJECT_CONFLICT_BITVEC(O) ((IRA_INT_TYPE *)(O)->conflicts_array) @@ -591,9 +594,9 @@ struct ira_allocno_copy { /* The unique order number of the copy node starting with 0. */ int num; - /* Allocnos connected by the copy. The first allocno should have + /* Objects connected by the copy. The first allocno should have smaller order number than the second one. */ - ira_allocno_t first, second; + ira_object_t first, second; /* Execution frequency of the copy. */ int freq; bool constraint_p; @@ -1043,6 +1046,9 @@ extern void ira_remove_allocno_prefs (ira_allocno_t); extern ira_copy_t ira_create_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx_insn *, ira_loop_tree_node_t); +extern ira_copy_t +ira_add_allocno_copy (ira_object_t, ira_object_t, int, bool, rtx_insn *, + ira_loop_tree_node_t); extern ira_copy_t ira_add_allocno_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx_insn *, ira_loop_tree_node_t); @@ -1056,6 +1062,7 @@ extern void ira_destroy (void); extern ira_object_t find_object (ira_allocno_t, int, int); extern ira_object_t find_object (ira_allocno_t, poly_int64, poly_int64); +extern ira_object_t find_object (ira_allocno_t, rtx); ira_object_t find_object_anyway (ira_allocno_t a, int start, int nregs); extern void ira_copy_allocno_objects (ira_allocno_t, ira_allocno_t); @@ -1084,6 +1091,8 @@ extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *, /* ira-conflicts.cc */ extern void ira_debug_conflicts (bool); extern void ira_build_conflicts (void); +extern bool subreg_move_p (ira_object_t, ira_object_t); +extern bool subreg_move_p (rtx, rtx); /* ira-color.cc */ extern ira_allocno_t ira_soft_conflict (ira_allocno_t, ira_allocno_t); diff --git a/gcc/ira.cc b/gcc/ira.cc index b9159d089c3..739ef28af6e 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -2853,14 +2853,15 @@ print_redundant_copies (void) if (hard_regno >= 0) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) - if (cp->first == a) + if (OBJECT_ALLOCNO (cp->first) == a) next_cp = cp->next_first_allocno_copy; else { next_cp = cp->next_second_allocno_copy; if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL && cp->insn != NULL_RTX - && ALLOCNO_HARD_REGNO (cp->first) == hard_regno) + && ALLOCNO_HARD_REGNO (OBJECT_ALLOCNO (cp->first)) + == hard_regno) fprintf (ira_dump_file, " Redundant move from %d(freq %d):%d\n", INSN_UID (cp->insn), cp->freq, hard_regno); From patchwork Wed Nov 8 03:47:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 162876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp676469vqo; Tue, 7 Nov 2023 19:48:56 -0800 (PST) X-Google-Smtp-Source: AGHT+IEv+tKHv/m26fgoUvasGFLALGp/e6YgD2G2lHAaGMwYnXeNRaaD8NknhRFSa0xJxtzAZehK X-Received: by 2002:a05:6214:4101:b0:672:1d32:9d37 with SMTP id kc1-20020a056214410100b006721d329d37mr854808qvb.26.1699415336000; Tue, 07 Nov 2023 19:48:56 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699415335; cv=pass; d=google.com; s=arc-20160816; b=sD1IzOhGFm75SRA+d5fWpKRrB6f9/0jzM+zLL9G/o3XmCv15Zdp3bRXB7lQ1vexvqj Jfv/MviMwDFwmMQSIv4mFaodn6J3MxnD5aX0ifwRn9/yjYXfijqYV9Ua233+MXDMeebb Aca0e7Fftu/lHPq9ZPcwAAdqEGF9EA3vDDf4qFhWiNG3kw9YtUlo99rA+xqn3MGH4J55 7foDRpYBsPAgCDmh6W411lqVtfWHng5ybxXZ5YYctDpTKaDoz0K/ga/bTqBJiNIUdJw+ qdfmOWLKqy2Uscw7AhwNT9iF2INBYAR2Wv2GRV10i8pep5HTxYITcqaVm39x5ZrDkdkJ t4IA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=XY/jeCLo1z8aKx7qQfUYXlKN9hRwYZZBKkwqWiWBiSM=; fh=9Ok8HNl3eD0lUFF4nhUPZJmQfyAUbHnIPw/rSVNIfK0=; b=yM+A49LtBjAJ5R9lUn3HN3AAVzDeHQhpe1EbF5E2ajrweTFWTllXg167wq7w0rASea 6w2JjZd7h5nRU2KUMZCs0hrEmWBQQzPPDdqiDpHOV2JKQ0AOTP1gfomzNzzIIPlTDvqs uk7Aheq1z88Sf7DuLV8Pv3WAMWTmDkuEVqKGJkjceQmSPlVo1mS4OM3JFYV9fjLIpsGn L1JU8O9y+PTinahka+96OwkxO9eVCnhqs4xJAOXHqDre7fOmSsu/vBtVgGbgWAAzW87w trFLdplI+ZrYtsd4p7rvRIUwMOhOATNYHV+ga/YQVPbRi73caZqK3xBObjYIkFcFUByl AC7w== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h2-20020a05620a244200b0076d512665d7si825039qkn.627.2023.11.07.19.48.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Nov 2023 19:48:55 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C087E385773C for ; Wed, 8 Nov 2023 03:48:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr2.qq.com (smtpbgbr2.qq.com [54.207.22.56]) by sourceware.org (Postfix) with ESMTPS id 57C5D3858035 for ; Wed, 8 Nov 2023 03:48:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 57C5D3858035 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 57C5D3858035 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.22.56 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415288; cv=none; b=rJpqgsq9w9P7dufnu8X6SzhlyceoiW/RDKtXO8+K15F7auPWKYCAz5KCxwa9VTHENkAycYd0/651XtuYtQ/vTpo52c56+iEbpEvFMfsLBTXLoOSx5Pkfj3eUbc76oe8d4DtIS7cu87BN43HY/S/fY/dtDtbfgs2cg3SGoiHWda0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699415288; c=relaxed/simple; bh=pOd5h7ZatzTojX2ZYpnCxGxvgPrGaE+n3qRZiQ8sX+g=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=iR8kI37Ch/45uEJ1tgV5XQPx7KzhwaELsHpBUFD8MpMncz0BRdgb7viB60Ls+HFuQM3K2hgQEiI/egH7gwkA7ZvEEYe6oGGksmZ3q37Z7kAH0Nhby6tGIpmwTE1WRREZVIZCworaVwlCZ+krGFl8Qh6VqOH9pKd5Sw44Vd+xEDQ= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1699415278tgak60kd Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 11:47:58 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: lm7sZZPcOdbWk/opJUtPOpa091AFp/LA9CWE/YqN+qUf6aYtNAlqYGK9LixjX ftMAD6W4R5vUwG0SCjj5nAPVX0VDdtlTzYX9cZdVKKMucToum03VB1s+rgRDPeGVJz4T57w y5Rjcw4Q/ZNeAUV7Bl0T4LmrqNgEwv49fx8bHuQNDRb9nl4BkDBmFyhRBt5J4cWLqzP9a5B hBiML+0trnwSy0/xVVEJQBzuPcf/ryzLTZvVsdNbl8ZANoc5kHYE89mm8g9V5UMhyokTNWE GpJV9oltG+CVNd2RL2oeyuQmu9aQkwDtsS7PLeKE3ENt8bZAstT2N1qnv8pOgcZ8lWaiXVo gtWNYF77TuAHYlaECnZj5SbW8vVeXXjz8rtECcxKwG3j2YSMXo6UAxRsTHavtXGTc3cVrgy X-QQ-GoodBg: 2 X-BIZMAIL-ID: 2057229749792789861 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: vmakarov@redhat.com, richard.sandiford@arm.com, juzhe.zhong@rivai.ai, lehua.ding@rivai.ai Subject: [PATCH 5/7] ira: Add all nregs >= 2 pseudos to tracke subreg list Date: Wed, 8 Nov 2023 11:47:38 +0800 Message-Id: <20231108034740.834590-6-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231108034740.834590-1-lehua.ding@rivai.ai> References: <20231108034740.834590-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781966135668688062 X-GMAIL-MSGID: 1781966135668688062 This patch completely relax to track all eligible subregs. gcc/ChangeLog: * ira-build.cc (get_reg_unit_size): New. (has_same_nregs): New. (ira_set_allocno_class): Relax. --- gcc/ira-build.cc | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index 1c47f81ce9d..379f877ca67 100644 --- a/gcc/ira-build.cc +++ b/gcc/ira-build.cc @@ -607,6 +607,37 @@ ira_create_allocno (int regno, bool cap_p, return a; } +/* Return single register size of allocno A. */ +static poly_int64 +get_reg_unit_size (ira_allocno_t a) +{ + enum reg_class aclass = ALLOCNO_CLASS (a); + gcc_assert (aclass != NO_REGS); + machine_mode mode = ALLOCNO_MODE (a); + int nregs = ALLOCNO_NREGS (a); + poly_int64 block_size = REGMODE_NATURAL_SIZE (mode); + int nblocks = get_nblocks (mode); + gcc_assert (nblocks % nregs == 0); + return block_size * (nblocks / nregs); +} + +/* Return true if TARGET_CLASS_MAX_NREGS and TARGET_HARD_REGNO_NREGS results is + same. It should be noted that some targets may not implement these two very + uniformly, and need to be debugged step by step. For example, in V3x1DI mode + in AArch64, TARGET_CLASS_MAX_NREGS returns 2 but TARGET_HARD_REGNO_NREGS + returns 3. They are in conflict and need to be repaired in the Hook of + AArch64. */ +static bool +has_same_nregs (ira_allocno_t a) +{ + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (REGNO_REG_CLASS (i) != NO_REGS + && reg_class_subset_p (REGNO_REG_CLASS (i), ALLOCNO_CLASS (a)) + && ALLOCNO_NREGS (a) != hard_regno_nregs (i, ALLOCNO_MODE (a))) + return false; + return true; +} + /* Set up register class for A and update its conflict hard registers. */ void @@ -624,12 +655,12 @@ ira_set_allocno_class (ira_allocno_t a, enum reg_class aclass) if (aclass == NO_REGS) return; - /* SET the unit_size of one register. */ - machine_mode mode = ALLOCNO_MODE (a); - int nregs = ira_reg_class_max_nregs[aclass][mode]; - if (nregs == 2 && maybe_eq (GET_MODE_SIZE (mode), nregs * UNITS_PER_WORD)) + gcc_assert (!ALLOCNO_TRACK_SUBREG_P (a)); + /* Set unit size and track_subreg_p flag for pseudo which need occupied multi + hard regs. */ + if (ALLOCNO_NREGS (a) > 1 && has_same_nregs (a)) { - ALLOCNO_UNIT_SIZE (a) = UNITS_PER_WORD; + ALLOCNO_UNIT_SIZE (a) = get_reg_unit_size (a); ALLOCNO_TRACK_SUBREG_P (a) = true; return; }