From patchwork Thu Sep 22 07:06:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 1356 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp50684wrt; Thu, 22 Sep 2022 00:07:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Yyaf9LbuI1TrzVwwf1VOraxMxTG+QnwZmUT0k03yqSQO1n0tnKl6I8U6uNnqxI1bQOkJq X-Received: by 2002:a05:6402:1f89:b0:453:8093:c4e5 with SMTP id c9-20020a0564021f8900b004538093c4e5mr1853989edc.182.1663830429522; Thu, 22 Sep 2022 00:07:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663830429; cv=none; d=google.com; s=arc-20160816; b=eDhd1LqiI3ufgiQ9ue6tO3nVAjG+zecFDhvNf/ks+XIJHEvsuk0b8ZQ2yjZ6sj67OW k6cBODZqJklzZfwQ/+LC8UXp1ShIqHZSlD+WahKT66eXSReSSAmQgEBWjeynGHwimIZQ OfcRhxhOo4gFcTEsearknCrjL3Hf1Do+LQ7yYhi9tJIqkF07ov7m0LkXQpNivOz3WyAZ FA5bXl+9FOHYJCWha1+LkQDKgPI7/rcI8GlMOpQASEBx8E5ge/abEKY+/GFy3GNY65FS vK7dVWlbdnlz/71UUQqLjL3tNvx+hKAu7GAZdTpGornRN3RXR8M/H7pDKie0NABevGzx rqvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:to :from:dmarc-filter:delivered-to; bh=hyYuHUqZnyj4qV0zGwE8WCG4g76xh2vk7pUCCQ9uBk4=; b=vXcidac1qJ4As5WIcTLPsO4I56JG/uv1q3FOgv+B/htYMobfFnra6QxdM7GNkVXIiM TJV798S7Hx8xV94zBoTgWItmClD1odckGDXuFdupUZYUCGTJT8YIb8N3+zExRoJ2mEeR 2hJZjO/udyoChqXW2COSDbtYaOOxL9TPbLtz4NtNfG6eZ/Bc4GeJylj8o3I69gPK/mNa Ba5N9EcsZ5PTehWqmjJkE4iWJgcxYRV/uGATaiFNxiwFg1oUGtF4NPoyoba0hgnJ/fAE 2Kq962Wz79Od5GRz4OrQax90y+Dxh2Hm0aEl8Kd44m+ot9Nxf2xn4vbpVE2ITkQnYpmq e7OQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id nd22-20020a170907629600b00741550f828bsi4844959ejc.919.2022.09.22.00.07.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Sep 2022 00:07:09 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 113E03857B92 for ; Thu, 22 Sep 2022 07:07:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgau1.qq.com (smtpbgau1.qq.com [54.206.16.166]) by sourceware.org (Postfix) with ESMTPS id 82C233858028 for ; Thu, 22 Sep 2022 07:06:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 82C233858028 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp70t1663830389tm58gz62 Received: from rios-cad5.localdomain ( [42.247.22.66]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 22 Sep 2022 15:06:27 +0800 (CST) X-QQ-SSF: 01400000000000D0J000000A0000000 X-QQ-FEAT: 5mtg6EGDOv4cMeK/QbGKQVZlUlwlKlXV4lnTkIc8UTBV83NneCZjJpB1+uVLE 3iIDnGZD+EhbtK4wEMkm0dILHWvVyGdFo270zmyaKPivm9SEInVksC/0FptxigVFe0blZ29 EDBmxc6xFrQEMQf4z5Ory5giT2bzabF+48BGnHdDNGtHO3XitTkpWOuTuFT1aOhBzXbhtN9 8cGir9UoYKcABiTnE7zTQT3s2wXJSK2+mjG5xwqZFcfzJ9Ey1/YvVTCENqQvbm6D1omswHd FoSzYNmfRLPHoNMC+jtfo+YZqj7SNWKYmj2Yf++RhZKiyloyom4zU65w2PPcO8H2XZ3s7I9 rGwpZ/gfNC+FfUAxc50T96eJWyI6jV6pGJejPJ4C7NpdEhdwHpPiR3mNzFRZGJ5D1nRR7tC 3hPPIelmKTg= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH] DSE: Enhance dse with def-ref analysis Date: Thu, 22 Sep 2022 15:06:25 +0800 Message-Id: <20220922070625.7197-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: richard.sandiford@arm.com, rguenther@suse.de, Ju-Zhe Zhong Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1744652173198973780?= X-GMAIL-MSGID: =?utf-8?q?1744652656419984415?= From: Ju-Zhe Zhong This patch fix issue: PR 99407 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99407 The enhancement implementation is simple: 1.Search gimple statement in program reverse order. 2.Queue the store statement which may be possible kill the def of previous store statement. 3.Perform dse_def_ref_analysis to remove stores will not kill any def. For example: a[i_18] = _5; ... foo (&a); a[i_18] = _7; a[i_18] = _7 is queued at the begining and will be removed in dse_def_ref_analysis. 4.Remove the store if the def is confirmed to be killed. I have fully tested it in RISC-V foundation downstream port (RVV): https://github.com/riscv-collab/riscv-gcc/tree/riscv-gcc-rvv-next Are you willing to review this patch and test it in ARM/x86? gcc/ChangeLog: * tree-ssa-dse.cc (dse_search_def_stores): New function. (dse_can_def_ref_p): Ditto. (dse_def_ref_analysis): Add a new argument. (dse_optimize_stmt): Pass through stores_queue. (pass_dse::execute): Add dse_def_ref_analysis and stores_queue. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr99407.c: New test. --- gcc/testsuite/gcc.dg/tree-ssa/pr99407.c | 30 ++++ gcc/tree-ssa-dse.cc | 209 +++++++++++++++++++++++- 2 files changed, 236 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr99407.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr99407.c b/gcc/testsuite/gcc.dg/tree-ssa/pr99407.c new file mode 100644 index 00000000000..57cea77da7c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr99407.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dse1-details" } */ +typedef float real_t; + +#define iterations 100000 +#define LEN_1D 32000 +#define LEN_2D 256 +real_t flat_2d_array[LEN_2D*LEN_2D]; + +real_t x[LEN_1D]; + +real_t a[LEN_1D],b[LEN_1D],c[LEN_1D],d[LEN_1D],e[LEN_1D], +bb[LEN_2D][LEN_2D],cc[LEN_2D][LEN_2D],tt[LEN_2D][LEN_2D]; + +int indx[LEN_1D]; + +real_t* __restrict__ xx; +real_t* yy; +real_t s243(void) +{ + for (int nl = 0; nl < iterations; nl++) { + for (int i = 0; i < LEN_1D-1; i++) { + a[i] = b[i] + c[i ] * d[i]; + b[i] = a[i] + d[i ] * e[i]; + a[i] = b[i] + a[i+1] * d[i]; + } + } +} + +/* { dg-final { scan-tree-dump "Deleted dead store" "dse1" } } */ \ No newline at end of file diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc index 34cfd1a8802..a8ca3672da2 100644 --- a/gcc/tree-ssa-dse.cc +++ b/gcc/tree-ssa-dse.cc @@ -1332,6 +1332,186 @@ dse_optimize_call (gimple_stmt_iterator *gsi, sbitmap live_bytes) return true; } +/* Search the stores_queue to see whether there is a store has a same vdef + as the stmt. */ + +static bool +dse_search_def_stores (function *fun, auto_vec &stores_queue, + gimple *stmt) +{ + /* Consider the following sequcence: + a[i_18] = _5; + _8 = e[i_18]; + _9 = _3 * _8; + _10 = _5 + _9; + b[i_18] = _10; + _12 = i_18 + 1; + _13 = a[_12]; + _15 = _3 * _13; + _16 = _10 + _15; + a[i_18] = _16 + + We should be able to remove a[i_18] = _5. */ + for (unsigned int i = 0; i < stores_queue.length (); ++i) + { + if (!stores_queue[i]) + continue; + tree lhs1 = gimple_assign_lhs (stores_queue[i]); + tree lhs2 = gimple_assign_lhs (stmt); + + if (TREE_CODE (lhs1) != TREE_CODE (lhs2)) + continue; + if (operand_equal_p (gimple_assign_lhs (stores_queue[i]), + gimple_assign_lhs (stmt), OEP_ADDRESS_OF)) + { + /* No matter it can be eliminated or not, remove it + in the worklist. */ + stores_queue[i] = NULL; + if (gimple_assign_single_p (stmt) && !gimple_has_side_effects (stmt) + && !is_ctrl_altering_stmt (stmt) + && (!stmt_could_throw_p (fun, stmt) + || fun->can_delete_dead_exceptions)) + return true; + } + } + + return false; +} + +/* Return true if the TREE_CODE of the mem op is allowed to do dse + according to def-ref analysis. */ + +static bool +dse_can_def_ref_p (gimple *stmt) +{ + /*TODO: For now, we only support dse according to + def-ref analysis for ARRAY_REF. */ + return TREE_CODE (gimple_assign_lhs (stmt)) == ARRAY_REF; +} + +/* Perform def-ref analysis on all the stores of stores_queue worklist. + Since dse is running on reverse program order walk, the stores in + stores_queue are always after stmt, clear the store in the stores_queue + if the address of store lhs is changed or the lhs of store is used + in stmt. */ + +static void +dse_def_ref_analysis (gimple *stmt, auto_vec &stores_queue) +{ + for (unsigned int i = 0; i < stores_queue.length (); ++i) + { + if (!stores_queue[i]) + continue; + + /* If we meet non-call or non-assign statement, we disable the possible + * dse. */ + if (gimple_code (stmt) != GIMPLE_CALL + && gimple_code (stmt) != GIMPLE_ASSIGN) + { + stores_queue[i] = NULL; + continue; + } + + tree lhs = gimple_get_lhs (stores_queue[i]); + if (!lhs) + continue; + switch (TREE_CODE (lhs)) + { + /* Handle the array like a[i_18]: + 1.if i_18 is changed in stmt which means lhs of stmt == i_18, + we remove the store from the stores_queue. + + 2.a or a[i_18] is used in stmt, we remove the store from the + stores_queue. */ + case ARRAY_REF: { + if (gimple_get_lhs (stmt) + && operand_equal_p (gimple_get_lhs (stmt), + TREE_OPERAND (lhs, 1), 0)) + { + stores_queue[i] = NULL; + continue; + } + + for (unsigned int j = 0; j < gimple_num_ops (stmt); ++j) + { + tree op = gimple_op (stmt, j); + if (!op) + continue; + + /* We only check the use op. */ + if (op == gimple_get_lhs (stmt)) + continue; + + if (TREE_OPERAND_LENGTH (op) == 0) + { + if (operand_equal_p (op, lhs, 0) + || operand_equal_p (op, TREE_OPERAND (lhs, 0), 0)) + stores_queue[i] = NULL; + } + else + { + for (int k = 0; k < TREE_OPERAND_LENGTH (op); ++k) + { + if (!TREE_OPERAND (op, k)) + continue; + + if (TREE_CODE (op) == ARRAY_REF) + { + /* + Disable optimization like this: + a[i_18] = _5; + ... + foo (a[i_18]). + + Don't disable optimization like this: + a[i_18] = _5; + ... + foo (a[i_12]). + */ + if (operand_equal_p (op, lhs, OEP_ADDRESS_OF)) + { + stores_queue[i] = NULL; + break; + } + } + else + { + /* To be conservative, if TREE_OPERAND (op, k) + length > 1. Disable the possible dse + optimization. + Disable optimization like this: + a[i_18] = _5; + ... + foo (&a). + + Or + a[i_18] = _5; + ... + foo (test (&a)). + */ + if (TREE_OPERAND_LENGTH (TREE_OPERAND (op, k)) > 0 + || operand_equal_p (TREE_OPERAND (op, k), lhs, + 0) + || operand_equal_p (TREE_OPERAND (op, k), + TREE_OPERAND (lhs, 0), 0)) + { + stores_queue[i] = NULL; + break; + } + } + } + } + if (!stores_queue[i]) + break; + } + } + break; + default: + gcc_unreachable (); + } + } +} + /* Attempt to eliminate dead stores in the statement referenced by BSI. A dead store is a store into a memory location which will later be @@ -1344,7 +1524,8 @@ dse_optimize_call (gimple_stmt_iterator *gsi, sbitmap live_bytes) post dominates the first store, then the first store is dead. */ static void -dse_optimize_stmt (function *fun, gimple_stmt_iterator *gsi, sbitmap live_bytes) +dse_optimize_stmt (function *fun, gimple_stmt_iterator *gsi, sbitmap live_bytes, + auto_vec &stores_queue) { gimple *stmt = gsi_stmt (*gsi); @@ -1469,7 +1650,25 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator *gsi, sbitmap live_bytes) byte_tracking_enabled, live_bytes, &by_clobber_p); if (store_status == DSE_STORE_LIVE) - return; + { + if (gimple_assign_single_p (stmt) && !gimple_has_side_effects (stmt) + && !is_ctrl_altering_stmt (stmt) + && (!stmt_could_throw_p (fun, stmt) + || fun->can_delete_dead_exceptions)) + { + if (dse_search_def_stores (fun, stores_queue, stmt)) + ; + else if (dse_can_def_ref_p (stmt)) + { + stores_queue.safe_push (stmt); + return; + } + else + return; + } + else + return; + } if (store_status == DSE_STORE_MAYBE_PARTIAL_DEAD) { @@ -1566,13 +1765,17 @@ pass_dse::execute (function *fun) { basic_block bb = BASIC_BLOCK_FOR_FN (fun, rpo[i-1]); gimple_stmt_iterator gsi; + /* Queue the stores which potentially updates mem of a previous + redundant store. We only do the optimization within a basic block. */ + auto_vec stores_queue; for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);) { gimple *stmt = gsi_stmt (gsi); + dse_def_ref_analysis (stmt, stores_queue); if (gimple_vdef (stmt)) - dse_optimize_stmt (fun, &gsi, live_bytes); + dse_optimize_stmt (fun, &gsi, live_bytes, stores_queue); else if (def_operand_p def_p = single_ssa_def_operand (stmt, SSA_OP_DEF)) {