From patchwork Thu Mar 2 13:28:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 63421 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp4233926wrd; Thu, 2 Mar 2023 05:29:18 -0800 (PST) X-Google-Smtp-Source: AK7set+wNHvX15zpY/r+lmlMNywlWQSxHAgUlQpljuSxq2qR4MSPGWT3bcOStkrfkpT+/jVZXyf6 X-Received: by 2002:aa7:d484:0:b0:4ac:bd71:c67e with SMTP id b4-20020aa7d484000000b004acbd71c67emr10478249edr.32.1677763758356; Thu, 02 Mar 2023 05:29:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677763758; cv=none; d=google.com; s=arc-20160816; b=zDzXTM7b+Q87XoPRH4zmY8c46vHfhpwueQMD7oh+NXVuBz56Lox9fxfENjCvw/oZM5 LUAmI61sqhOcIM3Segv/fD7jhOGAvRFvC3B/naj6Sn//DBWgmu/aI6ctWYz3ppzeF6Mc qB47Y5I8yyN8hYkOSI91q6Sm71VdW+ozIeYr6uX060wc54fdx08gmGaJs6OMiFKNMylr O3GQkANTpMNdwDXKouPvyAa8F6UWS7cTYUqwLWaZFEXB4n4qM6Y4XteCD957AI1U05uT VnAwPESiv/4Zgqsp/vD2wdc8Rx2xmEecidjJwR61S2EVo3clzSHYwGjTNw29Zmw9BYcI uPdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:cc:to:date:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=BZ8eg9kqI5FNnzgyqytP7yeDFlrmuk417mVvrzeo7qs=; b=V0jR31noLHFW6f3u2LPePK8iQPCdv7jE6yZNOdsXdBJgkvt+rwyA6/GcOhhHf5M/S/ mQdjfWchvHXTaxA0yTVAK4jbV7jwjq6x8YY6JWsLmL4mKQyuXFsH/PPl3Lm5igw6UuhD GhEpce0dFSzu1lJiKbcM3hJQcKVV8SqX5UJvuZW66xo4Hbsj+KcaeQqR94rFIYKDwdFD llTiNUowNSgjkzq+KRdoESz+YJ8BgEbCU3hOmcOd7H90t1n/eviykRrFLj7jVTleWbqf 1SGOx1PzM31wiVkz8lOCq+dUdycy0w8ks2VVYCH6fFC8PO1KggB0Ob9qOt//EDTk5ZVm cPZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WZZteQc8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o3-20020a056402038300b004acc4685827si4092732edv.436.2023.03.02.05.29.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Mar 2023 05:29:18 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WZZteQc8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2668B3858425 for ; Thu, 2 Mar 2023 13:29:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2668B3858425 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677763757; bh=BZ8eg9kqI5FNnzgyqytP7yeDFlrmuk417mVvrzeo7qs=; h=Date:To:cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=WZZteQc89ymmZRcDmuUSCftvYZcxEavdcD59g+YnTSeKwG7y4QddEW8yIZ8oBjrAX X8/tdm5I0bIawkn7sQGBwXi6D2VDtQ7vtnnrVqFKFbb3pna7+l/6Ee2vPsUPkz+Wt3 Jj2VHjayabGsEHSXXRKy2JlGxNIMvxirBQQb1CfA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id A1FB43858D33 for ; Thu, 2 Mar 2023 13:28:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A1FB43858D33 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 6F71521DF3; Thu, 2 Mar 2023 13:28:27 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 660E92C141; Thu, 2 Mar 2023 13:28:27 +0000 (UTC) Date: Thu, 2 Mar 2023 13:28:27 +0000 (UTC) To: gcc-patches@gcc.gnu.org cc: Jan Hubicka , ubizjak@gmail.com Subject: [PATCH] target/108738 - limit STV chain discovery User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20230302132917.2668B3858425@sourceware.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759262810855883375?= X-GMAIL-MSGID: =?utf-8?q?1759262810855883375?= The following puts a hard limit on the inherently quadratic STV chain discovery. Without a limit for the compiler.i testcase in PR26854 we see at -O2 machine dep reorg : 574.45 ( 53%) with release checking while with the proposed limit it's machine dep reorg : 2.86 ( 1%) Bootstrapped and tested on x86_64-unknown-linux-gnu. OK? Thanks, Richard. PR target/108738 * config/i386/i386.opt (--param x86-stv-max-visits): New param. * doc/invoke.texi (--param x86-stv-max-visits): Document it. * config/i386/i386-features.h (scalar_chain::max_visits): New. (scalar_chain::build): Add bitmap parameter, return boolean. (scalar_chain::add_insn): Likewise. (scalar_chain::analyze_register_chain): Likewise. * config/i386/i386-features.cc (scalar_chain::scalar_chain): Initialize max_visits. (scalar_chain::analyze_register_chain): When we exhaust max_visits, abort. Also abort when running into any disallowed insn. (scalar_chain::add_insn): Propagate abort. (scalar_chain::build): Likewise. When aborting amend the set of disallowed insn with the insns set. (convert_scalars_to_vector): Adjust. Do not convert aborted chains. --- gcc/config/i386/i386-features.cc | 77 +++++++++++++++++++++++--------- gcc/config/i386/i386-features.h | 10 +++-- gcc/config/i386/i386.opt | 4 ++ gcc/doc/invoke.texi | 4 ++ 4 files changed, 70 insertions(+), 25 deletions(-) diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index eff91301009..c09abf8fc20 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -296,6 +296,8 @@ scalar_chain::scalar_chain (enum machine_mode smode_, enum machine_mode vmode_) n_sse_to_integer = 0; n_integer_to_sse = 0; + + max_visits = x86_stv_max_visits; } /* Free chain's data. */ @@ -354,10 +356,12 @@ scalar_chain::mark_dual_mode_def (df_ref def) } /* Check REF's chain to add new insns into a queue - and find registers requiring conversion. */ + and find registers requiring conversion. Return true if OK, false + if the analysis was aborted. */ -void -scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref) +bool +scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref, + bitmap disallowed) { df_link *chain; bool mark_def = false; @@ -371,6 +375,9 @@ scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref) if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref))) continue; + if (--max_visits == 0) + return false; + if (!DF_REF_REG_MEM_P (chain->ref)) { if (bitmap_bit_p (insns, uid)) @@ -381,6 +388,10 @@ scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref) add_to_queue (uid); continue; } + + /* If we run into parts of an aborted chain discovery abort. */ + if (bitmap_bit_p (disallowed, uid)) + return false; } if (DF_REF_REG_DEF_P (chain->ref)) @@ -401,15 +412,19 @@ scalar_chain::analyze_register_chain (bitmap candidates, df_ref ref) if (mark_def) mark_dual_mode_def (ref); + + return true; } -/* Add instruction into a chain. */ +/* Add instruction into a chain. Return true if OK, false if the search + was aborted. */ -void -scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid) +bool +scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid, + bitmap disallowed) { if (!bitmap_set_bit (insns, insn_uid)) - return; + return true; if (dump_file) fprintf (dump_file, " Adding insn %d to chain #%d\n", insn_uid, chain_id); @@ -426,22 +441,27 @@ scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid) df_ref ref; for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) if (!HARD_REGISTER_P (DF_REF_REG (ref))) - analyze_register_chain (candidates, ref); + if (!analyze_register_chain (candidates, ref, disallowed)) + return false; /* The operand(s) of VEC_SELECT don't need to be converted/convertible. */ if (def_set && GET_CODE (SET_SRC (def_set)) == VEC_SELECT) - return; + return true; for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) if (!DF_REF_REG_MEM_P (ref)) - analyze_register_chain (candidates, ref); + if (!analyze_register_chain (candidates, ref, disallowed)) + return false; + + return true; } /* Build new chain starting from insn INSN_UID recursively - adding all dependent uses and definitions. */ + adding all dependent uses and definitions. Return true if OK, false + if the chain discovery was aborted. */ -void -scalar_chain::build (bitmap candidates, unsigned insn_uid) +bool +scalar_chain::build (bitmap candidates, unsigned insn_uid, bitmap disallowed) { queue = BITMAP_ALLOC (NULL); bitmap_set_bit (queue, insn_uid); @@ -454,7 +474,17 @@ scalar_chain::build (bitmap candidates, unsigned insn_uid) insn_uid = bitmap_first_set_bit (queue); bitmap_clear_bit (queue, insn_uid); bitmap_clear_bit (candidates, insn_uid); - add_insn (candidates, insn_uid); + if (!add_insn (candidates, insn_uid, disallowed)) + { + /* If we aborted the search put sofar found insn on the set of + disallowed insns so that further searches reaching them also + abort and thus we abort the whole but yet undiscovered chain. */ + bitmap_ior_into (disallowed, insns); + if (dump_file) + fprintf (dump_file, "Aborted chain #%d discovery\n", chain_id); + BITMAP_FREE (queue); + return false; + } } if (dump_file) @@ -478,6 +508,8 @@ scalar_chain::build (bitmap candidates, unsigned insn_uid) } BITMAP_FREE (queue); + + return true; } /* Return a cost of building a vector costant @@ -2282,6 +2314,7 @@ convert_scalars_to_vector (bool timode_p) for (unsigned i = 0; i <= 2; ++i) { + auto_bitmap disallowed; bitmap_tree_view (&candidates[i]); while (!bitmap_empty_p (&candidates[i])) { @@ -2296,14 +2329,14 @@ convert_scalars_to_vector (bool timode_p) /* Find instructions chain we want to convert to vector mode. Check all uses and definitions to estimate all required conversions. */ - chain->build (&candidates[i], uid); - - if (chain->compute_convert_gain () > 0) - converted_insns += chain->convert (); - else - if (dump_file) - fprintf (dump_file, "Chain #%d conversion is not profitable\n", - chain->chain_id); + if (chain->build (&candidates[i], uid, disallowed)) + { + if (chain->compute_convert_gain () > 0) + converted_insns += chain->convert (); + else if (dump_file) + fprintf (dump_file, "Chain #%d conversion is not profitable\n", + chain->chain_id); + } delete chain; } diff --git a/gcc/config/i386/i386-features.h b/gcc/config/i386/i386-features.h index 00c2c5e8c2d..72a9f54b4e2 100644 --- a/gcc/config/i386/i386-features.h +++ b/gcc/config/i386/i386-features.h @@ -148,12 +148,15 @@ class scalar_chain /* Registers used in both vector and sclar modes. */ bitmap defs_conv; + /* Limit on chain discovery. */ + unsigned max_visits; + bitmap insns_conv; hash_map defs_map; unsigned n_sse_to_integer; unsigned n_integer_to_sse; - void build (bitmap candidates, unsigned insn_uid); + bool build (bitmap candidates, unsigned insn_uid, bitmap disallowed); virtual int compute_convert_gain () = 0; int convert (); @@ -168,8 +171,9 @@ class scalar_chain void convert_registers (); private: - void add_insn (bitmap candidates, unsigned insn_uid); - void analyze_register_chain (bitmap candidates, df_ref ref); + bool add_insn (bitmap candidates, unsigned insn_uid, bitmap disallowed); + bool analyze_register_chain (bitmap candidates, df_ref ref, + bitmap disallowed); virtual void convert_insn (rtx_insn *insn) = 0; virtual void convert_op (rtx *op, rtx_insn *insn) = 0; }; diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 7d57f617d65..94fdd639ff1 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -599,6 +599,10 @@ Target Mask(STV) Save Disable Scalar to Vector optimization pass transforming 64-bit integer computations into a vector ones. +-param=x86-stv-max-visits= +Target Joined UInteger Var(x86_stv_max_visits) Init(10000) IntegerRange(1, 1000000) Param +The maximum number of use and def visits when discovering a STV chain before the discovery is aborted. + mdispatch-scheduler Target RejectNegative Var(flag_dispatch_scheduler) Do dispatch scheduling if processor is bdver1, bdver2, bdver3, bdver4 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0045661cc5d..2da68802356 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16229,6 +16229,10 @@ The following choices of @var{name} are available on i386 and x86_64 targets: @item x86-stlf-window-ninsns Instructions number above which STFL stall penalty can be compensated. +@item x86-stv-max-visits +The maximum number of use and def visits when discovering a STV chain before +the discovery is aborted. + @end table @end table