From patchwork Fri Aug 12 10:00:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 485 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:38f:b0:2d5:3c95:9e21 with SMTP id 15csp756674pxh; Fri, 12 Aug 2022 03:01:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR6oXDxXtI/4FHSYbjM3/UeXwBBC+QPiSB5WK3h/5AxX1jpHj/EaqSwDB1X8Ah+ctbuUeCKp X-Received: by 2002:a17:907:7b95:b0:731:113a:d7a2 with SMTP id ne21-20020a1709077b9500b00731113ad7a2mr2076352ejc.377.1660298503499; Fri, 12 Aug 2022 03:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660298503; cv=none; d=google.com; s=arc-20160816; b=sp6fabgA8JSQwV/XTeRItYBbuDWbm3HtsNTxZpI/SsZT0zHYfvsX8Wz6TcMcRyo1px 5JlRTFpt4tGwDzhBU929aleKFq7McUhoYI0b+jQUuKRNkrGRqD+z8zLYXX3ddtXVaKKz juAuMlJqkVvIDxSvpqIdn7/TBF2Mf+gmJWkyFq8KPg5LmNxvFOPmjaKl5OmwQwt0MkTF BoxCtGUahMGBUBw7QeysHvrl/UnQRc8PJyo30VnyGH8taRUrKc2nOTkitIa4+OWK+R5z zzkdfwuJfNSByoT4jHu94QCf173SGwmhO6z9+O00VS7635qq1BkBLVrq7x2qRMJkBPwC 8avg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:to:subject:content-language:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=PFvRP3ruwTB1KndcG/dhpu+/Bg9SUvXRiA/EVgjS7/w=; b=YY6hN7VkRYb3ccKwrsPcyxcFeWfLedYM3DpR9RoH/5hLzW3SLGCld979tqnG52FqXb u1b+cpYjpBWSnLnogroGIUjbYxxpxtmIu8WNjTso3Cov3sV9vfUOgiQ2su/k8fFuNZp5 orsZEyMtZvrLPFsakIkjLo/AtlnBwYJ7vhigxXZACsetD79KR5Gc09zzgvPOTWAjRhzU tEzEe4qHXre1d80YPukzxlT67uN87TLdecYNcT9UkkBKpKJP2BP+IrZcBm/iy1wT7/m4 LS1+BY1KSW3gB20+uTU0P/oynEAcq/WVW0gWXbdBOxb9ZzlqS1/I6NX70uFygdatOiAH lNFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pC4evD0K; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d12-20020a50f68c000000b0043b753b1e6asi1775108edn.225.2022.08.12.03.01.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Aug 2022 03:01:43 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pC4evD0K; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 19DFA3858436 for ; Fri, 12 Aug 2022 10:01:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 19DFA3858436 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660298502; bh=PFvRP3ruwTB1KndcG/dhpu+/Bg9SUvXRiA/EVgjS7/w=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=pC4evD0K3FNyuLSTxALehB+xoAvtkZtsLVc2C9T7iAImc91ctxsFXTTI6MosOzMpQ g4XG4hvEp1VeEKGP4S/2t0inrwNfTU9Ply5nfA05oRZ6muX0r38sS7X4sGhYDQWRw4 ywkjvbk71uPAU/g9gxJ3o8FRU9yDQ1tFPQju1JZY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 923803858D28 for ; Fri, 12 Aug 2022 10:00:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 923803858D28 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 27C9ftH2028517 for ; Fri, 12 Aug 2022 10:00:52 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hwmf1gdy1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 12 Aug 2022 10:00:52 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27C9rhHT022142 for ; Fri, 12 Aug 2022 10:00:49 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma03fra.de.ibm.com with ESMTP id 3huwvfta6f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 12 Aug 2022 10:00:49 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 27CA0kSD24445344 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Aug 2022 10:00:46 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 67D26AE04D; Fri, 12 Aug 2022 10:00:46 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 35104AE051; Fri, 12 Aug 2022 10:00:46 +0000 (GMT) Received: from [9.171.46.216] (unknown [9.171.46.216]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 12 Aug 2022 10:00:46 +0000 (GMT) Message-ID: <4166b06c-7713-2d4a-3c86-54e99f4a9f53@linux.ibm.com> Date: Fri, 12 Aug 2022 12:00:45 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Content-Language: en-US Subject: [PATCH] s390: Add -munroll-only-small-loops. To: GCC Patches X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Ah8Os3LqrFjcSaxr5jOKqED8e1i_y-ji X-Proofpoint-GUID: Ah8Os3LqrFjcSaxr5jOKqED8e1i_y-ji X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-12_06,2022-08-11_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 adultscore=0 priorityscore=1501 suspectscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 malwarescore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2208120026 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Robin Dapp via Gcc-patches From: Robin Dapp Reply-To: Robin Dapp Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740949163570546819?= X-GMAIL-MSGID: =?utf-8?q?1740949163570546819?= Hi, inspired by Power we also introduce -munroll-only-small-loops. This implies activating -funroll-loops and -munroll-only-small-loops at -O2 and above. Bootstrapped and regtested. This introduces one regression in gcc.dg/sms-compare-debug-1.c but currently dumps for sms are broken as well. The difference is in the location of some INSN_DELETED notes so I would consider this a minor issue. Is it OK? Regards Robin gcc/ChangeLog: * common/config/s390/s390-common.cc: Enable -funroll-loops and -munroll-only-small-loops for OPT_LEVELS_2_PLUS_SPEED_ONLY. * config/s390/s390.cc (s390_loop_unroll_adjust): Do not unroll loops larger than 12 instructions. (s390_override_options_after_change): Set unroll options. (s390_option_override_internal): Likewise. * config/s390/s390.opt: Document munroll-only-small-loops. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-copysign.c: Do not unroll. * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Dito. * gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: Dito. * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Dito. * gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: Dito. --- gcc/common/config/s390/s390-common.cc | 5 +++ gcc/config/s390/s390.cc | 31 +++++++++++++++++++ gcc/config/s390/s390.opt | 4 +++ .../gcc.target/s390/vector/vec-copysign.c | 2 +- .../s390/zvector/autovec-double-quiet-uneq.c | 2 +- .../zvector/autovec-double-signaling-ltgt.c | 2 +- .../s390/zvector/autovec-float-quiet-uneq.c | 2 +- .../zvector/autovec-float-signaling-ltgt.c | 2 +- 8 files changed, 45 insertions(+), 5 deletions(-) diff --git a/gcc/common/config/s390/s390-common.cc b/gcc/common/config/s390/s390-common.cc index 72a5ef47eaac..be3e6f201429 100644 --- a/gcc/common/config/s390/s390-common.cc +++ b/gcc/common/config/s390/s390-common.cc @@ -64,6 +64,11 @@ static const struct default_options s390_option_optimization_table[] = /* Enable -fsched-pressure by default when optimizing. */ { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 }, + /* Enable -munroll-only-small-loops with -funroll-loops to unroll small + loops at -O2 and above by default. */ + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 }, + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_munroll_only_small_loops, NULL, 1 }, + /* ??? There are apparently still problems with -fcaller-saves. */ { OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 }, diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 5644600edf3d..ef38fbe68c84 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -15457,6 +15457,21 @@ s390_loop_unroll_adjust (unsigned nunroll, struct loop *loop) if (s390_tune < PROCESSOR_2097_Z10) return nunroll; + if (unroll_only_small_loops) + { + /* Only unroll loops smaller than or equal to 12 insns. */ + const unsigned int small_threshold = 12; + + if (loop->ninsns > small_threshold) + return 0; + + /* ???: Make this dependent on the type of registers in + the loop. Increase the limit for vector registers. */ + const unsigned int max_insns = optimize >= 3 ? 36 : 24; + + nunroll = MIN (nunroll, max_insns / loop->ninsns); + } + /* Count the number of memory references within the loop body. */ bbs = get_loop_body (loop); subrtx_iterator::array_type array; @@ -15531,6 +15546,19 @@ static void s390_override_options_after_change (void) { s390_default_align (&global_options); + + /* Explicit -funroll-loops turns -munroll-only-small-loops off. */ + if ((OPTION_SET_P (flag_unroll_loops) && flag_unroll_loops) + || (OPTION_SET_P (flag_unroll_all_loops) + && flag_unroll_all_loops)) + { + if (!OPTION_SET_P (unroll_only_small_loops)) + unroll_only_small_loops = 0; + if (!OPTION_SET_P (flag_cunroll_grow_size)) + flag_cunroll_grow_size = 1; + } + else if (!OPTION_SET_P (flag_cunroll_grow_size)) + flag_cunroll_grow_size = flag_peel_loops || optimize >= 3; } static void @@ -15740,6 +15768,9 @@ s390_option_override_internal (struct gcc_options *opts, /* Set the default alignment. */ s390_default_align (opts); + /* Set unroll options. */ + s390_override_options_after_change (); + /* Call target specific restore function to do post-init work. At the moment, this just sets opts->x_s390_cost_pointer. */ s390_function_specific_restore (opts, opts_set, NULL); diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt index 9e8d3bfd404c..c375b9c5f729 100644 --- a/gcc/config/s390/s390.opt +++ b/gcc/config/s390/s390.opt @@ -321,3 +321,7 @@ and the default behavior is to emit separate multiplication and addition instructions for long doubles in vector registers, because measurements show that this improves performance. This option allows overriding it for testing purposes. + +munroll-only-small-loops +Target Undocumented Var(unroll_only_small_loops) Init(0) Save +; Use conservative small loop unrolling. diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-copysign.c b/gcc/testsuite/gcc.target/s390/vector/vec-copysign.c index 64c6970c23e2..b723ceb13be9 100644 --- a/gcc/testsuite/gcc.target/s390/vector/vec-copysign.c +++ b/gcc/testsuite/gcc.target/s390/vector/vec-copysign.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { s390*-*-* } } } */ -/* { dg-options "-O2 -ftree-vectorize -mzarch" } */ +/* { dg-options "-O2 -ftree-vectorize -mzarch -fno-unroll-loops" } */ /* { dg-final { scan-assembler-times "vgmg" 1 } } */ /* { dg-final { scan-assembler-times "vgmf" 1 } } */ /* { dg-final { scan-assembler-times "vsel" 2 } } */ diff --git a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c index 7c9b20fd2e0f..8948be28ed5d 100644 --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -march=z13 -mzvector -mzarch" } */ +/* { dg-options "-O3 -march=z13 -mzvector -mzarch -fno-unroll-loops" } */ #include "autovec.h" diff --git a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-ltgt.c b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-ltgt.c index 9dfae8f2f7e7..9417b0c4838f 100644 --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-ltgt.c +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-ltgt.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -march=z14 -mzvector -mzarch" } */ +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fno-unroll-loops" } */ #include "autovec.h" diff --git a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-quiet-uneq.c b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-quiet-uneq.c index 5ab9337880d0..0a2aca0d5dd3 100644 --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-quiet-uneq.c +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-quiet-uneq.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -march=z14 -mzvector -mzarch" } */ +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fno-unroll-loops" } */ #include "autovec.h" diff --git a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-ltgt.c b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-ltgt.c index c34cf0916087..15e61b70b0bd 100644 --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-ltgt.c +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-ltgt.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -march=z14 -mzvector -mzarch" } */ +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fno-unroll-loops" } */ #include "autovec.h"