From patchwork Fri May 5 23:16:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans-Peter Nilsson X-Patchwork-Id: 90624 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp747679vqo; Fri, 5 May 2023 16:17:04 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7oxq/B0q0HGYflrLQWKkVPZ1xRigsYetLwIKMyXSVGyb43erpQ7c3APPE7n/kKkQwZ+S69 X-Received: by 2002:a17:907:a41e:b0:965:e556:8f6d with SMTP id sg30-20020a170907a41e00b00965e5568f6dmr2842635ejc.63.1683328624044; Fri, 05 May 2023 16:17:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683328624; cv=none; d=google.com; s=arc-20160816; b=0jsjdGZeAFt9vU2HgOd1IwpAszkeu3HVbcPgVgWlQ7atDtMKVdybvuw2usXfWVUr78 3k/ERcLfR076i0ZXlUfK0rIoLLX+Vie2784cbDWStVKnW1l+pDpYldFDGYYgWPrLWW9c dLfXl5pcch14wi1uhwUndEM7fFQYg3NlPnIc8pZV3PibapKfFaP0kUEKPFFleVx5s+Th 0lxz7cMsJGE5Kz+hBPyEWb8BpvskDrP6R/V1nkB1c4za4p82sHGZAsMqzIBrM6N1nhUu 5ux7CAJAc9Wg6/YpyZfk8nUcWxoG0ydp+dxplAdTiKOhSloKAt+MsQcIjymzPZz4Sals mRWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:date:message-id :content-transfer-encoding:mime-version:subject:to:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=6S7xtbMcH/Q9s29moyKb9VPUNAJU9JnQeY3jhfnwyCM=; b=OiG0ftgpfCdyLWeRWgyTg9MAQhVdgU42Lb/K/6URiyqIGs/aOnQtBsmY93SgsASYod wiLJ+pJd71lgdG8lc5CDkiNXtPyPqo/hqy697ryRa0B5kJFG7bhuUMX54RW9jV8tNzuP j++0tnYt6FEyssH8N3qwnuuGKE2cwAXu/Y7CH/MPVZ616QatZsO1OgA2yh1PRnX+B4QM Rzc5deiKzwvCJmL6i1FPF6A4gteLzdluKnwAljXlD5ywgKXXf4SdRDv4ECkvIw1+xsOU oUzznfX3ohQpZQp1Iz9J32CjwfmyoXVv99sI2Osxjy0Rn3iMUXXyeJM7x3UFIgFq6wnC CIDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ccw2DCya; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id 26-20020a170906311a00b00965fd71f1b8si687133ejx.765.2023.05.05.16.17.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 May 2023 16:17:04 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ccw2DCya; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E47123856944 for ; Fri, 5 May 2023 23:17:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E47123856944 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683328622; bh=6S7xtbMcH/Q9s29moyKb9VPUNAJU9JnQeY3jhfnwyCM=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Ccw2DCyagKLu3Smqcq0UQCLlsdBzPt5q4Kh5mK5V+R5bEpuF9V/gHShne0eJvjl47 FLMzjfI+GkaUq+Opf4ejyyDXjoQKJHqHpuhODgl6Zhciyq9Nf5u0PdkSI8UmQxaAjX SGH2mqKNnyLdAPm+h6lpxbatwxASoyEjp2CPNBlA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp2.axis.com (smtp2.axis.com [195.60.68.18]) by sourceware.org (Postfix) with ESMTPS id 53B6A3858D1E for ; Fri, 5 May 2023 23:16:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 53B6A3858D1E To: Subject: [committed] CRIS: peephole2 a lsrq into a lslq+lsrq pair MIME-Version: 1.0 Message-ID: <20230505231617.1D56B2043B@pchp3.se.axis.com> Date: Sat, 6 May 2023 01:16:17 +0200 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hans-Peter Nilsson via Gcc-patches From: Hans-Peter Nilsson Reply-To: Hans-Peter Nilsson Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765097994941153478?= X-GMAIL-MSGID: =?utf-8?q?1765097994941153478?= Observed after opsplit1 with AND in libgcc floating-point functions, like the first spottings of opsplit1/AND opportunities. Two patterns are nominally needed, as the peephole2 optimizer continues from the *first replacement* insn, not from a minimum context for general matching; one that includes it as the last match. But, the "free-standing" opportunity (three shifts) didn't match by itself in a gcc build of libraries plus running the test-suite, and thus deemed uninteresting and left out. (As expected; if it had matched, that'd have indicated a previously missed optimization or other problem elsewhere.) Only the one that includes the previous define_peephole2 that may generate the sequence (i.e. opsplit1/AND), matches easily. Coremark results aren't impressive though: 0.003% improvement in speed and slightly less than 0.1% in size. A testcase is added to match and another one to cover a case of movulsr checking that it's used; it's preferable to lsrandsplit when both would match. gcc: * config/cris/cris.md (lsrandsplit1): New define_peephole2. gcc/testsuite: * gcc.target/cris/peep2-lsrandsplit1.c, gcc.target/cris/peep2-movulsr2.c: New tests. --- gcc/config/cris/cris.md | 53 +++++++++++++++++++ .../gcc.target/cris/peep2-lsrandsplit1.c | 19 +++++++ .../gcc.target/cris/peep2-movulsr2.c | 19 +++++++ 3 files changed, 91 insertions(+) create mode 100644 gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c create mode 100644 gcc/testsuite/gcc.target/cris/peep2-movulsr2.c diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md index e72943b942e5..d5aadf752e86 100644 --- a/gcc/config/cris/cris.md +++ b/gcc/config/cris/cris.md @@ -2690,6 +2690,59 @@ (define_peephole2 ; movulsr = INTVAL (operands[2]) <= 0xff ? GEN_INT (0xff) : GEN_INT (0xffff); }) +;; Avoid, after opsplit1 with AND (below), sequences of: +;; lsrq N,R +;; lslq M,R +;; lsrq M,R +;; (N < M), where we can fold the first lsrq into the lslq-lsrq, like: +;; lslq M-N,R +;; lsrq M,R +;; We have to match this before opsplit1 below and before other peephole2s of +;; lesser value, since peephole2 matching resumes at the first generated insn, +;; and thus wouldn't match a pattern of the three shifts after opsplit1/AND. +;; Note that this lsrandsplit1 is in turn of lesser value than movulsr, since +;; that one doesn't require the same operand for source and destination, but +;; they happen to be the same hard-register at peephole2 time even if +;; naturally separated like in peep2-movulsr2.c, thus this placement. (Source +;; and destination will be re-separated and the move optimized out in +;; cprop_hardreg at time of this writing.) +;; Testcase: gcc.target/cris/peep2-lsrandsplit1.c +(define_peephole2 ; lsrandsplit1 + [(parallel + [(set (match_operand:SI 0 "register_operand") + (lshiftrt:SI + (match_operand:SI 1 "register_operand") + (match_operand:SI 2 "const_int_operand"))) + (clobber (reg:CC CRIS_CC0_REGNUM))]) + (parallel + [(set (match_operand 3 "register_operand") + (and + (match_operand 4 "register_operand") + (match_operand 5 "const_int_operand"))) + (clobber (reg:CC CRIS_CC0_REGNUM))])] + "REGNO (operands[0]) == REGNO (operands[1]) + && REGNO (operands[0]) == REGNO (operands[3]) + && REGNO (operands[0]) == REGNO (operands[4]) + && (INTVAL (operands[2]) + < (clz_hwi (INTVAL (operands[5])) - (HOST_BITS_PER_WIDE_INT - 32))) + && cris_splittable_constant_p (INTVAL (operands[5]), AND, SImode, + optimize_function_for_speed_p (cfun)) == 2" + ;; We're guaranteed by the above hw_clz test (certainly non-zero) and the + ;; test for a two-insn return-value from cris_splittable_constant_p, that + ;; the cris_splittable_constant_p AND-replacement would be lslq-lsrq. + [(parallel + [(set (match_dup 0) (ashift:SI (match_dup 0) (match_dup 9))) + (clobber (reg:CC CRIS_CC0_REGNUM))]) + (parallel + [(set (match_dup 0) (lshiftrt:SI (match_dup 0) (match_dup 10))) + (clobber (reg:CC CRIS_CC0_REGNUM))])] +{ + HOST_WIDE_INT shiftval + = clz_hwi (INTVAL (operands[5])) - (HOST_BITS_PER_WIDE_INT - 32); + operands[9] = GEN_INT (shiftval - INTVAL (operands[2])); + operands[10] = GEN_INT (shiftval); +}) + ;; Testcase for the following four peepholes: gcc.target/cris/peep2-xsrand.c (define_peephole2 ; asrandb diff --git a/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c b/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c new file mode 100644 index 000000000000..0da645358771 --- /dev/null +++ b/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-final { scan-assembler-not " and" } } */ +/* { dg-final { scan-assembler-times "lsrq " 2 } } */ +/* { dg-final { scan-assembler-times "lslq " 2 } } */ +/* { dg-options "-O2" } */ + +/* Test the "lsrlsllsr1" peephole2 trivially. */ + +unsigned int +andwlsr (unsigned int x) +{ + return (x >> 17) & 0x7ff; +} + +int +andwasr (int x) +{ + return (x >> 17) & 0x7ff; +} diff --git a/gcc/testsuite/gcc.target/cris/peep2-movulsr2.c b/gcc/testsuite/gcc.target/cris/peep2-movulsr2.c new file mode 100644 index 000000000000..4696e71138cb --- /dev/null +++ b/gcc/testsuite/gcc.target/cris/peep2-movulsr2.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-final { scan-assembler "movu.w " } } */ +/* { dg-final { scan-assembler "movu.b " } } */ +/* { dg-final { scan-assembler-not "and.. " } } */ +/* { dg-options "-O2" } */ + +/* Test the "movulsrb", "movulsrw" peephole2:s trivially. */ + +unsigned int +movulsrb (unsigned y, unsigned int x) +{ + return (x & 255) >> 1; +} + +unsigned int +movulsrw (unsigned y, unsigned int x) +{ + return (x & 65535) >> 4; +}