From patchwork Wed Jun 7 16:58:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 104630 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp343071vqr; Wed, 7 Jun 2023 09:59:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5TjYHb8/6VY3hPVUHA1xDhV4B936CtYTnnXuldbWaMUmxsjtKYN4kLg7D1IF2RMsN39TXX X-Received: by 2002:a17:907:e8b:b0:975:bbc:1e33 with SMTP id ho11-20020a1709070e8b00b009750bbc1e33mr6986986ejc.31.1686157186907; Wed, 07 Jun 2023 09:59:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686157186; cv=none; d=google.com; s=arc-20160816; b=r8Rtkz7xQB1u7i0WS18nCiDNUxRe66J/jA2XOKzZnPtcvca3/8J+VUwOh3dmo+enSE 6H6z+WpMr9Mpl50wdKcS3HyX7eR3IKtOOsky/X0sIdqmAJvuWo/qDgmmcEOXufbm89jD t9Z3knXqqLXg7y1yjpln6bW/bUKTUxdpJ6X26JAjKU7WuzbjEfBmk56R0NNEvZOXowht 0LwBo9rdE8UNk63z1K1JoL+Ky2/RuCFxb8ga++u8XOrrBUkWrUqGhWNDaoTxYX1RXHRJ rptU7v4CAnw/ksgTbC1cgLUIV8BMjdFYWeNguHEDj5tmr1E4yBWSZ7p9xyuuzl0L8Lzl CyIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:cc:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=EKyF5Gm28qyvZvud4Z6qIClGVEHFBseibvW0UIwD9uM=; b=q7gQDFpYJw+tBJ82PDp7BUpXZ/M+JRzno8DqC8T1Ting5seXXJKTae+kpD7cqRE1rK xdIYo7kJfnGpXVRKutHEDrvy/pxBYNAPgtqo39tbbEnMu0cdDB6Pei/qr7UR3bPql/xY fUltARit4MbrRa7BBcKsp1igIIm2SKfU/mp3xfNh1iH4qAAZYG/w97zBmnyT+AS5MBx6 2KrxKj00VoYmvPbe2At7q5QDQYY9x+pCNKWggO/5NHXRVnlvt0Y5ey/I0qXvtJE96y55 kPfk8FvnX19Fp0krz5Bcsy4wR93aGkFuFp4vIL1gUAdDm0WJtfzIt7wD47e43NxNlbfG kM3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TRyOKJKD; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u2-20020aa7d542000000b0050d82f96859si9182001edr.145.2023.06.07.09.59.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jun 2023 09:59:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TRyOKJKD; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 70A8D3857B9B for ; Wed, 7 Jun 2023 16:59:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 70A8D3857B9B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686157185; bh=EKyF5Gm28qyvZvud4Z6qIClGVEHFBseibvW0UIwD9uM=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=TRyOKJKDXSxXrMWNQx1WztOHXC+OZkiquzpyUBKtD3NaFsUWBEhLYO2N7Obtf+MeE xIuqGW2m6ig6nDPlvX1d+h3cMYAuc6mvn+XFhDERQCniZQhP6k/lnK0rQL6SKxd+Cn ik12BhU5+cezX4UtldFxJxwfiJL1kzpj6rPs3fUk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 0E8FA3858C54 for ; Wed, 7 Jun 2023 16:58:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E8FA3858C54 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-284-smUlTjAhNwKD_0BNIRxnVg-1; Wed, 07 Jun 2023 12:58:57 -0400 X-MC-Unique: smUlTjAhNwKD_0BNIRxnVg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DEFC6858EEB; Wed, 7 Jun 2023 16:58:56 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.194.30]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A1B1C2166B25; Wed, 7 Jun 2023 16:58:56 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 357GwsDQ3600272 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 7 Jun 2023 18:58:54 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 357Gwr6C3600271; Wed, 7 Jun 2023 18:58:53 +0200 Date: Wed, 7 Jun 2023 18:58:53 +0200 To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] optabs: Implement double-word ctz and ffs expansion Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768063958440616489?= X-GMAIL-MSGID: =?utf-8?q?1768063958440616489?= Hi! We have expand_doubleword_clz for a couple of years, where we emit double-word CLZ as if (high_word == 0) return CLZ (low_word) + word_size; else return CLZ (high_word); We can do something similar for CTZ and FFS IMHO, just with the 2 words swapped. So if (low_word == 0) return CTZ (high_word) + word_size; else return CTZ (low_word); for CTZ and if (low_word == 0) { return high_word ? FFS (high_word) + word_size : 0; else return FFS (low_word); The following patch implements that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Note, on some targets which implement both word_mode ctz and ffs patterns, it might be better to incrementally implement those double-word ffs expansion patterns in md files, because we aren't able to optimize it correctly; nothing can detect we have just made sure that argument is not 0 and so don't need to bother with handling that case. So, on ia32 just using CTZ patterns would be better there, but I think we can even do better and instead of doing the comparisons of the operands against 0 do the CTZ expansion followed by testing of flags. 2023-06-07 Jakub Jelinek * optabs.cc (expand_ffs): Add forward declaration. (expand_doubleword_clz): Rename to ... (expand_doubleword_clz_ctz_ffs): ... this. Add UNOPTAB argument, handle also doubleword CTZ and FFS in addition to CLZ. (expand_unop): Adjust caller. Also call it for doubleword ctz_optab and ffs_optab. * gcc.target/i386/ctzll-1.c: New test. * gcc.target/i386/ffsll-1.c: New test. Jakub --- gcc/optabs.cc.jj 2023-06-07 09:42:14.701130305 +0200 +++ gcc/optabs.cc 2023-06-07 14:35:04.909879272 +0200 @@ -2697,10 +2697,14 @@ expand_clrsb_using_clz (scalar_int_mode return temp; } -/* Try calculating clz of a double-word quantity as two clz's of word-sized - quantities, choosing which based on whether the high word is nonzero. */ +static rtx expand_ffs (scalar_int_mode, rtx, rtx); + +/* Try calculating clz, ctz or ffs of a double-word quantity as two clz, ctz or + ffs operations on word-sized quantities, choosing which based on whether the + high (for clz) or low (for ctz and ffs) word is nonzero. */ static rtx -expand_doubleword_clz (scalar_int_mode mode, rtx op0, rtx target) +expand_doubleword_clz_ctz_ffs (scalar_int_mode mode, rtx op0, rtx target, + optab unoptab) { rtx xop0 = force_reg (mode, op0); rtx subhi = gen_highpart (word_mode, xop0); @@ -2709,6 +2713,7 @@ expand_doubleword_clz (scalar_int_mode m rtx_code_label *after_label = gen_label_rtx (); rtx_insn *seq; rtx temp, result; + int addend = 0; /* If we were not given a target, use a word_mode register, not a 'mode' register. The result will fit, and nobody is expecting @@ -2721,6 +2726,9 @@ expand_doubleword_clz (scalar_int_mode m 'target' to tag a REG_EQUAL note on. */ result = gen_reg_rtx (word_mode); + if (unoptab != clz_optab) + std::swap (subhi, sublo); + start_sequence (); /* If the high word is not equal to zero, @@ -2728,7 +2736,13 @@ expand_doubleword_clz (scalar_int_mode m emit_cmp_and_jump_insns (subhi, CONST0_RTX (word_mode), EQ, 0, word_mode, true, hi0_label); - temp = expand_unop_direct (word_mode, clz_optab, subhi, result, true); + if (optab_handler (unoptab, word_mode) != CODE_FOR_nothing) + temp = expand_unop_direct (word_mode, unoptab, subhi, result, true); + else + { + gcc_assert (unoptab == ffs_optab); + temp = expand_ffs (word_mode, subhi, result); + } if (!temp) goto fail; @@ -2739,14 +2753,32 @@ expand_doubleword_clz (scalar_int_mode m emit_barrier (); /* Else clz of the full value is clz of the low word plus the number - of bits in the high word. */ + of bits in the high word. Similarly for ctz/ffs of the high word, + except that ffs should be 0 when both words are zero. */ emit_label (hi0_label); - temp = expand_unop_direct (word_mode, clz_optab, sublo, 0, true); + if (unoptab == ffs_optab) + { + convert_move (result, const0_rtx, true); + emit_cmp_and_jump_insns (sublo, CONST0_RTX (word_mode), EQ, 0, + word_mode, true, after_label); + } + + if (optab_handler (unoptab, word_mode) != CODE_FOR_nothing) + temp = expand_unop_direct (word_mode, unoptab, sublo, NULL_RTX, true); + else + { + gcc_assert (unoptab == ffs_optab); + temp = expand_unop_direct (word_mode, ctz_optab, sublo, NULL_RTX, true); + addend = 1; + } + if (!temp) goto fail; + temp = expand_binop (word_mode, add_optab, temp, - gen_int_mode (GET_MODE_BITSIZE (word_mode), word_mode), + gen_int_mode (GET_MODE_BITSIZE (word_mode) + addend, + word_mode), result, true, OPTAB_DIRECT); if (!temp) goto fail; @@ -2759,7 +2791,7 @@ expand_doubleword_clz (scalar_int_mode m seq = get_insns (); end_sequence (); - add_equal_note (seq, target, CLZ, xop0, NULL_RTX, mode); + add_equal_note (seq, target, optab_to_code (unoptab), xop0, NULL_RTX, mode); emit_insn (seq); return target; @@ -3252,7 +3284,8 @@ expand_unop (machine_mode mode, optab un if (GET_MODE_SIZE (int_mode) == 2 * UNITS_PER_WORD && optab_handler (unoptab, word_mode) != CODE_FOR_nothing) { - temp = expand_doubleword_clz (int_mode, op0, target); + temp = expand_doubleword_clz_ctz_ffs (int_mode, op0, target, + unoptab); if (temp) return temp; } @@ -3499,6 +3532,18 @@ expand_unop (machine_mode mode, optab un if (temp) return temp; } + + if ((unoptab == ctz_optab || unoptab == ffs_optab) + && optimize_insn_for_speed_p () + && is_a (mode, &int_mode) + && GET_MODE_SIZE (int_mode) == 2 * UNITS_PER_WORD + && (optab_handler (unoptab, word_mode) != CODE_FOR_nothing + || optab_handler (ctz_optab, word_mode) != CODE_FOR_nothing)) + { + temp = expand_doubleword_clz_ctz_ffs (int_mode, op0, target, unoptab); + if (temp) + return temp; + } try_libcall: /* Now try a library call in this mode. */ --- gcc/testsuite/gcc.target/i386/ctzll-1.c.jj 2023-06-07 14:38:58.749648164 +0200 +++ gcc/testsuite/gcc.target/i386/ctzll-1.c 2023-06-07 14:41:22.676659439 +0200 @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "__ctzdi2" } } */ + +int +foo (unsigned long long x) +{ + return __builtin_ctzll (x); +} --- gcc/testsuite/gcc.target/i386/ffsll-1.c.jj 2023-06-07 14:40:00.859789953 +0200 +++ gcc/testsuite/gcc.target/i386/ffsll-1.c 2023-06-07 14:41:15.104764068 +0200 @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "__ffsdi2" } } */ + +int +foo (unsigned long long x) +{ + return __builtin_ffsll (x); +}