From patchwork Tue Jul 18 11:25:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 121949 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp1670266vqt; Tue, 18 Jul 2023 04:26:32 -0700 (PDT) X-Google-Smtp-Source: APBJJlGzwOMwHqg+rC8FysMt/v0QR+5+UXWizSOlxMzel1CioF6TOID51dtiFMlS7ociE6psrA2M X-Received: by 2002:a50:ee82:0:b0:51e:1643:5ad0 with SMTP id f2-20020a50ee82000000b0051e16435ad0mr13707884edr.8.1689679592262; Tue, 18 Jul 2023 04:26:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689679592; cv=none; d=google.com; s=arc-20160816; b=tMq9D/fX94cil8FHOI5iQR5oe7yS/uw3gv+F0QNwf7zZMkfGlKiJ2tWIiyokwD0qzi lqhGgBF87Ewg+DYLyOpHPdXLjGbvb4tCdKbuOo7Dtin/jFdcrVAkXoQoKzC4/PT7kbJn pv7GeiS6FW7dZlPjtFG/nBvX46ADok4T8SGeCEYN6VuXgQMEdT6Tj67jgnDL8705ECP5 KATSDlsvo8685ZwMT/jJ3tCVBWueuW0Grp0RLs9EA3ifoPjpRWOA+mG5gipOruv6xPcs lq6hhxsrf4MQyeG3h4YcUefzQ99JqaeOGkQNqwbtLtYFAyJJUxJUImTJrt4mFIh2bUpl IQaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:message-id :mime-version:subject:cc:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=F3LTaSGgSSWlFKjgYWeYeolvrrPQffUtqi9I5YVz+cA=; fh=8junGlhal8KRkZSFSRR2M10fTFNDSh24UgJcat68lmI=; b=SxTzb7C5mcwWbqMLSQ0wgluFO13/t1gMukj9AHFDAtwkpzY0nDwMkZ+Xv4EvPjeLcQ rwpeBnY0P6gT11be4YumF8LuD66dD6z9U61DJ7p2wvtMHhAts5gdzbp2uDIewKt6EJqS AFlnBp2csgyWJJDVPxN+ZPQJpsAptPytvcCDK4IC1G5ysPB/hwShz/qzLkgsh/ihPQMX Ivrp/5HKgOdt8fyVySYCJZyoyCq8vlsNTgqsZQl5fJsRc8d+8j9QV6XkpVrujwPLudO7 Z0EflDBC9AV9obWUF3ELs5qZu+/1I1eevjUOycGTQbVIpklE8QqViRNygLbgwNknmMRV NQzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fopBaLEH; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id k21-20020aa7d8d5000000b0051e1a404638si1137618eds.277.2023.07.18.04.26.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jul 2023 04:26:32 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fopBaLEH; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 099C93856DCE for ; Tue, 18 Jul 2023 11:26:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 099C93856DCE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689679590; bh=F3LTaSGgSSWlFKjgYWeYeolvrrPQffUtqi9I5YVz+cA=; h=Date:To:cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=fopBaLEH0DhX3xBnZS3EVmKYgy1Xwjgz3h+Yq120fJVvM9Xe3PwfibHgF9TU4MzNv cgVZlYytnLrV3fwxayJfcS0sv64IEXpzpHDr6HJ8NcgnGNY+nV6lwEBtFEuY4v846p dRcjL7uXMoV7bvcMGnPmSto8Q1SOZDpnG93NQfPI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 2F7BC38582B0 for ; Tue, 18 Jul 2023 11:25:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F7BC38582B0 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D40681FDB9; Tue, 18 Jul 2023 11:25:45 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BC5D413494; Tue, 18 Jul 2023 11:25:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id x+LOLLl2tmQBJAAAMHmgww (envelope-from ); Tue, 18 Jul 2023 11:25:45 +0000 Date: Tue, 18 Jul 2023 13:25:45 +0200 (CEST) To: gcc-patches@gcc.gnu.org cc: Jakub Jelinek Subject: [PATCH] middle-end/61747 - conditional move expansion and constants MIME-Version: 1.0 Message-Id: <20230718112545.BC5D413494@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771757467826080277 X-GMAIL-MSGID: 1771757467826080277 When expanding a COND_EXPR or a VEC_COND_EXPR the x86 backend for example tries to match FP min/max instructions. But this only works when it can see the equality of the comparison and selected operands. This breaks in both prepare_cmp_insn and vector_compare_rtx where the former forces expensive constants to a register and the latter performs legitimization. The patch below fixes this in the caller preserving former equalities. Bootstrap and regtest in progress. OK if that succeeds? Thanks, Richard. PR middle-end/61747 * internal-fn.cc (expand_vec_cond_optab_fn): When the value operands are equal to the original comparison operands preserve that equality by re-using the comparison expansion. * optabs.cc (emit_conditional_move): When the value operands are equal to the comparison operands and would be forced to a register by prepare_cmp_insn do so earlier, preserving the equality. * g++.target/i386/pr61747.C: New testcase. --- gcc/internal-fn.cc | 17 ++++++++-- gcc/optabs.cc | 32 ++++++++++++++++++- gcc/testsuite/g++.target/i386/pr61747.C | 42 +++++++++++++++++++++++++ 3 files changed, 88 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr61747.C diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index e698f0bffc7..c83c3921792 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3019,8 +3019,21 @@ expand_vec_cond_optab_fn (internal_fn, gcall *stmt, convert_optab optab) icode = convert_optab_handler (optab, mode, cmp_op_mode); rtx comparison = vector_compare_rtx (VOIDmode, tcode, op0a, op0b, unsignedp, icode, 4); - rtx rtx_op1 = expand_normal (op1); - rtx rtx_op2 = expand_normal (op2); + /* vector_compare_rtx legitimizes operands, preserve equality when + expanding op1/op2. */ + rtx rtx_op1, rtx_op2; + if (operand_equal_p (op1, op0a)) + rtx_op1 = XEXP (comparison, 0); + else if (operand_equal_p (op1, op0b)) + rtx_op1 = XEXP (comparison, 1); + else + rtx_op1 = expand_normal (op1); + if (operand_equal_p (op2, op0a)) + rtx_op2 = XEXP (comparison, 0); + else if (operand_equal_p (op2, op0b)) + rtx_op2 = XEXP (comparison, 1); + else + rtx_op2 = expand_normal (op2); rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); create_output_operand (&ops[0], target, mode); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 4e9f58f8060..a9ba3267666 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -5119,13 +5119,43 @@ emit_conditional_move (rtx target, struct rtx_comparison comp, last = get_last_insn (); do_pending_stack_adjust (); machine_mode cmpmode = comp.mode; + rtx orig_op0 = XEXP (comparison, 0); + rtx orig_op1 = XEXP (comparison, 1); + rtx op2p = op2; + rtx op3p = op3; + /* If we are optimizing, force expensive constants into a register + but preserve an eventual equality with op2/op3. */ + if (CONSTANT_P (orig_op0) && optimize + && (rtx_cost (orig_op0, mode, COMPARE, 0, + optimize_insn_for_speed_p ()) + > COSTS_N_INSNS (1)) + && can_create_pseudo_p ()) + { + XEXP (comparison, 0) = force_reg (cmpmode, orig_op0); + if (rtx_equal_p (orig_op0, op2)) + op2p = XEXP (comparison, 0); + if (rtx_equal_p (orig_op0, op3)) + op3p = XEXP (comparison, 0); + } + if (CONSTANT_P (orig_op1) && optimize + && (rtx_cost (orig_op1, mode, COMPARE, 0, + optimize_insn_for_speed_p ()) + > COSTS_N_INSNS (1)) + && can_create_pseudo_p ()) + { + XEXP (comparison, 1) = force_reg (cmpmode, orig_op1); + if (rtx_equal_p (orig_op1, op2)) + op2p = XEXP (comparison, 1); + if (rtx_equal_p (orig_op1, op3)) + op3p = XEXP (comparison, 1); + } prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1), GET_CODE (comparison), NULL_RTX, unsignedp, OPTAB_WIDEN, &comparison, &cmpmode); if (comparison) { rtx res = emit_conditional_move_1 (target, comparison, - op2, op3, mode); + op2p, op3p, mode); if (res != NULL_RTX) return res; } diff --git a/gcc/testsuite/g++.target/i386/pr61747.C b/gcc/testsuite/g++.target/i386/pr61747.C new file mode 100644 index 00000000000..024ef400052 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr61747.C @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target c++11 } */ +/* { dg-options "-O2 -msse4.1 -mfpmath=sse" } */ + +typedef float __attribute__( ( vector_size( 16 ) ) ) float32x4_t; + +template +V1 vmax(V1 a, V1 b) { + return (a>b) ? a : b; +} + +template +V1 vmin(V1 a, V1 b) { + return (a +Float bart(Float a) { + constexpr Float zero{0.f}; + constexpr Float it = zero+4.f; + constexpr Float zt = zero-3.f; + return vmin(vmax(a,zt),it); +} + +float bar(float a) { + return bart(a); +} + +float32x4_t bar(float32x4_t a) { + return bart(a); +} + +/* { dg-final { scan-assembler-times "min" 4 } } */ +/* { dg-final { scan-assembler-times "max" 4 } } */