From patchwork Thu Jul 13 09:53:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 119734 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1708767vqm; Thu, 13 Jul 2023 02:54:04 -0700 (PDT) X-Google-Smtp-Source: APBJJlEuxFZWyRgjGumkBC/6K28zXLgXsqs+mseIQ0+UHLx46bwM9qdGXJoWzwwFH/+mfYCeuCJd X-Received: by 2002:a17:907:9710:b0:98d:f2c9:a1eb with SMTP id jg16-20020a170907971000b0098df2c9a1ebmr6708174ejc.24.1689242044240; Thu, 13 Jul 2023 02:54:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689242044; cv=none; d=google.com; s=arc-20160816; b=OovynQoK29gSNFhc3XDs+EjzzdoBFAczyncismVVOUuENaCKZ0rxt8wY+2LkOr3U+h hOIgXwJL/GGHkv/asjHMDoy9YdEfaxhoOCplLgTf7OF5akWsO62AJIlkT48071mwxus9 /NKgcfC1BFe9F+e2IJR5yty+KFNAnzryZGOq2zdjF5NXT9N/z5t2HvI45SWj/e7wvewy 7AETxjel/cL9ODUad+nx/V7D5TFkpsRK7D6ZIcaO209V+5585g0ND/xtlDjodiqc9OvD qeYlJ1SrasesHIzO/Yhi4jyK5b96YH08JeRgV0pmM+s1hLohHKj5UYHZqCuW8St9oa62 4KXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=2IewqpFmc5+uE4JADlsxSTNiOKXUinZ4tSuUuZUaZTA=; fh=etb9MYHN7HLF/sff76ICVdPeKiI8ZsjoOL2bcdG0aog=; b=TwH1bS0E7/6Ik55FkAsF06W/xpJBAi3Dz9EDCYbqEgZLr5n9D3fNqpqyZyLR3yZfpP 5Nj5LViLJQy0t6PKbakEKJqgwLZ1FaNzMj4yDKlbGQrVwb7dtTcdUWm07fzmukf/6B8K V/4exmTMHeNxnFRyDrigwGX6ElyStkgVX7h1PcZmom2DHGPZ7kG7/gJtGZ40Vf1fvzTk OuWVsKl5DLweKdNkC2L9a52NKwhckh8U5rAlDD9+OwVSiqISKdXcTQy43LifE1sb1QW3 Td12y9tYIXO5R+wyZdLBeQT2Ln9/kVBoHUb9FOAsLBfBz8c2Bbc0JklggNWV4cNawNUS f2VA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Y3fl9VEU; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c8-20020a170906694800b0099279b2cdddsi6996384ejs.833.2023.07.13.02.54.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 02:54:04 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Y3fl9VEU; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E26EA385770D for ; Thu, 13 Jul 2023 09:54:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E26EA385770D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689242042; bh=2IewqpFmc5+uE4JADlsxSTNiOKXUinZ4tSuUuZUaZTA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Y3fl9VEUGGkBpXPVDhodbbSkzOQacCDrZMM3XkcPXJmSqKP5mamXc74cpV8PQoH+l siPwTTwXStljYanA2vrkcq8qmTplKGoIPYZ5bosLUiIint3jvX5d01nEAS9jCckCac IqIxSlfA6aL3mpasCd45IkbUEiSFAB8BtQB2vfU0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 104203858C41 for ; Thu, 13 Jul 2023 09:53:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 104203858C41 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 2C72822170 for ; Thu, 13 Jul 2023 09:53:15 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 22ED42C142 for ; Thu, 13 Jul 2023 09:53:15 +0000 (UTC) Date: Thu, 13 Jul 2023 09:53:14 +0000 (UTC) To: gcc-patches@gcc.gnu.org Subject: [PATCH][RFC] tree-optimization/88540 - FP x > y ? x : y if-conversion without -ffast-math User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20230713095402.E26EA385770D@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771298665887756288 X-GMAIL-MSGID: 1771298665887756288 The following makes sure that FP x > y ? x : y style max/min operations are if-converted at the GIMPLE level. While we can neither match it to MAX_EXPR nor .FMAX as both have different semantics with IEEE than the ternary ?: operation we can make sure to maintain this form as a COND_EXPR so backends have the chance to match this to instructions their ISA offers. The patch does this in phiopt where we recognize min/max and instead of giving up when we have to honor NaNs we alter the generated code to a COND_EXPR. This resolves PR88540 and we can then SLP vectorize the min operation for its testcase. It also resolves part of the regressions observed with the change matching bit-inserts of bit-field-refs to vec_perm. Expansion from a COND_EXPR rather than from compare-and-branch regresses gcc.target/i386/pr54855-13.c and gcc.target/i386/pr54855-9.c by producing extra moves while the corresponding min/max operations are now already synthesized by RTL expansion, register selection isn't optimal. This can be also provoked without this change by altering the operand order in the source. It regresses gcc.target/i386/pr110170.c where we end up CSEing the condition which makes RTL expansion no longer produce the min/max directly and code generation is obfuscated enough to confuse RTL if-conversion. It also regresses gcc.target/i386/ssefp-[12].c where oddly one variant isn't if-converted and ix86_expand_fp_movcc doesn't match directly (the FP constants get expanded twice). A fix could be in emit_conditional_move where both prepare_cmp_insn and emit_conditional_move_1 force the constants to (different) registers. Otherwise bootstrapped and tested on x86_64-unknown-linux-gnu. PR tree-optimization/88540 * tree-ssa-phiopt.cc (minmax_replacement): Do not give up with NaNs but handle the simple case by if-converting to a COND_EXPR. * gcc.target/i386/pr88540.c: New testcase. * gcc.target/i386/pr54855-12.c: Adjust. * gcc.target/i386/pr54855-13.c: Likewise. --- gcc/testsuite/gcc.target/i386/pr54855-12.c | 2 +- gcc/testsuite/gcc.target/i386/pr54855-13.c | 2 +- gcc/testsuite/gcc.target/i386/pr88540.c | 10 ++++++++++ gcc/tree-ssa-phiopt.cc | 21 ++++++++++++++++----- 4 files changed, 28 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr88540.c diff --git a/gcc/testsuite/gcc.target/i386/pr54855-12.c b/gcc/testsuite/gcc.target/i386/pr54855-12.c index 2f8af392c83..09e8ab8ae39 100644 --- a/gcc/testsuite/gcc.target/i386/pr54855-12.c +++ b/gcc/testsuite/gcc.target/i386/pr54855-12.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mavx512fp16" } */ -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */ /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr54855-13.c b/gcc/testsuite/gcc.target/i386/pr54855-13.c index 87b4f459a5a..a4f25066f81 100644 --- a/gcc/testsuite/gcc.target/i386/pr54855-13.c +++ b/gcc/testsuite/gcc.target/i386/pr54855-13.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mavx512fp16" } */ -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */ /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr88540.c b/gcc/testsuite/gcc.target/i386/pr88540.c new file mode 100644 index 00000000000..b927d0c57d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr88540.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse2" } */ + +void test(double* __restrict d1, double* __restrict d2, double* __restrict d3) +{ + for (int n = 0; n < 2; ++n) + d3[n] = d1[n] < d2[n] ? d1[n] : d2[n]; +} + +/* { dg-final { scan-assembler "minpd" } } */ diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index 467c9fd108a..13ee486831d 100644 --- a/gcc/tree-ssa-phiopt.cc +++ b/gcc/tree-ssa-phiopt.cc @@ -1580,10 +1580,6 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_ tree type = TREE_TYPE (PHI_RESULT (phi)); - /* The optimization may be unsafe due to NaNs. */ - if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) - return false; - gcond *cond = as_a (*gsi_last_bb (cond_bb)); enum tree_code cmp = gimple_cond_code (cond); tree rhs = gimple_cond_rhs (cond); @@ -1770,6 +1766,9 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_ else return false; } + else if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) + /* The optimization may be unsafe due to NaNs. */ + return false; else if (middle_bb != alt_middle_bb && threeway_p) { /* Recognize the following case: @@ -2103,7 +2102,19 @@ minmax_replacement (basic_block cond_bb, basic_block middle_bb, basic_block alt_ /* Emit the statement to compute min/max. */ gimple_seq stmts = NULL; tree phi_result = PHI_RESULT (phi); - result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, arg1); + + /* When we can't use a MIN/MAX_EXPR still make sure the expression + stays in a form to be recognized by ISA that map to IEEE x > y ? x : y + semantics (that's not IEEE max semantics). */ + if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type)) + { + result = gimple_build (&stmts, cmp, boolean_type_node, + gimple_cond_lhs (cond), rhs); + result = gimple_build (&stmts, COND_EXPR, TREE_TYPE (phi_result), + result, arg_true, arg_false); + } + else + result = gimple_build (&stmts, minmax, TREE_TYPE (phi_result), arg0, arg1); gsi = gsi_last_bb (cond_bb); gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);