From patchwork Thu Jul 6 01:18:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 116468 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2243674vqx; Wed, 5 Jul 2023 18:21:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlF5H2rNJnFfv4Cvu2WP21HvliecFBpEz9TUccmeLonNjUOliYZlu7CZCDaM9lBSs4h8PhR+ X-Received: by 2002:a17:906:3f48:b0:98e:16b7:e024 with SMTP id f8-20020a1709063f4800b0098e16b7e024mr229791ejj.38.1688606485648; Wed, 05 Jul 2023 18:21:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688606485; cv=none; d=google.com; s=arc-20160816; b=CKch6bvHo3+haAfoylwMktFcbHKDfQMFGU1TnR+ngtMH3GnhsHwXvv3pIn9y3e0Jg2 pdRqEJK0BqTo6gZQYnPt84WP9gXDWDtsCfE+aJpQqtvQcdjjXPaoSLYMxoA5pAzgjFSB WCRiokBHqhhow3IZ/TIkP7VsSnB3hK0PNxECyQEuOyhtONtqTNbbJTiZxtPn7bi7p+0e NlfCtVaWOMpUaGV+5cba9/hU+XfhP0NzJ0PvqanKgDW0WTVE42thTyKVqKh140CUN7Ds fsUGuMrdVMsBn4KJ5ryW4u2qrEsqPZ0OOniaqRBuSHe74wjAadleA9di96FIIHgSQGfX Qcnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=Homom8Im0GHNLxT4z3onPoAPVH/3jG2/rDXfBdFqjlw=; fh=dm/TRqLJBeVjVbri+NDGxi/A9xo65BhFe8USDc1Ftcw=; b=e+saacxe3n4+D13ttVtQzH1QEGUJSQw3W6AxeaDHl31gHwtkjzGW7X8UHzCBDPCv4J B2D0QhYMfACt4dHlyy+jopX9w5cB9prVBnC3ybpMg/pZ6HS2QUpFvzBig8e/HE5QKF0u wlYAI458z5piuxFFuSpIfeANManrrIiDA3O3yjmsJZ0Qu8IpMVpFhH9vl46a3BT/hjWF JSWPnN+jrkIcVO+a5wDOujMalrk85a+Tf1uG6elFnW9Oz6bPULLInThqpYfWdyNVP+ef UVQAiYwo2DFiORbUq+6NuvkGKMXdeUI/HQ8dPKirY2Ht1fb5RpNeEGQkiPhx/iezEVQW l1fA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ItkooSIX; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bq10-20020a170906d0ca00b009932537925bsi197477ejb.578.2023.07.05.18.21.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Jul 2023 18:21:25 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ItkooSIX; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0B18C385B53C for ; Thu, 6 Jul 2023 01:21:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0B18C385B53C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688606470; bh=Homom8Im0GHNLxT4z3onPoAPVH/3jG2/rDXfBdFqjlw=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=ItkooSIXK+MvVmAUV7pxOWLZmeDYmrtMrMtjvqod2qp50TPMKPNOwI2p3xV+ZfvHl 0Ioy0iFjue2VhigfktQtPzanDIhLkVMkyj3NL19Nu3e/FWOt5lIOtGMgMMhuRbjClL 7L41O7/fpibEeYgqdlwi1JlG4x+YIoWAy0j4zuig= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by sourceware.org (Postfix) with ESMTPS id C23443858033 for ; Thu, 6 Jul 2023 01:20:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C23443858033 X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="366962653" X-IronPort-AV: E=Sophos;i="6.01,184,1684825200"; d="scan'208";a="366962653" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jul 2023 18:20:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="696664165" X-IronPort-AV: E=Sophos;i="6.01,184,1684825200"; d="scan'208";a="696664165" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga006.jf.intel.com with ESMTP; 05 Jul 2023 18:20:16 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 34E691005142; Thu, 6 Jul 2023 09:20:16 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com Subject: [PATCH 1/2] [x86] Add pre_reload splitter to detect fp min/max pattern. Date: Thu, 6 Jul 2023 09:18:15 +0800 Message-Id: <20230706011816.3543708-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770632234645175601?= X-GMAIL-MSGID: =?utf-8?q?1770632234645175601?= We have ix86_expand_sse_fp_minmax to detect min/max sematics, but it requires rtx_equal_p for cmp_op0/cmp_op1 and if_true/if_false, for the testcase in the PR, there's an extra move from cmp_op0 to if_true, and it failed ix86_expand_sse_fp_minmax. This patch adds pre_reload splitter to detect the min/max pattern. Operands order in MINSS matters for signed zero and NANs, since the instruction always returns second operand when any operand is NAN or both operands are zero. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/110170 * config/i386/i386.md (*ieee_minmax3_1): New pre_reload splitter to detect fp min/max pattern. gcc/testsuite/ChangeLog: * g++.target/i386/pr110170.C: New test. * gcc.target/i386/pr110170.c: New test. --- gcc/config/i386/i386.md | 30 +++++++++ gcc/testsuite/g++.target/i386/pr110170.C | 78 ++++++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr110170.c | 18 ++++++ 3 files changed, 126 insertions(+) create mode 100644 gcc/testsuite/g++.target/i386/pr110170.C create mode 100644 gcc/testsuite/gcc.target/i386/pr110170.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e6ebc461e52..353bb21993d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -22483,6 +22483,36 @@ (define_insn "*ieee_s3" (set_attr "type" "sseadd") (set_attr "mode" "")]) +;; Operands order in min/max instruction matters for signed zero and NANs. +(define_insn_and_split "*ieee_minmax3_1" + [(set (match_operand:MODEF 0 "register_operand") + (unspec:MODEF + [(match_operand:MODEF 1 "register_operand") + (match_operand:MODEF 2 "register_operand") + (lt:MODEF + (match_operand:MODEF 3 "register_operand") + (match_operand:MODEF 4 "register_operand"))] + UNSPEC_BLENDV))] + "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH + && ((rtx_equal_p (operands[1], operands[3]) + && rtx_equal_p (operands[2], operands[4])) + || (rtx_equal_p (operands[1], operands[4]) + && rtx_equal_p (operands[2], operands[3]))) + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(const_int 0)] +{ + int u = (rtx_equal_p (operands[1], operands[3]) + && rtx_equal_p (operands[2], operands[4])) + ? UNSPEC_IEEE_MAX : UNSPEC_IEEE_MIN; + emit_move_insn (operands[0], + gen_rtx_UNSPEC (mode, + gen_rtvec (2, operands[2], operands[1]), + u)); + DONE; +}) + ;; Make two stack loads independent: ;; fld aa fld aa ;; fld %st(0) -> fld bb diff --git a/gcc/testsuite/g++.target/i386/pr110170.C b/gcc/testsuite/g++.target/i386/pr110170.C new file mode 100644 index 00000000000..1e9a781ca74 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr110170.C @@ -0,0 +1,78 @@ +/* { dg-do run } */ +/* { dg-options " -O2 -march=x86-64 -mfpmath=sse -std=gnu++20" } */ +#include + +void +__attribute__((noinline)) +__cond_swap(double* __x, double* __y) { + bool __r = (*__x < *__y); + auto __tmp = __r ? *__x : *__y; + *__y = __r ? *__y : *__x; + *__x = __tmp; +} + +auto test1() { + double nan = -0.0; + double x = 0.0; + __cond_swap(&nan, &x); + return x == -0.0 && nan == 0.0; +} + +auto test1r() { + double nan = NAN; + double x = 1.0; + __cond_swap(&x, &nan); + return isnan(x) && signbit(x) == 0 && nan == 1.0; +} + +auto test2() { + double nan = NAN; + double x = -1.0; + __cond_swap(&nan, &x); + return isnan(x) && signbit(x) == 0 && nan == -1.0; +} + +auto test2r() { + double nan = NAN; + double x = -1.0; + __cond_swap(&x, &nan); + return isnan(x) && signbit(x) == 0 && nan == -1.0; +} + +auto test3() { + double nan = -NAN; + double x = 1.0; + __cond_swap(&nan, &x); + return isnan(x) && signbit(x) == 1 && nan == 1.0; +} + +auto test3r() { + double nan = -NAN; + double x = 1.0; + __cond_swap(&x, &nan); + return isnan(x) && signbit(x) == 1 && nan == 1.0; +} + +auto test4() { + double nan = -NAN; + double x = -1.0; + __cond_swap(&nan, &x); + return isnan(x) && signbit(x) == 1 && nan == -1.0; +} + +auto test4r() { + double nan = -NAN; + double x = -1.0; + __cond_swap(&x, &nan); + return isnan(x) && signbit(x) == 1 && nan == -1.0; +} + + +int main() { + if ( + !test1() || !test1r() + || !test2() || !test2r() + || !test3() || !test4r() + || !test4() || !test4r() + ) __builtin_abort(); +} diff --git a/gcc/testsuite/gcc.target/i386/pr110170.c b/gcc/testsuite/gcc.target/i386/pr110170.c new file mode 100644 index 00000000000..0f98545cce3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110170.c @@ -0,0 +1,18 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options " -O2 -march=x86-64-v2 -mfpmath=sse" } */ +/* { dg-final { scan-assembler-times {(?n)mins[sd]} 2 } } */ +/* { dg-final { scan-assembler-times {(?n)maxs[sd]} 2 } } */ + +void __cond_swap_df(double* __x, double* __y) { + _Bool __r = (*__x < *__y); + double __tmp = __r ? *__x : *__y; + *__y = __r ? *__y : *__x; + *__x = __tmp; +} + +void __cond_swap_sf(float* __x, float* __y) { + _Bool __r = (*__x < *__y); + float __tmp = __r ? *__x : *__y; + *__y = __r ? *__y : *__x; + *__x = __tmp; +}