From patchwork Mon Oct 30 10:37:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 159682 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp2110774vqb; Mon, 30 Oct 2023 03:40:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHqD8vJwgSpF9GBd7HpVADDun8PoBbtOkT4NbFlDWpEeVsE97gAaaIfaSAJkQIIn8Zvk5Lv X-Received: by 2002:a05:622a:48e:b0:41e:453a:4dfe with SMTP id p14-20020a05622a048e00b0041e453a4dfemr10094225qtx.36.1698662401222; Mon, 30 Oct 2023 03:40:01 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698662401; cv=pass; d=google.com; s=arc-20160816; b=OYKfHUDfsiqtqFAWzRf0AeVooyiN5+8iF/vCvWhnNs9gPxmMl9hcE36XIxqoXd3XiT J8v8t+RlRzCn/naOJLpvJakKqF9H7r9HpXSBbVl9ijScnm7GUzDFzvbsoATu72Tvf3Ue nlMj3qZ0CPQAMdyyq6FrvVxYxBTzntpLpyUY4Q17E5SXChdxHcrIn7ceO/4c86iLi4vj 4d/7Vd+izYgs9jpvCAgPreUnsW4Y4uHainyjYLLa8u11FZnMu1T6OecU/mbZyBZhODTC chdN5UjgduJQR/6TLalZNFOl2IE/S96TBJjCRMB1+g9ccvaoNb4adANyuhbuxI3c0U+B lsrg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=zJA1r2qAI40nYJTL62s1nkJ+gceiXNKTuvUTIAgaqsc=; fh=ChXOctppJn0KECDRINafwUY5xHRufGHaa0Ju9pddrcQ=; b=t4/y5IyVDa5PVDtK3CYaC0L0wGIQNgj71f6sutglIun2AxhdEnNLyw4q1faiwsbLeh aea7f72DYX9o0fimd9HnNDlVFjjertZ202jyv8mCOviNqZZxTALAc3JmT7eK2cK2gZrF Prh0RgSbZkPEDnEuXgXIpVlG1wTZRfHcXLMNb725g1l3gVf4SpyN/VsSphUH1G0Iwpbh Ylg3hD7QQcn9+mg+x3jg3sxVcvlOob4e5//GUiwp+PXbX3DhD7CVW+e4W4X7X6C3PhQT jiF8Vxm6k7uF88rhVBz1JsegtQCmwchNni0OgCNikmZ9PoEmREq+kgmiaoUXnYfh/Cl1 VBOQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OSGa/mlh"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id v24-20020a05622a189800b004181f3cf9a0si5248282qtc.327.2023.10.30.03.40.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Oct 2023 03:40:01 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OSGa/mlh"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F3F4138582BE for ; Mon, 30 Oct 2023 10:40:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 6F2693858CDA for ; Mon, 30 Oct 2023 10:39:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6F2693858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6F2693858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.24 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698662377; cv=none; b=iW13XYQQSdEMJoM/x2Xr6BGyVVd7vN7AsWNU4zpwmXM9xqRqxtx7L7e19sd03b0SvXQX8LLGpJf3kuxzzmkyVuHRvvq3V1ot8n2JMvUdFIiOhfZCpvlMFGI0lQHxHAhuoaymFVrrOFzdK4MJIkXBw+LNblpHXkN7oizX2p4Cdnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698662377; c=relaxed/simple; bh=8RkVFZdeO4r5SZVjyyrQ5r73IMPqTQ6k7p0ApJoLqW8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=rIOOuDvzFDumI+k/+B80WqbBLhxhuO9NnlZRBnW1BY9hoXjuqR09i28XTCi0OugSh8UHISvUYD4aqYZPib9obB/qnw+72zZRMkxij6XmyJnQCktIVaY/qtPOYwxXyAqnlcuhvcmYFndgd9DkbfEQrnfwjEPT9ehMDVCfP/rt2Ns= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698662374; x=1730198374; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=8RkVFZdeO4r5SZVjyyrQ5r73IMPqTQ6k7p0ApJoLqW8=; b=OSGa/mlhugXCoDPnUn2691C6QcGWVx03dixNX/268wZiQD2hK+SnaG+p slrOMAy8N5NP2RixBk3WV+AC9li07ZYbpt5WE3ppu/no2eFcHtdUzm4mj WLnN/EhqULS1b0R87Xt+nNnhBW+asIAab/M+1z2DuIN6EDdSw136+Q1Mn UWOmzEv4NsjwReo8eXRrJl/PsbmU5gezTP03tyYb8P1CWu3JrU1p0mMFK gFLxQRQCYE7IDfwmtvLsa2xUBQ/OqJb0Lwy1ENb0LOpUSZCPRaI4WLqHB FQBtpu0Z8clpjt1PjIHQH2kNodCUQD+qhCTac6T8x06OyRvc4kS9uPWdF g==; X-IronPort-AV: E=McAfee;i="6600,9927,10878"; a="390906121" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="390906121" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 03:39:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10878"; a="933741117" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="933741117" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 30 Oct 2023 03:39:31 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 63E1E1005673; Mon, 30 Oct 2023 18:39:30 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH] Fix wrong code due to incorrest define_split Date: Mon, 30 Oct 2023 18:37:30 +0800 Message-Id: <20231030103730.168701-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781176625934394866 X-GMAIL-MSGID: 1781176625934394866 -(define_split - [(set (match_operand:V2HI 0 "register_operand") - (eq:V2HI - (eq:V2HI - (us_minus:V2HI - (match_operand:V2HI 1 "register_operand") - (match_operand:V2HI 2 "register_operand")) - (match_operand:V2HI 3 "const0_operand")) - (match_operand:V2HI 4 "const0_operand")))] - "TARGET_SSE4_1" - [(set (match_dup 0) - (umin:V2HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V2HI (match_dup 0) (match_dup 2)))]) the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1) So remove the splitter. Also extend another define_split to define_insn_and_split to handle below pattern 494(set (reg:V4QI 112) 495 (unspec:V4QI [ 496 (subreg:V4QI (reg:V2HF 111 [ bf ]) 0) 497 (subreg:V4QI (reg:V2HF 110 [ af ]) 0) 498 (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105) 499 (const_vector:V2HI [ 500 (const_int 0 [0]) repeated x2 501 ])) 502 (const_vector:V2HI [ 503 (const_int 0 [0]) repeated x2 504 ])) 0) 505 ] UNSPEC_BLENDV)) define_split doesn't work since pass_combine assumes it produces at most 2 insns after split, but here it produces 3 since we need to move const0_rtx (V2HImode) to reg. The move insn can be eliminated later. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: PR target/112276 * config/i386/mmx.md (*mmx_pblendvb_v8qi_1): Change define_split to define_insn_and_split to handle immediate_operand for comparison. (*mmx_pblendvb_v8qi_2): Ditto. (*mmx_pblendvb__1): Ditto. (*mmx_pblendvb_v4qi_2): Ditto. (3): Remove define_split after it. (v8qi3): Ditto. (3): Ditto. (v2hi3): Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/part-vect-vcondhf.C: Adjust testcase. * gcc.target/i386/pr112276.c: New test. --- gcc/config/i386/mmx.md | 112 ++++++------------ .../g++.target/i386/part-vect-vcondhf.C | 1 - gcc/testsuite/gcc.target/i386/pr112276.c | 36 ++++++ 3 files changed, 70 insertions(+), 79 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr112276.c diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index e3d0fb5b107..2b97bb8fa98 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -3360,21 +3360,6 @@ (define_insn "3" (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) -(define_split - [(set (match_operand:V4HI 0 "register_operand") - (eq:V4HI - (eq:V4HI - (us_minus:V4HI - (match_operand:V4HI 1 "register_operand") - (match_operand:V4HI 2 "register_operand")) - (match_operand:V4HI 3 "const0_operand")) - (match_operand:V4HI 4 "const0_operand")))] - "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" - [(set (match_dup 0) - (umin:V4HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V4HI (match_dup 0) (match_dup 2)))]) - (define_expand "mmx_v8qi3" [(set (match_operand:V8QI 0 "register_operand") (umaxmin:V8QI @@ -3408,21 +3393,6 @@ (define_expand "v8qi3" (match_operand:V8QI 2 "register_operand")))] "TARGET_MMX_WITH_SSE") -(define_split - [(set (match_operand:V8QI 0 "register_operand") - (eq:V8QI - (eq:V8QI - (us_minus:V8QI - (match_operand:V8QI 1 "register_operand") - (match_operand:V8QI 2 "register_operand")) - (match_operand:V8QI 3 "const0_operand")) - (match_operand:V8QI 4 "const0_operand")))] - "TARGET_MMX_WITH_SSE" - [(set (match_dup 0) - (umin:V8QI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V8QI (match_dup 0) (match_dup 2)))]) - (define_insn "3" [(set (match_operand:VI1_16_32 0 "register_operand" "=x,Yw") (umaxmin:VI1_16_32 @@ -3436,21 +3406,6 @@ (define_insn "3" (set_attr "type" "sseiadd") (set_attr "mode" "TI")]) -(define_split - [(set (match_operand:V4QI 0 "register_operand") - (eq:V4QI - (eq:V4QI - (us_minus:V4QI - (match_operand:V4QI 1 "register_operand") - (match_operand:V4QI 2 "register_operand")) - (match_operand:V4QI 3 "const0_operand")) - (match_operand:V4QI 4 "const0_operand")))] - "TARGET_SSE2" - [(set (match_dup 0) - (umin:V4QI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V4QI (match_dup 0) (match_dup 2)))]) - (define_insn "v2hi3" [(set (match_operand:V2HI 0 "register_operand" "=Yr,*x,Yv") (umaxmin:V2HI @@ -3467,21 +3422,6 @@ (define_insn "v2hi3" (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) -(define_split - [(set (match_operand:V2HI 0 "register_operand") - (eq:V2HI - (eq:V2HI - (us_minus:V2HI - (match_operand:V2HI 1 "register_operand") - (match_operand:V2HI 2 "register_operand")) - (match_operand:V2HI 3 "const0_operand")) - (match_operand:V2HI 4 "const0_operand")))] - "TARGET_SSE4_1" - [(set (match_dup 0) - (umin:V2HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V2HI (match_dup 0) (match_dup 2)))]) - (define_insn "ssse3_abs2" [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv") (abs:MMXMODEI @@ -3954,7 +3894,7 @@ (define_insn "mmx_pblendvb_v8qi" (set_attr "btver2_decode" "vector") (set_attr "mode" "TI")]) -(define_split +(define_insn_and_split "*mmx_pblendvb_v8qi_1" [(set (match_operand:V8QI 0 "register_operand") (unspec:V8QI [(match_operand:V8QI 1 "register_operand") @@ -3962,21 +3902,26 @@ (define_split (eq:V8QI (eq:V8QI (match_operand:V8QI 3 "register_operand") - (match_operand:V8QI 4 "register_operand")) + (match_operand:V8QI 4 "nonmemory_operand")) (match_operand:V8QI 5 "const0_operand"))] UNSPEC_BLENDV))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMX_WITH_SSE && ix86_pre_reload_split ()" + "#" + "&& 1" [(set (match_dup 6) - (eq:V8QI (match_dup 3) (match_dup 4))) + (eq:V8QI (match_dup 3) (match_dup 7))) (set (match_dup 0) (unspec:V8QI [(match_dup 2) (match_dup 1) (match_dup 6)] UNSPEC_BLENDV))] - "operands[6] = gen_reg_rtx (V8QImode);") +{ + operands[6] = gen_reg_rtx (V8QImode); + operands[7] = force_reg (V8QImode, operands[4]); +}) -(define_split +(define_insn_and_split "*mmx_pblendvb_v8qi_2" [(set (match_operand:V8QI 0 "register_operand") (unspec:V8QI [(match_operand:V8QI 1 "register_operand") @@ -3985,12 +3930,14 @@ (define_split (eq:MMXMODE24 (eq:MMXMODE24 (match_operand:MMXMODE24 3 "register_operand") - (match_operand:MMXMODE24 4 "register_operand")) + (match_operand:MMXMODE24 4 "nonmemory_operand")) (match_operand:MMXMODE24 5 "const0_operand")) 0)] UNSPEC_BLENDV))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMX_WITH_SSE && ix86_pre_reload_split ()" + "#" + "&& 1" [(set (match_dup 6) - (eq:MMXMODE24 (match_dup 3) (match_dup 4))) + (eq:MMXMODE24 (match_dup 3) (match_dup 8))) (set (match_dup 0) (unspec:V8QI [(match_dup 2) @@ -4000,6 +3947,7 @@ (define_split { operands[6] = gen_reg_rtx (mode); operands[7] = lowpart_subreg (V8QImode, operands[6], mode); + operands[8] = force_reg (mode, operands[4]); }) (define_insn "mmx_pblendvb_" @@ -4022,7 +3970,7 @@ (define_insn "mmx_pblendvb_" (set_attr "btver2_decode" "vector") (set_attr "mode" "TI")]) -(define_split +(define_insn_and_split "*mmx_pblendvb__1" [(set (match_operand:VI_16_32 0 "register_operand") (unspec:VI_16_32 [(match_operand:VI_16_32 1 "register_operand") @@ -4030,21 +3978,26 @@ (define_split (eq:VI_16_32 (eq:VI_16_32 (match_operand:VI_16_32 3 "register_operand") - (match_operand:VI_16_32 4 "register_operand")) + (match_operand:VI_16_32 4 "nonmemory_operand")) (match_operand:VI_16_32 5 "const0_operand"))] UNSPEC_BLENDV))] - "TARGET_SSE2" + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" [(set (match_dup 6) - (eq:VI_16_32 (match_dup 3) (match_dup 4))) + (eq:VI_16_32 (match_dup 3) (match_dup 7))) (set (match_dup 0) (unspec:VI_16_32 [(match_dup 2) (match_dup 1) (match_dup 6)] UNSPEC_BLENDV))] - "operands[6] = gen_reg_rtx (mode);") +{ + operands[6] = gen_reg_rtx (mode); + operands[7] = force_reg (mode, operands[4]); +}) -(define_split +(define_insn_and_split "*mmx_pblendvb_v4qi_2" [(set (match_operand:V4QI 0 "register_operand") (unspec:V4QI [(match_operand:V4QI 1 "register_operand") @@ -4053,12 +4006,14 @@ (define_split (eq:V2HI (eq:V2HI (match_operand:V2HI 3 "register_operand") - (match_operand:V2HI 4 "register_operand")) + (match_operand:V2HI 4 "nonmemory_operand")) (match_operand:V2HI 5 "const0_operand")) 0)] UNSPEC_BLENDV))] - "TARGET_SSE2" + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" [(set (match_dup 6) - (eq:V2HI (match_dup 3) (match_dup 4))) + (eq:V2HI (match_dup 3) (match_dup 8))) (set (match_dup 0) (unspec:V4QI [(match_dup 2) @@ -4068,6 +4023,7 @@ (define_split { operands[6] = gen_reg_rtx (V2HImode); operands[7] = lowpart_subreg (V4QImode, operands[6], V2HImode); + operands[8] = force_reg (V2HImode, operands[4]); }) ;; XOP parallel XMM conditional moves diff --git a/gcc/testsuite/g++.target/i386/part-vect-vcondhf.C b/gcc/testsuite/g++.target/i386/part-vect-vcondhf.C index f19727816cf..e623e6cde79 100644 --- a/gcc/testsuite/g++.target/i386/part-vect-vcondhf.C +++ b/gcc/testsuite/g++.target/i386/part-vect-vcondhf.C @@ -3,7 +3,6 @@ /* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */ /* { dg-final { scan-assembler-times "vpcmpeqw" 6 } } */ /* { dg-final { scan-assembler-times "vpcmpgtw" 2 } } */ -/* { dg-final { scan-assembler-times "vpminuw" 2 } } */ /* { dg-final { scan-assembler-times "vcmpph" 8 } } */ /* { dg-final { scan-assembler-times "vpblendvb" 8 } } */ typedef unsigned short __attribute__((__vector_size__ (4))) __v2hu; diff --git a/gcc/testsuite/gcc.target/i386/pr112276.c b/gcc/testsuite/gcc.target/i386/pr112276.c new file mode 100644 index 00000000000..5365313f4c2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr112276.c @@ -0,0 +1,36 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse4.1" } */ +/* { dg-require-effective-target sse4 } */ + +#include "sse4_1-check.h" + +typedef unsigned short __attribute__((__vector_size__ (8))) U4; +typedef unsigned short __attribute__((__vector_size__ (4))) U2; + +U4 +__attribute__((noipa)) +foo4 (U4 a, U4 b) +{ + return a > b; +} + +U2 +__attribute__((noipa)) +foo2 (U2 a, U2 b) +{ + return a > b; +} + +static void +sse4_1_test () +{ + U4 a = __extension__(U4) {1, 1, 1, 1}; + U4 b = foo4 (a, a); + if (b[0] || b[1] || b[2] || b[3]) __builtin_abort(); + + U2 c = __extension__(U2) {1, 1}; + U2 d = foo2 (c, c); + if (d[0] || d[1]) __builtin_abort(); + + return; +}