From patchwork Sat Jun 10 22:54:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 106023 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1768644vqr; Sat, 10 Jun 2023 15:55:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4o72OB2086NQfq8YnJc83RWTJ64pw9UrBHWRGMgtMNwgf9H4T7dc8LZpNDz8G4G1rIO/1s X-Received: by 2002:a17:906:fe0c:b0:977:e310:1ce7 with SMTP id wy12-20020a170906fe0c00b00977e3101ce7mr5181077ejb.38.1686437713630; Sat, 10 Jun 2023 15:55:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686437713; cv=none; d=google.com; s=arc-20160816; b=CxYzqQN0BhQQe4+KHyPGKEglUUYszlbV1tNIzH8oi0J08ZBKQnVkPR3qSeXVPawuhJ HW/2MQDkcvmtD7Kwl0Tq3oLnRcAKpiZUmY5IuL9O8MIyNXC8ZU47XzpnONLUJoOcqzgF cfViX2kGa+DjHKfZ0vRcbA9QsuVnHpOrx8nAXEz9lX5B0fgt8RE5aYl3KGGaqt68SWiJ d+GP+TlZ6pZWAsrkJntIWOQkVPVsGz32ATz8TrNwD2VhY5Cpd+KKJ8Q3HigNrcVsO9jk 5kh+Uao2FAptotkgF8VVRk3J1u1RCmKP+1AfxHTTlbm2VZHJGrS7XlVLZ30WZ41kIjwE hU6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:to:from:dkim-signature :dmarc-filter:delivered-to; bh=JKFrjU0R/NFsnj5DDUhwP3MMVmjzgMh4JgtiHTAWbsc=; b=zrg6bkL2kheOSV1a1F6Dqd9kCrPD/nJUBX/KxaMcKvlr7sunG/ZsITQE4yJkx/F23A ZPzP2DcWnnXfxNlBp9dAOKQqgTIw/zxzYnmW244DcQA1s+Gea05H1a/DvIddy5u9SaaU uy08SU25j2sN11mgERiKqkhfk2OLlzYNkI2OqMJzQZ3UD78C9802CKfNtGn3rOH+TD5V lLVKI+b4TbuxqdlXhhJPpPNzaFz0dXb+//4z4G1nTnp3M65Rb4KkV3t8sz/+/oqdhnqP fS5Apu4h8zvOdiYfeFRnFk2+BrIhRFtwDCq9xV+SdgSAf4Shcic1fw8Eshy93K2Z1lpz AHLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=U8utdHCv; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id op13-20020a170906bced00b0096b247ef9aesi3144085ejb.509.2023.06.10.15.55.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jun 2023 15:55:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=U8utdHCv; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A9B4B3856622 for ; Sat, 10 Jun 2023 22:55:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id E06EA3858D28 for ; Sat, 10 Jun 2023 22:54:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E06EA3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=JKFrjU0R/NFsnj5DDUhwP3MMVmjzgMh4JgtiHTAWbsc=; b=U8utdHCv8vBRY6ya0KWTLSGwiO 1vtCTVzshsTUD3seXfpP2FNN2ne8+bJlAfCBisk2/8akpcpPD6Oq/40U/UEDGae2HgmXHL4p476dR vqmGVNp1PVgm94ERhVHpLjyjVtumB+7dDl3kuPP+QgRFwRISWpc3wpyxCmkjarlTVIIZhMX/7zpcK sZod7a2XgHT2KJgATYstJlwzC6mpnxbOP1DdSPq5+DTmShFwi1ca571KFVKSIhnecw67lnIoWf+/S z2EEJrwXm18vViwoRZ+BwchEv03IJSAfD4bzEuAVnuI3o+tCkprmeXfVHk6MOfuQusVG7BmpDxYFH qSH8qLeQ==; Received: from host86-169-41-81.range86-169.btcentralplus.com ([86.169.41.81]:49409 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1q87Ti-0007NQ-02 for gcc-patches@gcc.gnu.org; Sat, 10 Jun 2023 18:54:38 -0400 From: "Roger Sayle" To: Subject: [GCC 13 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest. Date: Sat, 10 Jun 2023 23:54:36 +0100 Message-ID: <03bd01d99bee$888f3a70$99adaf50$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Admb65Gck+Vb7+srTwaOAnWwt1UtGA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768358111919373061?= X-GMAIL-MSGID: =?utf-8?q?1768358111919373061?= This is a backport of the fixes for PR target/109973 and PR target/110083. This backport to the releases/gcc-13 branch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for gcc-13, or should we just close PR 109973 in Bugzilla? 2023-06-10 Roger Sayle Uros Bizjak gcc/ChangeLog PR target/109973 PR target/110083 * config/i386/i386-builtin.def (__builtin_ia32_ptestz128): Use new CODE_for_sse4_1_ptestzv2di. (__builtin_ia32_ptestc128): Use new CODE_for_sse4_1_ptestcv2di. (__builtin_ia32_ptestz256): Use new CODE_for_avx_ptestzv4di. (__builtin_ia32_ptestc256): Use new CODE_for_avx_ptestcv4di. * config/i386/i386-expand.cc (ix86_expand_branch): Use CCZmode when expanding UNSPEC_PTEST to compare against zero. * config/i386/i386-features.cc (scalar_chain::convert_compare): Likewise generate CCZmode UNSPEC_PTESTs when converting comparisons. Update or delete REG_EQUAL notes, converting CONST_INT and CONST_WIDE_INT immediate operands to a suitable CONST_VECTOR. (general_scalar_chain::convert_insn): Use CCZmode for COMPARE result. (timode_scalar_chain::convert_insn): Use CCZmode for COMPARE result. * config/i386/i386-protos.h (ix86_match_ptest_ccmode): Prototype. * config/i386/i386.cc (ix86_match_ptest_ccmode): New predicate to check for suitable matching modes for the UNSPEC_PTEST pattern. * config/i386/sse.md (define_split): When splitting UNSPEC_MOVMSK to UNSPEC_PTEST, preserve the FLAG_REG mode as CCZ. (*_ptest): Add asterisk to hide define_insn. Remove ":CC" mode of FLAGS_REG, instead use ix86_match_ptest_ccmode. (_ptestz): New define_expand to specify CCZ. (_ptestc): New define_expand to specify CCC. (_ptest): A define_expand using CC to preserve the current behavior. (*ptest_and): Specify CCZ to only perform this optimization when only the Z flag is required. gcc/testsuite/ChangeLog PR target/109973 PR target/110083 * gcc.target/i386/pr109973-1.c: New test case. * gcc.target/i386/pr109973-2.c: Likewise. * gcc.target/i386/pr110083.c: Likewise. Thanks, Roger diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 6dae697..37df018 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -1004,8 +1004,8 @@ BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_roundps_sfix, "__builtin_ia32_ BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_roundv4sf2, "__builtin_ia32_roundps_az", IX86_BUILTIN_ROUNDPS_AZ, UNKNOWN, (int) V4SF_FTYPE_V4SF) BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_roundv4sf2_sfix, "__builtin_ia32_roundps_az_sfix", IX86_BUILTIN_ROUNDPS_AZ_SFIX, UNKNOWN, (int) V4SI_FTYPE_V4SF) -BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_ptestz128", IX86_BUILTIN_PTESTZ, EQ, (int) INT_FTYPE_V2DI_V2DI_PTEST) -BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_ptestc128", IX86_BUILTIN_PTESTC, LTU, (int) INT_FTYPE_V2DI_V2DI_PTEST) +BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestzv2di, "__builtin_ia32_ptestz128", IX86_BUILTIN_PTESTZ, EQ, (int) INT_FTYPE_V2DI_V2DI_PTEST) +BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestcv2di, "__builtin_ia32_ptestc128", IX86_BUILTIN_PTESTC, LTU, (int) INT_FTYPE_V2DI_V2DI_PTEST) BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_ptestnzc128", IX86_BUILTIN_PTESTNZC, GTU, (int) INT_FTYPE_V2DI_V2DI_PTEST) /* SSE4.2 */ @@ -1164,8 +1164,8 @@ BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestpd256, "__builtin_ia32_vtestnzc BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, "__builtin_ia32_vtestzps256", IX86_BUILTIN_VTESTZPS256, EQ, (int) INT_FTYPE_V8SF_V8SF_PTEST) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, "__builtin_ia32_vtestcps256", IX86_BUILTIN_VTESTCPS256, LTU, (int) INT_FTYPE_V8SF_V8SF_PTEST) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, "__builtin_ia32_vtestnzcps256", IX86_BUILTIN_VTESTNZCPS256, GTU, (int) INT_FTYPE_V8SF_V8SF_PTEST) -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, "__builtin_ia32_ptestz256", IX86_BUILTIN_PTESTZ256, EQ, (int) INT_FTYPE_V4DI_V4DI_PTEST) -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, "__builtin_ia32_ptestc256", IX86_BUILTIN_PTESTC256, LTU, (int) INT_FTYPE_V4DI_V4DI_PTEST) +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestzv4di, "__builtin_ia32_ptestz256", IX86_BUILTIN_PTESTZ256, EQ, (int) INT_FTYPE_V4DI_V4DI_PTEST) +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestcv4di, "__builtin_ia32_ptestc256", IX86_BUILTIN_PTESTC256, LTU, (int) INT_FTYPE_V4DI_V4DI_PTEST) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, "__builtin_ia32_ptestnzc256", IX86_BUILTIN_PTESTNZC256, GTU, (int) INT_FTYPE_V4DI_V4DI_PTEST) BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_movmskpd256, "__builtin_ia32_movmskpd256", IX86_BUILTIN_MOVMSKPD256, UNKNOWN, (int) INT_FTYPE_V4DF ) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 0d817fc..7719449 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -2370,8 +2370,8 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx op1, rtx label) tmp = gen_reg_rtx (mode); emit_insn (gen_rtx_SET (tmp, gen_rtx_XOR (mode, op0, op1))); tmp = gen_lowpart (p_mode, tmp); - emit_insn (gen_rtx_SET (gen_rtx_REG (CCmode, FLAGS_REG), - gen_rtx_UNSPEC (CCmode, + emit_insn (gen_rtx_SET (gen_rtx_REG (CCZmode, FLAGS_REG), + gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, tmp, tmp), UNSPEC_PTEST))); tmp = gen_rtx_fmt_ee (code, VOIDmode, flag, const0_rtx); diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index a0a7348..4a3b07a 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -974,12 +974,45 @@ general_scalar_chain::convert_op (rtx *op, rtx_insn *insn) } } -/* Convert COMPARE to vector mode. */ +/* Convert CCZmode COMPARE to vector mode. */ rtx scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn) { rtx src, tmp; + + /* Handle any REG_EQUAL notes. */ + tmp = find_reg_equal_equiv_note (insn); + if (tmp) + { + if (GET_CODE (XEXP (tmp, 0)) == COMPARE + && GET_MODE (XEXP (tmp, 0)) == CCZmode + && REG_P (XEXP (XEXP (tmp, 0), 0))) + { + rtx *op = &XEXP (XEXP (tmp, 0), 1); + if (CONST_SCALAR_INT_P (*op)) + { + if (constm1_operand (*op, GET_MODE (*op))) + *op = CONSTM1_RTX (vmode); + else + { + unsigned n = GET_MODE_NUNITS (vmode); + rtx *v = XALLOCAVEC (rtx, n); + v[0] = *op; + for (unsigned i = 1; i < n; ++i) + v[i] = const0_rtx; + *op = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (n, v)); + } + tmp = NULL_RTX; + } + else if (REG_P (*op)) + tmp = NULL_RTX; + } + + if (tmp) + remove_note (insn, tmp); + } + /* Comparison against anything other than zero, requires an XOR. */ if (op2 != const0_rtx) { @@ -1023,7 +1056,7 @@ scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn) emit_insn_before (gen_rtx_SET (tmp, op11), insn); op11 = tmp; } - return gen_rtx_UNSPEC (CCmode, gen_rtvec (2, op11, op12), + return gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, op11, op12), UNSPEC_PTEST); } else @@ -1052,7 +1085,7 @@ scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn) src = tmp; } - return gen_rtx_UNSPEC (CCmode, gen_rtvec (2, src, src), UNSPEC_PTEST); + return gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, src, src), UNSPEC_PTEST); } /* Helper function for converting INSN to vector mode. */ @@ -1219,7 +1252,7 @@ general_scalar_chain::convert_insn (rtx_insn *insn) break; case COMPARE: - dst = gen_rtx_REG (CCmode, FLAGS_REG); + dst = gen_rtx_REG (CCZmode, FLAGS_REG); src = convert_compare (XEXP (src, 0), XEXP (src, 1), insn); break; @@ -1726,7 +1759,7 @@ timode_scalar_chain::convert_insn (rtx_insn *insn) break; case COMPARE: - dst = gen_rtx_REG (CCmode, FLAGS_REG); + dst = gen_rtx_REG (CCZmode, FLAGS_REG); src = convert_compare (XEXP (src, 0), XEXP (src, 1), insn); break; diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 71ae95f..b00756b 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -140,6 +140,7 @@ extern void ix86_expand_copysign (rtx []); extern void ix86_expand_xorsign (rtx []); extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2]); extern bool ix86_match_ccmode (rtx, machine_mode); +extern bool ix86_match_ptest_ccmode (rtx); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); extern void ix86_expand_setcc (rtx, enum rtx_code, rtx, rtx); extern bool ix86_expand_int_movcc (rtx[]); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index fbd33a6..30fc552 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -15985,6 +15985,29 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx op1) } } +/* Return TRUE or FALSE depending on whether the ptest instruction + INSN has source and destination with suitable matching CC modes. */ + +bool +ix86_match_ptest_ccmode (rtx insn) +{ + rtx set, src; + machine_mode set_mode; + + set = PATTERN (insn); + gcc_assert (GET_CODE (set) == SET); + src = SET_SRC (set); + gcc_assert (GET_CODE (src) == UNSPEC + && XINT (src, 1) == UNSPEC_PTEST); + + set_mode = GET_MODE (src); + if (set_mode != CCZmode + && set_mode != CCCmode + && set_mode != CCmode) + return false; + return GET_MODE (SET_DEST (set)) == set_mode; +} + /* Return the fixed registers used for condition codes. */ static bool diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 513960e..e8d50a1 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -20441,10 +20441,10 @@ UNSPEC_MOVMSK) (match_operand 2 "const_int_operand")))] "TARGET_SSE4_1 && (INTVAL (operands[2]) == (int) ())" - [(set (reg:CC FLAGS_REG) - (unspec:CC [(match_dup 0) - (match_dup 0)] - UNSPEC_PTEST))]) + [(set (reg:CCZ FLAGS_REG) + (unspec:CCZ [(match_dup 0) + (match_dup 0)] + UNSPEC_PTEST))]) (define_expand "sse2_maskmovdqu" [(set (match_operand:V16QI 0 "memory_operand") @@ -23096,13 +23096,13 @@ (set_attr "mode" "")]) ;; ptest is very similar to comiss and ucomiss when setting FLAGS_REG. -;; But it is not a really compare instruction. -(define_insn "_ptest" - [(set (reg:CC FLAGS_REG) - (unspec:CC [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x") - (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")] - UNSPEC_PTEST))] - "TARGET_SSE4_1" +;; But it is not really a compare instruction. +(define_insn "*_ptest" + [(set (reg FLAGS_REG) + (unspec [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x") + (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")] + UNSPEC_PTEST))] + "TARGET_SSE4_1 && ix86_match_ptest_ccmode (insn)" "%vptest\t{%1, %0|%0, %1}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecomi") @@ -23115,6 +23115,30 @@ (const_string "*"))) (set_attr "mode" "")]) +;; Expand a ptest to set the Z flag. +(define_expand "_ptestz" + [(set (reg:CCZ FLAGS_REG) + (unspec:CCZ [(match_operand:V_AVX 0 "register_operand") + (match_operand:V_AVX 1 "vector_operand")] + UNSPEC_PTEST))] + "TARGET_SSE4_1") + +;; Expand a ptest to set the C flag +(define_expand "_ptestc" + [(set (reg:CCC FLAGS_REG) + (unspec:CCC [(match_operand:V_AVX 0 "register_operand") + (match_operand:V_AVX 1 "vector_operand")] + UNSPEC_PTEST))] + "TARGET_SSE4_1") + +;; Expand a ptest to set both the Z and C flags +(define_expand "_ptest" + [(set (reg:CC FLAGS_REG) + (unspec:CC [(match_operand:V_AVX 0 "register_operand") + (match_operand:V_AVX 1 "vector_operand")] + UNSPEC_PTEST))] + "TARGET_SSE4_1") + (define_insn "ptesttf2" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:TF 0 "register_operand" "Yr, *x, x") @@ -23129,17 +23153,17 @@ (set_attr "mode" "TI")]) (define_insn_and_split "*ptest_and" - [(set (reg:CC FLAGS_REG) - (unspec:CC [(and:V_AVX (match_operand:V_AVX 0 "register_operand") - (match_operand:V_AVX 1 "vector_operand")) - (and:V_AVX (match_dup 0) (match_dup 1))] + [(set (reg:CCZ FLAGS_REG) + (unspec:CCZ [(and:V_AVX (match_operand:V_AVX 0 "register_operand") + (match_operand:V_AVX 1 "vector_operand")) + (and:V_AVX (match_dup 0) (match_dup 1))] UNSPEC_PTEST))] "TARGET_SSE4_1 && ix86_pre_reload_split ()" "#" "&& 1" - [(set (reg:CC FLAGS_REG) - (unspec:CC [(match_dup 0) (match_dup 1)] UNSPEC_PTEST))]) + [(set (reg:CCZ FLAGS_REG) + (unspec:CCZ [(match_dup 0) (match_dup 1)] UNSPEC_PTEST))]) (define_expand "nearbyint2" [(set (match_operand:VFH 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/i386/pr109973-1.c b/gcc/testsuite/gcc.target/i386/pr109973-1.c new file mode 100644 index 0000000..a1b6136b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109973-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx2" } */ + +typedef long long __m256i __attribute__ ((__vector_size__ (32))); + +int +foo (__m256i x, __m256i y) +{ + __m256i a = x & y; + return __builtin_ia32_ptestc256 (a, a); +} + +/* { dg-final { scan-assembler "vpand" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr109973-2.c b/gcc/testsuite/gcc.target/i386/pr109973-2.c new file mode 100644 index 0000000..167f6ee --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109973-2.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +typedef long long __m128i __attribute__ ((__vector_size__ (16))); + +int +foo (__m128i x, __m128i y) +{ + __m128i a = x & y; + return __builtin_ia32_ptestc128 (a, a); +} + +/* { dg-final { scan-assembler "pand" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr110083.c b/gcc/testsuite/gcc.target/i386/pr110083.c new file mode 100644 index 0000000..4b38ca8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110083.c @@ -0,0 +1,26 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O2 -msse4 -mstv -mno-stackrealign" } */ +typedef int TItype __attribute__ ((mode (TI))); +typedef unsigned int UTItype __attribute__ ((mode (TI))); + +void foo (void) +{ + static volatile TItype ivin, ivout; + static volatile float fv1, fv2; + ivin = ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)); + fv1 = ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)); + fv2 = ivin; + ivout = fv2; + if (ivin != ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)) + || ((((128) > sizeof (TItype) * 8 - 1)) && ivout != ivin) + || ((((128) > sizeof (TItype) * 8 - 1)) + && ivout != + ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1))) + || fv1 != + (float) ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)) + || fv2 != + (float) ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)) + || fv1 != fv2) + __builtin_abort (); +} +