From patchwork Wed Dec 6 08:06:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174392 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954916vqy; Wed, 6 Dec 2023 00:10:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IHcUujpBHBjCQGkMPtIq9edJIeBtCK5+V+Q5cz0YfD807rYLsSxoBAbBofkvR0aiGi1bW/b X-Received: by 2002:a05:620a:8f02:b0:77f:8bc:fd6 with SMTP id rh2-20020a05620a8f0200b0077f08bc0fd6mr486652qkn.143.1701850254892; Wed, 06 Dec 2023 00:10:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850254; cv=pass; d=google.com; s=arc-20160816; b=t0WAHlCvE6SlFgb4ZqVXIHC9AP/a1zRA/JCK/IBS4WpvwwycJsxN3gRSZENBfU80sW BffZGwzJ/uY4Y+lYSY0Skz62GcOL8n7AGKHP5WVYyyBCHIqb0jdgEYStNn5JN/UMgSgb g/+rSprZL20D/V0Rxl2LIHzkMZEhjGMsc2zg2Q2/j9oMoDtznyHerdJ7jVJx5H+GPAIt knS4SFmItYNsmALNjStYXtzw/aX+Q6WZ3Ws+R6dNJ+P6IL2950KEKs+g0psLXyFdP/Gj GZtw8R87daiPISqaTzrtnzPeZYySnJqEWbkL2jZH1LKY8tFKBu0jD1LhMdxMs6dQxJe7 PiBQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=TtIm3ONQ1eCjFaYXtCKFpqZg6OEBRk8qcLop9crKTEI=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=dH0RGNdpkWZPE+1QZFIA9t7tSD9zoomWGBzI0m8pTj2/6IDKypg4GlTjJmivhc+zfZ Ow0ozxmVkeNVLLN1QQwqfNz15FPLh63fBQSjyDVAo8W4Jk3im3JSQVE/oLNkoiw3GAVI 2KZOKW+ZkgnwztOoN7C6eS84HmVBlqpj9EqH3lYjq+t3ZZa/YDsE4DnRTmUFQceZvs3e tY/MWa8u0LmgT+lqMHAW9afeST/wO5VaAV0I72nNFFtSx451/STImq9TTvk/zHV2O3Mn fUprCjMzl1vmXQDUFZl9X7KqHHVkUno4xo3zqrxhXft8OJMQdBHJuxnVR6TduySOcaOw 8fDQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UyUihQZx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id pa20-20020a05620a831400b0077eff515cc6si8419834qkn.508.2023.12.06.00.10.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UyUihQZx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E70D53875DCB for ; Wed, 6 Dec 2023 08:10:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 29EA7384DB7E for ; Wed, 6 Dec 2023 08:08:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 29EA7384DB7E Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 29EA7384DB7E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; cv=none; b=O7awfqenrWPqLbS5hoP4kOfrzVOBSuLJTrWzVaFdxtTCVC2H+eKSyxrZDZpHnGo+ilk98iu1Wg8kKXALeB7/abBuVQI5zKHRL7Qc8YJ41si/G+Fx364fiJS+MLaJ1D4GBSIHV5vwsQTWOu8KIwWjcydQ64vm1uzkVb7DZD/BA9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; c=relaxed/simple; bh=+jfzyOzkcrJHQobtrrx82ZI2EQG5qN2bF2fJM4RB5zc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=AKFbWrLsPBh8/ifvqOU6uGhx2BGYwKfnnZeVuE/GfiVN3gDjRCnsWq7l2zMo6bvczKCUypsZ9bq3GDbC2fC4oy5rcfJlUU1cwfJT4oRHS+C0mZXh/rL47BdEEmKl2WAMrZuwC7siIUiRfA/J/+LnbMglSscBIMzphgPHp77jv4Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850133; x=1733386133; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+jfzyOzkcrJHQobtrrx82ZI2EQG5qN2bF2fJM4RB5zc=; b=UyUihQZxm02eYtg0sB5FhlrkYAJ5UqxRtg7nf07VJ+rztQEgzsvcTh8X ME3JlPdNV4UGYKZwhhXRHn/l4F7JH+n6cM6W5XDIy9UqTPt+zKK1vWEyQ uzgSwRyTrzHDuqGuwDr2YFRC5qdA6/Z8/Q6G0/oqeJ+Z2MNRvCxiP44y8 NVH9sR52mTCfrjh0WX+y6S0cK2DprSw8dXII96Dzu+fET28H56r98ZK04 zbQy65DODIcIcldBhPOxAEW/8O/XY2rNj8Jfnwx9mIMox/pUYlkf5/fcH HMeL1HTGbyl42nmpe2CCZZPhCAW28DJckWdVcRSAv1NXwQk80tqaBVswy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085475" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085475" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737773" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737773" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:40 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9F33C100780D; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 09/16] [APX NDD] Support APX NDD for and insn Date: Wed, 6 Dec 2023 16:06:29 +0800 Message-Id: <20231206080636.178863-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519332920903596 X-GMAIL-MSGID: 1784519332920903596 From: Kong Lingling For NDD form AND insn, there are three splitter fixes after extending legacy patterns. 1. APX NDD does not support high QImode registers like ah, bh, ch, dh, so for some optimization splitters that generates highpart zero_extract for QImode need to be prohibited under NDD pattern. 2. Legacy AND insn will use r/qm/L constraint, and a post-reload splitter will transform it into zero_extend move. But for NDD form AND, the splitter is not strict enough as the splitter assum such AND will have the const_int operand matching the constraint "L", then NDD form AND allows const_int with any QI values. Restrict the splitter condition to match "L" constraint that strictly matches zero-extend sematic. 3. Legacy AND insn will adopt r/0/Z constraint, a splitter will try to optimize such form into strict_lowpart QImode AND when 7th bit is not set. But the splitter will wronly convert non-zext form of NDD and with memory src, then the strict_lowpart transform matches alternative 1 of *_slp_1 and generates *movstrict_1 so the zext sematic was omitted. This could cause highpart of dest not cleared and generates wrong code. Disable the splitter when NDD adopted and operands[0] and operands[1] are not equal. gcc/ChangeLog: * config/i386/i386.md (and3): Add NDD alternatives and adjust output template. (*anddi_1): Likewise. (*and_1): Likewise. (*andqi_1): Likewise. (*andsi_1_zext): Likewise. (*anddi_2): Likewise. (*andsi_2_zext): Likewise. (*andqi_2_maybe_si): Likewise. (*and_2): Likewise. (*and3_doubleword): Add NDD alternative, adopt '&' to NDD dest and emit move for optimized case if operands[0] not equal to operands[1]. (define_split for QI highpart AND): Prohibit splitter to split NDD form AND insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. (define_split for zero_extend and optimization): Prohibit splitter to split NDD form AND insn to zero_extend insn. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add and test. --- gcc/config/i386/i386.md | 175 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 2 files changed, 127 insertions(+), 61 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 61b7b79543b..d2528e0dcf6 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11710,18 +11710,19 @@ (define_expand "and3" (operands[0], gen_lowpart (mode, operands[1]), mode, mode, 1)); else - ix86_expand_binary_operator (AND, mode, operands); + ix86_expand_binary_operator (AND, mode, operands, + TARGET_APX_NDD); DONE; }) (define_insn_and_split "*and3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (and: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -11733,39 +11734,53 @@ (define_insn_and_split "*and3_doubleword" if (operands[2] == const0_rtx) emit_move_insn (operands[0], const0_rtx); else if (operands[2] == constm1_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else - ix86_expand_binary_operator (AND, mode, &operands[0]); + ix86_expand_binary_operator (AND, mode, &operands[0], + TARGET_APX_NDD); if (operands[5] == const0_rtx) emit_move_insn (operands[3], const0_rtx); else if (operands[5] == constm1_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else - ix86_expand_binary_operator (AND, mode, &operands[3]); + ix86_expand_binary_operator (AND, mode, &operands[3], + TARGET_APX_NDD); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,?k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,rm,r,r,r,r,?k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,r,0,0,rm,r,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,Z,re,m,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands, + TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") - (set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,*") + [(set_attr "isa" "x64,apx_ndd,x64,x64,apx_ndd,apx_ndd,x64,avx512bw_512") + (set_attr "type" "alu,alu,alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11773,7 +11788,7 @@ (define_insn "*anddi_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,SI,DI,DI,DI,DI,SI,DI")]) (define_insn_and_split "*anddi_1_btr" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") @@ -11828,36 +11843,45 @@ (define_split ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands, + TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*and_1" - [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,Ya,?k") - (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SWI24 2 "" "r,,L,k"))) + [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,r,r,Ya,?k") + (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,rm,r,qm,k") + (match_operand:SWI24 2 "" "r,,r,,L,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "@ and{}\t{%2, %0|%0, %2} and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2} # #" [(set (attr "isa") - (cond [(eq_attr "alternative" "3") + (cond [(eq_attr "alternative" "2,3") + (const_string "apx_ndd") + (eq_attr "alternative" "5") (if_then_else (eq_attr "mode" "SI") (const_string "avx512bw") (const_string "avx512f")) ] (const_string "*"))) - (set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + (set_attr "type" "alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11865,24 +11889,27 @@ (define_insn "*and_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" ",,SI,")]) + (set_attr "mode" ",,,,SI,")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, QImode, operands)" + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD)" "@ and{b}\t{%2, %0|%0, %2} and{b}\t{%2, %0|%0, %2} and{l}\t{%k2, %k0|%k0, %k2} + and{b}\t{%2, %1, %0|%0, %1, %2} + and{b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "type" "alu,alu,alu,msklog") + [(set_attr "type" "alu,alu,alu,alu,alu,msklog") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,*") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -11985,7 +12012,10 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1]))" + || REGNO (operands[0]) != REGNO (operands[1])) + && (UINTVAL (operands[2]) == GET_MODE_MASK (SImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (HImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (QImode))" [(const_int 0)] { unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]); @@ -12058,10 +12088,10 @@ (define_insn "*anddi_2" [(set (reg FLAGS_REG) (compare (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m")) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,r,rm,r") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,Z,re,m")) (const_int 0))) - (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") + (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,r,r") (and:DI (match_dup 1) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode @@ -12075,38 +12105,46 @@ (define_insn "*anddi_2" && (!CONST_INT_P (operands[2]) || val_signbit_known_set_p (SImode, INTVAL (operands[2])))) ? CCZmode : CCNOmode) - && ix86_binary_operator_ok (AND, DImode, operands)" + && ix86_binary_operator_ok (AND, DImode, operands, TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} - and{q}\t{%2, %0|%0, %2}" + and{q}\t{%2, %0|%0, %2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") - (set_attr "mode" "SI,DI,DI")]) + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,apx_ndd") + (set_attr "mode" "SI,DI,DI,SI,DI,DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_2_zext" [(set (reg FLAGS_REG) (compare (and:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (AND, SImode, operands, TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*andqi_2_maybe_si" [(set (reg FLAGS_REG) (compare (and:QI - (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:QI 2 "general_operand" "qn,m,n")) + (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,n,rn,m")) (const_int 0))) - (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r") (and:QI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (AND, QImode, operands) + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD) && ix86_match_ccmode (insn, CONST_INT_P (operands[2]) && INTVAL (operands[2]) >= 0 ? CCNOmode : CCZmode)" @@ -12117,11 +12155,16 @@ (define_insn "*andqi_2_maybe_si" operands[2] = GEN_INT (INTVAL (operands[2]) & 0xff); return "and{l}\t{%2, %k0|%k0, %2}"; } + if (which_alternative > 2) + return "and{b}\t{%2, %1, %0|%0, %1, %2}"; return "and{b}\t{%2, %0|%0, %2}"; } [(set_attr "type" "alu") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd") (set (attr "mode") - (cond [(eq_attr "alternative" "2") + (cond [(eq_attr "alternative" "3,4") + (const_string "QI") + (eq_attr "alternative" "2") (const_string "SI") (and (match_test "optimize_insn_for_size_p ()") (and (match_operand 0 "ext_QIreg_operand") @@ -12138,15 +12181,21 @@ (define_insn "*andqi_2_maybe_si" (define_insn "*and_2" [(set (reg FLAGS_REG) (compare (and:SWI124 - (match_operand:SWI124 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI124 2 "" ",")) + (match_operand:SWI124 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI124 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,,r,r") (and:SWI124 (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, mode, operands)" - "and{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (AND, mode, operands, + TARGET_APX_NDD)" + "@ + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) (define_insn "*qi_ext_0" @@ -12392,6 +12441,7 @@ (define_insn_and_split "*qi_ext_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (and:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12399,7 +12449,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(~INTVAL (operands[2]) & ~(255 << 8))" + && !(~INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12428,7 +12479,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(~INTVAL (operands[2]) & ~255) - && !(INTVAL (operands[2]) & 128)" + && !(INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (and:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 2bd551614c4..be436d57bdf 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -85,6 +85,15 @@ F (int, not, ~) F1 (int, not, ~) F (long, not, ~) F1 (long, not, ~) + +FOO (char, and, &) +FOO1 (char, and, &) +FOO (short, and, &) +FOO1 (short, and, &) +FOO (int, and, &) +FOO1 (int, and, &) +FOO (long, and, &) +FOO1 (long, and, &) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -95,3 +104,7 @@ F1 (long, not, ~) /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */