From patchwork Wed Nov 15 09:46:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165241 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429806vqg; Wed, 15 Nov 2023 01:51:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHliB6MKAq6U2lds8QTJr/L4jf5MBpGd2UKrvbptYUOC3WMu8Ny+aTx7/DcIut4ja4ZmH7H X-Received: by 2002:a05:620a:4627:b0:77b:cc56:49f2 with SMTP id br39-20020a05620a462700b0077bcc5649f2mr5476489qkb.61.1700041862407; Wed, 15 Nov 2023 01:51:02 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041862; cv=pass; d=google.com; s=arc-20160816; b=QKR4qzJYvGb3lMG92CeNFRV0/RcjRSAYm4ow2ZIJDu6KboiOmkcXYnauGBSmlLhJQE J5YeCfltz4BG9+bsbxXXKrpzKoV89i8x/Ma32iUXMwssqQSUZAegVk51e/MhH79As4Bf l4HZLoBu7V6mvnBJDokvuwZYNR4+DsF+dZ487sX6fLAhxi3u+PU723kxNdtOT+EzXm9y wuDrWBB50yd8+tYWn4Vej9rGtJ2iBuArB+B1tCqSS7nrLLak1bizt5M9RlBF28Mg2ED7 8V+xbMCVxNfBqyOg1wCLRUXrANmVpotvfyz6g72QcqLV304L0y8Ynu6VN9x4s2WIvkS1 /HvQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=M8yJ7Smpk6dxlWwLXZGYma4715kz8502NAgAMW50quw=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=N7wanmuWhgUMxPt1zZn2G//aEgFhYHSZcbFqTJljCT8iPdL7blAXuRE338T6hPgCR2 lGG3DCLqdyk7uUuLzXZsgdftaGwn5u39AxrfSW5/P1T5LqNZyoHkYvh2qHMyOuw6Ccim u33SftYrzu9v7HJxTft1evuhBz+03MD4vANm87rJLT7qAWsfJYrIYPCRAEEna0oVRSWV 4keM7pS6ECsH0LgDxC6snAZ++DzMpQwDW4omiAhCM4Iw0KqJk2mPHx9BGfZK69PYjxEZ GqitV0yH8nyj6HQqbxMDAqWNY41AGcObxz7hdWID6TUofsEIii8ZXNceNopeGvR4vpfG C0Pw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GnLDduyA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h24-20020a05620a21d800b0076eeeaaa07asi8470829qka.492.2023.11.15.01.51.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:51:02 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GnLDduyA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2FC743857C60 for ; Wed, 15 Nov 2023 09:51:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 81AA3385802F for ; Wed, 15 Nov 2023 09:47:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 81AA3385802F Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 81AA3385802F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041642; cv=none; b=RjTrKJ5KF5NjBtzPXCfF7jRt2xzcTllnKMZAUfZ6j0J171MfNolUU18p3+3IME/3dGKFbm0R1pSkkXLJQHL4d6rBnZtp1aMjjzuPwcV4W0m6lVTdsNOgRnBh46jVGyHiqF4TupphLfuguUgBtwG/BmWdIH17uZuktVguzDiW2/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041642; c=relaxed/simple; bh=kpOHflRJOsh0VzDAL/st4FmYIMQtxPitKliiGz0gVSE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CgfcCK9A+OxpR97tOYzzakh2tqPVoqYb6CyklGJEQ2p/klJLOfemaIUYJMwTJn3FTm7FkpJORVMHbdMCVWPQPILumV961i78svYz2Sy6zseKHgB5fgbwqHhbfnmiNWl7Ck8lZbdzEPFylF8rA1fRN6UBzR7XOClRh90G8+TzpPQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041639; x=1731577639; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kpOHflRJOsh0VzDAL/st4FmYIMQtxPitKliiGz0gVSE=; b=GnLDduyAnT5Ge1tNm3Q/sYLC044yfcgsrwUzydC5PVDk6qEwNLId5uIJ f8JeFIX87mOaRAm5iR417heKXMt1y2o4Q+NWVAQRYOEYO/cWh2QuQpkRD vZDq73VifRSc856zjkrSc9hx9+de4jItkVpdbgvKbAxBYwbUbAq8BA81+ 3YtOfjvQfToZx/+JPIh6LxOE0Q31xDfutNYiFSujw+a01GUPOntKCOq7K A0H8hEpBtkIOV7BNtx+bKS6dAL7/ShYDzL4ZM+CGCNU3Hg5XCv6a9NyOL qGWS+KoEMJiDCS+f0FBXJj2nBTvJklgt5FcjIDT/e10sNHojOyl0icaKx Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342610" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342610" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431710" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431710" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BEA2E100567F; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 10/16] [APX NDD] Support APX NDD for and insn Date: Wed, 15 Nov 2023 17:46:59 +0800 Message-Id: <20231115094705.3976553-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623096031561697 X-GMAIL-MSGID: 1782623096031561697 From: Kong Lingling For NDD form AND insn, there are three splitter fixes after extending legacy patterns. 1. APX NDD does not support high QImode registers like ah, bh, ch, dh, so for some optimization splitters that generates highpart zero_extract for QImode need to be prohibited under NDD pattern. 2. Legacy AND insn will use r/qm/L constraint, and a post-reload splitter will transform it into zero_extend move. But for NDD form AND, the splitter is not strict enough as the splitter assum such AND will have the const_int operand matching the constraint "L", then NDD form AND allows const_int with any QI values. Restrict the splitter condition to match "L" constraint that strictly matches zero-extend sematic. 3. Legacy AND insn will adopt r/0/Z constraint, a splitter will try to optimize such form into strict_lowpart QImode AND when 7th bit is not set. But the splitter will wronly convert non-zext form of NDD and with memory src, then the strict_lowpart transform matches alternative 1 of *_slp_1 and generates *movstrict_1 so the zext sematic was omitted. This could cause highpart of dest not cleared and generates wrong code. Disable the splitter when NDD adopted and operands[0] and operands[1] are not equal. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add AND support. * config/i386/i386.md (and3): Add NDD alternatives and adjust output template. (*anddi_1): Likewise. (*and_1): Likewise. (*andqi_1): Likewise. (*andsi_1_zext): Likewise. (*anddi_2): Likewise. (*andsi_2_zext): Likewise. (*andqi_2_maybe_si): Likewise. (*and_2): Likewise. (*and3_doubleword): Add NDD constraints, emit move for optimized case if operands[0] not equal to operands[1]. (define_split for QI highpart AND): Prohibit splitter to split NDD form AND insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. (define_split for zero_extend and optimization): Prohibit splitter to split NDD form AND insn to zero_extend insn. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add and test. * gcc.target/i386/apx-spill_to_egprs-1.c: Change some check. --- gcc/config/i386/i386-expand.cc | 1 + gcc/config/i386/i386.md | 177 ++++++++++++------ gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ .../gcc.target/i386/apx-spill_to_egprs-1.c | 8 +- 4 files changed, 135 insertions(+), 64 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index be77ba4a476..662f687abc3 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1273,6 +1273,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case MINUS: case NEG: case NOT: + case AND: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9758e4e5144..4bf0c16f401 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11471,18 +11471,20 @@ (define_expand "and3" (operands[0], gen_lowpart (mode, operands[1]), mode, mode, 1)); else - ix86_expand_binary_operator (AND, mode, operands); + ix86_expand_binary_operator (AND, mode, operands, + ix86_can_use_ndd_p (AND)); DONE; }) (define_insn_and_split "*and3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (and: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -11494,39 +11496,53 @@ (define_insn_and_split "*and3_doubleword" if (operands[2] == const0_rtx) emit_move_insn (operands[0], const0_rtx); else if (operands[2] == constm1_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else - ix86_expand_binary_operator (AND, mode, &operands[0]); + ix86_expand_binary_operator (AND, mode, &operands[0], + ix86_can_use_ndd_p (AND)); if (operands[5] == const0_rtx) emit_move_insn (operands[3], const0_rtx); else if (operands[5] == constm1_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else - ix86_expand_binary_operator (AND, mode, &operands[3]); + ix86_expand_binary_operator (AND, mode, &operands[3], + ix86_can_use_ndd_p (AND)); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,?k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,rm,r,r,r,r,?k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,r,0,0,rm,r,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,Z,re,m,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{l}\t{%k2, %k0|%k0, %k2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") - (set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,*") + [(set_attr "isa" "x64,apx_ndd,x64,x64,apx_ndd,apx_ndd,x64,avx512bw_512") + (set_attr "type" "alu,alu,alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11534,7 +11550,7 @@ (define_insn "*anddi_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,SI,DI,DI,DI,DI,SI,DI")]) (define_insn_and_split "*anddi_1_btr" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") @@ -11589,36 +11605,46 @@ (define_split ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*and_1" - [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,Ya,?k") - (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SWI24 2 "" "r,,L,k"))) + [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,r,r,Ya,?k") + (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,rm,r,qm,k") + (match_operand:SWI24 2 "" "r,,r,,L,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" "@ and{}\t{%2, %0|%0, %2} and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2} # #" [(set (attr "isa") - (cond [(eq_attr "alternative" "3") + (cond [(eq_attr "alternative" "2,3") + (const_string "apx_ndd") + (eq_attr "alternative" "5") (if_then_else (eq_attr "mode" "SI") (const_string "avx512bw") (const_string "avx512f")) ] (const_string "*"))) - (set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + (set_attr "type" "alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11626,24 +11652,28 @@ (define_insn "*and_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" ",,SI,")]) + (set_attr "mode" ",,,,SI,")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, QImode, operands)" + "ix86_binary_operator_ok (AND, QImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{b}\t{%2, %0|%0, %2} and{b}\t{%2, %0|%0, %2} and{l}\t{%k2, %k0|%k0, %k2} + and{b}\t{%2, %1, %0|%0, %1, %2} + and{b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "type" "alu,alu,alu,msklog") + [(set_attr "type" "alu,alu,alu,alu,alu,msklog") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,*") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -11683,7 +11713,10 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1]))" + || REGNO (operands[0]) != REGNO (operands[1])) + && (UINTVAL (operands[2]) == GET_MODE_MASK (SImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (HImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (QImode))" [(const_int 0)] { unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]); @@ -11756,10 +11789,10 @@ (define_insn "*anddi_2" [(set (reg FLAGS_REG) (compare (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m")) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,r,rm,r") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,Z,re,m")) (const_int 0))) - (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") + (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,r,r") (and:DI (match_dup 1) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode @@ -11773,38 +11806,49 @@ (define_insn "*anddi_2" && (!CONST_INT_P (operands[2]) || val_signbit_known_set_p (SImode, INTVAL (operands[2])))) ? CCZmode : CCNOmode) - && ix86_binary_operator_ok (AND, DImode, operands)" + && ix86_binary_operator_ok (AND, DImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} - and{q}\t{%2, %0|%0, %2}" + and{q}\t{%2, %0|%0, %2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") - (set_attr "mode" "SI,DI,DI")]) + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,apx_ndd") + (set_attr "mode" "SI,DI,DI,SI,DI,DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_2_zext" [(set (reg FLAGS_REG) (compare (and:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (AND, SImode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*andqi_2_maybe_si" [(set (reg FLAGS_REG) (compare (and:QI - (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:QI 2 "general_operand" "qn,m,n")) + (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,n,rn,m")) (const_int 0))) - (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r") (and:QI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (AND, QImode, operands) + "ix86_binary_operator_ok (AND, QImode, operands, + ix86_can_use_ndd_p (AND)) && ix86_match_ccmode (insn, CONST_INT_P (operands[2]) && INTVAL (operands[2]) >= 0 ? CCNOmode : CCZmode)" @@ -11815,9 +11859,12 @@ (define_insn "*andqi_2_maybe_si" operands[2] = GEN_INT (INTVAL (operands[2]) & 0xff); return "and{l}\t{%2, %k0|%k0, %2}"; } + if (which_alternative > 2) + return "and{b}\t{%2, %1, %0|%0, %1, %2}"; return "and{b}\t{%2, %0|%0, %2}"; } [(set_attr "type" "alu") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") @@ -11836,15 +11883,21 @@ (define_insn "*andqi_2_maybe_si" (define_insn "*and_2" [(set (reg FLAGS_REG) (compare (and:SWI124 - (match_operand:SWI124 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI124 2 "" ",")) + (match_operand:SWI124 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI124 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,,r,r") (and:SWI124 (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, mode, operands)" - "and{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) (define_insn "*qi_ext_0" @@ -12057,6 +12110,7 @@ (define_insn_and_split "*qi_ext_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (and:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12064,7 +12118,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(~INTVAL (operands[2]) & ~(255 << 8))" + && !(~INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12093,7 +12148,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(~INTVAL (operands[2]) & ~255) - && !(INTVAL (operands[2]) & 128)" + && !(INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (and:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 9af72d1a46d..b34194b762d 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -85,6 +85,15 @@ F (int, not, ~) F1 (int, not, ~) F (long, not, ~) F1 (long, not, ~) + +FOO (char, and, &) +FOO1 (char, and, &) +FOO (short, and, &) +FOO1 (short, and, &) +FOO (int, and, &) +FOO1 (int, and, &) +FOO (long, and, &) +FOO1 (long, and, &) /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -95,3 +104,7 @@ F1 (long, not, ~) /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c index 290863d63a7..ecddcd6c14c 100644 --- a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c +++ b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c @@ -3,8 +3,8 @@ #include "spill_to_mask-1.c" -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r16d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r17d" } } */ +/* { dg-final { scan-assembler "(?:movl|rorx)\[ \t]+\[^\\n\\r\]*, %r16d" } } */ +/* { dg-final { scan-assembler "(?:movl|rorx)\[ \t]+\[^\\n\\r\]*, %r17d" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r18d" } } */ /* { dg-final { scan-assembler "movq\[ \t]+\[^\\n\\r\]*, %r19" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r20d" } } */ @@ -13,8 +13,8 @@ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r23d" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r24d" } } */ /* { dg-final { scan-assembler "addl\[ \t]+\[^\\n\\r\]*, %r25d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r26d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r27d" } } */ +/* { dg-final { scan-assembler "(?:movl|movbel)\[ \t]+\[^\\n\\r\]*, %r26d" } } */ +/* { dg-final { scan-assembler "(?:movl|movbel)\[ \t]+\[^\\n\\r\]*, %r27d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r28d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r29d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r30d" } } */