From patchwork Wed Nov 15 09:46:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165232 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2428980vqg; Wed, 15 Nov 2023 01:48:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IEe7Ck9lDVy6oBZDbeV35NPPDdkLkSfBI7K15zIUpV0kl4PYVASRoqTUeTqjnqBghYLDjV6 X-Received: by 2002:a05:620a:c45:b0:778:98bf:1e55 with SMTP id u5-20020a05620a0c4500b0077898bf1e55mr5071750qki.65.1700041712188; Wed, 15 Nov 2023 01:48:32 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041712; cv=pass; d=google.com; s=arc-20160816; b=Dsy5dh+w7W+kWhu11Zfuv8qQhhXdXQDT0bEx25D8LhY1aIg1d+BC+HhNoh1T1WFfWa RHSwAmnZFPDtwF6DHiTPThRK7xe6xHxKMxmAEMHkXFpDoYYbyDZt670mdGR/Qv3aGlUV 01Zk0WaQ73BVqfAkGKi0/E7rLFmweMm8QdZU/NL5S5m+RkpL5+tML40gCX+bGNoET7pP /xqidk74vYw4sBa2comjMCh21LVBbu/eoI73tl1XM299BR9hmFh7A1B6FZu+FMrH/3J+ 89HrL8rPRc7JHfrFClN4BRRGs7PY9r9ErGyQgcCg92G66Nla+qKDN/2HgYP5Ab8LJR75 bIVQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=aO7nIsDi8AjyRbY/jijZrOJLIWhF2akbJaIX337R3ig=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=Q6n6OvXXXB75CnQB+KsDha2N4i+VLo97alzk2JPL35L+y9E83YEl+ZTy+uGzjyE8EW 5/Nw16NfQSLWhdEQIv3GCtSOwIQn3CTncnzMk5zMj8cYhhBYlEYUcrliXWpMj4qUYwKQ GrXukWnGBe8rLOulRlM6+WfixvomZnfBsPWrDKTJi9b3U0lyjVFZ16vTVaajZ1pDJt/o aGN+2zFoaRc8IklMkQo9+muTH1UosxT4TccMAoQJGnnNC4cMzSO2xJZ0Uh4ulb7l250+ FTvcBJWsFGSXkFFQ2KuCjIYT0ZTAOr0jOagZTdi4n6MrvdhViJOq8RN7DI3lD+ZKoThT Q13g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=AJl9sUu1; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e14-20020a05620a208e00b00773cb326234si8537538qka.413.2023.11.15.01.48.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:48:32 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=AJl9sUu1; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 54504385C33F for ; Wed, 15 Nov 2023 09:48:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id E13833858C52 for ; Wed, 15 Nov 2023 09:47:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E13833858C52 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E13833858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041634; cv=none; b=gu0hC5nggEv0l1sbhWqBLcITnIZ+V+FCToNHfWPVqFsUzEE0TIOG15nyf+3qVQD2fwXYedOYz3Vz34knglgt2Uki4iISir3/ByfzcWXDcPXtqSz4zq3KfXGJfKEoAVwYMilYIR3OhzVahQbwENS7eQ7/qQifeXs8ia7o93MPbMI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041634; c=relaxed/simple; bh=UwzQFBwHKjbY0mBjaQ0Xy783wevM8tqwPF0x8fGAb3c=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=iQj+yXjkAhtff7D0zTOsY1FP5n4IVGyZoW5phGJEUQbSbMhxavvRuSuJfNKo0I/KpWDQKgkTdJBjLUmiLHX114GQ1HZFQDvCjABY6sdwULS9VenU2V6QIuxKLnktTioWZ0+BCTEn5Hsdwqcamxf8BMV6rPXshHBCbcsXz9s28gs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041632; x=1731577632; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UwzQFBwHKjbY0mBjaQ0Xy783wevM8tqwPF0x8fGAb3c=; b=AJl9sUu1cRt4A8aDRiNUTnoRRoul6rSjdEn7B2cqCu0n7X85ENywo84g vb4DEtdTTQoodOtD7NJhW7iGPmVzgZO2DD8zX0uo/b1+xdfs3Gh437uYT s3FMQlZK4POPMsiE66QyiYr10W99k6yg6FHXQubonfj7aLAG4mp+Bj6m+ m+vxl9P7DGtxcALb+JFPaMZ/WvTjhM86r1HwHma+srlpTNyHe9gaH282z QW/EqqCrcuRoJtBgkFnf3cLE1JoO75Sqw19r67OlobWIVJX7hV351g6ah SO6tjVRnvOUQwCJxqIvh+OmUPl7nTPmS6PgCUL97hNyl6uAWbb4CFCSdM Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="455138350" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="455138350" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="6105933" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 15 Nov 2023 01:47:07 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A82491005675; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 01/16] [APX NDD] Support Intel APX NDD for legacy add insn Date: Wed, 15 Nov 2023 17:46:50 +0800 Message-Id: <20231115094705.3976553-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622938624182999 X-GMAIL-MSGID: 1782622938624182999 From: Kong Lingling APX NDD provides an extra destination register operand for several gpr related legacy insns, so a new alternative can be adopted to operand1 with "r" constraint. This first patch supports NDD for add instruction, and keeps to use lea when all operands are registers since lea have shorter encoding. For add operations containing mem NDD will be adopted to save an extra move. In legacy x86 binary operation expand it will force operands[0] and operands[1] to be the same so add a helper function to allow NDD form pattern that operands[0] and operands[1] can be different. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): New function. (ix86_fixup_binary_operands): Add new use_ndd flag to check whether ndd can be used for this binop and adjust operand emit. (ix86_binary_operator_ok): Likewise. (ix86_expand_binary_operator): Likewise, and void postreload expand generate lea pattern when use_ndd is explicit parsed. * config/i386/i386-options.cc (ix86_option_override_internal): Prohibit apx subfeatures when not in 64bit mode. * config/i386/i386-protos.h (ix86_binary_operator_ok): Add use_ndd flag. (ix86_fixup_binary_operand): Likewise. (ix86_expand_binary_operand): Likewise. * config/i386/i386.md (*add_1): Extend with new alternatives to support NDD, and adjust output template. (*addhi_1): Likewise. (*addqi_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: New test. --- gcc/config/i386/i386-expand.cc | 31 +++++-- gcc/config/i386/i386-options.cc | 3 + gcc/config/i386/i386-protos.h | 7 +- gcc/config/i386/i386.md | 109 ++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 21 +++++ 5 files changed, 118 insertions(+), 53 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index a8d871d321e..ea0e5881087 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1260,6 +1260,22 @@ ix86_swap_binary_operands_p (enum rtx_code code, machine_mode mode, return false; } +/* APX extends most (but not all) integer instructions with a new form that + has a third register operand called a nondestructive destination (NDD). */ + +bool ix86_can_use_ndd_p (enum rtx_code code) +{ + if (!TARGET_APX_NDD) + return false; + switch (code) + { + case PLUS: + return true; + default: + return false; + } + return false; +} /* Fix up OPERANDS to satisfy ix86_binary_operator_ok. Return the destination to use for the operation. If different from the true @@ -1267,7 +1283,7 @@ ix86_swap_binary_operands_p (enum rtx_code code, machine_mode mode, rtx ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1307,7 +1323,7 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, src1 = force_reg (mode, src1); /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) src1 = force_reg (mode, src1); /* Improve address combine. */ @@ -1338,11 +1354,11 @@ ix86_fixup_binary_operands_no_copy (enum rtx_code code, void ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx src1, src2, dst, op, clob; - dst = ix86_fixup_binary_operands (code, mode, operands); + dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); src1 = operands[1]; src2 = operands[2]; @@ -1352,7 +1368,8 @@ ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, if (reload_completed && code == PLUS - && !rtx_equal_p (dst, src1)) + && !rtx_equal_p (dst, src1) + && !use_ndd) { /* This is going to be an LEA; avoid splitting it later. */ emit_insn (op); @@ -1451,7 +1468,7 @@ ix86_expand_vector_logical_operator (enum rtx_code code, machine_mode mode, bool ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, - rtx operands[3]) + rtx operands[3], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1475,7 +1492,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, return false; /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) /* Support "andhi/andsi/anddi" as a zero-extending move. */ return (code == AND && (mode == HImode diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index df7d24352d1..7bef8ed8e8a 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -2099,6 +2099,9 @@ ix86_option_override_internal (bool main_args_p, if (TARGET_APX_F && !TARGET_64BIT) error ("%<-mapxf%> is not supported for 32-bit code"); + if (opts->x_ix86_apx_features != apx_none && !TARGET_64BIT) + error ("%<-mapx-features=%> option is not supported for 32-bit code"); + if (TARGET_UINTR && !TARGET_64BIT) error ("%<-muintr%> not supported for 32-bit code"); diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 28d0eab11d5..3e08eae4e79 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -108,14 +108,15 @@ extern void ix86_expand_move (machine_mode, rtx[]); extern void ix86_expand_vector_move (machine_mode, rtx[]); extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_binary_operator (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, machine_mode, rtx[]); -extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3]); +extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3], bool = false); +extern bool ix86_can_use_ndd_p (enum rtx_code); extern bool ix86_avoid_lea_for_add (rtx_insn *, rtx[]); extern bool ix86_use_lea_for_mov (rtx_insn *, rtx[]); extern bool ix86_avoid_lea_for_addr (rtx_insn *, rtx[]); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 29289f48e9c..daab634fea0 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -559,7 +559,7 @@ (define_attr "unit" "integer,i387,sse,mmx,unknown" ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes, + x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, @@ -957,6 +957,8 @@ (define_attr "enabled" "" (symbol_ref "TARGET_AVX512BF16 && TARGET_AVX512VL") (eq_attr "isa" "vpclmulqdqvl") (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") + (eq_attr "isa" "apx_ndd") + (symbol_ref "TARGET_APX_NDD") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") @@ -6229,7 +6231,8 @@ (define_expand "add3" (plus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (PLUS, mode, operands); DONE;") + "ix86_expand_binary_operator (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)); DONE;") (define_insn_and_split "*add3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -6356,26 +6359,29 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" { + bool use_ndd = (which_alternative == 4 || which_alternative == 5); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: @@ -6384,14 +6390,16 @@ (define_insn "*add_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:SWI48 2 "incdec_operand") @@ -6460,25 +6468,26 @@ (define_insn "addsi_1_zext" (set_attr "mode" "SI")]) (define_insn "*addhi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp") - (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp") - (match_operand:HI 2 "general_operand" "rn,m,0,ln"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp,r,r") + (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp,rm,r") + (match_operand:HI 2 "general_operand" "rn,m,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, HImode, operands)" + "ix86_binary_operator_ok (PLUS, HImode, operands, + ix86_can_use_ndd_p (PLUS))" { + bool use_ndd = (which_alternative == 4 || which_alternative == 5); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{w}\t%0"; + return use_ndd ? "inc{w}\t{%1, %0|%0, %1}" : "inc{w}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{w}\t%0"; + return use_ndd ? "dec{w}\t{%1, %0|%0, %1}" : "dec{w}\t%0"; } default: @@ -6487,14 +6496,16 @@ (define_insn "*addhi_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], HImode)) - return "sub{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{w}\t{%2, %1, %0|%0, %1, %2}" + : "sub{w}\t{%2, %0|%0, %2}"; - return "add{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{w}\t{%2, %1, %0|%0, %1, %2}" + : "add{w}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:HI 2 "incdec_operand") @@ -6506,30 +6517,38 @@ (define_insn "*addhi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "HI,HI,HI,SI")]) + (set_attr "mode" "HI,HI,HI,SI,HI,HI")]) (define_insn "*addqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp") - (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp") - (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp,r,r") + (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, QImode, operands)" + "ix86_binary_operator_ok (PLUS, QImode, operands, + ix86_can_use_ndd_p (PLUS))" { bool widen = (get_attr_mode (insn) != MODE_QI); - + bool use_ndd = (which_alternative == 6 || which_alternative == 7); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; + if (use_ndd) + return widen ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{b}\t{%1, %0|%0, %1}"; + else + return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; + if (use_ndd) + return widen ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{b}\t{%1, %0|%0, %1}"; + else + return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; } default: @@ -6538,21 +6557,25 @@ (define_insn "*addqi_1" if (which_alternative == 2 || which_alternative == 4) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], QImode)) { - if (widen) - return "sub{l}\t{%2, %k0|%k0, %2}"; + if (use_ndd) + return widen ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "sub{b}\t{%2, %0|%0, %2}"; + return widen ? "sub{l}\t{%2, %k0|%k0, %2}" + : "sub{b}\t{%2, %0|%0, %2}"; } - if (widen) - return "add{l}\t{%k2, %k0|%k0, %k2}"; + if (use_ndd) + return widen ? "add{l}\t{%k2, %1, %k0|%k0, %1, %k2}" + : "add{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "add{b}\t{%2, %0|%0, %2}"; + return widen ? "add{l}\t{%k2, %k0|%k0, %k2}" + : "add{b}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "5") (const_string "lea") (match_operand:QI 2 "incdec_operand") @@ -6564,10 +6587,10 @@ (define_insn "*addqi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "QI,QI,QI,SI,SI,SI") + (set_attr "mode" "QI,QI,QI,SI,SI,SI,SI,SI") ;; Potential partial reg stall on alternatives 3 and 4. (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "3,4") + (cond [(eq_attr "alternative" "3,4,6,7") (symbol_ref "!TARGET_PARTIAL_REG_STALL")] (symbol_ref "true")))]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c new file mode 100644 index 00000000000..dd3dc78e52f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mapxf -O2" } */ +/* { dg-final { scan-assembler-not "movl"} } */ + +int foo (int *a) +{ + int b = *a - 1; + return b; +} + +int foo2 (int a, int b) +{ + int c = a + b; + return c; +} + +int foo3 (int *a, int b) +{ + int c = *a + b; + return c; +} From patchwork Wed Nov 15 09:46:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165229 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2428641vqg; Wed, 15 Nov 2023 01:47:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IGy0ZBfbClYEe5JMSsDOrmaIFKUzk+HCwCJpo4Td5HizYs3lMuoWzQR8GvH32WsWqV0qukV X-Received: by 2002:a67:c383:0:b0:45d:852c:3e9f with SMTP id s3-20020a67c383000000b0045d852c3e9fmr11946947vsj.8.1700041658728; Wed, 15 Nov 2023 01:47:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041658; cv=pass; d=google.com; s=arc-20160816; b=fGHq1a0WEu1hdhHpBrZS9zFAyIrr1Wr7dE5Bg1+peJ7WPbEmCwtrMP3Co3CtiAtvkK SBkIKNI9HbwyTsakqF/kFccfD7LpUZM9M9jZ3nu271fpyKhl02m/bJ6qUgTZu3aKOfsc Pv+MbY/rz3KryWbUAXGDcBjjMPxOyXeH+PRvzzAhVPIiFeWD7KYCgwJ4fhYgWiCRoWIL 4X+HptJyirtXFImzdZ4zTxAHnN+g6vKrmLhHmm4MjqVuhNp1w7QyyqNwBbK+m/K+t3e/ Wr7XvDtomhq2gSFHcyieUFVSEzFCxtuTtU/cksRclrZ6Yf0Zw7+x6c20KvM2iMrfou/3 oiBg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=By3ftjJwLWQV9Hx3QJlgL8zbb7OnXPDiyWTi3mof3Io=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=stTGGArMCmOjS5aGliPVcxAazkFE2+sV+9dvogM11zmVGEjlBmBNRpMsx/ZNnZVJfC NoASw9f4jPlfbeVGiJFcPZHBxlslKrulbtzx7sisHTXA74Igg1qI47HjHU0XFhMYQUob d+VVEwtEWGP2DfMqAvr+Ql/6iVihvbP7fd9RU3CGkH1y2XZoEcd+SCCUG2TqVZloFnwL 6+UDXjdFkLGlv38nK+IY+bJqx4pkxOmemVPhFgRlFTEveEBtMfY+QtinB+EPykSHisc1 nXmzYzOIpi3oAvZet8anfKdZ9PHm/U3godkpX3IT/2V41JZqm+ZYO2dKzXXYzSrOYBN5 QN2w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="b/EQrbDk"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t40-20020a05622a182800b00417a65a82besi9366716qtc.414.2023.11.15.01.47.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:47:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="b/EQrbDk"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 40FC03857C7C for ; Wed, 15 Nov 2023 09:47:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 4DE343858C98 for ; Wed, 15 Nov 2023 09:47:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4DE343858C98 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4DE343858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041631; cv=none; b=MMuvIcXRV91ZrQWn/3iUVw4e0KmlpRHGoXHu541DQ/MYjPb31AakS13aPrpI3hxhmyOPQP8p/RZMbIFo2apF4IuOpU7fQjunlVM7TvDJ3rJd4BnTofAOT7SplL8S36IEsrxurQBs5YOVbpWeskA4HrvkZCaJXSicJ6hFrOlLHwQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041631; c=relaxed/simple; bh=7eydAPM26+JtVSt/KxuCrlJ58nR8vRZQHBCN+NUh20U=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Y+VMLv52N/7Re5+SeowGYJcZ5RdSxh42Z5Z6bTeGZ/RSUVgoTccxcerSs599kDcEMGKqHDHQxUzywMd+FMt591qaxL1ZqE+XoRGm3s7KZDeZssLzRQqwURdhLdbsgz0KUP4JMYoPwj6LqfVKSNiWQqjG+jGNTXhyfU7ROjzu5Kg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041629; x=1731577629; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7eydAPM26+JtVSt/KxuCrlJ58nR8vRZQHBCN+NUh20U=; b=b/EQrbDkVOmmIq8BTmUBP4XYA77eiPsGbyvjcWGGXpKcxFtlJDdd1zbq dyfSvD8iw6MoSgaQf5c1trRg/Hwk+MjAJRaIokTL2EpUY8WK7DEPH0Vjm MYesjGrIgwJvWOOoKjeedSc7ulCFhuIuYcZmoOj2NLAGmIjs4eltqz0uu GdQVpEELIPFkZr5OyN2eXI42IzCa2XlEVtc4CF/ca1+97MEX3ZcKWbqX7 bW1r7NTNO9ILpFk1Gm3na2zovlP+93K9g2P7l/MP7/NbmhApdwDkJXyKu InGTwRg8Qs/F9uw1HStI7Ml7Q9TfTJ+83j+LCIc/vaFEZd5BKXUVn5psE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="455138345" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="455138345" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="6105926" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 15 Nov 2023 01:47:07 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AAA2A1005676; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 02/16] [APX NDD] Restrict TImode register usage when NDD enabled Date: Wed, 15 Nov 2023 17:46:51 +0800 Message-Id: <20231115094705.3976553-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622882144599072 X-GMAIL-MSGID: 1782622882144599072 Under APX NDD, previous TImode allocation will have issue that it was originally allocated using continuous pair, like rax:rdi, rdi:rdx. This will cause issue for all TImode NDD patterns. For NDD we will not assume the arithmetic operations like add have dependency between dest and src1, then write to 1st highpart rdi will be overrided by the 2nd lowpart rdi if 2nd lowpart rdi have different src as input, then the write to 1st highpart rdi will missed and cause miscompliation. To resolve this, under TARGET_APX_NDD we'd only allow register with even regno to be allocated with TImode, then TImode registers will be allocated with non-overlapping pairs. There could be some error for inline assembly if it forcely allocate __int128 with odd number general register. gcc/ChangeLog: * config/i386/i386.cc (ix86_hard_regno_mode_ok): Restrict even regno for TImode if APX NDD enabled. --- gcc/config/i386/i386.cc | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 683ac643bc8..3779d5b1206 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20824,6 +20824,16 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return true; return !can_create_pseudo_p (); } + /* With TImode we previously have assumption that src1/dest will use same + register, so the allocation of highpart/lowpart can be consecutive, and + 2 TImode insn would held their low/highpart in continuous sequence like + rax:rdx, rdx:rcx. This will not work for APX_NDD since NDD allows + different registers as dest/src1, when writes to 2nd lowpart will impact + the writes to 1st highpart, then the insn will be optimized out. So for + TImode pattern if we support NDD form, the allowed register number should + be even to avoid such mixed high/low part override. */ + else if (TARGET_APX_NDD && mode == TImode) + return regno % 2 == 0; /* We handle both integer and floats in the general purpose registers. */ else if (VALID_INT_MODE_P (mode) || VALID_FP_MODE_P (mode)) From patchwork Wed Nov 15 09:46:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165235 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429179vqg; Wed, 15 Nov 2023 01:49:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IGH4WyJjqlWTyxF3ndWErnP8I/hJVlpe3PZg6TmvUA6mCrmYf9QJKLvfjvJDdyd6OOFmFPl X-Received: by 2002:ac8:5e13:0:b0:418:1c96:8ae9 with SMTP id h19-20020ac85e13000000b004181c968ae9mr5368407qtx.11.1700041744414; Wed, 15 Nov 2023 01:49:04 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041744; cv=pass; d=google.com; s=arc-20160816; b=Go1aQ1zIxmNYRDZv12wuW73SMDWqlLphpJdwLBrzvrAHj8AXmyqTdsOY4wZK8C2JKA mBcK8yrr2b0GEZgkgXaQc2yQhakQ4KyUawttx40q13eANuqJnBKrnDv1ArUIORjllkHu 6VuSUITYBtdoIv7kqjrd2L0wgzAKZieojNkEVoDRjC23gy5J+wL/LiH/OKsTJ9QmNQWS eqf2ll7zrp36z/pr1sxweIMAcxXKlbMDF7izR8bcOCE1eoHBJ31rzHjjZszoiwJhxChE ae1Kv9SP+jeHmAFVEQ+KNZJf6o5pDgVIUzRT1XSleNSFAjsCZJF9y7W4OUqTVtyyebwH JARg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=tXvRnqNYldTtY7XEeB19d5ITrYVrzDeRGFjmumb8QwE=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=Pjnqnjv+5o4TIZYL5x+XNfT6xC4c/2bDv4G+fC1ruEwBc5t3ATQS1NshCVmIpCBfrz JM4sE8QqI5z/xOfrQX5tsW7aEEpEDYPUSmA2DADmjLTdXnStfi3LtmyycR+tPWSUdx/z m1njyfo9PUa7cl/EliY7i3zMuJ9oWsdAenvziNXLMQAweAqzYV59wsqnNr7rO2kodvB8 x0wKPJtUOJScwEGJIjiD1h85+JZWXBkQzB7xIw30i2h1fngVfNpUhmarZ4hXvcdmc4ER lgaNF0zWdkvX28b31kIZkYTn0OqGO2ClpwOD2SlrH3pvpePQtLpKQpwZ9Def2di5X0WI FvXw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GHKgDZwQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b14-20020ac87fce000000b004198ca326a2si9130701qtk.507.2023.11.15.01.49.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:49:04 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GHKgDZwQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 373783861842 for ; Wed, 15 Nov 2023 09:48:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 023F838582A0 for ; Wed, 15 Nov 2023 09:47:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 023F838582A0 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 023F838582A0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; cv=none; b=OHfTgCG9UJKAAc+6OpBpq8FIKjVTkCCmUsEo+riI52tPkmrTo4OcfKbsyw+T1VBV/cKyaoW0ZG84E8Yl6b20o9Zv0IAZH3dcUK92N/ig0i0I7euhAybAlqDHBepMerCQECJW7okKlM272kuu0Qpha5wbLFiOqOt+m0/eJJdphac= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; c=relaxed/simple; bh=LQ+lW8LpgjRpWMMreZ6hnN6e5f8ekCweRUhbY9GCR7U=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=MVKKc9zZt9an4A/0Igkw2j1qSeQhs0BLZ4KYFVXKCOJkh1yhCJp/Z7uzrsedm6wITpo1jEUM1Ym1ve/FT7gfIbT3uGrBJXP9961rK2XQf0wmYQdkrCF7TE9lxQSPYCHicYoVkWf52s9k9H+CkykrOsAdn0Fr8Ah4RkVIFVhHEFo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041635; x=1731577635; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LQ+lW8LpgjRpWMMreZ6hnN6e5f8ekCweRUhbY9GCR7U=; b=GHKgDZwQyaL5MciD6QKLWAMTIYIOnOrCEjGIzRhsOM0JlGjyKXABAM7A /cSGkb0kQeahFLYptXnYz+x/fEuVCHzAriwhFV8F3x23OZlF5q48QVgsE 6U9jBioosFOA8mCC3h3JJekRFoYROdoeog+SKMa8K0xJYHAQ6F/mQBV/h 7iXo9nP+DbEOik3Yyd6zvr1U7+UPkLcyHCYjnFuqdMfhSHsJfzzP9OhPZ bCUuXe6ww5v7wg2bShDX0yYkRdMkwkPU/pfznhRS+67C2im8S9T9B3fxR edqHLXUZENHY/PB/0VO6LEn426ytaB7yT3xQv+DJX0ghSfYA3a34KqcK4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="455138359" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="455138359" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="6105948" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 15 Nov 2023 01:47:07 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AC6A91005677; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 03/16] [APX NDD] Support APX NDD for optimization patterns of add Date: Wed, 15 Nov 2023 17:46:52 +0800 Message-Id: <20231115094705.3976553-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622972624077462 X-GMAIL-MSGID: 1782622972624077462 From: Kong Lingling gcc/ChangeLog: * config/i386/i386.md: (addsi_1_zext): Add new alternatives for NDD and adjust output templates. (*add_2): Likewise. (*addsi_2_zext): Likewise. (*add_3): Likewise. (*addsi_3_zext): Likewise. (*adddi_4): Likewise. (*add_4): Likewise. (*add_5): Likewise. (*addv4): Likewise. (*addv4_1): Likewise. (*add3_cconly_overflow_1): Likewise. (*add3_cc_overflow_1): Likewise. (*addsi3_zext_cc_overflow_1): Likewise. (*add3_cconly_overflow_2): Likewise. (*add3_cc_overflow_2): Likewise. (*addsi3_zext_cc_overflow_2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add more test. --- gcc/config/i386/i386.md | 314 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 53 ++-- 2 files changed, 236 insertions(+), 131 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index daab634fea0..7ddb2cb2a71 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6420,12 +6420,13 @@ (define_insn "*add_1" ;; patterns constructed from addsi_1 to match. (define_insn "addsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le")))) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le,rBMe")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" { switch (get_attr_type (insn)) { @@ -6434,11 +6435,13 @@ (define_insn "addsi_1_zext" case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return which_alternative == 3 ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return which_alternative == 3 ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6448,12 +6451,15 @@ (define_insn "addsi_1_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return which_alternative == 3 ? "sub{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return which_alternative == 3 ? "add{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (match_operand:SI 2 "incdec_operand") @@ -6697,37 +6703,43 @@ (define_insn "*add_2" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0,") - (match_operand:SWI 2 "" ",,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,,rm,r") + (match_operand:SWI 2 "" ",,0,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, mode, operands)" + && ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" { + bool use_ndd = (which_alternative == 3 || which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6742,23 +6754,27 @@ (define_insn "*add_2" (define_insn "*addsi_2_zext" [(set (reg FLAGS_REG) (compare - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6766,12 +6782,15 @@ (define_insn "*addsi_2_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6785,35 +6804,40 @@ (define_insn "*addsi_2_zext" (define_insn "*add_3" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 2 "" ",0")) - (match_operand:SWI 1 "nonimmediate_operand" "%0,"))) - (clobber (match_scratch:SWI 0 "=,"))] + (neg:SWI (match_operand:SWI 2 "" ",0,")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r"))) + (clobber (match_scratch:SWI 0 "=,,r"))] "ix86_match_ccmode (insn, CCZmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6828,22 +6852,24 @@ (define_insn "*add_3" (define_insn "*addsi_3_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) - (match_operand:SI 1 "nonimmediate_operand" "%0,r"))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,r"))) + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCZmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" : "dec{l}\t%k0"; } default: @@ -6851,12 +6877,15 @@ (define_insn "*addsi_3_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6877,31 +6906,35 @@ (define_insn "*addsi_3_zext" (define_insn "*adddi_4" [(set (reg FLAGS_REG) (compare - (match_operand:DI 1 "nonimmediate_operand" "0") - (match_operand:DI 2 "x86_64_immediate_operand" "e"))) - (clobber (match_scratch:DI 0 "=r"))] + (match_operand:DI 1 "nonimmediate_operand" "0,r") + (match_operand:DI 2 "x86_64_immediate_operand" "e,e"))) + (clobber (match_scratch:DI 0 "=r,r"))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{q}\t%0"; + return use_ndd ? "inc{q}\t{%1, %0|%0, %1}" : "inc{q}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{q}\t%0"; + return use_ndd ? "dec{q}\t{%1, %0|%0, %1}" : "dec{q}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], DImode)) - return "add{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{q}\t{%2, %1, %0|%0, %1, %2}" + : "add{q}\t{%2, %0|%0, %2}"; - return "sub{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{q}\t{%2, %1, %0|%0, %1, %2}" + : "sub{q}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:DI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6922,30 +6955,36 @@ (define_insn "*adddi_4" (define_insn "*add_4" [(set (reg FLAGS_REG) (compare - (match_operand:SWI124 1 "nonimmediate_operand" "0") + (match_operand:SWI124 1 "nonimmediate_operand" "0,r") (match_operand:SWI124 2 "const_int_operand"))) - (clobber (match_scratch:SWI124 0 "="))] + (clobber (match_scratch:SWI124 0 "=,r"))] "ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], mode)) - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand: 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6960,36 +6999,41 @@ (define_insn "*add_5" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,") - (match_operand:SWI 2 "" ",0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r") + (match_operand:SWI 2 "" ",0,")) (const_int 0))) - (clobber (match_scratch:SWI 0 "=,"))] + (clobber (match_scratch:SWI 0 "=,,r"))] "ix86_match_ccmode (insn, CCGOCmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7169,35 +7213,46 @@ (define_insn "*addv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (plus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" + { + if (which_alternative == 2 || which_alternative == 3) + return "add{}\t{%2, %1, %0|%0, %1, %2}"; + else + return "add{}\t{%2, %0|%0, %2}"; + } + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "addv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (plus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8944,27 +8999,37 @@ (define_insn "*add3_cconly_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r") + (match_operand:SWI 2 "" ",")) (match_dup 1))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "@add3_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 1))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" +{ + if (which_alternative == 2 || which_alternative == 3) + return "add{}\t{%2, %1, %0|%0, %1, %2}"; + else + return "add{}\t{%2, %0|%0, %2}"; +} + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -9009,55 +9074,72 @@ (define_insn "*addsi3_zext_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")) (match_dup 1))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*add3_cconly_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r") + (match_operand:SWI 2 "" ",")) (match_dup 2))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*add3_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 2))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addsi3_zext_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")) (match_dup 2))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn_and_split "*add3_doubleword_cc_overflow_1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index dd3dc78e52f..6c136174d24 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,20 +2,43 @@ /* { dg-options "-mapxf -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ -int foo (int *a) -{ - int b = *a - 1; - return b; -} +#define FOO(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = *a OP 1; \ + return b; \ +} -int foo2 (int a, int b) -{ - int c = a + b; - return c; -} +#define FOO1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo1_##OP_NAME##_##TYPE (TYPE a, TYPE b) \ +{ \ + TYPE c = a OP b; \ + return c; \ +} + +#define FOO2(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ +{ \ + TYPE c = *a OP b; \ + return c; \ +} + +FOO (char, add, +) +FOO1 (char, add, +) +FOO2 (char, add, +) +FOO (short, add, +) +FOO1 (short, add, +) +FOO2 (short, add, +) +FOO (int, add, +) +FOO1 (int, add, +) +FOO2 (int, add, +) +FOO (long, add, +) +FOO1 (long, add, +) +FOO2 (long, add, +) -int foo3 (int *a, int b) -{ - int c = *a + b; - return c; -} From patchwork Wed Nov 15 09:46:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165230 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2428684vqg; Wed, 15 Nov 2023 01:47:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IHwzFVOSNyupgmai1vlqte/4hmvWBwiqWRIzU1+3i1DBNZjH7s+LBpVQFzXQRBWKVz77uUS X-Received: by 2002:ac8:5ac7:0:b0:417:953c:ff57 with SMTP id d7-20020ac85ac7000000b00417953cff57mr7287154qtd.14.1700041665177; Wed, 15 Nov 2023 01:47:45 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041665; cv=pass; d=google.com; s=arc-20160816; b=dxQrjMYeNdH7FHO2EQ64XvTPxXTI4ZtSRzg/IqfGvY5A5euVdGrB4PtP5YCwO+wsY0 6vWjLqCxzIhWItxU6Vk3rIQXtlEzxiOPwxlO3yxQbb+/CFIf/knvcWVQ/DIG313yo5C6 ZA4Slpwzpt34pHU3aQe75QQejvQ6DHn1OJd6bdb7TJnlaQJMxubTlk2xnDUELMazrWG7 NXfFlfJDozuqvhazgAXm3HedBX9sLSBR6slbKcP8Y0U9Ko/UFPogjEU8KJ3MuuNpjTJB eEJsIk2K1Ex/OAp3bW0jJGbYb/iFXEio5JmLux3AI2aspa6zZgTGEjMi25C5PEBnnmBC zB2Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=Nf4WtB9BegpD4k3qDGFXTslM4NHHDXokWxCS5+uMImU=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=IZphFhZZWqFIBS6mL8kWGwTlK0BDLcYg48vqX2pe1Yj/OlQlO0gXjiklQXu4/5LWC6 YtLEyvJaeAnIA1wGW1AWK23Y3VoqmKPOnKBg9QmnnxriXvHUgck7fFp+c/e3USN2PQOg jM60DFzH8nqEJyxYsXWTZ/qQsVDObWhUvfIOSMTLn9aTDmtNd/HnVse9UO5ckvheJDCc f6WTDvHRqspykWVpUtsaWaJ5mn40qryTCts/E7kqqpvjHF9g+QseIeW0b3LTOGD0ao/8 QeAkevX6ek+878V4L3zx14OVjU1I5qibKXGS3RY7N1MRTBYVvg2DLR1kS0vsFoq9EQ6u 55Vg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=oA1QOrdY; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y12-20020ac85f4c000000b0041cc00fc0a4si8574438qta.218.2023.11.15.01.47.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:47:45 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=oA1QOrdY; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6C2C3857BA6 for ; Wed, 15 Nov 2023 09:47:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 225643858C42 for ; Wed, 15 Nov 2023 09:47:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 225643858C42 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 225643858C42 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041632; cv=none; b=PQmqzUtMqMDcwSpTLo330O9Hs0hMlAhFtLI6InZjpup05s3ttRhvlWoglfrJvkUaBO5NrIY2450VCAQ5ZJAYu96cdyOFT+uR/4CWOrX0zjEcKd02aZnWzde6kTQKDlAuEIFdP4MkefMzFIHv7ilkK3siNrCoZKORDxVEIug7lRA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041632; c=relaxed/simple; bh=jAtvMGq4xIVd43JPg8FaU0uVuKTj/ix3xB1b2+mIV0o=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Ln8CQkWOpapasCEhtz+5khHkhyGsa5Db6FFdG5S2PcJWQ0FI1h94+vqXl+Gvz+k22O5/VOZHeMclcN47DWd1us5wCnWnlAhtirmGQ2a79arMYqB7BiLXPe4Epvea5jv98bti0KAZz8b1nx7Sbef25PWVyT1gdSEJm1DxLb1hmqc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041630; x=1731577630; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jAtvMGq4xIVd43JPg8FaU0uVuKTj/ix3xB1b2+mIV0o=; b=oA1QOrdYnJDFA7PXfMA0cilZpJ8ADxiEDUmNDNOub7uBDvjV15njFkey dPVZtvJG+Tm5qxaz3pYvSq7wXPD+uL09QBWw9NQw3mbdYtIiTi1w3+JZj hGvQ66myixmORCc+fn77zjyA4rbRAABtrVq35WUAGdrS24vPgFl2QxUyR jMcKeLfXZMQT4N6TnAm8ZlLE1fy3kGRQE2MgtlgCWzq267Seyw9iuEcuo SR8cHX5KxU8bL6FhYqC+aJqIx/0B0cWQDeQJ8mJeTsONrvVBOBbqSjWvV Qjx4des+RGFyKmjSaCkDva1OL6I4l+Rn/5x2+p1JaKTnOOQ04XUY8m0+v g==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="455138347" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="455138347" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="6105929" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 15 Nov 2023 01:47:07 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AFAE11005678; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 04/16] [APX NDD] Disable seg_prefixed memory usage for NDD add Date: Wed, 15 Nov 2023 17:46:53 +0800 Message-Id: <20231115094705.3976553-5-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622888961349925 X-GMAIL-MSGID: 1782622888961349925 NDD uses evex prefix, so when segment prefix is also applied, the instruction could excceed its 15byte limit, especially adding immediates. This could happen when "e" constraint accepts any UNSPEC_TPOFF/UNSPEC_NTPOFF constant and it will add the offset to segment register, which will be encoded using segment prefix. Disable those *POFF constant usage in NDD add alternatives with new constraint. gcc/ChangeLog: * config/i386/constraints.md (je): New constraint. * config/i386/i386-protos.h (x86_no_poff_operand_p): New function to check any *POFF constant. * config/i386/i386.cc (x86_no_poff_operand_p): New prototype. * config/i386/i386.md (*add_1): Split out je alternative for add. --- gcc/config/i386/constraints.md | 5 +++++ gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 25 +++++++++++++++++++++++++ gcc/config/i386/i386.md | 10 +++++----- 4 files changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index cbee31fa40a..c6b51324294 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -433,3 +433,8 @@ (define_address_constraint "jb" (define_register_constraint "jc" "TARGET_APX_EGPR && !TARGET_AVX ? GENERAL_GPR16 : GENERAL_REGS") + +(define_constraint "je" + "@internal constant that do not allow any unspec global offsets" + (and (match_operand 0 "x86_64_immediate_operand") + (match_test "x86_no_poff_operand_p (op)"))) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 3e08eae4e79..5d902e2925b 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -66,6 +66,7 @@ extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); extern bool x86_extended_rex2reg_mentioned_p (rtx); extern bool x86_evex_reg_mentioned_p (rtx [], int); +extern bool x86_no_poff_operand_p (rtx); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3779d5b1206..47159b06f7d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -23292,6 +23292,31 @@ x86_evex_reg_mentioned_p (rtx operands[], int nops) return false; } +/* Return true when rtx operand does not contain any UNSPEC_*POFF related + constant to avoid APX_NDD instructions excceed encoding length limit. */ +bool +x86_no_poff_operand_p (rtx operand) +{ + if (GET_CODE (operand) == CONST) + { + rtx op = XEXP (operand, 0); + if (GET_CODE (op) == PLUS) + op = XEXP (op, 0); + + if (GET_CODE (op) == UNSPEC) + { + int unspec = XINT (op, 1); + return (unspec != UNSPEC_NTPOFF + && unspec != UNSPEC_TPOFF + && unspec != UNSPEC_DTPOFF + && unspec != UNSPEC_GOTTPOFF + && unspec != UNSPEC_GOTNTPOFF + && unspec != UNSPEC_INDNTPOFF); + } + } + return true; +} + /* If profitable, negate (without causing overflow) integer constant of mode MODE at location LOC. Return true in this case. */ bool diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7ddb2cb2a71..ecd06625a7d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6359,15 +6359,15 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r,m,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,r,e,je,BM"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (PLUS, mode, operands, ix86_can_use_ndd_p (PLUS))" { - bool use_ndd = (which_alternative == 4 || which_alternative == 5); + bool use_ndd = (which_alternative >= 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -6398,7 +6398,7 @@ (define_insn "*add_1" : "add{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd,apx_ndd,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") From patchwork Wed Nov 15 09:46:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165238 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429374vqg; Wed, 15 Nov 2023 01:49:41 -0800 (PST) X-Google-Smtp-Source: AGHT+IHlJrjmxRgaBhyD7RMkgfP/HvR8BbZD/jx5izpftzym8XHS+yrRFr8grp1Zgg9fFfqF97cE X-Received: by 2002:a05:620a:2910:b0:773:dce0:b2b9 with SMTP id m16-20020a05620a291000b00773dce0b2b9mr5488290qkp.2.1700041780837; Wed, 15 Nov 2023 01:49:40 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041780; cv=pass; d=google.com; s=arc-20160816; b=x8lzKYQXM9An95CIQIU3PQEheVyJctiMORLqQtkBXSob+XqCei0SJAaGwNyJxSi6lY 4i50oRiw7rWInGsQWE/LQIggOtBaevfmuCDIDSnU27ArwQBn0abGY52ncd+26cjpw+Yh 0YuborPSx46XcoiBXotuR7MKE80PsGkvlvy3hKgQ0u+QWSUh3KmTPgni/gJf2jqVmhr9 s8YGDDfTMFaNv9tzLNRpnhLe3M5qU7QCwnGlvK0M6kisTbJEuTGn3Y5c45LKLYb+3lOz uNGY9MHll+y+H5+9zUkY92T7xQoZ9HNOSxBkGLnAzamSWvWPH+RwHd4qLfr0iHjirnPZ Snew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=xERAoVH0n/4ESmVo4+Lahx/djnmfOIagL8cQ2UQxa2A=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=KIP6pxRbaVoTakj4z6zB1gN+89G2Qvij9uJXwBx74uiag6Nxi0HzPnLOenYIBSX9fP QJymdVy5WiS2+pY01q5fqwodrfX+ZXArNwyMI8bFf+d5vwEPO/4JQCcBYCfq0wzOC8ll p7uMlOOJogCKEN2prYdhVXZO7jpvlSz2EE9izZOsgqlxGz0kuRn/XZEA+BvB6iud2Pq5 Fzb4hpSNVDJdYTLKivXP0yrZvZqbHcaPQ+MZQESaiO2w+y7tAYxAZlmrGjYH0UY5VQDw Ud67HdcW3GEvkl0QABqf/3/sJH20ra/hCKZf8wHrLPdOiz/TeBQzbVjACRpg/Aln0FwN VoBg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="UkzV9n/I"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bn45-20020a05620a2aed00b00770753a1153si8491558qkb.25.2023.11.15.01.49.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:49:40 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="UkzV9n/I"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E7BE9385C318 for ; Wed, 15 Nov 2023 09:49:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id 0263838582A1 for ; Wed, 15 Nov 2023 09:47:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0263838582A1 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0263838582A1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; cv=none; b=ZrOVXHcCT7mTzjCKJXfXSYqKNy7to7e53VWk5eVYEKuNMXESPic0T+jUh5p3YdbyU4+Ga3hXuTiLIwNQppLRBphoCeMg0dWeN3KqfGYgssCYiSqp8I4qM0N9+b3wYaDotW2xTA926K51A6vAaGQ7e9JurdQRhu4qNqdWfNO6u8I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; c=relaxed/simple; bh=hzVAL/sje/hm1tCU8mqqjMFCXkdGvOlysKCdlzoWkdc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=OWw5FgNjH8ULlS9ldoHom8W454AuFZ8ni6IQCxMjHRBmUNilYPbdHoT92eEsz6yx3Hqn2Oi1ueUzMk1KxUlvq3ipngGZCuh/Ms/CdXszLRjrNgB0hsnC9YDoPop1cE7jcuP6IimQd12IK8fQxCvHkKEC5jHwkYJ0wyy3EvWE4L4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041635; x=1731577635; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hzVAL/sje/hm1tCU8mqqjMFCXkdGvOlysKCdlzoWkdc=; b=UkzV9n/IjdT4dQu3Za691AjXPdXl8KDOFZDMAsAiZ4lW8TKX8ZsY1yUU 4lDpyybX2EaG21QFIsVxDRqFf63AUCa0291cIbPuXQmIjUGwQ5z5U3W/q HeqhhHkqxiZ+5UTHjXBPBFuXAoe7gtwoHQSrlzeDMTtOwhQGBfrISwkLf DIjNbYqujRRmnUTkFO2UXPh+jQ2dqkIqv2okE1kjxL1nMrdt5ZGL5MfxD QoX/a6tEXH0sJVcnhzaY6Dt1CAB7/LiAwUuSfbFbsyArSayQoovLFihvf rP5hsKJA+bWe9bSilPmZ/8lF64ZikuIfme9pXKS43q2vG2hEq8f2c31Me g==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="455138361" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="455138361" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="6105954" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B187C1005679; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 05/16] [APX NDD] Support APX NDD for adc insns Date: Wed, 15 Nov 2023 17:46:54 +0800 Message-Id: <20231115094705.3976553-6-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623010405842268 X-GMAIL-MSGID: 1782623010405842268 From: Kong Lingling Legacy adc patterns are commonly adopted to TImode add, when extending TImode add to NDD version, operands[0] and operands[1] can be different, so extra move should be emitted if those patterns have optimization when adding const0_rtx. gcc/ChangeLog: * config/i386/i386.md (*add3_doubleword): Add ndd constraints, and move operands[1] to operands[0] when they are not equal. (*add3_doubleword_cc_overflow_1): Likewise. (*add3_doubleword_zext): Add ndd constraints. (*addv4_doubleword): Likewise. (*addv4_doubleword_1): Likewise. (addv4_overflow_1): Likewise. (*addv4_overflow_2): Likewise. (@add3_carry): Likewise. (*add3_carry_0): Likewise. (*addsi3_carry_zext): Likewise. (*addsi3_carry_zext_0): Likewise. (addcarry): Likewise. (addcarry_0): Likewise. (*addcarry_1): Likewise. (*add3_eq): Likewise. (*add3_ne): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-adc.c: New test. --- gcc/config/i386/i386.md | 203 +++++++++++++------- gcc/testsuite/gcc.target/i386/apx-ndd-adc.c | 15 ++ 2 files changed, 146 insertions(+), 72 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-adc.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ecd06625a7d..f23859d1172 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6235,12 +6235,13 @@ (define_expand "add3" ix86_can_use_ndd_p (PLUS)); DONE;") (define_insn_and_split "*add3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,r"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6260,24 +6261,35 @@ (define_insn_and_split "*add3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + /* Under NDD op0 and op1 may not equal, do not delete insn then. */ + + bool emit_insn_deleted_note_p = true; + if (!rtx_equal_p (operands[0], operands[1])) + { + emit_move_insn (operands[0], operands[1]); + emit_insn_deleted_note_p = false; + } if (operands[5] != const0_rtx) - ix86_expand_binary_operator (PLUS, mode, &operands[3]); + ix86_expand_binary_operator (PLUS, mode, &operands[3], + ix86_can_use_ndd_p (PLUS)); else if (!rtx_equal_p (operands[3], operands[4])) emit_move_insn (operands[3], operands[4]); - else + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,r,r") (plus: (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")) - (match_operand: 1 "nonimmediate_operand" "0,0"))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")) + (match_operand: 1 "nonimmediate_operand" "0,0,r,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + ix86_can_use_ndd_p (PLUS))" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6293,7 +6305,8 @@ (define_insn_and_split "*add3_doubleword_zext" (match_dup 4)) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_concat" [(set (match_operand: 0 "register_operand" "=&r") @@ -7269,14 +7282,15 @@ (define_insn_and_split "*addv4_doubleword" (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0,0")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (plus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -7306,22 +7320,24 @@ (define_insn_and_split "*addv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*addv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0")) - (match_operand: 3 "const_scalar_int_operand" "n")) + (match_operand: 1 "nonimmediate_operand" "%0,rm")) + (match_operand: 3 "const_scalar_int_operand" "n,n")) (sign_extend: (plus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -7359,7 +7375,8 @@ (define_insn_and_split "*addv4_doubleword_1" operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*addv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -7369,9 +7386,9 @@ (define_insn "*addv4_overflow_1" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0"))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r"))) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (plus:SWI (plus:SWI @@ -7379,15 +7396,22 @@ (define_insn "*addv4_overflow_1" [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" +{ + if (which_alternative == 2 || which_alternative == 3) + return "adc{}\t{%2, %1, %0|%0, %1, %2}"; + else + return "adc{}\t{%2, %0|%0, %2}"; +} + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addv4_overflow_2" @@ -7398,26 +7422,30 @@ (define_insn "*addv4_overflow_2" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0"))) - (match_operand: 6 "const_int_operand" "n")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm"))) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (plus:SWI (plus:SWI (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)]) (match_dup 1)) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8143,17 +8171,24 @@ (define_insn "*subsi_3_zext" ;; Add with carry and subtract with borrow (define_insn "@add3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (plus:SWI (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" +{ + if (which_alternative == 2 || which_alternative == 3) + return "adc{}\t{%2, %1, %0|%0, %1, %2}"; + else + return "adc{}\t{%2, %0|%0, %2}"; +} + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8240,31 +8275,38 @@ (define_insn "*add3_carry_0r" (set_attr "mode" "")]) (define_insn "*addsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (plus:SI (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "%0")) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 1 "register_operand" "%0i,r")) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "adc{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + ix86_can_use_ndd_p (PLUS))" + "@ + adc{l}\t{%2, %k0|%k0, %2} + adc{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "register_operand" "0,r")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "adc{l}\t{$0, %k0|%k0, 0}" - [(set_attr "type" "alu") + "@ + adc{l}\t{$0, %k0|%k0, 0} + adc{l}\t{$0, %1, %k0|%k0, %1, 0}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8293,20 +8335,26 @@ (define_insn "addcarry" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,r,m"))) (plus: (zero_extend: (match_dup 2)) (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8464,7 +8512,8 @@ (define_expand "addcarry_0" (match_dup 1))) (set (match_operand:SWI48 0 "nonimmediate_operand") (plus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (PLUS, mode, operands)") + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))") (define_insn "*addcarry_1" [(set (reg:CCC FLAGS_REG) @@ -8474,18 +8523,19 @@ (define_insn "*addcarry_1" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0")) - (match_operand:SWI48 2 "x86_64_immediate_operand" "e"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,rm")) + (match_operand:SWI48 2 "x86_64_immediate_operand" "e,e"))) (plus: (match_operand: 6 "const_scalar_int_operand") (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && CONST_INT_P (operands[2]) /* Check that operands[6] is operands[2] zero extended from mode to mode. */ @@ -8498,8 +8548,11 @@ (define_insn "*addcarry_1" && ((unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (operands[6], 0) == UINTVAL (operands[2])) && CONST_WIDE_INT_ELT (operands[6], 1) == 0))" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "") @@ -9146,12 +9199,13 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o")) (match_dup 1))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS))" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -9180,6 +9234,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_addcarry_0 (operands[3], operands[4], operands[5])); DONE; } @@ -9188,7 +9244,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" operands[5], mode); else operands[6] = gen_rtx_ZERO_EXTEND (mode, operands[5]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) ;; x == 0 with zero flag test can be done also as x < 1U with carry flag ;; test, where the latter is preferrable if we have some carry consuming @@ -9203,7 +9260,8 @@ (define_insn_and_split "*add3_eq" (match_operand:SWI 1 "nonimmediate_operand")) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9227,7 +9285,8 @@ (define_insn_and_split "*add3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (PLUS, mode, operands) + && ix86_binary_operator_ok (PLUS, mode, operands, + ix86_can_use_ndd_p (PLUS)) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c new file mode 100644 index 00000000000..9d5991457da --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-1.c" +// *addti3_doubleword +// *addti3_doubleword_zext +// *adddi3_cc_overflow_1 +// *adddi3_carry + +int foo3 (int *a, int b) +{ + int c = *a + b + (a > b); /* { dg-warning "comparison between pointer and integer" } */ + return c; +} +/* { dg-final { scan-assembler-not "xor" } } */ From patchwork Wed Nov 15 09:46:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165231 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2428845vqg; Wed, 15 Nov 2023 01:48:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IFtv8dOiVtNYP85Wu+0bDrfKkJ6uiauDjDtK7VWqwtW46ahgTIDtf1qFAZwOVD4NupNUsK5 X-Received: by 2002:a05:6102:10dc:b0:45d:ab58:b7a with SMTP id t28-20020a05610210dc00b0045dab580b7amr13025489vsr.16.1700041690272; Wed, 15 Nov 2023 01:48:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041690; cv=pass; d=google.com; s=arc-20160816; b=loouuuCi2KirEdwEWY+eWVU+jiSI5McFrHZed3cc6ynDbnmcIrXP98lcCtoYkFx3vi LAsuaVfoZVMvbFRckqYhkJpMN2F02gBcr6SaYLFebAE+C0aadZOf7xx/4e5/xUwFiGdO 6bolxwoCBxLHREgMU6jkJ3PzD84/dPKTUAEu8itNkceZy2fISufdwz8q/qh5p/WGTflI jrOaTmYqNCg2NkETcZrPyRHCnI83jUD0FFbYw7Hg3Zpk4Awn6POBt3XopQxKlTws0fZG CGgD7kQwDzC3JfmZJyWfCSXBWtnMf0oO1WQlPQoGAOwMRJFyFtvdPP5mKyMQcMiBagh/ EqlA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=11LLqA3ijdA6G24IPcIdiQbvf2g3QGSvZMQQNGZ22qY=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=FEzox/J8dcbCWGSjDG3+s1R+JiiQeQ1xQj4B4QAn0STT9vOPosq25/iSBo6L/a4D+u +scG4niqIjBlXaf5i5LEhir9Uf1kqcYJl90AlzguulHY9NpGK//Fz34B+Vg/7mP9ffMV U7TzjLI5m6jOwQxN+VDpGl3Bxt+uEI7Szi07kaVOtfamurvBohoW/auE/el6DDDFAgGA +wyi4nZYNVR3kzhA5yRRPUU9b/hcunezuHGgiDtCIlLzGq2pPlhlmT2q4vzvtG9nDAss JY6OREGy8J/htDmdPd5m1q2ZEFBarmBmVL1/uOPl80q3ugU7i+XANxzEoKvUGIvz7+Ci n8nw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GUpcC6OR; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id z21-20020ac87f95000000b0041cc54b994bsi9019918qtj.749.2023.11.15.01.48.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:48:10 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GUpcC6OR; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2C33C385700A for ; Wed, 15 Nov 2023 09:48:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 55A263858427 for ; Wed, 15 Nov 2023 09:47:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 55A263858427 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 55A263858427 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041636; cv=none; b=ItbgnZQlTlU1q8z0KLJFu+gdOpORQmFRpxi272E5AB1F0rgLQ575gfQtnQ2IhinNrlO4Go+spNzHpaG0aIwyQWgyv/WRr7OW6i/cUKLAnRdt2wM8Z7ZuW0Jzs9HX6PD9M9Rhmu1I/X0LHuf0L9fK/vjWfxTZWuYHuqYjrvDvWXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041636; c=relaxed/simple; bh=z80N3i3EPdIdNxw+lI+HVcQgp9YKX7m9WhHjZJyfYeU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=OqYzlPEjWG1UlEDlITQM239T7maq252nSWOb1EJuUVscv93R76NI1ukSxPwkMGRCP3aqnaKHBO5zlY4E/iOc8ckgzS4gQd4C2RRIK1rrWxY+VrKG8oXLolHL0n5MpwTIJo8vuY1JwkmTTkGdDDpvCGzcnMW9dB66aGTmoM2J4JA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041633; x=1731577633; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z80N3i3EPdIdNxw+lI+HVcQgp9YKX7m9WhHjZJyfYeU=; b=GUpcC6ORKvQcc+GE1QETyKILk3J7Y22o22T5JomtD7CaqUIESaYUOUaZ OtPF5UGXsrMIvM7Fk0ObCoKrySZZ6mdCA7NvcLgXnCqLo5sTPTE/OeFoI QntZomhJb/dwlyP4+uBt7AmO+6Bv0+zj4quCn3HXE7kOzamk2hE20Qnv2 M7excp3z0IP+nn4OS4XoBR1C1HqCGHlR7bgynvZhUC8VqF/eQAm5uqWex cQb0XqozY+AKu/NQzMmJyvSmnxZuD1rKzIm+sRsd07BuLJ+6RgDYgzWms 6PwotBEQsTM47xNIqbew9jUbppr/A36sK+ZXX00FJTskxk0VM/eWP20c7 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342593" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342593" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431665" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431665" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:09 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B461A100567A; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 06/16] [APX NDD] Support APX NDD for sub insns Date: Wed, 15 Nov 2023 17:46:55 +0800 Message-Id: <20231115094705.3976553-7-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622915612142177 X-GMAIL-MSGID: 1782622915612142177 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_fixup_binary_operands_no_copy): Add use_ndd parameter. (ix86_can_use_ndd_p): ADD MINUS. * config/i386/i386-protos.h (ix86_fixup_binary_operands_no_copy): Change define. * config/i386/i386.md (sub3): Add NDD constraints. (*sub_1): Likewise. (*subsi_1_zext): Likewise. (*sub_2): Likewise. (*subsi_2_zext): Likewise. (subv4): Likewise. (*subv4): Likewise. (subv4_1): Likewise. (usubv4): Likewise. (*sub_3): Likewise. (*subsi_3_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for ndd sub. [APX NDD] Support APX NDD for more optimized sub insn gcc/ChangeLog: * config/i386/i386.md --- gcc/config/i386/i386-expand.cc | 6 +- gcc/config/i386/i386-protos.h | 2 +- gcc/config/i386/i386.md | 152 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 4 files changed, 118 insertions(+), 55 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index ea0e5881087..e5f75875e3b 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1270,6 +1270,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) switch (code) { case PLUS: + case MINUS: return true; default: return false; @@ -1342,9 +1343,10 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, void ix86_fixup_binary_operands_no_copy (enum rtx_code code, - machine_mode mode, rtx operands[]) + machine_mode mode, rtx operands[], + bool use_ndd) { - rtx dst = ix86_fixup_binary_operands (code, mode, operands); + rtx dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); gcc_assert (dst == operands[0]); } diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 5d902e2925b..ad895fac72d 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -111,7 +111,7 @@ extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_binary_operator (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index f23859d1172..1aa8469d666 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7637,7 +7637,8 @@ (define_expand "sub3" (minus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (MINUS, mode, operands); DONE;") + "ix86_expand_binary_operator (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)); DONE;") (define_insn_and_split "*sub3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -7663,7 +7664,10 @@ (define_insn_and_split "*sub3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { - ix86_expand_binary_operator (MINUS, mode, &operands[3]); + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + ix86_expand_binary_operator (MINUS, mode, &operands[3], + ix86_can_use_ndd_p (MINUS)); DONE; } }) @@ -7692,25 +7696,35 @@ (define_insn_and_split "*sub3_doubleword_zext" "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") (define_insn "*sub_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,i,r,r") (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (minus:SI (match_operand:SI 1 "register_operand" "0,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -7738,31 +7752,41 @@ (define_insn "*sub_2" [(set (reg FLAGS_REG) (compare (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_2_zext" [(set (reg FLAGS_REG) (compare - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (minus:SI (match_operand:SI 1 "register_operand" "0,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*subqi_ext_0" @@ -7841,7 +7865,8 @@ (define_expand "subv4" (pc)))] "" { - ix86_fixup_binary_operands_no_copy (MINUS, mode, operands); + ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)); if (CONST_SCALAR_INT_P (operands[2])) operands[4] = operands[2]; else @@ -7852,35 +7877,45 @@ (define_insn "*subv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (minus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "subv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (minus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -7976,6 +8011,8 @@ (define_insn_and_split "*subv4_doubleword_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_subv4_1 (operands[3], operands[4], operands[5], operands[5])); DONE; @@ -8057,18 +8094,25 @@ (define_expand "usubv4" (label_ref (match_operand 3)) (pc)))] "" - "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands);") + "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS));") (define_insn "*sub_3" [(set (reg FLAGS_REG) - (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,i,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -8156,16 +8200,20 @@ (define_insn_and_split "*dec_cmov" (define_insn "*subsi_3_zext" [(set (reg FLAGS_REG) - (compare (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe"))) - (set (match_operand:DI 0 "register_operand" "=r") + (compare (match_operand:SI 1 "register_operand" "0,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe"))) + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %1|%1, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sub{l}\t{%2, %1|%1, %2} + sub{l}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Add with carry and subtract with borrow diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 6c136174d24..9d444e0d830 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -42,3 +42,16 @@ FOO (long, add, +) FOO1 (long, add, +) FOO2 (long, add, +) +FOO (char, sub, -) +FOO1 (char, sub, -) +FOO (short, sub, -) +FOO1 (short, sub, -) +FOO (int, sub, -) +FOO1 (int, sub, -) +FOO (long, sub, -) +FOO1 (long, sub, -) +/* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Nov 15 09:46:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165239 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429462vqg; Wed, 15 Nov 2023 01:50:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IEYujkGRXTl2t/tR9zE9B08gjGOSFNSFdPck986a8AOs1SKHlxCDtmU3wrVb1OKofH0EuIA X-Received: by 2002:a05:620a:2f5:b0:778:9542:a765 with SMTP id a21-20020a05620a02f500b007789542a765mr4484588qko.64.1700041801566; Wed, 15 Nov 2023 01:50:01 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041801; cv=pass; d=google.com; s=arc-20160816; b=t7E8tEvfsg3grzFWb30GKRaTrfb+VGK426Z5a4quNBbfuQn3TuSQDu1rY5D9DXgjf9 NWhLN5rv6H7jIa+3i4T1xf+FDjFtLY4KyLuT4cKDxQtdKo0+g4jNIB4EONzskx/NCCSI MpDpa4rfAwqQro5mF/KH3WZ4MOgPWJIgEHHJPn36gJ4/9Hw82iDe7M9wbJ01mRdf9FdJ WMM2ddRxXYo6ULcFe/9nxpXDrGCouuVZrnNHPlLZCgqhj6WEl84L6GgNb2pHYXOOCzLd CyGGjHua41AqgikHNLnbOQ6koEeJwbOjIs6k5bvrkCzqaANdfGqWQb35aBv+2HIbxlTX 9q3g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=fUnHuQ7BvkWOLLWnWfLzMbQ70kHwDwkbDQQYyWu6N0M=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=RtUSMLWdxiXxGTJzMK0kIDxdiAPx/oEkghbCVEcmm8/dMluMDb6Oe6pCHUafHm91gx cYgkIl81PNPJJ5fLntD2WPcb18ymq8YBwnCSJgyRH6M9ucVKmxtWEV1HI89E9Gu32doE RtIf5EEWCi3h1/n0hTKmC1oWXkoI5yENcar+nNIKFxENJrl2JnqXEAEjpsHRTnppE4Us IVlf4W/Co7rQkY2zXq+FB25LisltfCWQAYae8g/vawdfxp1smHv6ybbMfyebGXwuVAMZ 36dEFM6JCgn9R1T2ynru0LynWkKF4ve5PNZgfJs4ZB+FWAvXyiChiheyoATJCydVWwy4 eXmQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Gqwqi6JB; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id 3-20020a05620a048300b0077a62fe9097si8810998qkr.745.2023.11.15.01.50.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:50:01 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Gqwqi6JB; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 53A073858C98 for ; Wed, 15 Nov 2023 09:50:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 6B99338582BA for ; Wed, 15 Nov 2023 09:47:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6B99338582BA Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6B99338582BA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041639; cv=none; b=thyUNTRHjRfaAdBNA132xnVZCVMeTi5jHuaJNr86bTWAITpdFvvdthWLHudivgvMu16CL30E9O81/jRataYDFhZzEXEQpH4vHLlgDmjmTILEPg/lEwQTATaEOOUzuk/Rs10i1WQ1p2h/F/L8WEEQH2bXizs4tIq+NB4gVLXCUok= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041639; c=relaxed/simple; bh=SQZrQ+d1VIAbWNh8eLyAfmaxY+Z5Cb7fZuP3JaiX+9E=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=FFck5qE6amNv1p0/nTZ1NzUuQMH3R71puXTd1j69NrWtzfaPq3dKN2SEc681dJluP4yVzWHCsiQcLQ7V910ZJ24soG75lAagK50IVxe7NQxPt56mfHFvvPcTMUvQ7JLqZzBiuoDB2f8vh4sWXvmtga6yIt//jmjVeYhrJWiDG54= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041636; x=1731577636; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SQZrQ+d1VIAbWNh8eLyAfmaxY+Z5Cb7fZuP3JaiX+9E=; b=Gqwqi6JBStwBWX3OwjheqjcfbYAW7C2nEknYc6417wMh/Tn67iDCi9V3 SJML8P3De5Yz/G38juybllee0LOA4HvmXZ9CgzZyynUHnDpmUGvvAHSHO dMYGROZOpcYu37bKUXwAFFoHYT1KSksAipo24Yvp+SQUeS48opSGX7hw/ JpJQaLpm5qhVNog+bjaZmHcr1Btf2Vu7IPFe18ltkzWEtbyWCnIc2pr6S TkN5w8B7alZy/yMmCYSySWMaHFW4CTvXOWGUc2F5F90KtPMa8YXmrsqi9 RtuR9LwpUbXux67EEkpZWPdzX8icfo4ymxEodUXCYa27oe+SpwVvJJZiN g==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342604" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342604" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431684" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431684" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:09 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B6F6A100567B; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 07/16] [APX NDD] Support APX NDD for sbb insn Date: Wed, 15 Nov 2023 17:46:56 +0800 Message-Id: <20231115094705.3976553-8-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623032233685552 X-GMAIL-MSGID: 1782623032233685552 From: Kong Lingling Similar to *add3_doubleword, operands[1] may not equal to operands[0] so extra move is required. gcc/ChangeLog: * config/i386/i386.md (*sub3_doubleword): Add ndd constraints, and emit move when operands[0] not equal to operands[1]. (*sub3_doubleword_zext): Likewise. (*subv4_doubleword): Likewise. (*subv4_doubleword_1): Likewise. (*subv4_overflow_1): Likewise. (*subv4_overflow_2): Likewise. (*addsi3_carry_zext_0r): Likewise. (@sub3_carry): Add NDD alternatives and adjust output templates. (*subsi3_carry_zext): Likewise. (subborrow): Likewise. (subborrow_0): Likewise. (*sub3_eq): Likewise. (*sub3_ne): Likewise. (*sub3_eq_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-sbb.c: New test. --- gcc/config/i386/i386.md | 159 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c | 6 + 2 files changed, 106 insertions(+), 59 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 1aa8469d666..c3dcfaf52e1 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7641,12 +7641,13 @@ (define_expand "sub3" ix86_can_use_ndd_p (MINUS)); DONE;") (define_insn_and_split "*sub3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7670,16 +7671,18 @@ (define_insn_and_split "*sub3_doubleword" ix86_can_use_ndd_p (MINUS)); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*sub3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,r,r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") + (match_operand: 1 "nonimmediate_operand" "0,0,r,o") (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + ix86_can_use_ndd_p (MINUS))" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7693,7 +7696,8 @@ (define_insn_and_split "*sub3_doubleword_zext" (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*sub_1" [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,i,r,r") @@ -7929,14 +7933,15 @@ (define_insn_and_split "*subv4_doubleword" (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0,0")) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (minus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7964,22 +7969,24 @@ (define_insn_and_split "*subv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*subv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0")) + (match_operand: 1 "nonimmediate_operand" "0,ro")) (match_operand: 3 "const_scalar_int_operand")) (sign_extend: (minus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -8017,7 +8024,8 @@ (define_insn_and_split "*subv4_doubleword_1" operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*subv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -8025,11 +8033,11 @@ (define_insn "*subv4_overflow_1" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (minus:SWI (minus:SWI @@ -8037,15 +8045,21 @@ (define_insn "*subv4_overflow_1" (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subv4_overflow_2" @@ -8054,28 +8068,32 @@ (define_insn "*subv4_overflow_2" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand: 6 "const_int_operand" "n")) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (minus:SWI (minus:SWI (match_dup 1) (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8360,15 +8378,18 @@ (define_insn "*addsi3_carry_zext_0" (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0r" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_unset_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "register_operand" "0,r")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "sbb{l}\t{$-1, %k0|%k0, -1}" - [(set_attr "type" "alu") + "@ + sbb{l}\t{$-1, %k0|%k0, -1} + sbb{l}\t{$-1, %1, %k0|%k0, %1, -1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8610,17 +8631,23 @@ (define_insn "*addcarry_1" (const_string "4")))]) (define_insn "@sub3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8707,18 +8734,22 @@ (define_insn "*sub3_carry_0r" (set_attr "mode" "")]) (define_insn "*subsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (minus:SI (minus:SI - (match_operand:SI 1 "register_operand" "0") + (match_operand:SI 1 "register_operand" "0,r") (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)])) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sbb{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sbb{l}\t{%2, %k0|%k0, %2} + sbb{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8803,21 +8834,27 @@ (define_insn "subborrow" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extend: - (match_operand:SWI48 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI48 1 "nonimmediate_operand" "0,0,r,rm")) (plus: (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (zero_extend: - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,rm,r"))))) + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI48 (minus:SWI48 (match_dup 1) (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8978,7 +9015,8 @@ (define_expand "subborrow_0" (match_operand:SWI48 2 ""))) (set (match_operand:SWI48 0 "register_operand") (minus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (MINUS, mode, operands)") + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS))") (define_expand "uaddc5" [(match_operand:SWI48 0 "register_operand") @@ -9404,7 +9442,8 @@ (define_insn_and_split "*sub3_eq" (const_int 0))) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9429,7 +9468,8 @@ (define_insn_and_split "*sub3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9458,7 +9498,8 @@ (define_insn_and_split "*sub3_eq_1" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + ix86_can_use_ndd_p (MINUS)) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c new file mode 100644 index 00000000000..662e3c607d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c @@ -0,0 +1,6 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-2.c" + +/* { dg-final { scan-assembler-times "sbbq\[^\n\r]*0, %rdi, %rdx" 1 } } */ From patchwork Wed Nov 15 09:46:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165233 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2428991vqg; Wed, 15 Nov 2023 01:48:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IHTJEQenb4DOnhj1iunAlC+1JiNQB5/CNduwZDdVvzlja/fxDjmlr9XGowwUn4qKZntTXzV X-Received: by 2002:a81:a549:0:b0:57a:cf8:5b4 with SMTP id v9-20020a81a549000000b0057a0cf805b4mr11675969ywg.51.1700041713459; Wed, 15 Nov 2023 01:48:33 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041713; cv=pass; d=google.com; s=arc-20160816; b=ZFv3CIskFZhYPAXdWN9kY4K7V+fj/LpkOjheEN5FTGTgGFyGpid4iPBQvxemajQUUJ 80P08gJubFRay4klRb99+mkH4erDTb6/Mt6g5BtHzFZ5xWFy8Ie0kTgqqcHhRJmxC9Li aq/AOloPbBghegSC4+Pd0c/TTDewAXJT6l3DpokUMnFZDiuaFIJzgIUEmzkkmb/2RzMd fs/iJrBrt0puTyN6tol4ceieFSH37qM349nRV4jO8X79ixKkUtA8e9cFYVVN2x0melY8 AuPJHskBIIykFdnKaZQtwPzZ9vzPCnIdoIh7Z2L7i5HQf1sYTURfXVm51z/V6XFO3N8I A+YA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=/c6XkyDQ2M1df5G3xKIJVNtVhbXYYM2Nr3L1nggh2sc=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=N6tXSOPNwK9fPsRi38w4ZeK4UrUY4Jl8pJBa7jcOBlGsZ2Ov6Wjl2lZESTqEhbT+BW GcYODehNczekEzAWQvqrYJxovchDzI1c2JGslYe5Ly6J8GBBluY0xoEnJXg7gpCA/4Pf HT3yeHjpUsmt3l9jzZDe9oq6Q5zQu1a/OmPFR4zCxR/lgsSLHJJuAVM9hTQQFaGFZkcx ++eyFHogG8YojI1pI28SPS5A7kxtpYbgQ/xi6e5h8fq6GeI6aEiJHL1sXDa3NyYoPWTL azBFgn9sH2WE0jCFDfLm/4dSeAJDZfNoJ5PQZKxUmwu2FOp9ANFvDLmZF1FoMVzE89zv tUBw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aHyue0KK; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id x6-20020a0ceb86000000b0064f3b7ec231si8799494qvo.131.2023.11.15.01.48.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:48:33 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aHyue0KK; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 421F53857B9A for ; Wed, 15 Nov 2023 09:48:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 5C7A23858006 for ; Wed, 15 Nov 2023 09:47:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C7A23858006 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5C7A23858006 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; cv=none; b=p/0Jm40PQXCPsUNGRFhOnfiJcY1hsrzKNiciOlLLCYHIDZQlcZaVpl5EGrMShPQmUKWhx3PqAvw/MHIKVeIvk1R/uQCPy4nZ6z8XhBXUR3oIcJByuHySyRPe0vapAHjmu+jd/wplmU4lIR6rOnEpHWetSBMt2HpoMfcEc9cyLHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041638; c=relaxed/simple; bh=3+ikYa/TD+k1hlUP0oUpg5GFw3Z0IfURtiYfq7IOPh0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=MpEPoHg6VAT/d9nfAoOXKHDEtzfCz2TwmPX15IDlni01owfrXFlmhv2lZO2OIt+HZUgBu6WS1xwTI+fW2gOTzJGCjD21u6bICvrugt0KYMWrD/IN3X4aNw4G3P3lqYvyFjLNte4uwtm6cHe/kCfYuK3enPcqf52ZYLeRtLL59AM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041636; x=1731577636; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3+ikYa/TD+k1hlUP0oUpg5GFw3Z0IfURtiYfq7IOPh0=; b=aHyue0KKLRGLE+IOy8ZjMsad75vUl7wTZuJrQle1MbqKzGDFyAfbEyLK eorRmSmk7a2nc0BcpYe92XHsqRmwhUgUnlSD6Z34m9l8iBgeGIrm0+GHO WCc9fCJpfekzLQooyM5+s9jLbmghHIFpZzNgnqG7/0+jJ1tSSfY9L4/PK a++f79PFvVxRtDWmNSk0LFaMTBBjJgX4NVlVCXlVrlAgtzm4S7Qizjf50 JHYfu3uK1/HMciC1xtiO/ltf0VUSd9LfE+pBKTs37yiF4rEmRblwFkMpm Fh92uCiKWn+L9y11aTiVrAgeeWITvuKGA3b8yIopNCc7SQqejIERMy4Wq Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342598" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342598" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431674" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431674" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:09 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B9A9F100567C; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 08/16] [APX NDD] Support APX NDD for neg insn Date: Wed, 15 Nov 2023 17:46:57 +0800 Message-Id: <20231115094705.3976553-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622939898296505 X-GMAIL-MSGID: 1782622939898296505 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add NEG support. (ix86_expand_unary_operator): Add use_ndd parameter and adjust for NDD. * config/i386/i386-protos.h : Add use_ndd parameter for ix86_unary_operator_ok and ix86_expand_unary_operator. * config/i386/i386.cc (ix86_unary_operator_ok): Add ndd constraints, and add use_ndd parameter. * config/i386/i386.md (neg2): Add ndd constraints. (*neg_1): Likewise. (*neg2_doubleword): Likewise. (*negsi_1_zext): Likewise. (*neg_2): Likewise. (*negsi_2_zext): Likewise. (*neg_ccc_1): Likewise. (*neg_ccc_2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add neg test. --- gcc/config/i386/i386-expand.cc | 5 +- gcc/config/i386/i386-protos.h | 5 +- gcc/config/i386/i386.cc | 5 +- gcc/config/i386/i386.md | 79 ++++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 29 +++++++++ 5 files changed, 90 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index e5f75875e3b..995cc792c5f 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1271,6 +1271,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) { case PLUS: case MINUS: + case NEG: return true; default: return false; @@ -1511,7 +1512,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, void ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { bool matching_memory = false; rtx src, dst, op, clob; @@ -1530,7 +1531,7 @@ ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, } /* When source operand is memory, destination must match. */ - if (MEM_P (src) && !matching_memory) + if (!use_ndd && MEM_P (src) && !matching_memory) src = force_reg (mode, src); /* Emit the instruction. */ diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index ad895fac72d..0010fd71011 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -128,7 +128,7 @@ extern bool ix86_vec_interleave_v2df_operator_ok (rtx operands[3], bool high); extern bool ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn); extern bool ix86_agi_dependent (rtx_insn *set_insn, rtx_insn *use_insn); extern void ix86_expand_unary_operator (enum rtx_code, machine_mode, - rtx[]); + rtx[], bool = false); extern rtx ix86_build_const_vector (machine_mode, bool, rtx); extern rtx ix86_build_signbit_mask (machine_mode, bool, bool); extern HOST_WIDE_INT ix86_convert_const_vector_to_integer (rtx, @@ -148,7 +148,8 @@ extern void ix86_split_fp_absneg_operator (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_copysign (rtx []); extern void ix86_expand_xorsign (rtx []); -extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2]); +extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2], + bool = false); extern bool ix86_match_ccmode (rtx, machine_mode); extern bool ix86_match_ptest_ccmode (rtx); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 47159b06f7d..9b0715943f7 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -16160,11 +16160,12 @@ ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn) bool ix86_unary_operator_ok (enum rtx_code, machine_mode, - rtx operands[2]) + rtx operands[2], + bool use_ndd) { /* If one of operands is memory, source and destination must match. */ if ((MEM_P (operands[0]) - || MEM_P (operands[1])) + || (!use_ndd && MEM_P (operands[1]))) && ! rtx_equal_p (operands[0], operands[1])) return false; return true; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c3dcfaf52e1..8ba524e9e44 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -12952,13 +12952,15 @@ (define_expand "neg2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (neg:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NEG, mode, operands); DONE;") + "ix86_expand_unary_operator (NEG, mode, operands, + ix86_can_use_ndd_p (NEG)); DONE;") (define_insn_and_split "*neg2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (neg: (match_operand: 1 "nonimmediate_operand" "0"))) + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (neg: (match_operand: 1 "nonimmediate_operand" "0,ro"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" + "ix86_unary_operator_ok (NEG, mode, operands, + ix86_can_use_ndd_p (NEG))" "#" "&& reload_completed" [(parallel @@ -12975,7 +12977,8 @@ (define_insn_and_split "*neg2_doubleword" [(set (match_dup 2) (neg:DWIH (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) ;; Convert: ;; mov %esi, %edx @@ -13064,22 +13067,30 @@ (define_peephole2 (clobber (reg:CC FLAGS_REG))])]) (define_insn "*neg_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0"))) + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + "ix86_unary_operator_ok (NEG, mode, operands, + ix86_can_use_ndd_p (NEG))" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI - (neg:SI (match_operand:SI 1 "register_operand" "0")))) + (neg:SI (match_operand:SI 1 "register_operand" "0,r")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands, + ix86_can_use_ndd_p (NEG))" + "@ + neg{l}\t%k0 + neg{l}\t{%k1, %k0|%k0, %k1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -13105,51 +13116,65 @@ (define_insn_and_split "*neg_1_slp" (define_insn "*neg_2" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + && ix86_unary_operator_ok (NEG, mode, operands, + ix86_can_use_ndd_p (NEG))" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_2_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 1 "register_operand" "0")) + (neg:SI (match_operand:SI 1 "register_operand" "0,r")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (neg:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + && ix86_unary_operator_ok (NEG, SImode, operands, + ix86_can_use_ndd_p (NEG))" + "@ + neg{l}\t%k0 + neg{l}\t{%k1, %k0|%k0, %k1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*neg_ccc_1" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,rm") (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*neg_ccc_2" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,r") (const_int 0)] UNSPEC_CC_NE)) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_expand "x86_neg_ccc" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 9d444e0d830..18b423258ea 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -27,8 +27,25 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ { \ TYPE c = *a OP b; \ return c; \ +} + +#define F(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = OP*a; \ + return b; \ } +#define F1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f1_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = OP a; \ + return b; \ +} FOO (char, add, +) FOO1 (char, add, +) FOO2 (char, add, +) @@ -50,8 +67,20 @@ FOO (int, sub, -) FOO1 (int, sub, -) FOO (long, sub, -) FOO1 (long, sub, -) + +F (char, neg, -) +F1 (char, neg, -) +F (short, neg, -) +F1 (short, neg, -) +F (int, neg, -) +F1 (int, neg, -) +F (long, neg, -) +F1 (long, neg, -) /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Nov 15 09:46:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165236 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429218vqg; Wed, 15 Nov 2023 01:49:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IEO5I9gle08YnOwe5GNzKqfvYW0oq/coGGUnUJYCOGd2RTqAuI4p4+fCAnAwsvqH3Pmb+1K X-Received: by 2002:a05:620a:4d86:b0:76d:bb77:b295 with SMTP id uw6-20020a05620a4d8600b0076dbb77b295mr3813197qkn.78.1700041753698; Wed, 15 Nov 2023 01:49:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041753; cv=pass; d=google.com; s=arc-20160816; b=Gpwgcvk5CmMfZAT7hAfJEGaq6fufUarimcpH3UdGuERmKfwXruq4m9z0a46WWwVdvX V2/PieKWeam/q18UzXF+/2qf9jrcj7eEO7meNd2kfqvOUN4Jggak7c9oDliW40GmCd3u QN2EOxzJasRIgb7K6xvmeuKJw4kk4+OK3/qPFBTR4ELM5b7s7W6ulLx9cKfzO+1RsUo7 776kfKlC4du228UD4L9cIVBEDBrhwn/b4AwFha2OSo03jk4nfh4fr13vGNp+7n12IZUa 3CQKhUHj89KdpWh43mtg1wn9LaSudlw5IFw7QABGgehXRWr3oCQlYgUmy4x9wPARcq8t WO2g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=hzytxSxbEkFKx6iB5zy6Z5IbeYcuWou5l3C0bocdfQU=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=xodBX8nYVKBtCpiEB6Zxz81qmskTvlk9ntI7sqgAljVuNkTY2SxGnlCuPtKDKloper As++0cnSiNNlMkzm3X7Ww6wYJpjXzMknDRJmV7Tn2Bhdq9IpdGZ/HjS2PCNYpvKpMBXr 0T3eRpNsyupalpcDCJVzq1dP7yjWohTdBkDEtkTXvqU7ewheY40jig+qXnZ57sxpr5mA KHaU7fBUubritIPjbFw3BlAPBI84mZFtRm1SyIkT4WfBs/AFyUHQTeDmKHczGF8sl0NR cKsCKJCPxsUD3F5Z3HiuLGnAUTS6NyMx0YCZ2pzCuvvvffKpQEt3osj71/oaMSnvVsVH tKDQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=i5cmKpkY; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i15-20020a05620a248f00b0076db6af9392si9532703qkn.254.2023.11.15.01.49.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:49:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=i5cmKpkY; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 25ED3385B837 for ; Wed, 15 Nov 2023 09:48:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 081CE38582A3 for ; Wed, 15 Nov 2023 09:47:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 081CE38582A3 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 081CE38582A3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041637; cv=none; b=jNbElwhquL1oK1+HIlDTcRR0uv5Ky4rmzdyGOQlpI70y4xGFfTYoQAPMnHG/oleRXU1czXwwtmBrfjFFXkRIR6Gi2vlEo8gkAPKgKm2B2ujTAFOGZiKDtvXiUie8WmmBUZjpV6RDO2Q7SvkBFC68WKYpYWgJUdT99xxeXEoMdyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041637; c=relaxed/simple; bh=D8NYfXmMOvuqE7AI6cesfXbyMspO/Lvy2iAjHraiq9I=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=jduUYDUcG5VRRhLnmPz8qZG9MkK9QNkmwuxIzWQaBWgl0Ph914hO68z6VKFdjGQSu7ATCmD+RArnneGgyo6bMoO69kR0kr6cA9/+EZHrp7xMdjNatSM1TV1xOITeZJMtXnbGS7/LRrqkEhTUJjhDnD5UsM5oT9MbO3hE8fDR/lw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041635; x=1731577635; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D8NYfXmMOvuqE7AI6cesfXbyMspO/Lvy2iAjHraiq9I=; b=i5cmKpkY+0b4JWhbvfEgi6BQS+lx4YO19N9kED8fxVlSXLvHwyw9By0+ Ph4YI4WR53lPF+92db4r5E4kHy+BeHePvECdnGuvBSmdXKQpwsm2ToU9+ ZiVVWqhjk2GMxq7Wky2UUw1yMC1bD00GUohUbIqeZjsy+FviU/ZRRZ+wH L96wX6izV8Xsd7Z8auR6my9j25msLG6jw/pPwkNQhlc4RWRVNCQeKNiXK wUrFi8F3jjuFpkXV8FkSli2wLXEeUCMaEa/PvhFFOoGdtNYnG6ZXWCSxH FM8/zDBuQTJnQGAW7mKRNhKyrV9MWVXGb9WiH7xb8Qx2FvWzI+/ATcC8Q w==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342596" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342596" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431668" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431668" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BC0F4100567D; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 09/16] [APX NDD] Support APX NDD for not insn Date: Wed, 15 Nov 2023 17:46:58 +0800 Message-Id: <20231115094705.3976553-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622982358242055 X-GMAIL-MSGID: 1782622982358242055 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add NOT support. * config/i386/i386.md (one_cmpl2): Add NDD constraints, adjust output template. (*one_cmpl2_1): Likewise. (*one_cmplqi2_1): Likewise. (*one_cmpl2_doubleword): Likewise. (*one_cmplsi2_1_zext): Likewise. (*one_cmpl2_2): Likewise. (*one_cmplsi2_2_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add not test. --- gcc/config/i386/i386-expand.cc | 1 + gcc/config/i386/i386.md | 73 +++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 11 ++++ 3 files changed, 55 insertions(+), 30 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 995cc792c5f..be77ba4a476 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1272,6 +1272,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case PLUS: case MINUS: case NEG: + case NOT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8ba524e9e44..9758e4e5144 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -13673,64 +13673,73 @@ (define_expand "one_cmpl2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (not:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NOT, mode, operands); DONE;") + "ix86_expand_unary_operator (NOT, mode, operands, + ix86_can_use_ndd_p (NOT)); DONE;") (define_insn_and_split "*one_cmpl2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (not: (match_operand: 1 "nonimmediate_operand" "0")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (not: (match_operand: 1 "nonimmediate_operand" "0,ro")))] + "ix86_unary_operator_ok (NOT, mode, operands, + ix86_can_use_ndd_p (NOT))" "#" "&& reload_completed" [(set (match_dup 0) (not:DWIH (match_dup 1))) (set (match_dup 2) (not:DWIH (match_dup 3)))] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) (define_insn "*one_cmpl2_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,?k") - (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,k")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,rm,k")))] + "ix86_unary_operator_ok (NOT, mode, operands, + ix86_can_use_ndd_p (NOT))" "@ not{}\t%0 + not{}\t{%1, %0|%0, %1} #" - [(set_attr "isa" "*,") - (set_attr "type" "negnot,msklog") + [(set_attr "isa" "*,apx_ndd,") + (set_attr "type" "negnot,negnot,msklog") (set_attr "mode" "")]) (define_insn "*one_cmplsi2_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,?k") + [(set (match_operand:DI 0 "register_operand" "=r,r,?k") (zero_extend:DI - (not:SI (match_operand:SI 1 "register_operand" "0,k"))))] - "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands)" + (not:SI (match_operand:SI 1 "register_operand" "0,r,k"))))] + "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands, + ix86_can_use_ndd_p (NOT))" "@ not{l}\t%k0 + not{l}\t{%k1, %k0|%k0, %k1} #" - [(set_attr "isa" "x64,avx512bw_512") - (set_attr "type" "negnot,msklog") - (set_attr "mode" "SI,SI")]) + [(set_attr "isa" "x64,apx_ndd,avx512bw_512") + (set_attr "type" "negnot,negnot,msklog") + (set_attr "mode" "SI,SI,SI")]) (define_insn "*one_cmplqi2_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,?k") - (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,k")))] - "ix86_unary_operator_ok (NOT, QImode, operands)" + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,r,?k") + (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,rm,k")))] + "ix86_unary_operator_ok (NOT, QImode, operands, + ix86_can_use_ndd_p (NOT))" "@ not{b}\t%0 not{l}\t%k0 + not{l}\t{%k1, %k0|%k0, %k1} #" - [(set_attr "isa" "*,*,avx512f") - (set_attr "type" "negnot,negnot,msklog") + [(set_attr "isa" "*,*,apx_ndd,avx512f") + (set_attr "type" "negnot,negnot,negnot,msklog") (set (attr "mode") - (cond [(eq_attr "alternative" "1") + (cond [(eq_attr "alternative" "1,2") (const_string "SI") - (and (eq_attr "alternative" "2") + (and (eq_attr "alternative" "3") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] (const_string "QI"))) ;; Potential partial reg stall on alternative 1. (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "1") + (cond [(eq_attr "alternative" "1,2") (symbol_ref "!TARGET_PARTIAL_REG_STALL")] (symbol_ref "true")))]) @@ -13753,14 +13762,16 @@ (define_insn_and_split "*one_cmpl_1_slp" (define_insn "*one_cmpl2_2" [(set (reg FLAGS_REG) - (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (not:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, mode, operands)" + && ix86_unary_operator_ok (NOT, mode, operands, + ix86_can_use_ndd_p (NOT))" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_split @@ -13779,14 +13790,16 @@ (define_split (define_insn "*one_cmplsi2_2_zext" [(set (reg FLAGS_REG) - (compare (not:SI (match_operand:SI 1 "register_operand" "0")) + (compare (not:SI (match_operand:SI 1 "register_operand" "0,r")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (not:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, SImode, operands)" + && ix86_unary_operator_ok (NOT, SImode, operands, + ix86_can_use_ndd_p (NOT))" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 18b423258ea..9af72d1a46d 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -76,6 +76,15 @@ F (int, neg, -) F1 (int, neg, -) F (long, neg, -) F1 (long, neg, -) + +F (char, not, ~) +F1 (char, not, ~) +F (short, not, ~) +F1 (short, not, ~) +F (int, not, ~) +F1 (int, not, ~) +F (long, not, ~) +F1 (long, not, ~) /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -84,3 +93,5 @@ F1 (long, neg, -) /* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Nov 15 09:46:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165241 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429806vqg; Wed, 15 Nov 2023 01:51:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHliB6MKAq6U2lds8QTJr/L4jf5MBpGd2UKrvbptYUOC3WMu8Ny+aTx7/DcIut4ja4ZmH7H X-Received: by 2002:a05:620a:4627:b0:77b:cc56:49f2 with SMTP id br39-20020a05620a462700b0077bcc5649f2mr5476489qkb.61.1700041862407; Wed, 15 Nov 2023 01:51:02 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041862; cv=pass; d=google.com; s=arc-20160816; b=QKR4qzJYvGb3lMG92CeNFRV0/RcjRSAYm4ow2ZIJDu6KboiOmkcXYnauGBSmlLhJQE J5YeCfltz4BG9+bsbxXXKrpzKoV89i8x/Ma32iUXMwssqQSUZAegVk51e/MhH79As4Bf l4HZLoBu7V6mvnBJDokvuwZYNR4+DsF+dZ487sX6fLAhxi3u+PU723kxNdtOT+EzXm9y wuDrWBB50yd8+tYWn4Vej9rGtJ2iBuArB+B1tCqSS7nrLLak1bizt5M9RlBF28Mg2ED7 8V+xbMCVxNfBqyOg1wCLRUXrANmVpotvfyz6g72QcqLV304L0y8Ynu6VN9x4s2WIvkS1 /HvQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=M8yJ7Smpk6dxlWwLXZGYma4715kz8502NAgAMW50quw=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=N7wanmuWhgUMxPt1zZn2G//aEgFhYHSZcbFqTJljCT8iPdL7blAXuRE338T6hPgCR2 lGG3DCLqdyk7uUuLzXZsgdftaGwn5u39AxrfSW5/P1T5LqNZyoHkYvh2qHMyOuw6Ccim u33SftYrzu9v7HJxTft1evuhBz+03MD4vANm87rJLT7qAWsfJYrIYPCRAEEna0oVRSWV 4keM7pS6ECsH0LgDxC6snAZ++DzMpQwDW4omiAhCM4Iw0KqJk2mPHx9BGfZK69PYjxEZ GqitV0yH8nyj6HQqbxMDAqWNY41AGcObxz7hdWID6TUofsEIii8ZXNceNopeGvR4vpfG C0Pw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GnLDduyA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h24-20020a05620a21d800b0076eeeaaa07asi8470829qka.492.2023.11.15.01.51.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:51:02 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GnLDduyA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2FC743857C60 for ; Wed, 15 Nov 2023 09:51:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 81AA3385802F for ; Wed, 15 Nov 2023 09:47:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 81AA3385802F Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 81AA3385802F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041642; cv=none; b=RjTrKJ5KF5NjBtzPXCfF7jRt2xzcTllnKMZAUfZ6j0J171MfNolUU18p3+3IME/3dGKFbm0R1pSkkXLJQHL4d6rBnZtp1aMjjzuPwcV4W0m6lVTdsNOgRnBh46jVGyHiqF4TupphLfuguUgBtwG/BmWdIH17uZuktVguzDiW2/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041642; c=relaxed/simple; bh=kpOHflRJOsh0VzDAL/st4FmYIMQtxPitKliiGz0gVSE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CgfcCK9A+OxpR97tOYzzakh2tqPVoqYb6CyklGJEQ2p/klJLOfemaIUYJMwTJn3FTm7FkpJORVMHbdMCVWPQPILumV961i78svYz2Sy6zseKHgB5fgbwqHhbfnmiNWl7Ck8lZbdzEPFylF8rA1fRN6UBzR7XOClRh90G8+TzpPQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041639; x=1731577639; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kpOHflRJOsh0VzDAL/st4FmYIMQtxPitKliiGz0gVSE=; b=GnLDduyAnT5Ge1tNm3Q/sYLC044yfcgsrwUzydC5PVDk6qEwNLId5uIJ f8JeFIX87mOaRAm5iR417heKXMt1y2o4Q+NWVAQRYOEYO/cWh2QuQpkRD vZDq73VifRSc856zjkrSc9hx9+de4jItkVpdbgvKbAxBYwbUbAq8BA81+ 3YtOfjvQfToZx/+JPIh6LxOE0Q31xDfutNYiFSujw+a01GUPOntKCOq7K A0H8hEpBtkIOV7BNtx+bKS6dAL7/ShYDzL4ZM+CGCNU3Hg5XCv6a9NyOL qGWS+KoEMJiDCS+f0FBXJj2nBTvJklgt5FcjIDT/e10sNHojOyl0icaKx Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342610" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342610" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431710" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431710" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BEA2E100567F; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 10/16] [APX NDD] Support APX NDD for and insn Date: Wed, 15 Nov 2023 17:46:59 +0800 Message-Id: <20231115094705.3976553-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623096031561697 X-GMAIL-MSGID: 1782623096031561697 From: Kong Lingling For NDD form AND insn, there are three splitter fixes after extending legacy patterns. 1. APX NDD does not support high QImode registers like ah, bh, ch, dh, so for some optimization splitters that generates highpart zero_extract for QImode need to be prohibited under NDD pattern. 2. Legacy AND insn will use r/qm/L constraint, and a post-reload splitter will transform it into zero_extend move. But for NDD form AND, the splitter is not strict enough as the splitter assum such AND will have the const_int operand matching the constraint "L", then NDD form AND allows const_int with any QI values. Restrict the splitter condition to match "L" constraint that strictly matches zero-extend sematic. 3. Legacy AND insn will adopt r/0/Z constraint, a splitter will try to optimize such form into strict_lowpart QImode AND when 7th bit is not set. But the splitter will wronly convert non-zext form of NDD and with memory src, then the strict_lowpart transform matches alternative 1 of *_slp_1 and generates *movstrict_1 so the zext sematic was omitted. This could cause highpart of dest not cleared and generates wrong code. Disable the splitter when NDD adopted and operands[0] and operands[1] are not equal. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add AND support. * config/i386/i386.md (and3): Add NDD alternatives and adjust output template. (*anddi_1): Likewise. (*and_1): Likewise. (*andqi_1): Likewise. (*andsi_1_zext): Likewise. (*anddi_2): Likewise. (*andsi_2_zext): Likewise. (*andqi_2_maybe_si): Likewise. (*and_2): Likewise. (*and3_doubleword): Add NDD constraints, emit move for optimized case if operands[0] not equal to operands[1]. (define_split for QI highpart AND): Prohibit splitter to split NDD form AND insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. (define_split for zero_extend and optimization): Prohibit splitter to split NDD form AND insn to zero_extend insn. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add and test. * gcc.target/i386/apx-spill_to_egprs-1.c: Change some check. --- gcc/config/i386/i386-expand.cc | 1 + gcc/config/i386/i386.md | 177 ++++++++++++------ gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ .../gcc.target/i386/apx-spill_to_egprs-1.c | 8 +- 4 files changed, 135 insertions(+), 64 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index be77ba4a476..662f687abc3 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1273,6 +1273,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case MINUS: case NEG: case NOT: + case AND: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 9758e4e5144..4bf0c16f401 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11471,18 +11471,20 @@ (define_expand "and3" (operands[0], gen_lowpart (mode, operands[1]), mode, mode, 1)); else - ix86_expand_binary_operator (AND, mode, operands); + ix86_expand_binary_operator (AND, mode, operands, + ix86_can_use_ndd_p (AND)); DONE; }) (define_insn_and_split "*and3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (and: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -11494,39 +11496,53 @@ (define_insn_and_split "*and3_doubleword" if (operands[2] == const0_rtx) emit_move_insn (operands[0], const0_rtx); else if (operands[2] == constm1_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else - ix86_expand_binary_operator (AND, mode, &operands[0]); + ix86_expand_binary_operator (AND, mode, &operands[0], + ix86_can_use_ndd_p (AND)); if (operands[5] == const0_rtx) emit_move_insn (operands[3], const0_rtx); else if (operands[5] == constm1_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else - ix86_expand_binary_operator (AND, mode, &operands[3]); + ix86_expand_binary_operator (AND, mode, &operands[3], + ix86_can_use_ndd_p (AND)); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,?k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,rm,r,r,r,r,?k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,r,0,0,rm,r,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,Z,re,m,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{l}\t{%k2, %k0|%k0, %k2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") - (set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,*") + [(set_attr "isa" "x64,apx_ndd,x64,x64,apx_ndd,apx_ndd,x64,avx512bw_512") + (set_attr "type" "alu,alu,alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11534,7 +11550,7 @@ (define_insn "*anddi_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,SI,DI,DI,DI,DI,SI,DI")]) (define_insn_and_split "*anddi_1_btr" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") @@ -11589,36 +11605,46 @@ (define_split ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*and_1" - [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,Ya,?k") - (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SWI24 2 "" "r,,L,k"))) + [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,r,r,Ya,?k") + (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,rm,r,qm,k") + (match_operand:SWI24 2 "" "r,,r,,L,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" "@ and{}\t{%2, %0|%0, %2} and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2} # #" [(set (attr "isa") - (cond [(eq_attr "alternative" "3") + (cond [(eq_attr "alternative" "2,3") + (const_string "apx_ndd") + (eq_attr "alternative" "5") (if_then_else (eq_attr "mode" "SI") (const_string "avx512bw") (const_string "avx512f")) ] (const_string "*"))) - (set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + (set_attr "type" "alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11626,24 +11652,28 @@ (define_insn "*and_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" ",,SI,")]) + (set_attr "mode" ",,,,SI,")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, QImode, operands)" + "ix86_binary_operator_ok (AND, QImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{b}\t{%2, %0|%0, %2} and{b}\t{%2, %0|%0, %2} and{l}\t{%k2, %k0|%k0, %k2} + and{b}\t{%2, %1, %0|%0, %1, %2} + and{b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "type" "alu,alu,alu,msklog") + [(set_attr "type" "alu,alu,alu,alu,alu,msklog") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,*") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -11683,7 +11713,10 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1]))" + || REGNO (operands[0]) != REGNO (operands[1])) + && (UINTVAL (operands[2]) == GET_MODE_MASK (SImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (HImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (QImode))" [(const_int 0)] { unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]); @@ -11756,10 +11789,10 @@ (define_insn "*anddi_2" [(set (reg FLAGS_REG) (compare (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m")) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,r,rm,r") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,Z,re,m")) (const_int 0))) - (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") + (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,r,r") (and:DI (match_dup 1) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode @@ -11773,38 +11806,49 @@ (define_insn "*anddi_2" && (!CONST_INT_P (operands[2]) || val_signbit_known_set_p (SImode, INTVAL (operands[2])))) ? CCZmode : CCNOmode) - && ix86_binary_operator_ok (AND, DImode, operands)" + && ix86_binary_operator_ok (AND, DImode, operands, + ix86_can_use_ndd_p (AND))" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} - and{q}\t{%2, %0|%0, %2}" + and{q}\t{%2, %0|%0, %2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") - (set_attr "mode" "SI,DI,DI")]) + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,apx_ndd") + (set_attr "mode" "SI,DI,DI,SI,DI,DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_2_zext" [(set (reg FLAGS_REG) (compare (and:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (AND, SImode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*andqi_2_maybe_si" [(set (reg FLAGS_REG) (compare (and:QI - (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:QI 2 "general_operand" "qn,m,n")) + (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,n,rn,m")) (const_int 0))) - (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r") (and:QI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (AND, QImode, operands) + "ix86_binary_operator_ok (AND, QImode, operands, + ix86_can_use_ndd_p (AND)) && ix86_match_ccmode (insn, CONST_INT_P (operands[2]) && INTVAL (operands[2]) >= 0 ? CCNOmode : CCZmode)" @@ -11815,9 +11859,12 @@ (define_insn "*andqi_2_maybe_si" operands[2] = GEN_INT (INTVAL (operands[2]) & 0xff); return "and{l}\t{%2, %k0|%k0, %2}"; } + if (which_alternative > 2) + return "and{b}\t{%2, %1, %0|%0, %1, %2}"; return "and{b}\t{%2, %0|%0, %2}"; } [(set_attr "type" "alu") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") @@ -11836,15 +11883,21 @@ (define_insn "*andqi_2_maybe_si" (define_insn "*and_2" [(set (reg FLAGS_REG) (compare (and:SWI124 - (match_operand:SWI124 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI124 2 "" ",")) + (match_operand:SWI124 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI124 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,,r,r") (and:SWI124 (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, mode, operands)" - "and{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (AND, mode, operands, + ix86_can_use_ndd_p (AND))" + "@ + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) (define_insn "*qi_ext_0" @@ -12057,6 +12110,7 @@ (define_insn_and_split "*qi_ext_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (and:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12064,7 +12118,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(~INTVAL (operands[2]) & ~(255 << 8))" + && !(~INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12093,7 +12148,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(~INTVAL (operands[2]) & ~255) - && !(INTVAL (operands[2]) & 128)" + && !(INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (and:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 9af72d1a46d..b34194b762d 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -85,6 +85,15 @@ F (int, not, ~) F1 (int, not, ~) F (long, not, ~) F1 (long, not, ~) + +FOO (char, and, &) +FOO1 (char, and, &) +FOO (short, and, &) +FOO1 (short, and, &) +FOO (int, and, &) +FOO1 (int, and, &) +FOO (long, and, &) +FOO1 (long, and, &) /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -95,3 +104,7 @@ F1 (long, not, ~) /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c index 290863d63a7..ecddcd6c14c 100644 --- a/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c +++ b/gcc/testsuite/gcc.target/i386/apx-spill_to_egprs-1.c @@ -3,8 +3,8 @@ #include "spill_to_mask-1.c" -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r16d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r17d" } } */ +/* { dg-final { scan-assembler "(?:movl|rorx)\[ \t]+\[^\\n\\r\]*, %r16d" } } */ +/* { dg-final { scan-assembler "(?:movl|rorx)\[ \t]+\[^\\n\\r\]*, %r17d" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r18d" } } */ /* { dg-final { scan-assembler "movq\[ \t]+\[^\\n\\r\]*, %r19" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r20d" } } */ @@ -13,8 +13,8 @@ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r23d" } } */ /* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r24d" } } */ /* { dg-final { scan-assembler "addl\[ \t]+\[^\\n\\r\]*, %r25d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r26d" } } */ -/* { dg-final { scan-assembler "movl\[ \t]+\[^\\n\\r\]*, %r27d" } } */ +/* { dg-final { scan-assembler "(?:movl|movbel)\[ \t]+\[^\\n\\r\]*, %r26d" } } */ +/* { dg-final { scan-assembler "(?:movl|movbel)\[ \t]+\[^\\n\\r\]*, %r27d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r28d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r29d" } } */ /* { dg-final { scan-assembler "movbel\[ \t]+\[^\\n\\r\]*, %r30d" } } */ From patchwork Wed Nov 15 09:47:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165242 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2430042vqg; Wed, 15 Nov 2023 01:51:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IFNoX5axuPkNWr/F08xMnHy+QAGgov1a7K6lPXJhOuRjJsgVUqoZXWCjv9wkoe7JbppAuG2 X-Received: by 2002:a25:e00f:0:b0:d9a:49bb:b8b1 with SMTP id x15-20020a25e00f000000b00d9a49bbb8b1mr12802213ybg.4.1700041899183; Wed, 15 Nov 2023 01:51:39 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041899; cv=pass; d=google.com; s=arc-20160816; b=RyJBoyF45fd2rkMgVlz6mvsqnk7MK7koe8KqVWU19PCDNLXua2mPzto6MM9w4c0uYe D6iiMFnJelQbat2u2VG2gubXm6nhZSolfn/IVnxXFfV8BhcJzZogXp5vAqaR0FxPFPfM EdXwB85X3eNeJhYb1lvyyHFGoboElSiCu4WwAb2gJSzcT7PzWUM4k26gKD0IKVtkdKm5 v7FZ55nqoB6mMTmmED9Go0GLWg3F1q5cRV/IA0tH6+fZlIsIk4csCmYnpPi6t0fQ503i yP1G0kN84djbrUUhMk99cK2anvSe9HXbkuHmlzLopE3uVAb4UUryEviZ+5qEnMoeFwzq ZrHQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=l9Lr9Vl2FVzyoJq+BxexnR4anOWCDw3HJarBhPAEit4=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=bOQLyq7i+0wVg1ACrLoVtdz0cu7YoFmvRE92loG01UJ6l73TwNjL6tEgIOKhG/Z9F1 nA/Niq3fAJNjq/G7BF/6VZIWqaZmqU91vyA3C12FosZvWSSpq8Sy/Z/nMqtUWblv9xv0 6CB4MhLBmiB6EQWmIPoBStdOPcbwDr338WqHgi0MuYMblMGfO3ljF+0aTXVUt+oQ3E1s LDTIJdHKYS+JGWb4qmtBYcdoOXctWC5BD0kPAQ4iU4CZ+M0QQP6u0bZQNfhV/JZhFGlW DSaVrBOG4BU+kt4Pc1HDQ1cyh4Zl9HPMvHrZT81wGJegtV/FmIZVGGfuPuDPEYcYfol8 sg3g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PLbUcTjl; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c10-20020a05621401ea00b00658900baafasi8607986qvu.609.2023.11.15.01.51.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:51:39 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PLbUcTjl; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C8FE83858409 for ; Wed, 15 Nov 2023 09:51:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id E8DFC38582BD for ; Wed, 15 Nov 2023 09:47:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E8DFC38582BD Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E8DFC38582BD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041645; cv=none; b=EfSJJVuP96/vxJcgWg9xUJmL25D2OWPCC+BBA3mjMbzU0IvVqVbtkop7+TW7S7gaHndBPJpNzR8TnpKQNzi4o0i2N5/BmHFYsRC5b0nk45qDeaqKEvlSBq7FEUw2iLYWnx20ui7ArZDjUD4OzWOIm09Yui1WWllSwwwsRUlI27o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041645; c=relaxed/simple; bh=0DAXHF+Qf/x4Avp3Yw1AAG9NNgY5BV9BfKGqerN6Doc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=rVEwt4FyiBe/CAlmwch9PFGUHM7ow3X3a0T3oX05kwFm0U5N+Nc1O6A8pwHmHgfm3WdOgsNtz23QSEF/t0+Zgqlg//wWnqsPVhJzVKbApzie0DtDBa1uu8hUMfnQjygQRsAujOwYIxlXRuE28YLTR17iv72yi/V6bncmcMdhg6I= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041641; x=1731577641; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0DAXHF+Qf/x4Avp3Yw1AAG9NNgY5BV9BfKGqerN6Doc=; b=PLbUcTjlPvf0QbYZLRapZfi15LIfoqVgNCRGJajfb3jn++OwrNa0CTYv RredtzBRwL8cSE2HwJq6F2SNANNNAv/pb4hNFacO6iQujPatDRZxpex4P tUmlUVhmkW3BgyUT2dGpW1sxWxCFHCtYtYWAY5qZw69zTtzInIVHknAxD hhMW3y/DpUgHPny71rdWIuOIKFFiMNYptsLwZ94AbZVvRuMFH1gK5SoA5 Nnoc/VDJiCyzjtEPvJgSeFrfBkjh1pJUOO6zBJgZO4/FWmTZipKUI4vUy ArQPp6PRJ3pkmFEL4l53o4zTvzm2laPKPHuq2XS/bfeDye+6VwmxKwcI2 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342617" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342617" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431716" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431716" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C1ACA1005680; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 11/16] [APX NDD] Support APX NDD for or/xor insn Date: Wed, 15 Nov 2023 17:47:00 +0800 Message-Id: <20231115094705.3976553-12-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623134337659380 X-GMAIL-MSGID: 1782623134337659380 From: Kong Lingling Similar to AND insn, two splitters need to be adjusted to prevent misoptimizaiton for NDD OR/XOR. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add IOR/XOR support. * config/i386/i386.md (3): Add NDD alternative and adjust output templates. (*_1): Likewise. (*qi_1): Likewise. (*notxor_1): Likewise. (*si_1_zext): Likewise. (*si_1_zext_imm): Likewise. (*notxorqi_1): Likewise. (*_2): Likewise. (*si_2_zext): Likewise. (*si_2_zext_imm): Likewise. (*3_doubleword): Add NDD constraints, emit move for optimized case if operands[0] != operands[1] or operands[4] != operands[5]. (define_split for QI highpart OR/XOR): Prohibit splitter to split NDD form OR/XOR insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add or and xor test. --- gcc/config/i386/i386-expand.cc | 2 + gcc/config/i386/i386.md | 180 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 26 ++++ 3 files changed, 143 insertions(+), 65 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 662f687abc3..5f02d557a50 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1274,6 +1274,8 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case NEG: case NOT: case AND: + case IOR: + case XOR: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 4bf0c16f401..cf9842d1a49 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -12372,17 +12372,19 @@ (define_expand "3" && !x86_64_hilo_general_operand (operands[2], mode)) operands[2] = force_reg (mode, operands[2]); - ix86_expand_binary_operator (, mode, operands); + ix86_expand_binary_operator (, mode, operands, + ix86_can_use_ndd_p ()); DONE; }) (define_insn_and_split "*3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (any_or: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -12394,20 +12396,29 @@ (define_insn_and_split "*3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else if (operands[2] == constm1_rtx) { if ( == IOR) emit_move_insn (operands[0], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[0]); + ix86_expand_unary_operator (NOT, mode, &operands[0], + ix86_can_use_ndd_p (NOT)); } else - ix86_expand_binary_operator (, mode, &operands[0]); + ix86_expand_binary_operator (, mode, &operands[0], + ix86_can_use_ndd_p ()); if (operands[5] == const0_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else if (operands[5] == constm1_rtx) @@ -12415,37 +12426,44 @@ (define_insn_and_split "*3_doubleword" if ( == IOR) emit_move_insn (operands[3], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[3]); + ix86_expand_unary_operator (NOT, mode, &operands[3], + ix86_can_use_ndd_p (NOT)); } else - ix86_expand_binary_operator (, mode, &operands[3]); + ix86_expand_binary_operator (, mode, &operands[3], + ix86_can_use_ndd_p ()); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (any_or:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k"))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" "@ {}\t{%2, %0|%0, %2} {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*notxor_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (not:SWI248 (xor:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k")))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, mode, operands)" + "ix86_binary_operator_ok (XOR, mode, operands, + ix86_can_use_ndd_p (XOR))" "#" "&& reload_completed" [(parallel @@ -12461,8 +12479,8 @@ (define_insn_and_split "*notxor_1" DONE; } } - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*iordi_1_bts" @@ -12550,44 +12568,56 @@ (define_insn_and_split "*xor2andn" ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*si_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_1_zext_imm" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "%0")) - (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z"))) + (zero_extend:DI (match_operand:SI 1 "register_operand" "%0,r")) + (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z,Z"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*qi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, QImode, operands)" + "ix86_binary_operator_ok (, QImode, operands, + ix86_can_use_ndd_p ())" "@ {b}\t{%2, %0|%0, %2} {b}\t{%2, %0|%0, %2} {l}\t{%k2, %k0|%k0, %k2} + {b}\t{%2, %1, %0|%0, %1, %2} + {b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -12599,12 +12629,13 @@ (define_insn "*qi_1" (symbol_ref "true")))]) (define_insn_and_split "*notxorqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") (not:QI - (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k")))) + (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, QImode, operands)" + "ix86_binary_operator_ok (XOR, QImode, operands, + ix86_can_use_ndd_p (XOR))" "#" "&& reload_completed" [(parallel @@ -12620,12 +12651,12 @@ (define_insn_and_split "*notxorqi_1" DONE; } } - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -12673,44 +12704,59 @@ (define_split (define_insn "*_2" [(set (reg FLAGS_REG) (compare (any_or:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (any_or:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, mode, operands)" - "{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" + "@ + {}\t{%2, %0|%0, %2} + {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand ;; ??? Special case for immediate operand is missing - it is tricky. (define_insn "*si_2_zext" [(set (reg FLAGS_REG) - (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (any_or:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_2_zext_imm" [(set (reg FLAGS_REG) (compare (any_or:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm") + (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z,Z")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*_3" @@ -12731,6 +12777,7 @@ (define_insn "*_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (any_or:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12738,7 +12785,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(INTVAL (operands[2]) & ~(255 << 8))" + && !(INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12776,7 +12824,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(INTVAL (operands[2]) & ~255) - && (INTVAL (operands[2]) & 128)" + && (INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (any_or:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index b34194b762d..7541a41a01e 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -94,6 +94,24 @@ FOO (int, and, &) FOO1 (int, and, &) FOO (long, and, &) FOO1 (long, and, &) + +FOO (char, or, |) +FOO1 (char, or, |) +FOO (short, or, |) +FOO1 (short, or, |) +FOO (int, or, |) +FOO1 (int, or, |) +FOO (long, or, |) +FOO1 (long, or, |) + +FOO (char, xor, ^) +FOO1 (char, xor, ^) +FOO (short, xor, ^) +FOO1 (short, xor, ^) +FOO (int, xor, ^) +FOO1 (int, xor, ^) +FOO (long, xor, ^) +FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -108,3 +126,11 @@ FOO1 (long, and, &) /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "orb\[^\n\r]*1, \\(%rdi\\), %al" 2} } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 6 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "xorb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ From patchwork Wed Nov 15 09:47:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165237 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429342vqg; Wed, 15 Nov 2023 01:49:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IHTcwijRWyxrIdS7ulmy8A5EsSmVHllyUFXlGELozJPEzU+YmClnV/CRe4MR+CzkJkTdq6m X-Received: by 2002:a05:622a:514:b0:41c:d846:4d7 with SMTP id l20-20020a05622a051400b0041cd84604d7mr5166959qtx.65.1700041775626; Wed, 15 Nov 2023 01:49:35 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041775; cv=pass; d=google.com; s=arc-20160816; b=AiUyPP9S8S/3icMBxe+ZFKbZOU3gIAg/0cchqvA3ezZFbPXaiErw4271VdvR+kGaWi 8qmXnSnCybyMSc4hjY5sH0tbOHriZK3AzwXTBqO2NI0ehl3/iNVLIJEkw27AxfYAIEIT 5JzOGzVPmdaW5oj4JM+A8rNm4Bfb7QX1yonJWW6W5MUHyrKy+98kwNKm1rz3yC/pW9oZ bYvz2pqOFRtaJVOU+w0tox7i27AEJ4y74egn9XWb45/T9IAUVbCegh/hDc3QApfd2hov KnV2BbltVG4DgUGY3zKmq9Da3/Jn763LzaaVmpvpt+8fJaU1UsmledyZH+RRhClvMSwk rRkg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=M+I6jYdQQTR2qWEYKSEfFyJDux/U+s1wxm+xsDvjwpI=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=q9ZRD7X40S717owlKapVjkhywp/vWU8O62W4/AzLW1rdlQBps6tz5K95LNN+LUQwOP WuRy/DkCznwckJpmiAhLyi8SzvP2j3d+ZeAx+bWXdHMx6K6F0kGjcWmImJNZymhmzTEa MFAGT3S3NVzYSDDIwAom0GKg0E3F28RBi1fgjaxai3OPqwzQnzjNH0oG6Q3x+m//siI2 DEpwt5DbwrwtO2o2AR2bFJd5joK873zGxwhQu5Qgn/d7bgpHfXQK2m7CSgDF+rAQNEDB acUrU/9Bt+Lp6QpW18/50e07HqT0K4OLFoql9fZc5s9e9AqycCdKojIRUF4QwRFPyeII jI+g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SATj1LJy; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t19-20020a05622a149300b004197196b0f0si8558539qtx.617.2023.11.15.01.49.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:49:35 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SATj1LJy; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 009DD385E021 for ; Wed, 15 Nov 2023 09:49:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 45FBE3857C47 for ; Wed, 15 Nov 2023 09:47:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 45FBE3857C47 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 45FBE3857C47 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041640; cv=none; b=ZWprWoBqQZRiPWmLr+KXrrkV/F6SstlY4wKEZSfXgQrNt0dsKL+zItELInv7UrzpVE/trbeCGmWc8+sQ3yt2cdoi0X1ttgBrGSkTDhE0utVL5ToywozCBjfZEd8eFsbP+h/ed4xW6fO5dqVS5W3OOZ0/eyVcrto+olbixT3gwh4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041640; c=relaxed/simple; bh=owJWzN2RqD9NcTTL+EZpfixd4JiJ3YVBMxQzh8a9Lg8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=KPrd5JlUJ39XOKAEvidYXENDNopLpDJ6bGU1c0esHzxqU+USBrBWrb1SmSjoZFSsBhovbToFunRbmws5LGanHQgMbCEyr68ID877wLy1FeY/65rVuTtSYJRKafJnUTKkjBwjcnFHy83GabX80UCd4gm3dsnhxBZeJwsEmuPjeE0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041637; x=1731577637; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=owJWzN2RqD9NcTTL+EZpfixd4JiJ3YVBMxQzh8a9Lg8=; b=SATj1LJyF6r/5HnaJIWp9W7SHuW9Yoc6TmQlQwjkcXC4WCPyBqDBsQZ8 fvrly3Ovs6cgRxe6g1PWe+G9P6ng6p/Llkg83bSBuqxJwM7GiYLyMAbYf OvgdlozkNwblQhqse1SqeXjLgfHso+NKDAFtdwvJ6m6HU5ountStXfmpy qxIWi24j1WFBRMsy1cojFG1OWZzD0RX8oNgNfhKqWOao+LHDSKQu2Hyjv RnvkYFYB3T60EVjKARe6ZJlltVtV53DXKLzrfbb7KainXjw16MTR2KGBV KE7oBHoJUCEQA4BFIUNPfaLeAbtMOJTB8TUOZQo3Cawtt+Mj+1nmfL/en w==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342608" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342608" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431701" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431701" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:10 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C4A0C1005681; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 12/16] [APX NDD] Support APX NDD for left shift insns Date: Wed, 15 Nov 2023 17:47:01 +0800 Message-Id: <20231115094705.3976553-13-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623004787382886 X-GMAIL-MSGID: 1782623004787382886 For left shift, there is an optimization TARGET_DOUBLE_WITH_ADD that shl 1 can be optimized to add. As NDD form of add requires src operand to be register since NDD cannot take 2 memory src, we currently just keep using NDD form shift instead of add. The optimization TARGET_SHIFT1 will try to remove constant 1, but under NDD it could create ambiguous mnemonic like sal %ecx, %edx, this will be encoded to legacy shift sal %cl, %edx which changes the expected behavior that %ecx is actually considered as NDD src. Under such case we emit $1 explicitly when operands[1] is CX reg. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add ASHIFT. * config/i386/i386.md (*ashl3_1): Extend with new alternatives to support NDD, limit the new alternative to generate sal only, and adjust output template for NDD. (*ashlsi3_1_zext): Likewise. (*ashlhi3_1): Likewise. (*ashlqi3_1): Likewise. (*ashl3_cmp): Likewise. (*ashlsi3_cmp_zext): Likewise. (*ashl3_cconly): Likewise. (*ashl3_doubleword): Likewise. (*ashl3_doubleword_highpart): Adjust codegen for NDD. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add tests for sal. --- gcc/config/i386/i386-expand.cc | 1 + gcc/config/i386/i386.md | 194 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 22 +++ 3 files changed, 150 insertions(+), 67 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 5f02d557a50..7e3080482a6 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1276,6 +1276,7 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case AND: case IOR: case XOR: + case ASHIFT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index cf9842d1a49..a0e81545f17 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14103,13 +14103,14 @@ (define_insn_and_split "*ashl3_doubleword_mask_1" }) (define_insn "ashl3_doubleword" - [(set (match_operand:DWI 0 "register_operand" "=&r") - (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:DWI 0 "register_operand" "=&r,r") + (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n,r") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] "" "#" - [(set_attr "type" "multi")]) + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "multi")]) (define_split [(set (match_operand:DWI 0 "register_operand") @@ -14149,11 +14150,14 @@ (define_insn_and_split "*ashl3_doubleword_highpart" [(const_int 0)] { split_double_mode (mode, &operands[0], 1, &operands[0], &operands[3]); + bool use_ndd = ix86_can_use_ndd_p (ASHIFT) + && !rtx_equal_p (operands[3], operands[1]); int bits = INTVAL (operands[2]) - ( * BITS_PER_UNIT); - if (!rtx_equal_p (operands[3], operands[1])) + if (!rtx_equal_p (operands[3], operands[1]) || !use_ndd) emit_move_insn (operands[3], operands[1]); + rtx op_tmp = use_ndd? operands[1] : operands[3]; if (bits > 0) - emit_insn (gen_ashl3 (operands[3], operands[3], GEN_INT (bits))); + emit_insn (gen_ashl3 (operands[3], op_tmp, GEN_INT (bits))); ix86_expand_clear (operands[0]); DONE; }) @@ -14460,12 +14464,14 @@ (define_insn "*bmi2_ashl3_1" (set_attr "mode" "")]) (define_insn "*ashl3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k") - (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,M,r,"))) + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k,r") + (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,M,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, mode, operands)" + "ix86_binary_operator_ok (ASHIFT, mode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14480,18 +14486,24 @@ (define_insn "*ashl3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{}\t{%1, %0|%0, %1}" + : "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,bmi2,") + [(set_attr "isa" "*,*,bmi2,,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14533,13 +14545,15 @@ (define_insn "*bmi2_ashlsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*ashlsi3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI - (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,M,r")))) + (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14552,18 +14566,24 @@ (define_insn "*ashlsi3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{l}\t{%1, %k0|%k0, %1}" + : "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,*,bmi2") + [(set_attr "isa" "*,*,bmi2,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "3") + (const_string "ishift") (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") @@ -14593,12 +14613,14 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashlhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k") - (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k,r") + (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, HImode, operands)" + "ix86_binary_operator_ok (ASHIFT, HImode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14611,18 +14633,24 @@ (define_insn "*ashlhi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{w}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{w}\t{%1, %0|%0, %1}" + : "sal{w}\t%0"; else - return "sal{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{w}\t{%2, %1, %0|%0, %1, %2}" + : "sal{w}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,avx512f") + [(set_attr "isa" "*,*,avx512f,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "msklog") + (eq_attr "alternative" "3") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14638,15 +14666,17 @@ (define_insn "*ashlhi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "HI,SI,HI")]) + (set_attr "mode" "HI,SI,HI,HI")]) (define_insn "*ashlqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k") - (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k,r") + (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, QImode, operands)" + "ix86_binary_operator_ok (ASHIFT, QImode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14662,28 +14692,34 @@ (define_insn "*ashlqi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) { if (get_attr_mode (insn) == MODE_SI) - return "sal{l}\t%k0"; + return use_ndd ? "sal{l}\t{%1, %k0|%k0, %1}" + : "sal{l}\t%k0"; else return "sal{b}\t%0"; } else { if (get_attr_mode (insn) == MODE_SI) - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; else return "sal{b}\t{%2, %0|%0, %2}"; } } } - [(set_attr "isa" "*,*,*,avx512dq") + [(set_attr "isa" "*,*,*,avx512dq,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (eq_attr "alternative" "3") (const_string "msklog") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14699,10 +14735,10 @@ (define_insn "*ashlqi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "QI,SI,SI,QI") + (set_attr "mode" "QI,SI,SI,QI,SI") ;; Potential partial reg stall on alternative 1. (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "1") + (cond [(eq_attr "alternative" "1,4") (symbol_ref "!TARGET_PARTIAL_REG_STALL")] (symbol_ref "true")))]) @@ -14797,10 +14833,10 @@ (define_split (define_insn "*ashl3_cmp" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (ashift:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL @@ -14808,8 +14844,10 @@ (define_insn "*ashl3_cmp" && (TARGET_SHIFT1 || (TARGET_DOUBLE_WITH_ADD && REG_P (operands[0]))))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, mode, operands)" + && ix86_binary_operator_ok (ASHIFT, mode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: @@ -14818,14 +14856,21 @@ (define_insn "*ashl3_cmp" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{}\t{%1, %0|%0, %1}" + : "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") @@ -14845,10 +14890,10 @@ (define_insn "*ashl3_cmp" (define_insn "*ashlsi3_cmp_zext" [(set (reg FLAGS_REG) (compare - (ashift:SI (match_operand:SI 1 "register_operand" "0") + (ashift:SI (match_operand:SI 1 "register_operand" "0,r") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (ashift:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -14857,8 +14902,10 @@ (define_insn "*ashlsi3_cmp_zext" && (TARGET_SHIFT1 || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFT, SImode, operands, + ix86_can_use_ndd_p (ASHIFT))" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: @@ -14867,14 +14914,20 @@ (define_insn "*ashlsi3_cmp_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{l}\t{%1, %k0|%k0, %1}" + : "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") - (cond [(and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") ] @@ -14893,10 +14946,10 @@ (define_insn "*ashlsi3_cmp_zext" (define_insn "*ashl3_cconly" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "register_operand" "0,r") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -14904,22 +14957,29 @@ (define_insn "*ashl3_cconly" || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: gcc_assert (operands[2] == const1_rtx); return "add{}\t%0, %0"; - default: + default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sal{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sal{}\t{%1, %0|%0, %1}" + : "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 7541a41a01e..481ec8b00a8 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -29,6 +29,16 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ return c; \ } +#define FOO3(TYPE, OP_NAME, OP, IMM) \ +TYPE \ +__attribute__ ((noipa)) \ +foo3_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = a OP IMM; \ + return b; \ +} + + #define F(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -112,6 +122,16 @@ FOO (int, xor, ^) FOO1 (int, xor, ^) FOO (long, xor, ^) FOO1 (long, xor, ^) + +FOO (char, shl, <<) +FOO3 (char, shl, <<, 7) +FOO (short, shl, <<) +FOO3 (short, shl, <<, 7) +FOO (int, shl, <<) +FOO3 (int, shl, <<, 7) +FOO (long, shl, <<) +FOO3 (long, shl, <<, 7) + /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -134,3 +154,5 @@ FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Nov 15 09:47:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165244 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2430367vqg; Wed, 15 Nov 2023 01:52:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFUxiwaSy1k+CEPXJmkgCxIuJHGiFLzRFuB0S8pp8FlICosM8w7NM2chaI+pjLI3UkFGN3B X-Received: by 2002:a05:620a:8a8b:b0:777:1d46:fd4a with SMTP id qu11-20020a05620a8a8b00b007771d46fd4amr3759279qkn.29.1700041958355; Wed, 15 Nov 2023 01:52:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041958; cv=pass; d=google.com; s=arc-20160816; b=JAn1VE3v6p0JAglI6o/48DnG8gNjJFyOdgXCWcouZY60LlV8ap8+U428NzphbCoHM2 gE8YG0s8zf6iEQgrg3yPLYl0UxlSpe6hFojBWlF3X7bT5EXwfyM2jtOQzk1xpeALWmBY T1xJf5RX/W+yKFoPkEfD1mPDtrQpzoFKTxTXHEk19JjjVEyDWO+77Xc3HBoc1KadPFRr dFe6YNyZzLUW7NoGl2sfKYaK9dZySqXSEQx8BpIB9hHBo9uGQnxpuWlludpte7El67vf 9eGqq58KoqaDPuNPy4v0NU0721Sy0mh48kd2z4f1JKkSrbOycFh31XbYMn1Fyqi1L5Ra Zikg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=jnSDj9iMZGcJAvT/I6rt+WaF7OlMTJmxfGxqWpMQu4E=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=nwx9Qiwsi3iih5Cu38uZSCgKZkMkD/EimFCTAa7GWDxeAYLj6gsjdHe54hKameJRyF ZP/WBWRqUcQrxZMt5zD3Inzc+LfYMltufaq2sCcDyPXUQGb1YTEMVq9BVN9mpj1GYV5u zWNNGstcN0znbB9kMgBkd3t3ZvjjlWvG3TtTTJ66BKg+jYmX39IO8pRUNHeLuGRjJNpb au+0UB6mHUegc9sPYf23dLp9WcWdNwkeaIBGxUqDMn2KIY6LKSeoof0h8fnVntTZmeRH 8K4ltVyXx4WXx8dIfcATfSp77CpKUDaoVTdqQgNcs8EW+N4u+BYhaTfEMu3YondBD6D2 iTuQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PUXZSmPm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y7-20020ae9f407000000b0076d1c784892si7920720qkl.478.2023.11.15.01.52.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:52:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PUXZSmPm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 213CE3858C42 for ; Wed, 15 Nov 2023 09:52:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 8FDB63857C44 for ; Wed, 15 Nov 2023 09:47:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8FDB63857C44 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8FDB63857C44 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; cv=none; b=JRwFwi++gqLkWaEu744EDX/6ya+4MttchLTEnnaB26KVEaSn9uKXfVacD4eIxqUPR6ZSeJQ+/cS7uCbPhdOpKUasx8uQeIyLFkKQHW9iDLBe+8jnlApyOWBF9nRVuEVtM9TsB33etaeTXK/YthshK3UVLsHHWsELW+wvLvEbWcY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; c=relaxed/simple; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=SLaOPFo8xGYMY2iN/VQe9f9Lb61o65NiFPmCqyvwK2h68UqFcshMtCBfWwS2jEtwDLnSM+PSLcQczV96r8HXmt9ucKvQ4fD5XkPLXtnlqqNnk4VU/mtZ+QUlr4uusUWvfhM1AiptfI9+Pr8l4SB4ncG2wiXuUuP5zk8HO/wPb+0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041643; x=1731577643; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; b=PUXZSmPmQLaGX65XitX0JyONtsH7joGrFDtOZenoc5lAk0xqqAcIZIzh RC9v7PRbDtDG2Z9mgopR6h2Xvq/wXhp+n1RwN8K2dqpP1PEctmj6goYna ORDIKO09JL/XEAv7HRPbo10Om2hDqvYGa+XJ6Pp007m/OS1o1gVIy4AA0 k7NB011mvTKBL8eYvD3xaZhSiD+2NhPm7fhGCaQqian4LzEOLbARtdiTg MrmWGwxT20YJ3WQu3MKlhsqV/mRCXSLmjdXEOEHKcMOYBBk9MDGf32Oir hbgAvFKvEkkRYi6zc+eGf/YVbakJ2WRUkm6htoAPjT72pElZjiflp3HC8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342622" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342622" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431725" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431725" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:12 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C7A5D1005682; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 13/16] [APX NDD] Support APX NDD for right shift insns Date: Wed, 15 Nov 2023 17:47:02 +0800 Message-Id: <20231115094705.3976553-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623196322819617 X-GMAIL-MSGID: 1782623196322819617 Similar to LSHIFT, rshift should also emit $1 for NDD form with CX_REG as operands[1]. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add LSHIFTRT and RSHIFTRT. * config/i386/i386.md (ashr3_cvt): Extend with new alternatives to support NDD, and adjust output templates. (*ashrsi3_cvt_zext): Likewise. (*ashr3_1): Likewise for SI/DI mode. (*highpartdisi2): Likewise. (*lshr3_1): Likewise. (*si3_1_zext): Likewise. (*ashr3_1): Likewise for QI/HI mode. (*lshrqi3_1): Likewise. (*lshrhi3_1): Likewise. (3_cmp): Likewise. (*si3_cmp_zext): Likewise. (*3_cconly): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add l/ashiftrt tests. --- gcc/config/i386/i386-expand.cc | 2 + gcc/config/i386/i386.md | 265 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 24 +++ 3 files changed, 191 insertions(+), 100 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 7e3080482a6..8e040346fbb 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1277,6 +1277,8 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case IOR: case XOR: case ASHIFT: + case ASHIFTRT: + case LSHIFTRT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a0e81545f17..3ff333d4a41 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15490,39 +15490,45 @@ (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) (define_insn "ashr3_cvt" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "*a,0") + (match_operand:SWI48 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)-1 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ - sar{}\t{%2, %0|%0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{}\t{%2, %0|%0, %2} + sar{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "")]) (define_insn "*ashrsi3_cvt_zext" - [(set (match_operand:DI 0 "register_operand" "=*d,r") + [(set (match_operand:DI 0 "register_operand" "=*d,r,r") (zero_extend:DI - (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0") + (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0,r") (match_operand:QI 2 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && INTVAL (operands[2]) == 31 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, SImode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ {cltd|cdq} - sar{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{l}\t{%2, %k0|%k0, %2} + sar{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "SI")]) (define_expand "@x86_shift_adj_3" @@ -15564,13 +15570,15 @@ (define_insn "*bmi2_3_1" (set_attr "mode" "")]) (define_insn "*ashr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,r"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15578,14 +15586,18 @@ (define_insn "*ashr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15598,8 +15610,8 @@ (define_insn "*ashr3_1" ;; Specialization of *lshr3_1 below, extracting the SImode ;; highpart of a DI to be extracted, but allowing it to be clobbered. (define_insn_and_split "*highpartdisi2" - [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k") 0) - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k") + [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k,r") 0) + (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k,r") (const_int 32))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" @@ -15618,16 +15630,20 @@ (define_insn_and_split "*highpartdisi2" DONE; } operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); -}) +} +[(set_attr "isa" "*,*,*,apx_ndd")]) + (define_insn "*lshr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k,r") (lshiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,r,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, mode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, mode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15636,14 +15652,18 @@ (define_insn "*lshr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{}\t{%1, %0|%0, %1}" + : "shr{}\t%0"; else - return "shr{}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{}\t{%2, %1, %0|%0, %1, %2}" + : "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,") - (set_attr "type" "ishift,ishiftx,msklog") + [(set_attr "isa" "*,bmi2,,apx_ndd") + (set_attr "type" "ishift,ishiftx,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15676,13 +15696,15 @@ (define_insn "*bmi2_si3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,r")))) + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15690,14 +15712,18 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15720,20 +15746,26 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashr3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m, r") (ashiftrt:SWI12 - (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + (match_operand:SWI12 1 "nonimmediate_operand" "0, rm") + (match_operand:QI 2 "nonmemory_operand" "c, c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return which_alternative ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15744,29 +15776,35 @@ (define_insn "*ashr3_1" (set_attr "mode" "")]) (define_insn "*lshrqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k,r") (lshiftrt:QI - (match_operand:QI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI,Wb"))) + (match_operand:QI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, QImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, QImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{b}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{b}\t{%1, %0|%0, %1}" + : "shr{b}\t%0"; else - return "shr{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{b}\t{%2, %1, %0|%0, %1, %2}" + : "shr{b}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*,avx512dq,apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15778,29 +15816,35 @@ (define_insn "*lshrqi3_1" (set_attr "mode" "QI")]) (define_insn "*lshrhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k") + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k, r") (lshiftrt:HI - (match_operand:HI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI, Ww"))) + (match_operand:HI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI, Ww, cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, HImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, HImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{w}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{w}\t{%1, %0|%0, %1}" + : "shr{w}\t%0"; else - return "shr{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{w}\t{%2, %1, %0|%0, %1, %2}" + : "shr{w}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*, avx512f") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*, avx512f, apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15853,25 +15897,31 @@ (define_insn "*3_cmp" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (any_shiftrt:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, mode, operands)" + && ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15884,10 +15934,10 @@ (define_insn "*3_cmp" (define_insn "*si3_cmp_zext" [(set (reg FLAGS_REG) (compare - (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0") + (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0,r") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (any_shiftrt:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -15895,15 +15945,20 @@ (define_insn "*si3_cmp_zext" || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, SImode, operands)" + && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return which_alternative ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15917,10 +15972,10 @@ (define_insn "*3_cconly" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "register_operand" "0,r") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -15928,12 +15983,18 @@ (define_insn "*3_cconly" && ix86_match_ccmode (insn, CCGOCmode)" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative + ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16537,18 +16598,22 @@ (define_insn "rcrdi2" ;; Versions of sar and shr that set the carry flag. (define_insn "3_carry" [(set (reg:CCC FLAGS_REG) - (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0") + (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0,r") (const_int 1)) (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI48 0 "register_operand" "=r") + (set (match_operand:SWI48 0 "register_operand" "=r,r") (any_shiftrt:SWI48 (match_dup 1) (const_int 1)))] "" { - if (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) - return "{}\t%0"; - return "{}\t{1, %0|%0, 1}"; + if ((TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; + return which_alternative ? "{}\t{$1, %1, %0|%0, %1, 1}" + : "{}\t{$1, %0|%0, 1}"; } - [(set_attr "type" "ishift1") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift1") (set (attr "length_immediate") (if_then_else (ior (match_test "TARGET_SHIFT1") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 481ec8b00a8..28c0df72988 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,6 +2,8 @@ /* { dg-options "-mapxf -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ +#include + #define FOO(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -132,6 +134,24 @@ FOO3 (int, shl, <<, 7) FOO (long, shl, <<) FOO3 (long, shl, <<, 7) +FOO (char, sar, >>) +FOO3 (char, sar, >>, 7) +FOO (short, sar, >>) +FOO3 (short, sar, >>, 7) +FOO (int, sar, >>) +FOO3 (int, sar, >>, 7) +FOO (long, sar, >>) +FOO3 (long, sar, >>, 7) + +FOO (uint8_t, shr, >>) +FOO3 (uint8_t, shr, >>, 7) +FOO (uint16_t, shr, >>) +FOO3 (uint16_t, shr, >>, 7) +FOO (uint32_t, shr, >>) +FOO3 (uint32_t, shr, >>, 7) +FOO (uint64_t, shr, >>) +FOO3 (uint64_t, shr, >>, 7) + /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -156,3 +176,7 @@ FOO3 (long, shl, <<, 7) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Wed Nov 15 09:47:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165234 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429151vqg; Wed, 15 Nov 2023 01:49:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IFA32pUtaSs3b3tmOFST6u4bFb1iZkiH1O3lduGbZ8c9MI3Pb2Md+VzSWTnexYHP2UNzmVt X-Received: by 2002:a05:620a:2492:b0:77b:c740:eb93 with SMTP id i18-20020a05620a249200b0077bc740eb93mr5994183qkn.59.1700041740872; Wed, 15 Nov 2023 01:49:00 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041740; cv=pass; d=google.com; s=arc-20160816; b=X1WkJAxWA+lVv4LGPPI0qDeFmG7CXff1kssC5GyVV/2WzoJsPcLc6hT9GSzOGOdDfp nqMZiDn03LG0FelcEYDElqL6wmhDEFcJ0a5+Hc15AIAnYh9DpKqjgUQRA5c298DDkRwP Q2KNxq04tg9WMxhu6F16Kk8I/r3xSLEpgNu28qKc/YhVcGZxMSiaYgzOSlvUzHpmAJhO BeLWqmtddWrOKZdyM3fzgZQA5EHuL40pKZzKNLTUHMIf2YYcNn3lC/q7gdXCLIuDBX13 noNfuK4+IVNJtRx/052nHjFLtR3W7t9G9uzNdyrKUrMXqxVyxZWrDbupMO7aUBNTx/ye ZMJw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=MVn62WTk5hLZmS9SRv+wGdOwf4TvqcPepfDU5EukGHM=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=TkXwBsc+86fZEDXnQC1YHF3jOqys0m9RHIFw9KaRu9GKICKDhqcvoY7l8HN6TJj08w LEqHR6D9V5R0z+h3xS38WttfEc/Os9h4pFZN9mlTNKMt8t9XWu/nUmHmqOPKmVlFNy9c Ii3i9KYgZr/FJsQtlhFlL5P9biQ9LesPBm7nVkI0FEeTXzUjlqyA6VJSZJsqiX4Syxb7 MhXVzmvb2AgYkTau05xrkS+/yR23VMgH+SQ10JyA/jf7+6KXw8j9vQ1VfVtKdXVDDyoV ga2RM62iXtcyL8uLQw8UTmqrHZSWvI68uVQ/d6BfyvQ6lzWLM40xMmwdQlnGQcLy7vh3 W44w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TXEOyMya; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id qb5-20020a05620a650500b007759c22e907si8933038qkn.142.2023.11.15.01.49.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:49:00 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TXEOyMya; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 930E53861825 for ; Wed, 15 Nov 2023 09:48:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id BA3333857C64 for ; Wed, 15 Nov 2023 09:47:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BA3333857C64 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BA3333857C64 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041643; cv=none; b=viLTd+syubUIVkPrJWOGgqzLwKHQEuT0CxxWCt/iyKMM0l5SJCQzHfQ4/Ufuh3O0IqpJ6ligM7X1jGKmPGHq2ElFmNKAEGOBkGdkSNAIWhA3vGnEgPGdMynMmDI5Zp1GoeuXpAAUe08oC/0jM2mXwrL7Y6c/UBF/65acF0XDz7I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041643; c=relaxed/simple; bh=hUGBxDXsdbPm2jiO+1gEBrDOdvirm52KKdq5VzLNTTk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Wif3vqFmQOFoz/CQCQadMaGwTCWMrlIqIOKdBUO00F8NF+SSZRoCxYVnLE7syIOuD8bUTffaEuIKio1M6/1ag/+X298G0Mf79j/2mJQa1Qto91P5uSz2dwfszP/92Yc41i02pluVF3dF6kfHYBndSh5nwlwl1O2nR3YtfkBQo4s= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041640; x=1731577640; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hUGBxDXsdbPm2jiO+1gEBrDOdvirm52KKdq5VzLNTTk=; b=TXEOyMyaE/Z3FaXF3qRi7ShtcwjHk8OKxd3aLJg62XGXoA3qNQTVWOPN /o87hnTkLR8HObbyEWkhMoepqXz7ZNxr8IJhdYKR10uHYw3WV09N71ilJ ThcHeLDMahLVLbyx6rcfIxUZa1+GU46ZNqwhMtDOUvGZ/JDHe3mWrwAhe EvBfhiAHE+Fk5IR+SA+JCFoqsUXFFqOyD8rrzF6v8fIw7/PsJZZAp9QCl iS0UdVq/fBlgbM+IgiVZAFz5xPthn2ZQhzuruQaLOGrIfa+cdXjvymE+F K3N3eUq7Oq1VBipXsZov8yiC5hlrUjRgF6+qjPpcWUdKi0P2QzOnu3Asv w==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342614" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342614" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431714" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431714" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:13 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id CA85E1005684; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 14/16] [APX NDD] Support APX NDD for rotate insns Date: Wed, 15 Nov 2023 17:47:03 +0800 Message-Id: <20231115094705.3976553-15-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782622968569457772 X-GMAIL-MSGID: 1782622968569457772 gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add ROTATE and ROTATERT. * config/i386/i386.md (*3_1): Extend with a new alternative to support NDD for SI/DI rotate, and adjust output template. (*si3_1_zext): Likewise. (*3_1): Likewise for QI/HI modes. (rcrsi2): Likewise. (rcrdi2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for left/right rotate. --- gcc/config/i386/i386-expand.cc | 2 + gcc/config/i386/i386.md | 91 ++++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 20 ++++++ 3 files changed, 80 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 8e040346fbb..ab6f14485d6 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1279,6 +1279,8 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case ASHIFT: case ASHIFTRT: case LSHIFTRT: + case ROTATE: + case ROTATERT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 3ff333d4a41..760c0d32f4d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16362,13 +16362,15 @@ (define_insn "*bmi2_rorx3_1" (set_attr "mode" "")]) (define_insn "*3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (any_rotate:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16376,14 +16378,18 @@ (define_insn "*3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16433,13 +16439,14 @@ (define_insn "*bmi2_rorxsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,I")))) + (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,I,cI")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16447,14 +16454,18 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16498,19 +16509,27 @@ (define_split (zero_extend:DI (rotatert:SI (match_dup 1) (match_dup 2))))]) (define_insn "*3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") - (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m,r") + (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative + ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "rotate") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "rotate") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16567,31 +16586,37 @@ (define_split ;; Rotations through carry flag (define_insn "rcrsi2" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,r") (plus:SI - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,r") (const_int 1)) (ashift:SI (ltu:SI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 31)))) (clobber (reg:CC FLAGS_REG))] "" - "rcr{l}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{l}\t{%1, %0|%0, %1} + rcr{l}\t%0" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "memory" "none") (set_attr "length_immediate" "0") (set_attr "mode" "SI")]) (define_insn "rcrdi2" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (plus:DI - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0") + (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,r") (const_int 1)) (ashift:DI (ltu:DI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 63)))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "rcr{q}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{q}\t{%1, %0|%0, %1} + rcr{q}\t%0" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "length_immediate" "0") (set_attr "mode" "DI")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 28c0df72988..b8b70511023 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -40,6 +40,14 @@ foo3_##OP_NAME##_##TYPE (TYPE a) \ return b; \ } +#define FOO4(TYPE, OP_NAME, OP1, OP2, IMM1) \ +TYPE \ +__attribute__ ((noipa)) \ +foo4_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = (a OP1 IMM1 | a OP2 (8 * sizeof(TYPE) - IMM1)); \ + return b; \ +} #define F(TYPE, OP_NAME, OP) \ TYPE \ @@ -152,6 +160,16 @@ FOO3 (uint32_t, shr, >>, 7) FOO (uint64_t, shr, >>) FOO3 (uint64_t, shr, >>, 7) +FOO4 (uint8_t, ror, >>, <<, 1) +FOO4 (uint16_t, ror, >>, <<, 1) +FOO4 (uint32_t, ror, >>, <<, 1) +FOO4 (uint64_t, ror, >>, <<, 1) + +FOO4 (uint8_t, rol, <<, >>, 1) +FOO4 (uint16_t, rol, <<, >>, 1) +FOO4 (uint32_t, rol, <<, >>, 1) +FOO4 (uint64_t, rol, <<, >>, 1) + /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -180,3 +198,5 @@ FOO3 (uint64_t, shr, >>, 7) /* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "ror(?:b|l|w|q)\[^\n\r]%(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "rol(?:b|l|w|q)\[^\n\r]%(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Wed Nov 15 09:47:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165243 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2430214vqg; Wed, 15 Nov 2023 01:52:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IGY4UuAWgHn+9C52hnymXJJTcp5I09/8hpyqbSy5U4gU6Z+IXEjrhSkdWnJNeJiR9GxaWrb X-Received: by 2002:a67:c196:0:b0:45d:9d27:215c with SMTP id h22-20020a67c196000000b0045d9d27215cmr12059848vsj.1.1700041929168; Wed, 15 Nov 2023 01:52:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041929; cv=pass; d=google.com; s=arc-20160816; b=uRMVWV0zi4bcePc5Hwon7NznzOOJWk+HUeL5u7s0qeuUtyaU3DXVGOE2yKSQTYKDgP HZnBQMI/SWpfjqe5TcuiaVeOJuKSvQAQixnaM7EvvruE0YhutavjdSHZXu10pu5SOlgt 5NL4Q2XknuPVGnQ5vVvDLfe2JWcqz5X6Ly9PoJYpK11q96PgQerwKsRoX11UEPjgCWls plIx7VeGYNPEg/R16vwZl/CepQ8nHXhKwwlRvWLuWFDXL+9g3CiPIGYpBs+sQlUJlLlv 5UJNBFdlgMVYGdKuEyYh8cZnUwjwsHWU4i5dJYtf/jdAHsZe011/An6CwNB4wj4SoRfv obXg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=rWW3bnQHalUWOXB/lhUNdRjYEftq3ToxFX1moPfjUes=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=aMLPY5Afa6o1JKKrOJRBwVT7CtERQKd8sJg39QF8UAH8JAeBBWxOSNstCE3insheZL GGpmA61HLKXVlW1InEEhlm+16lk3VYAjArj/1Ga/DyOe4j3ZJgNV3Sn4Yp31jChP5cNR a2cPL/UE72rk3scu8Qfz4UakJ34hD8h/cJAiVJ37AigYLjN/A1kmU4HTe9b6PmICvr1W /3WTcoJQ4SDf3Vzlwo8Mr6IKbPJzJ9G9l0lABf5SV/xPzIhlGx17HIcuSy+jvC3Hsu7v 36+fDKaCFprdur9tYetwy8awvsUvsatqbOdj8FUaf/BbWPC0vt2XL0uWsdIExkO5G9WP kk4w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lmoe3uQx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k10-20020a05622a03ca00b004053819f665si8450238qtx.608.2023.11.15.01.52.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:52:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lmoe3uQx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D6AA63858283 for ; Wed, 15 Nov 2023 09:52:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 080A63857C4B for ; Wed, 15 Nov 2023 09:47:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 080A63857C4B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 080A63857C4B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041645; cv=none; b=IlGxhfxGshgNpmKqn05kDZgz8pxb7DgqAyLhyuiCxKuS+9nwgZ+/GcFs4w01YCvNJSNZs9j/fPvGpDx02rBDNMi4N05k/CdxS9CZuZG0zgBVFnr+CCAEuTzz3DHFytvBBH+ZvA4tWdftz7gKnS5vWZZDyI8hKa7XkMPeW9B6/eM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041645; c=relaxed/simple; bh=2be3W85hD1jUMYCdJoxqreOSwF5889G+qsLBJdr1Vao=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=wsUn8+tCOZOU+oVVR7D1nl6ZjaKo6tus0r6hE2//KJWcUgdDu0kVIWiu2imSPO/Bykahy+D8iRX7Tq7WLxepxAK3cCMfCr0BCud+c+KlYHRye/Ax+hsE/FrrAuE2s5WTzFiZyD09AWvlOuxxbSMpfOucZg8oUVy2sWRlkutptCk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041643; x=1731577643; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2be3W85hD1jUMYCdJoxqreOSwF5889G+qsLBJdr1Vao=; b=lmoe3uQx/QA/4vlnocRB6s3Fr4wjnUIe+G3lPjtrho5ViNn7tKdYuUDf ZTsROL9Ts4RhOGIMYv+HBQaejL1R72Ze+kD4v6/RNkJFAWCFPAcRjXUqh 3AJd+avt09xk8tG9T2JVVVfF76nzXoSDBFFFEOuNc6XxiOGPm22ChHNYj /d9TwBUAAIew0R/mtZFLGjWSzRKlKGBxQ+lIrkhWADXWj0Hc4ZLvh9wEQ QUMvHwGeQyjrDOSwdbrXlewrRISGjRlOoEV3BwkcZq/rR48+OwDn4rzBm ZGpnrnx+Zkh96iAPokuEZkDBSGfjJCfsAC0gsnKdj7YSlwaa+xpHzW9jI w==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342620" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342620" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431722" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431722" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:12 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id CCD791005686; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 15/16] [APX NDD] Support APX NDD for shld/shrd insns Date: Wed, 15 Nov 2023 17:47:04 +0800 Message-Id: <20231115094705.3976553-16-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623165451334275 X-GMAIL-MSGID: 1782623165451334275 For shld/shrd insns, the old pattern use match_dup 0 as its shift src and use +r*m as its constraint. To support NDD we added new define_insns to handle NDD form pattern with extra input and dest operand to be fixed in register. gcc/ChangeLog: * config/i386/i386.md (x86_64_shld_ndd): New define_insn. (x86_64_shld_ndd_1): Likewise. (*x86_64_shld_ndd_2): Likewise. (x86_shld_ndd): Likewise. (x86_shld_ndd_1): Likewise. (*x86_shld_ndd_2): Likewise. (x86_64_shrd_ndd): Likewise. (x86_64_shrd_ndd_1): Likewise. (*x86_64_shrd_ndd_2): Likewise. (x86_shrd_ndd): Likewise. (x86_shrd_ndd_1): Likewise. (*x86_shrd_ndd_2): Likewise. (*x86_64_shld_shrd_1_nozext): Adjust codegen under TARGET_APX_NDD. (*x86_shld_shrd_1_nozext): Likewise. (*x86_64_shrd_shld_1_nozext): Likewise. (*x86_shrd_shld_1_nozext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-shld-shrd.c: New test. --- gcc/config/i386/i386.md | 323 +++++++++++++++++- .../gcc.target/i386/apx-ndd-shld-shrd.c | 24 ++ 2 files changed, 345 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 760c0d32f4d..2e3d37d08b0 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14183,6 +14183,24 @@ (define_insn "x86_64_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD" + "shld{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "prefix_0f" "1") + (set_attr "mode" "DI")]) + (define_insn "x86_64_shld_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (ashift:DI (match_dup 0) @@ -14204,6 +14222,24 @@ (define_insn "x86_64_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shld{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI") + (set_attr "length_immediate" "1")]) + + (define_insn_and_split "*x86_64_shld_shrd_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (ashift:DI (match_operand:DI 4 "nonimmediate_operand") @@ -14229,6 +14265,23 @@ (define_insn_and_split "*x86_64_shld_shrd_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -14261,6 +14314,33 @@ (define_insn_and_split "*x86_64_shld_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shld_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (ashift:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shld" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14283,6 +14363,24 @@ (define_insn "x86_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd" + [(set (match_operand:SI 0 "nonimmediate_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shld{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + + (define_insn "x86_shld_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14304,6 +14402,24 @@ (define_insn "x86_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD && + INTVAL (operands[4]) == 32 - INTVAL (operands[3])" + "shld{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shld_shrd_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (ashift:SI (match_operand:SI 4 "nonimmediate_operand") @@ -14328,7 +14444,24 @@ (define_insn_and_split "*x86_shld_shrd_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -14360,6 +14493,33 @@ (define_insn_and_split "*x86_shld_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shld_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (ashift:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_expand "@x86_shift_adj_1" [(set (reg:CCZ FLAGS_REG) (compare:CCZ (and:QI (match_operand:QI 2 "register_operand") @@ -15308,6 +15468,24 @@ (define_insn "x86_64_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD" + "shrd{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI")]) + + (define_insn "x86_64_shrd_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (lshiftrt:DI (match_dup 0) @@ -15329,6 +15507,24 @@ (define_insn "x86_64_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shrd{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "DI")]) + + (define_insn_and_split "*x86_64_shrd_shld_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (lshiftrt:DI (match_operand:DI 4 "nonimmediate_operand") @@ -15354,6 +15550,23 @@ (define_insn_and_split "*x86_64_shrd_shld_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shld_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -15386,6 +15599,33 @@ (define_insn_and_split "*x86_64_shrd_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shrd_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 2))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (lshiftrt:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shrd" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15408,6 +15648,23 @@ (define_insn "x86_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shrd{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + (define_insn "x86_shrd_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15429,6 +15686,24 @@ (define_insn "x86_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && (INTVAL (operands[4]) == 32 - INTVAL (operands[3]))" + "shrd{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shrd_shld_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (lshiftrt:SI (match_operand:SI 4 "nonimmediate_operand") @@ -15453,7 +15728,24 @@ (define_insn_and_split "*x86_shrd_shld_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shld_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -15485,6 +15777,33 @@ (define_insn_and_split "*x86_shrd_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shrd_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT && TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (lshiftrt:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + ;; Base name for insn mnemonic. (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c new file mode 100644 index 00000000000..87068ea31aa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -Wno-shift-count-overflow -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times {(?n)shld[ql]?[\t ]*\$2} 4 } } */ +/* { dg-final { scan-assembler-times {(?n)shrd[ql]?[\t ]*\$2} 4 } } */ + +typedef unsigned long u64; +typedef unsigned int u32; + +long a; +int c; +const char n = 2; + +long test64r (long e) { long t = ((u64)a >> n) | (e << (64 - n)); return t;} +long test64l (u64 e) { long t = (a << n) | (e >> (64 - n)); return t;} +int test32r (int f) { int t = ((u32)c >> n) | (f << (32 - n)); return t; } +int test32l (u32 f) { int t = (c << n) | (f >> (32 - n)); return t; } + +u64 ua; +u32 uc; + +u64 testu64l (u64 ue) { u64 ut = (ua << n) | (ue >> (64 - n)); return ut; } +u64 testu64r (u64 ue) { u64 ut = (ua >> n) | (ue << (64 - n)); return ut; } +u32 testu32l (u32 uf) { u32 ut = (uc << n) | (uf >> (32 - n)); return ut; } +u32 testu32r (u32 uf) { u32 ut = (uc >> n) | (uf << (32 - n)); return ut; } From patchwork Wed Nov 15 09:47:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165240 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2429642vqg; Wed, 15 Nov 2023 01:50:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEHkGikpdJ4pogY/YNwrOajxLZnmUbigOM+iWWvncvhhkocq9UCP3rXqpae8UtdxCbNMxRd X-Received: by 2002:a05:6214:21ea:b0:670:85e1:3902 with SMTP id p10-20020a05621421ea00b0067085e13902mr6026036qvj.57.1700041831747; Wed, 15 Nov 2023 01:50:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041831; cv=pass; d=google.com; s=arc-20160816; b=b9cpMWc30Qc0iyWHPWbAqSC9EySdLNAySieUrywtqaPDJFgFvtP+Se3Xz+qDnDpcvA ctw3rn1KtYsFEePC4xOTrKFybgycs8jo8g4Pkx0Y8jES07d/a11MV8QOvwT00pPOOPjW QU6HtjqQOyBPUBDiw1mndYsV306AJy3ItMQef1K8+lhp1dnKyaNbFRidFngT7AkfmPh+ abRGc7iOdN0WAd8cxC3J8RvjgkBCHVm4lqJdzRdKyFo2gO7L5o0MQ6I1mwklYbJF8A+G BaeysqUDzs97hO5NBwduxEV1SXJM3/pHDXfg09VxpBKx9NzCxPcdDlLxYam8ssFPR5jL 2MwQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=+xIKs1WKZyydd2L/Ur942H5HaVL0dsdNTQL65SufBcc=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=m3qUgiDJ5m9+rVe63o5zwsH5K5TrhkzRURzJ5KsnMiPrQg1DjOZK5T8Iz2gKdgk48F sL8PxHe+mlt9mSWoMZU12mn1jlJ6LoN6Gw3tjTdhf0PZmCPsbflGmq04NpXmr0SiJKFl PX5mWiC5e8tqJDbdnM4/pnrIsXyss7qNBPXC81SQpeblEJa7hDbIQSVoSumL1drp4wM+ SE+Tqeo69rUMJufLxJzZALkluHmcjIJmGM53W6ZRb5ydlPMub5Z3WCfXWzIq0uLJ2p59 q9/Y9dBGuGrqYMHwQ88ElbXy3nIeBRfoiGxyKmZhppiKm6ar39IE9LLlcslcR0V4ELGH PQ2w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=akEtIAAh; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k11-20020a0cc78b000000b0066d0543b654si8844905qvj.341.2023.11.15.01.50.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:50:31 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=akEtIAAh; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 846763857BA6 for ; Wed, 15 Nov 2023 09:50:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id DF7413857C56 for ; Wed, 15 Nov 2023 09:47:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF7413857C56 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DF7413857C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041640; cv=none; b=W4Ta8PhQWQ673GhOgJIEbHhzhMV8104AytZc8YlxHqUd8ZQaMYY9mur6dq5gpe3QQggwpUm8BrmPIxn9cKWMNpNIQRUEK8yMdPZXdVLvL5TA15NwF2o6qKGlxk+OzQ0/KUN4NVoWxXVfADf5xvpcLVdTOB7W+WDIr+nRAoWCvGc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041640; c=relaxed/simple; bh=Fk8T9wjTQs/tbyNOKyUAwa1q1KGR3r3K5TeJZkiMBYU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=sESyQlpoLz/HrjRFOaCojTFmpwxBAznalYTCeDgjBlAYO12tEJ/t4CbYqDzzwTVAC6cxXPodiSWgBVnPDuGrhOJKVja8eN7eFL6qN399NYhSBHyBF/vXzu7pa370xjr2q20LGCC8Cia1wkNdf7upflxSOUTzbElDEwczg15T4Zo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041639; x=1731577639; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Fk8T9wjTQs/tbyNOKyUAwa1q1KGR3r3K5TeJZkiMBYU=; b=akEtIAAhWJcbiDcCh5yR/LYEnX2wJkQhNf50xpkJ0nWncULOYQs1bDnK rrIr/SNbErlzIH26yNbSQE4mTht2Qpn8zx699CHQDqytsdQC6QKBBL7EE jXFyLyEtKMCoPULqY53W1fo1v9lZIkSRgpofKrbkWRFipK8HfGIXf/pG5 F1Rs4NzXJKyV95I95S6E0AjA80VmGGqVVBK37WvJEwFuKynWTOKToXgS9 X81RthHp9+WoF35CnWd7VmDZIJam7PIS+vE0o0XuVLQaZelK7VjMEgOQM 10fFz0vi9lpoO5ePSCPLmtRwSc8bt5m3ptlZPHs8vZrvAZGbXQD35zcuw Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342613" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342613" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431712" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431712" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:13 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id CF9021005687; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 16/16] [APX NDD] Support APX NDD for cmove insns Date: Wed, 15 Nov 2023 17:47:05 +0800 Message-Id: <20231115094705.3976553-17-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623063877453558 X-GMAIL-MSGID: 1782623063877453558 gcc/ChangeLog: * config/i386/i386.md (*movcc_noc): Extend with new constraints to support NDD. (*movsicc_noc_zext): Likewise. (*movsicc_noc_zext_1): Likewise. (*movqicc_noc): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-cmov.c: New test. --- gcc/config/i386/i386.md | 48 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c | 16 +++++++ 2 files changed, 45 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 2e3d37d08b0..2ae9aaf59fb 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -24119,47 +24119,56 @@ (define_split (neg:SWI (ltu:SWI (reg:CCC FLAGS_REG) (const_int 0))))]) (define_insn "*movcc_noc" - [(set (match_operand:SWI248 0 "register_operand" "=r,r") + [(set (match_operand:SWI248 0 "register_operand" "=r,r,r,r") (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SWI248 2 "nonimmediate_operand" "rm,0") - (match_operand:SWI248 3 "nonimmediate_operand" "0,rm")))] + (match_operand:SWI248 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SWI248 3 "nonimmediate_operand" "0,rm,r,rm")))] "TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %0|%0, %2} - cmov%O2%c1\t{%3, %0|%0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %0|%0, %3} + cmov%O2%C1\t{%2, %3, %0|%0, %3, %2} + cmov%O2%c1\t{%3, %2, %0|%0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "")]) (define_insn "*movsicc_noc_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (if_then_else:DI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) (zero_extend:DI - (match_operand:SI 2 "nonimmediate_operand" "rm,0")) + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r")) (zero_extend:DI - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) (define_insn "*movsicc_noc_zext_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,r") (zero_extend:DI (if_then_else:SI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 2 "nonimmediate_operand" "rm,0") - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) @@ -24184,14 +24193,15 @@ (define_split }) (define_insn "*movqicc_noc" - [(set (match_operand:QI 0 "register_operand" "=r,r") + [(set (match_operand:QI 0 "register_operand" "=r,r,r") (if_then_else:QI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:QI 2 "register_operand" "r,0") - (match_operand:QI 3 "register_operand" "0,r")))] + (match_operand:QI 2 "register_operand" "r,0,r") + (match_operand:QI 3 "register_operand" "0,r,r")))] "TARGET_CMOVE && !TARGET_PARTIAL_REG_STALL" "#" - [(set_attr "type" "icmov") + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "QI")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c new file mode 100644 index 00000000000..459dc965342 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times "cmove\[^\n\r]*, %eax" 1 } } */ +/* { dg-final { scan-assembler-times "cmovge\[^\n\r]*, %eax" 1 } } */ + +unsigned int c[4]; + +unsigned long long foo1 (int a, unsigned int b) +{ + return a ? b : c[1]; +} + +unsigned int foo3 (int a, int b, unsigned int c, unsigned int d) +{ + return a < b ? c : d; +}