From patchwork Wed Dec 6 08:06:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174383 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954302vqy; Wed, 6 Dec 2023 00:09:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IFt2KHVCSDNMWp0M3Q+MhPXm6Hi2Z3h4YBursar9iIHemSu7ISRKnrYd2b39YssxCpOoGZv X-Received: by 2002:a05:622a:283:b0:425:4043:18b4 with SMTP id z3-20020a05622a028300b00425404318b4mr679995qtw.103.1701850166356; Wed, 06 Dec 2023 00:09:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850166; cv=pass; d=google.com; s=arc-20160816; b=NGYM2IDLdAmLWMcfKTfk8aftPNITwTApDsK2hEVpwUYxg/CQU/cbataZTOFB2iGhUg fsejAkoDY4xb+DGCIp8AN5wsAarDguR/nLleQhtRpE/Rk/ZcI1G7WVbhCLK75dtzWRuN p/MvwSP+lXIBTNG0cTS4SF6Z0out8Q3QzlHxVbw+O8MEqxECKO+RqLZDHdM3WjM8mBx2 M7bZnDKsFigYko5Pi8eyBtXMkqswN9spEC2HjoPrpx83LVAuc3Il2OUqu3rSNmehIwcA 3KdUcBehy2xWRfsl4GID71BtR3bN09ve5+hDY6YOU87DJqqEWF0TatHRDc92RluQfowK G1pg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=I08LrIIBaQqgFO61JFBvOJbpQ6Sox+Euzq0Q3wCkgCk=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=CPsS1peV0+gqgkZD+p7aL267my5GzjZD74wFlXTlL3sTA1chan0Lxn3vS087TAD9XM tgtHpo4DofbmTIudH9+lDcLkW3yaEj1ZkIygkpYRXQnfDamY6fPyS2bS0748vbjmZ8nu k2Yf5X6nnprJ6aC+TD24R+3YNwWwgxttKSlP16oh8KZ+YlK7w2mpZflTEKVILr6pMTZ5 hn6A63qHtWfApxyxmaVj8DuUKqjmLJxwdBkHQk1XvqUu5zrCwgQEM9EN++B4yNeIX36j BYWOpawhoa2wcvUaGZQF/CGRH93ZhM5aksfyrjWLQe43LK1JiJx2tkxIg5npfi+UEBNF oNXw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="bApej/F6"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a16-20020ac81090000000b00423954a1aafsi12980820qtj.475.2023.12.06.00.09.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:09:26 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="bApej/F6"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 09F9E385783E for ; Wed, 6 Dec 2023 08:09:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 28129385840B for ; Wed, 6 Dec 2023 08:08:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 28129385840B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 28129385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850125; cv=none; b=s/0ZDZy+zDjQhgV0giFAHdNuGUwL//fFLX2TUcOKERKvZ/j+ZBIgChG7j1bF5SxcB1glgyZ996OvrRPPRv/vC9jqoowEeezVlsR2o4UV/67prZa+vSEUDmchkbGOFdw2Cu9ypEy6B1KAEdYEO1RluSrRnS1aB6HwInGM/oRvouM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850125; c=relaxed/simple; bh=udrlvmQVVVndr2ELdEFi/x9GSbQg0R+X5RbtpTEkUJE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=WfRbxFNGhZNR1g2u9wyo5mYRoOFa8a1J7nU7biSg/DVvJ0ZCaj/9nkpo2eBGDhCzms6Yjiro8PokPEJNiRa+O5GaFmB8SgIpnQ2uMSOgsO0TC3TJQkPzudSETDjZVkCtldiOKXjlvx3clMA/61MG7RyjiOyEAgJwP5Va0crec/Q= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850124; x=1733386124; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=udrlvmQVVVndr2ELdEFi/x9GSbQg0R+X5RbtpTEkUJE=; b=bApej/F6Eb5jZZKVffWEuWG20cf0xRP6EvoDuYJrS5rvj5c9Am4gquZv q1nlLHYyf4GsYHH8v7KHsGRL83JU9O3ZkXt4LVppolD5EfdvSFDc/PGap 8TfJS/V/UD1P8POrFfztmFympvCmfExJfVwszkWo1kDsFcLtOqxrL6yq3 6+yrvpS+vaW+IVrbQU/ApBf3DV/oSHB1xiZz6sCDL+3ZA6+JqkpJZMI61 H+OAO+cflsN6CU0WNNAsEIWoPuh4tG6UEh81/kdCUM9CjzSX1zXbuSvCy x3PXp/WVjWwxe/DDYWEokVUBaCQx5P8QZEGgkv3znSvl3a7YN6ymrBKd4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085455" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085455" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737740" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737740" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:37 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 86FBD10056EF; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 01/16] [APX NDD] Support Intel APX NDD for legacy add insn Date: Wed, 6 Dec 2023 16:06:21 +0800 Message-Id: <20231206080636.178863-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519240266510283 X-GMAIL-MSGID: 1784519240266510283 From: Kong Lingling APX NDD provides an extra destination register operand for several gpr related legacy insns, so a new alternative can be adopted to operand1 with "r" constraint. This first patch supports NDD for add instruction, and keeps to use lea when all operands are registers since lea have shorter encoding. For add operations containing mem NDD will be adopted to save an extra move. In legacy x86 binary operation expand it will force operands[0] and operands[1] to be the same so add a helper function to allow NDD form pattern that operands[0] and operands[1] can be different. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_fixup_binary_operands): Add new use_ndd flag to check whether ndd can be used for this binop and adjust operand emit. (ix86_binary_operator_ok): Likewise. (ix86_expand_binary_operator): Likewise, and void postreload expand generate lea pattern when use_ndd is explicit parsed. * config/i386/i386-options.cc (ix86_option_override_internal): Prohibit apx subfeatures when not in 64bit mode. * config/i386/i386-protos.h (ix86_binary_operator_ok): Add use_ndd flag. (ix86_fixup_binary_operand): Likewise. (ix86_expand_binary_operand): Likewise. * config/i386/i386.md (*add_1): Extend with new alternatives to support NDD, and adjust output template. (*addhi_1): Likewise. (*addqi_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: New test. --- gcc/config/i386/i386-expand.cc | 19 ++--- gcc/config/i386/i386-options.cc | 2 + gcc/config/i386/i386-protos.h | 6 +- gcc/config/i386/i386.md | 102 ++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 21 +++++ 5 files changed, 96 insertions(+), 54 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 4bd7d4f39c8..3ecda989cf8 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1260,14 +1260,14 @@ ix86_swap_binary_operands_p (enum rtx_code code, machine_mode mode, return false; } - /* Fix up OPERANDS to satisfy ix86_binary_operator_ok. Return the destination to use for the operation. If different from the true - destination in operands[0], a copy operation will be required. */ + destination in operands[0], a copy operation will be required except + under TARGET_APX_NDD. */ rtx ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1307,7 +1307,7 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, src1 = force_reg (mode, src1); /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) src1 = force_reg (mode, src1); /* Improve address combine. */ @@ -1338,11 +1338,11 @@ ix86_fixup_binary_operands_no_copy (enum rtx_code code, void ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx src1, src2, dst, op, clob; - dst = ix86_fixup_binary_operands (code, mode, operands); + dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); src1 = operands[1]; src2 = operands[2]; @@ -1352,7 +1352,8 @@ ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, if (reload_completed && code == PLUS - && !rtx_equal_p (dst, src1)) + && !rtx_equal_p (dst, src1) + && !use_ndd) { /* This is going to be an LEA; avoid splitting it later. */ emit_insn (op); @@ -1451,7 +1452,7 @@ ix86_expand_vector_logical_operator (enum rtx_code code, machine_mode mode, bool ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, - rtx operands[3]) + rtx operands[3], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1475,7 +1476,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, return false; /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) /* Support "andhi/andsi/anddi" as a zero-extending move. */ return (code == AND && (mode == HImode diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index f86ad332aad..7d0a253e07f 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -2129,6 +2129,8 @@ ix86_option_override_internal (bool main_args_p, if (TARGET_APX_F && !TARGET_64BIT) error ("%<-mapxf%> is not supported for 32-bit code"); + else if (opts->x_ix86_apx_features != apx_none && !TARGET_64BIT) + error ("%<-mapx-features=%> option is not supported for 32-bit code"); if (TARGET_UINTR && !TARGET_64BIT) error ("%<-muintr%> not supported for 32-bit code"); diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 28d0eab11d5..a9d0c568bba 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -108,14 +108,14 @@ extern void ix86_expand_move (machine_mode, rtx[]); extern void ix86_expand_vector_move (machine_mode, rtx[]); extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_binary_operator (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, machine_mode, rtx[]); -extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3]); +extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3], bool = false); extern bool ix86_avoid_lea_for_add (rtx_insn *, rtx[]); extern bool ix86_use_lea_for_mov (rtx_insn *, rtx[]); extern bool ix86_avoid_lea_for_addr (rtx_insn *, rtx[]); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index df7f9172381..a5b123a51bd 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -562,7 +562,7 @@ (define_attr "unit" "integer,i387,sse,mmx,unknown" ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes, + x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, @@ -960,6 +960,8 @@ (define_attr "enabled" "" (symbol_ref "TARGET_AVX512BF16 && TARGET_AVX512VL") (eq_attr "isa" "vpclmulqdqvl") (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") + (eq_attr "isa" "apx_ndd") + (symbol_ref "TARGET_APX_NDD") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") @@ -6288,7 +6290,8 @@ (define_expand "add3" (plus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (PLUS, mode, operands); DONE;") + "ix86_expand_binary_operator (PLUS, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*add3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -6415,26 +6418,29 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: @@ -6443,14 +6449,16 @@ (define_insn "*add_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:SWI48 2 "incdec_operand") @@ -6519,25 +6527,26 @@ (define_insn "addsi_1_zext" (set_attr "mode" "SI")]) (define_insn "*addhi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp") - (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp") - (match_operand:HI 2 "general_operand" "rn,m,0,ln"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp,r,r") + (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp,rm,r") + (match_operand:HI 2 "general_operand" "rn,m,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, HImode, operands)" + "ix86_binary_operator_ok (PLUS, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{w}\t%0"; + return use_ndd ? "inc{w}\t{%1, %0|%0, %1}" : "inc{w}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{w}\t%0"; + return use_ndd ? "dec{w}\t{%1, %0|%0, %1}" : "dec{w}\t%0"; } default: @@ -6546,14 +6555,16 @@ (define_insn "*addhi_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], HImode)) - return "sub{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{w}\t{%2, %1, %0|%0, %1, %2}" + : "sub{w}\t{%2, %0|%0, %2}"; - return "add{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{w}\t{%2, %1, %0|%0, %1, %2}" + : "add{w}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:HI 2 "incdec_operand") @@ -6565,30 +6576,35 @@ (define_insn "*addhi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "HI,HI,HI,SI")]) + (set_attr "mode" "HI,HI,HI,SI,HI,HI")]) (define_insn "*addqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp") - (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp") - (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp,r,r") + (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, QImode, operands)" + "ix86_binary_operator_ok (PLUS, QImode, operands, TARGET_APX_NDD)" { bool widen = (get_attr_mode (insn) != MODE_QI); - + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; + if (use_ndd) + return "inc{b}\t{%1, %0|%0, %1}"; + else + return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; + if (use_ndd) + return "dec{b}\t{%1, %0|%0, %1}"; + else + return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; } default: @@ -6597,21 +6613,23 @@ (define_insn "*addqi_1" if (which_alternative == 2 || which_alternative == 4) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], QImode)) { - if (widen) - return "sub{l}\t{%2, %k0|%k0, %2}"; + if (use_ndd) + return "sub{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "sub{b}\t{%2, %0|%0, %2}"; + return widen ? "sub{l}\t{%2, %k0|%k0, %2}" + : "sub{b}\t{%2, %0|%0, %2}"; } - if (widen) - return "add{l}\t{%k2, %k0|%k0, %k2}"; + if (use_ndd) + return "add{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "add{b}\t{%2, %0|%0, %2}"; + return widen ? "add{l}\t{%k2, %k0|%k0, %k2}" + : "add{b}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "5") (const_string "lea") (match_operand:QI 2 "incdec_operand") @@ -6623,7 +6641,7 @@ (define_insn "*addqi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "QI,QI,QI,SI,SI,SI") + (set_attr "mode" "QI,QI,QI,SI,SI,SI,QI,QI") ;; Potential partial reg stall on alternatives 3 and 4. (set (attr "preferred_for_speed") (cond [(eq_attr "alternative" "3,4") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c new file mode 100644 index 00000000000..056a323a647 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mapxf -march=x86-64 -O2" } */ +/* { dg-final { scan-assembler-not "movl"} } */ + +int foo (int *a) +{ + int b = *a - 1; + return b; +} + +int foo2 (int a, int b) +{ + int c = a + b; + return c; +} + +int foo3 (int *a, int b) +{ + int c = *a + b; + return c; +} From patchwork Wed Dec 6 08:06:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174396 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955237vqy; Wed, 6 Dec 2023 00:11:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IHjl6k8LUDv8nHhpUiDjSmJXsa1q/NGr0hFKpBLzbEUhGyYU1wyMOkkPS7PrhSFpW3sqDa2 X-Received: by 2002:a0c:fb48:0:b0:67a:a721:e13d with SMTP id b8-20020a0cfb48000000b0067aa721e13dmr472663qvq.106.1701850294571; Wed, 06 Dec 2023 00:11:34 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850294; cv=pass; d=google.com; s=arc-20160816; b=F8A1p+7UBjEncQ7NgF0pFO67CpKI7pDyu6etynuMLmf5Hdl7POroAzj4GqmgpzbCNL Ma8yV/hx6dRVq5snVAhb6ihw4x55VrwEN1mh4iYZj+JQijCA2wZrGTSalMXj9PxY6nU3 zELQIdMR3bCUwRZFLt35jJ0ldRWgmrWVys7Rh9SjUaVyvi/7YjjgoA81bG4vmk3HKY0w hKI+cjhiB6CTBu1oM+mW/+VSIMdf/a5v21NxpENwbnzEieU/phtKGZIjzkKUmL3PkI+T 4IBNbDV7lADp7w8oTsRpqgIYnEZEDzPUaoA5HN8iseDnXwHg48poVVWJuiV0tH2k19K/ KPoA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=J0qdbERDsjBowx20D+AZd/45rDihOFew6bt19jUNsxE=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=Nv3wdt2IBJeyj5BZVIbpsLmWkaTI1l0S4122Hr8CVmve/bA20Aa0r0bY1xHf/iIKfA pr9opXx0TWw/VnZFuG7JyEKycV/Sw7g1uLVk4JeQp8wsILEdnsS91RoVUPiKmHicuE4K wKRyX65RCZXjQHOsp1qy/d0D7+Fg5ZVGzrd+uClX+XsFcufq/Kx/lW5m5eLfNEAjdPOP nubZ4piWaH536E7Pc/6019TPf0tB1TMkOMiZLbe/SjUN8skhZ+8FLxK+pBK1EjP7XqE5 ivqde5kcKo4kZM+mz2BimIuTyhNI8amQ6e2Gff4S2uKqQlfRb3b4TM1Dvwo8k9nAu8X4 2IMg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FpsB5jiN; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f13-20020a05620a408d00b0077d59d88b25si14433629qko.273.2023.12.06.00.11.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:11:34 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FpsB5jiN; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3CE183870939 for ; Wed, 6 Dec 2023 08:11:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 0BF403861887 for ; Wed, 6 Dec 2023 08:08:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0BF403861887 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0BF403861887 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850129; cv=none; b=tOhGRWmpQwSAcEG8yJ/iYI7QrHRum3UZCEMl3OKoTPYh3tVGPK6biLDHQHqxceyyCjM8lAcJTtEbqcHzNQtAXI8tnZI6BFots5T8Z3nQgZBcADuI2pMpmntWlXi2VvELFif2ggPafguWpJzTyeAHfmTOCga/NLx8tp7tsVWS0Z4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850129; c=relaxed/simple; bh=y23QEm33rDn949Skw7G2N5CLAN1hbRjGaFKhJKaA5f0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=sZIXRUPjA8ktO3JT+Ax45fgvg4u04Bf/TvH+W+uHwgg7CKNKBf4CNEChoTq2g9DzCxc/EXrwZd8YfiXbwePtrzjlpsVHoW8MYVNmovVfYmlJyaC11fHDmM3K0p7XHw6x66deMlAB5JGi22eolpwjRHA34Ns7zrG6EkAUcdKtKPU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850126; x=1733386126; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y23QEm33rDn949Skw7G2N5CLAN1hbRjGaFKhJKaA5f0=; b=FpsB5jiNxjgN5d+z8Ry/cDeHCxlH2Ssre5Qgpe3ff5qcc1vLdaAtpT3e FvfeYJ5reHD5kgg+WJO0IZjGAGULA1sLiG3486J0p87E4KQsDNeC+a+82 I3lLpRXjQpeAJcp7JKxMO/SIHbvP8TDrFnSNfHSkoWDGUZvesqsWK7Dhy 89766cqjbtOcIn1fTfMJq3AGidwc9XlAJtdRfiNqY0MZVIDYQ2HgSX5W5 aYebXEN8RwLn9TahRX3XtueCwBUSMIwGmLwmhX9dbH3/EX6olumi5h6r8 o0y6Dxi3N9FbnNzu5lG3aDE1nQJ9qB7cIhUwkw8zIFjVjUQ/3JoVD0rc4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085463" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085463" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737752" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737752" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:37 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8A28D100568D; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 02/16] [APX NDD] Support APX NDD for optimization patterns of add Date: Wed, 6 Dec 2023 16:06:22 +0800 Message-Id: <20231206080636.178863-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519374392068607 X-GMAIL-MSGID: 1784519374392068607 From: Kong Lingling gcc/ChangeLog: * config/i386/i386.md: (addsi_1_zext): Add new alternatives for NDD and adjust output templates. (*add_2): Likewise. (*addsi_2_zext): Likewise. (*add_3): Likewise. (*addsi_3_zext): Likewise. (*adddi_4): Likewise. (*add_4): Likewise. (*add_5): Likewise. (*addv4): Likewise. (*addv4_1): Likewise. (*add3_cconly_overflow_1): Likewise. (*add3_cc_overflow_1): Likewise. (*addsi3_zext_cc_overflow_1): Likewise. (*add3_cconly_overflow_2): Likewise. (*add3_cc_overflow_2): Likewise. (*addsi3_zext_cc_overflow_2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add more test. --- gcc/config/i386/i386.md | 310 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 53 ++-- 2 files changed, 232 insertions(+), 131 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a5b123a51bd..1e846183347 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6479,13 +6479,15 @@ (define_insn "*add_1" ;; patterns constructed from addsi_1 to match. (define_insn "addsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r") (zero_extend:DI - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le")))) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: @@ -6493,11 +6495,13 @@ (define_insn "addsi_1_zext" case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6507,12 +6511,15 @@ (define_insn "addsi_1_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (match_operand:SI 2 "incdec_operand") @@ -6814,37 +6821,42 @@ (define_insn "*add_2" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0,") - (match_operand:SWI 2 "" ",,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,,rm,r") + (match_operand:SWI 2 "" ",,0,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, mode, operands)" + && ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6859,23 +6871,26 @@ (define_insn "*add_2" (define_insn "*addsi_2_zext" [(set (reg FLAGS_REG) (compare - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe,re")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6883,12 +6898,15 @@ (define_insn "*addsi_2_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6902,35 +6920,40 @@ (define_insn "*addsi_2_zext" (define_insn "*add_3" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 2 "" ",0")) - (match_operand:SWI 1 "nonimmediate_operand" "%0,"))) - (clobber (match_scratch:SWI 0 "=,"))] + (neg:SWI (match_operand:SWI 2 "" ",0,,re")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r,rm"))) + (clobber (match_scratch:SWI 0 "=,,r,r"))] "ix86_match_ccmode (insn, CCZmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6945,22 +6968,23 @@ (define_insn "*add_3" (define_insn "*addsi_3_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) - (match_operand:SI 1 "nonimmediate_operand" "%0,r"))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe,re")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,rm"))) + (set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCZmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" : "dec{l}\t%k0"; } default: @@ -6968,12 +6992,15 @@ (define_insn "*addsi_3_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6994,31 +7021,35 @@ (define_insn "*addsi_3_zext" (define_insn "*adddi_4" [(set (reg FLAGS_REG) (compare - (match_operand:DI 1 "nonimmediate_operand" "0") - (match_operand:DI 2 "x86_64_immediate_operand" "e"))) - (clobber (match_scratch:DI 0 "=r"))] + (match_operand:DI 1 "nonimmediate_operand" "0,rm") + (match_operand:DI 2 "x86_64_immediate_operand" "e,e"))) + (clobber (match_scratch:DI 0 "=r,r"))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{q}\t%0"; + return use_ndd ? "inc{q}\t{%1, %0|%0, %1}" : "inc{q}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{q}\t%0"; + return use_ndd ? "dec{q}\t{%1, %0|%0, %1}" : "dec{q}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], DImode)) - return "add{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{q}\t{%2, %1, %0|%0, %1, %2}" + : "add{q}\t{%2, %0|%0, %2}"; - return "sub{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{q}\t{%2, %1, %0|%0, %1, %2}" + : "sub{q}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:DI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7039,30 +7070,36 @@ (define_insn "*adddi_4" (define_insn "*add_4" [(set (reg FLAGS_REG) (compare - (match_operand:SWI124 1 "nonimmediate_operand" "0") + (match_operand:SWI124 1 "nonimmediate_operand" "0,rm") (match_operand:SWI124 2 "const_int_operand"))) - (clobber (match_scratch:SWI124 0 "="))] + (clobber (match_scratch:SWI124 0 "=,r"))] "ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], mode)) - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand: 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7077,36 +7114,41 @@ (define_insn "*add_5" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,") - (match_operand:SWI 2 "" ",0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r,rm") + (match_operand:SWI 2 "" ",0,,re")) (const_int 0))) - (clobber (match_scratch:SWI 0 "=,"))] + (clobber (match_scratch:SWI 0 "=,,r,r"))] "ix86_match_ccmode (insn, CCGOCmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7319,35 +7361,43 @@ (define_insn "*addv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (plus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "addv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (plus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -9190,27 +9240,36 @@ (define_insn "*add3_cconly_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SWI 2 "" ",,re")) (match_dup 1))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "@add3_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 1))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -9255,55 +9314,74 @@ (define_insn "*addsi3_zext_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (match_dup 1))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*add3_cconly_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SWI 2 "" ",,re")) (match_dup 2))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*add3_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 2))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addsi3_zext_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (match_dup 2))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn_and_split "*add3_doubleword_cc_overflow_1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 056a323a647..c1049022f2a 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,20 +2,43 @@ /* { dg-options "-mapxf -march=x86-64 -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ -int foo (int *a) -{ - int b = *a - 1; - return b; -} +#define FOO(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = *a OP 1; \ + return b; \ +} -int foo2 (int a, int b) -{ - int c = a + b; - return c; -} +#define FOO1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo1_##OP_NAME##_##TYPE (TYPE a, TYPE b) \ +{ \ + TYPE c = a OP b; \ + return c; \ +} + +#define FOO2(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ +{ \ + TYPE c = *a OP b; \ + return c; \ +} + +FOO (char, add, +) +FOO1 (char, add, +) +FOO2 (char, add, +) +FOO (short, add, +) +FOO1 (short, add, +) +FOO2 (short, add, +) +FOO (int, add, +) +FOO1 (int, add, +) +FOO2 (int, add, +) +FOO (long, add, +) +FOO1 (long, add, +) +FOO2 (long, add, +) -int foo3 (int *a, int b) -{ - int c = *a + b; - return c; -} From patchwork Wed Dec 6 08:06:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174385 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954555vqy; Wed, 6 Dec 2023 00:10:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IF4AaqYn6ROhdNlJt5taT3xj1sogXjoYOkwnx8EpwlAZ9gNSsz6su30SoTAzuuHVBrEimRX X-Received: by 2002:a0c:f60a:0:b0:67a:a721:b18c with SMTP id r10-20020a0cf60a000000b0067aa721b18cmr460093qvm.71.1701850203236; Wed, 06 Dec 2023 00:10:03 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850203; cv=pass; d=google.com; s=arc-20160816; b=n6hUtKwZGsQkcvSQsr19ExLfdCA/zB8zAIMRSxeByPPFLDuZUmqXalVkWWxO4NbsDS bN3etyHrA+q7impjzHJTRmtTUlrTXf8W6OCOM+g8C1JHE9wBaANKROMjJq06A9SPrBxk ZtPPaP54/+rKoMIhlKqm8yBTMCWdJCSBL4JXhvfFjURg8GMYcUe31BD73dIpsUoThEK3 7Aa6ty26fnGknkkG5YvO1nHZ79kof8bxK2WldjMpHBucbtXsduqLCHBRFuX0Te+lPwpK sKap98ti/dYYm6Ee6M8BhyC0PTu9SNpKjlh8mw4sVkjFGkeibf39+cu81ITPql2Pmcyb m8wA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=+NqvhAe3TDML1urkH80iEDa4FnrsTPmcK9NIi976JrA=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=ZsEZuBHVrfSAdEf/e1FkPgm/Qo+uKokugOMC0KwvfD+my1rhrQm5K6GLYPs4wNsL9d P+yyz/BdP1Mikxs70RwfKxTQuervevSW5NGscZTs6BsOGxc/YY8auxC7tVEs4Him+xwo x6nRm03QteLFYfofgWZ9WSqk0EmtaYgMMCTWDbC3Y+jWQWqkbPOkB9Rw0mkxefgAe5Jc SFn59GOS6gSDhhd9oZehsE2rGUnYivbhpHNMeGklB8M7RIemxzLJ1DJ0r4Sw6gvl8R/s cze/opue9aunrzhc+njIFyKt2AYHBSVXTaptf0Y6OFLWx8vGdaUkrYtdk1UaGaXNhYpn FDfg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f1B9CGtq; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id l10-20020a0ce6ca000000b0067a235c49dfsi13291000qvn.542.2023.12.06.00.10.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:03 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f1B9CGtq; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 79117384F987 for ; Wed, 6 Dec 2023 08:09:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 9BD413858020 for ; Wed, 6 Dec 2023 08:08:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9BD413858020 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9BD413858020 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850124; cv=none; b=ubt5swoOLb9QOEA8w0qiaTgk6qmefKvfHUdOUtG0goNNW96X2OLU3pOe87Rbw6/AvqVNCP5iL5UKHJOJGxgPX5e/3IIlQtsWx0U9rdJY3o4AvwR2XQxuh4wcSyUSPqGmbpq/pPrE/pMtiTo621kCL0gOdWb0ef0QuKIiqRGK/K8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850124; c=relaxed/simple; bh=VmodS8O6oqWK9x2p7cugr2PrJmtY8up3GEdC6M/YPXg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=wcJwVwUH7+GAjHlXkSuUURpMvuIt2y8dr9L2+LE+9lo7hAEtBdNtxQ203BCx9wdfwNelV2GgA2O857UOaLaferdjiuRDKsfuyNzU4gZ81mCUkdco+RrXtpNItjLMR1nP8X31+tW5NHuZKUEqDOxKpYrbDtUeO0QC5zvfb/RhzGU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850123; x=1733386123; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VmodS8O6oqWK9x2p7cugr2PrJmtY8up3GEdC6M/YPXg=; b=f1B9CGtqD0zy7qZtH0J5jphCHNoGKfmqKlb0MuHRrbe4MUPaBcjm3wND bgbktstpeSoSwOUUM9k4kbvIfftObYu4WmbF1E1AaGez9dCllPvt2P5Du xar03MHgGA8RAfQQXrmAAa/HdewYar7QKAQIm0KGWW/VjJQ4sqU3TtoNf 4Dr+pCRf01kMasNSR/aSH9GpRw9BPs4ZUPFTMPINy4+uD+Mh8VIr+XlSe u+xRAN14bjRGHBvPMVOhVxT4a8BggVLI8mAtn9sh3OidAz2Uca1vu+XCe mNXCuIJFweZ7OBGeH/VLtQAuJ3znBDpFC4ERIWHrPTSdaRuIdC1fwcY1h g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085453" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085453" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737736" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737736" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:37 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8DA25100568F; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 03/16] [APX NDD] Disable seg_prefixed memory usage for NDD add Date: Wed, 6 Dec 2023 16:06:23 +0800 Message-Id: <20231206080636.178863-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519278604015348 X-GMAIL-MSGID: 1784519278604015348 NDD uses evex prefix, so when segment prefix is also applied, the instruction could excceed its 15byte limit, especially adding immediates. This could happen when "e" constraint accepts any UNSPEC_TPOFF/UNSPEC_NTPOFF constant and it will add the offset to segment register, which will be encoded using segment prefix. Disable those *POFF constant usage in NDD add alternatives with new constraint. gcc/ChangeLog: * config/i386/constraints.md (je): New constraint. * config/i386/i386-protos.h (x86_poff_operand_p): New function to check any *POFF constant in operand. * config/i386/i386.cc (x86_poff_operand_p): New prototype. * config/i386/i386.md (*add_1): Split out je alternative for add. --- gcc/config/i386/constraints.md | 5 +++++ gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 25 +++++++++++++++++++++++++ gcc/config/i386/i386.md | 8 ++++---- 4 files changed, 35 insertions(+), 4 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index cbee31fa40a..f4c3c3dd952 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -433,3 +433,8 @@ (define_address_constraint "jb" (define_register_constraint "jc" "TARGET_APX_EGPR && !TARGET_AVX ? GENERAL_GPR16 : GENERAL_REGS") + +(define_constraint "je" + "@internal constant that do not allow any unspec global offsets" + (and (match_operand 0 "x86_64_immediate_operand") + (match_test "!x86_poff_operand_p (op)"))) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index a9d0c568bba..7dfeb6af225 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -66,6 +66,7 @@ extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); extern bool x86_extended_rex2reg_mentioned_p (rtx); extern bool x86_evex_reg_mentioned_p (rtx [], int); +extern bool x86_poff_operand_p (rtx); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 7c5cab4e2c6..8aa33aef7e1 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -23331,6 +23331,31 @@ x86_evex_reg_mentioned_p (rtx operands[], int nops) return false; } +/* Return true when rtx operand does not contain any UNSPEC_*POFF related + constant to avoid APX_NDD instructions excceed encoding length limit. */ +bool +x86_poff_operand_p (rtx operand) +{ + if (GET_CODE (operand) == CONST) + { + rtx op = XEXP (operand, 0); + if (GET_CODE (op) == PLUS) + op = XEXP (op, 0); + + if (GET_CODE (op) == UNSPEC) + { + int unspec = XINT (op, 1); + return (unspec == UNSPEC_NTPOFF + || unspec == UNSPEC_TPOFF + || unspec == UNSPEC_DTPOFF + || unspec == UNSPEC_GOTTPOFF + || unspec == UNSPEC_GOTNTPOFF + || unspec == UNSPEC_INDNTPOFF); + } + } + return false; +} + /* If profitable, negate (without causing overflow) integer constant of mode MODE at location LOC. Return true in this case. */ bool diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 1e846183347..a1626121227 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6418,10 +6418,10 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r,m,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,r,e,je,BM"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" @@ -6457,7 +6457,7 @@ (define_insn "*add_1" : "add{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd,apx_ndd,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") From patchwork Wed Dec 6 08:06:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174390 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954695vqy; Wed, 6 Dec 2023 00:10:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IE9bUoZMb4Miv2zeJgbmwN53sDwhHrzdDWrzxgIwl9qUvN3d+gmOm3HGv8CttJlxcDH2S7o X-Received: by 2002:a05:620a:2702:b0:77d:c0dd:5ba1 with SMTP id b2-20020a05620a270200b0077dc0dd5ba1mr638593qkp.66.1701850221904; Wed, 06 Dec 2023 00:10:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850221; cv=pass; d=google.com; s=arc-20160816; b=gENkS5Q29WBMyffmT+0Sjl0/6GkNU82F0PCFu1jLxDpI5229T/i/+nTX5w0Pa6rhaB w22GzUY/0oHVYBZVRL4ivuv8keTqJY5y7JkyOSY29W7aQib+ds0vC7dv8swvvE49viC+ U+nUeql+EqF/YvJJcc/dqHk0zunp72GJ40lEU14PmL9Y+9oDQdmEwoY33l5bGizLLejb YReBiXj/u+BQhg29628SlvDeR6d1y+MAV//+vJnJaDCvhOQ1q/90WMwrH8eo0yRU2K4x x+UQ/dLHcsTGvAtCu6618viDNedR4zsn19Vg1hEev4lk8aJfLbe62ep3hJVnIJzkXBVM DC8A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=4wxpNXYWGkuhGkOuQJgtaOijkl1oPWB1y5jAMD/uWzQ=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=C5XyOPo+0avTw3jgdFipn+OQCjX3BBuOqrtbdK3Tf867IfDUkorSUZeVvuHYhtKa7u k8SYiLW7ZaogOtnLFPg3jtGAnxVSHNePgDbol4OLDOlqQt+IwZ7iQR8nGT6Zn+Xoh4lO i+d4Ts6AVMT7bQuNuCwWb9xEGfO4ADNSMJLpk0o7SkcpCKC57mJSdxvc4Lr0px2auPsi WWi5CM/rthFMRbpNDdSTCCZV9R511SFa/TPOAzeuhZpmysKsVQ7oiuGPf1/WuIJBPuwt GnljLVswm9M1NMO5NLFtQ+eDi3LjsClpSxVlqANkOB3z9VyyuJsis28IWoE1F+tEBljf mByw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QhBQ2xST; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ee26-20020a05620a801a00b0076f18ad2e65si14380743qkb.520.2023.12.06.00.10.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:21 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QhBQ2xST; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9076E384F00C for ; Wed, 6 Dec 2023 08:09:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id A6B1D38618A2 for ; Wed, 6 Dec 2023 08:08:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A6B1D38618A2 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A6B1D38618A2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850127; cv=none; b=hN0ziUPK1vKZf8S+EC9d0Mj3D1lPrrJwmV7Zjk/51OboxQ3+P2y1ks1Mbkr5OrIKbIyyY7roehlB68YiX+DWPOF4fNfUwfyy9Kv9pGkIm63qAPfJt+WPumGrXuJ49kXp0R3sqcc53WFStlONHzJZMHqm8x1Td57ynGowbppeGq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850127; c=relaxed/simple; bh=xwFpZ/bUe3/J66FeQnuQOTEsdfVC11uuVndefPpf8C8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=KUuaR1Gh4Mq+i8UTqYr43cilUXKx9HMuyoyVcAOV1WDZeG4Ji+6YPPwTsRGIKiddwK7uAJJVQgZYeQ7OUFVRh1dbeX+cGNTBQ99zI94cuPh+FgYo/VoyIvbP0Jh5UNuDcCOOG5Sr7sQVz8dMjQOdlKCBSyHv73OJ1UzNlYwD13A= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850125; x=1733386125; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xwFpZ/bUe3/J66FeQnuQOTEsdfVC11uuVndefPpf8C8=; b=QhBQ2xSTZjC9lTQsgU1LTTALymgcIN0owEWXqoAEWNYOjD5JMa8GI3oj R9Vn6rjHcQ28LDBAAb/irpZ2J4xLZcyzCRuRwmL+LNrH57VO45k/uSVsJ yFdC9cz9ZqdM8Z0UFrDbK5As4nebgiLwadluognkfmzJmMyMN/KsP24C9 TCJ1LmpR/qD5nm3bxUrFiQxK92/ExvFBW3WuyQdMsTeaNeDFkMoo7R2O+ X9w4nel6TEFBhMdVj4A9Kba4w5InDRhyQ1A31gbmLE2RW/6HWCEiQg5NJ GjjwYHUtRNzRO8UkBWuILo9JljY72tHZpK++kRzSqEBPpQztINE1fuK+w g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085457" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085457" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737747" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737747" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:37 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8F8C61005129; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 04/16] [APX NDD] Support APX NDD for adc insns Date: Wed, 6 Dec 2023 16:06:24 +0800 Message-Id: <20231206080636.178863-5-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519298021578766 X-GMAIL-MSGID: 1784519298021578766 From: Kong Lingling Legacy adc patterns are commonly adopted to TImode add, when extending TImode add to NDD version, operands[0] and operands[1] can be different, so extra move should be emitted if those patterns have optimization when adding const0_rtx. For TImode insn, there could be register overlapping between operands[0] and operands[1] as x86 allocates TImode register sequentially like rax:rdi, rdi:rdx. After postreload split for TImode, write to 1st highpart rdi will be overrided by the 2nd lowpart rdi if 2nd lowpart rdi have different src as input, then the write to 1st highpart rdi will missed and cause miscompliation. In addition, when input operands contain memory, the address register may also overlaps with dest register if it is marked dead after one of highpart/lowpart operation was done. So the earlyclobber modifier '&' should be added to NDD dest to avoid overlapping between dest and src operands. NDD instructions will automatically zero-extend dest register to 64bit, so for zext patterns it can adopt all NDD form that have memory src input. gcc/ChangeLog: * config/i386/i386.md (*add3_doubleword): Add ndd alternatives, adopt '&' to ndd dest and move operands[1] to operands[0] when they are not equal. (*add3_doubleword_cc_overflow_1): Likewise. (*addv4_doubleword): Likewise. (*addv4_doubleword_1): Likewise. (*add3_doubleword_zext): Likewise. (addv4_overflow_1): Add ndd alternatives. (*addv4_overflow_2): Likewise. (@add3_carry): Likewise. (*add3_carry_0): Likewise. (*addsi3_carry_zext): Likewise. (addcarry): Likewise. (addcarry_0): Likewise. (*addcarry_1): Likewise. (*add3_eq): Likewise. (*add3_ne): Likewise. (*addsi3_carry_zext_0): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-adc.c: New test. --- gcc/config/i386/i386.md | 193 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-adc.c | 15 ++ 2 files changed, 136 insertions(+), 72 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-adc.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a1626121227..8dd8216041e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6294,12 +6294,12 @@ (define_expand "add3" TARGET_APX_NDD); DONE;") (define_insn_and_split "*add3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,r"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6319,24 +6319,34 @@ (define_insn_and_split "*add3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + /* Under NDD op0 and op1 may not equal, do not delete insn then. */ + bool emit_insn_deleted_note_p = true; + if (!rtx_equal_p (operands[0], operands[1])) + { + emit_move_insn (operands[0], operands[1]); + emit_insn_deleted_note_p = false; + } if (operands[5] != const0_rtx) - ix86_expand_binary_operator (PLUS, mode, &operands[3]); + ix86_expand_binary_operator (PLUS, mode, &operands[3], + TARGET_APX_NDD); else if (!rtx_equal_p (operands[3], operands[4])) emit_move_insn (operands[3], operands[4]); - else + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,&r,&r") (plus: (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")) - (match_operand: 1 "nonimmediate_operand" "0,0"))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")) + (match_operand: 1 "nonimmediate_operand" "0,0,r,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6352,7 +6362,8 @@ (define_insn_and_split "*add3_doubleword_zext" (match_dup 4)) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_concat" [(set (match_operand: 0 "register_operand" "=&r") @@ -7414,14 +7425,14 @@ (define_insn_and_split "*addv4_doubleword" (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0,0")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (plus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -7451,22 +7462,23 @@ (define_insn_and_split "*addv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*addv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0")) - (match_operand: 3 "const_scalar_int_operand" "n")) + (match_operand: 1 "nonimmediate_operand" "%0,rm")) + (match_operand: 3 "const_scalar_int_operand" "n,n")) (sign_extend: (plus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,&r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -7500,11 +7512,14 @@ (define_insn_and_split "*addv4_doubleword_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_addv4_1 (operands[3], operands[4], operands[5], operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*addv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -7514,9 +7529,9 @@ (define_insn "*addv4_overflow_1" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0"))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r"))) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (plus:SWI (plus:SWI @@ -7524,15 +7539,20 @@ (define_insn "*addv4_overflow_1" [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addv4_overflow_2" @@ -7543,26 +7563,29 @@ (define_insn "*addv4_overflow_2" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0"))) - (match_operand: 6 "const_int_operand" "n")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm"))) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (plus:SWI (plus:SWI (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)]) (match_dup 1)) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8384,17 +8407,22 @@ (define_insn "*subsi_3_zext" ;; Add with carry and subtract with borrow (define_insn "@add3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (plus:SWI (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8481,31 +8509,39 @@ (define_insn "*add3_carry_0r" (set_attr "mode" "")]) (define_insn "*addsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (plus:SI (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "%0")) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm")) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "adc{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + adc{l}\t{%2, %k0|%k0, %2} + adc{l}\t{%2, %1, %k0|%k0, %1, %2} + adc{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "adc{l}\t{$0, %k0|%k0, 0}" - [(set_attr "type" "alu") + "@ + adc{l}\t{$0, %k0|%k0, 0} + adc{l}\t{$0, %1, %k0|%k0, %1, 0}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8534,20 +8570,25 @@ (define_insn "addcarry" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,r,m"))) (plus: (zero_extend: (match_dup 2)) (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8705,7 +8746,8 @@ (define_expand "addcarry_0" (match_dup 1))) (set (match_operand:SWI48 0 "nonimmediate_operand") (plus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (PLUS, mode, operands)") + "ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD)") (define_insn "*addcarry_1" [(set (reg:CCC FLAGS_REG) @@ -8715,18 +8757,18 @@ (define_insn "*addcarry_1" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0")) - (match_operand:SWI48 2 "x86_64_immediate_operand" "e"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,rm")) + (match_operand:SWI48 2 "x86_64_immediate_operand" "e,e"))) (plus: (match_operand: 6 "const_scalar_int_operand") (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) /* Check that operands[6] is operands[2] zero extended from mode to mode. */ @@ -8739,8 +8781,11 @@ (define_insn "*addcarry_1" && ((unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (operands[6], 0) == UINTVAL (operands[2])) && CONST_WIDE_INT_ELT (operands[6], 1) == 0))" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "") @@ -9388,12 +9433,12 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o")) (match_dup 1))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -9422,6 +9467,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_addcarry_0 (operands[3], operands[4], operands[5])); DONE; } @@ -9430,7 +9477,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" operands[5], mode); else operands[6] = gen_rtx_ZERO_EXTEND (mode, operands[5]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) ;; x == 0 with zero flag test can be done also as x < 1U with carry flag ;; test, where the latter is preferrable if we have some carry consuming @@ -9445,7 +9493,7 @@ (define_insn_and_split "*add3_eq" (match_operand:SWI 1 "nonimmediate_operand")) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9469,7 +9517,8 @@ (define_insn_and_split "*add3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (PLUS, mode, operands) + && ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c new file mode 100644 index 00000000000..9d5991457da --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-1.c" +// *addti3_doubleword +// *addti3_doubleword_zext +// *adddi3_cc_overflow_1 +// *adddi3_carry + +int foo3 (int *a, int b) +{ + int c = *a + b + (a > b); /* { dg-warning "comparison between pointer and integer" } */ + return c; +} +/* { dg-final { scan-assembler-not "xor" } } */ From patchwork Wed Dec 6 08:06:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174399 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955473vqy; Wed, 6 Dec 2023 00:12:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IGdTc5yXuYSsYpHD7q6HoplVnjpqNhC21nMQ7OKgcbSgLT2VWXw11VnElAZFTcRUb9TOUqk X-Received: by 2002:a81:ae45:0:b0:5d7:1941:ab4 with SMTP id g5-20020a81ae45000000b005d719410ab4mr346373ywk.79.1701850329321; Wed, 06 Dec 2023 00:12:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850329; cv=pass; d=google.com; s=arc-20160816; b=zmWW+QvaUCN4Dtrc0X+Fv6iO43EZlNwOg+PuqNdXzQMQ3aTQnSvZgvz6eEXLlnM0Jp Is+lU241ZyR/5ebAnvlW098mFD1pigUQXsaDOiTYSjZLqtH3ZK40M2otd9hBEOmtBgxh y/Ch3P03+rEP5Xu8P0jA0w1dot0kenoN5LrqljfgBEwEHoANg1I4543p2CvDF1o6r5gP kuw5HXc0OGu5jThoYn3FSejC5NgYw/m/WMHqKJr8/8GUW3ExGMX/K9i1yGw7DrJvCfno kv3M/KkEIStwISRdKO4BGTg/s1L/bN5tKdLlIt7ZwxYY+Lw4M5GCcyyuClXOMPeN19h3 BsLQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=+ESaU059n5KdNvrdSuXCnApEXpIp0K5gObW7qy0LT6k=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=yus/A9qMxkXUuqyR3r2cg7jEZkCMNBWVJRYkEqJL+Bm4oNoWb5fd/mOh/K9EbGs5WQ LzJPdf4jgy5Hol8lJdeuTkPTZMxYFnOkpdaCfpVo9MX9CHyY0BBzA/C89B9wmBvL+vxm EPvi9L5CfjTQSTVt7zWvXrLUQG9UMzlKcZmTl112IKgOYUB/f3+Je0ms/1ZApzU6KC1m Iv/D9sw4jcmGc26AWa3jKCXU6OzTg6uzRzP2svaKmG9iND55T2C8LJ/8ofCHR3/X44/S XZjBZzfhw2ch6C6h8+kj1ft0P2PzoXzR6Cc+Bf0B6PDxC880GXhc/pYiLjJ5KC2BOb9x fDbA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kx5+V93x; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c12-20020a0cd60c000000b0067ab207026esi8736100qvj.128.2023.12.06.00.12.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:12:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kx5+V93x; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 920F9384DEF1 for ; Wed, 6 Dec 2023 08:11:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id C35CC384F98C for ; Wed, 6 Dec 2023 08:08:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C35CC384F98C Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C35CC384F98C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850132; cv=none; b=n/iVAuK+zlCUOwgCNkP5nbRFcZ6AmWGmyOHAs3ZrBetmg4naxKwMcav+fJJWozNuxNlk3YYAHFXAaLfdVxrCB/tfML3OjgUHsbDIy3ESNhvMjrsdqwwdmE4qEARijsZ2Zd/qzEDu03+rqpm9PgCcH+M1KBf33vNArSnBMwiHCG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850132; c=relaxed/simple; bh=ywVdMzIq0LctSHEa5xoJ5FPi/LeDcraVXkHFmmAVqDE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=iW2FPcC1hHFGz9OfsZlLy+GfVNUeX4Dtp6BGfhQ6SoLCO7K63BvAJmBcuAjKceqIK7GlKqJYbpN5RJd4uxYLtg0F03ftxM5vjXgGy5C9rFpB8cdgdLB/vAvWQcrnZiV006F12TKHB53kEQYj1igJ4n6SXhio9fEkEhyo+8A1QPY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850130; x=1733386130; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ywVdMzIq0LctSHEa5xoJ5FPi/LeDcraVXkHFmmAVqDE=; b=kx5+V93x+aj/rVTaHQxF4usVhROwinyBX+riXzbKI0j+a/LJnYIzSiKF 1vVaW9Ej3j3UoIHBh46qrrvgp4/KgcTuUfi7UE9GISO2HFuex2dXuxJm2 xEUCcF89LPyxQZYleSrFwBlngb97bFNcCdgBHzhe7Is+hBTrZOjNsvP6a 904RFJTRJaNwRgaEmDfKZsRFiaq2FVekS026XcXWbLrsKCDx4rJtWpa1c Izik093T/5124nERrxOTJqr9oM7XbMIf8y2W2kll9WAwSH7VbTbMu33Bf ynjR7/Fo38lbfXSlssGjq1+KWnf0U8GPovm/ORCEg2df+M99OAAKn6tnG Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085470" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085470" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737765" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737765" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:39 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 92DA5100512E; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 05/16] [APX NDD] Support APX NDD for sub insns Date: Wed, 6 Dec 2023 16:06:25 +0800 Message-Id: <20231206080636.178863-6-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519411124574987 X-GMAIL-MSGID: 1784519411124574987 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_fixup_binary_operands_no_copy): Add use_ndd parameter and parse it. * config/i386/i386-protos.h (ix86_fixup_binary_operands_no_copy): Change define. * config/i386/i386.md (sub3): Add new alternatives for NDD and adjust output templates. (*sub_1): Likewise. (*sub_2): Likewise. (subv4): Likewise. (*subv4): Likewise. (subv4_1): Likewise. (usubv4): Likewise. (*sub_3): Likewise. (*subsi_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternatives. (*subsi_2_zext): Likewise. (*subsi_3_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for ndd sub. --- gcc/config/i386/i386-expand.cc | 5 +- gcc/config/i386/i386-protos.h | 2 +- gcc/config/i386/i386.md | 155 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 4 files changed, 120 insertions(+), 55 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 3ecda989cf8..93ecde4b4a8 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1326,9 +1326,10 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, void ix86_fixup_binary_operands_no_copy (enum rtx_code code, - machine_mode mode, rtx operands[]) + machine_mode mode, rtx operands[], + bool use_ndd) { - rtx dst = ix86_fixup_binary_operands (code, mode, operands); + rtx dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); gcc_assert (dst == operands[0]); } diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 7dfeb6af225..481527872e8 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -111,7 +111,7 @@ extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_binary_operator (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8dd8216041e..6ec498725aa 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7777,7 +7777,8 @@ (define_expand "sub3" (minus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (MINUS, mode, operands); DONE;") + "ix86_expand_binary_operator (MINUS, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*sub3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -7803,7 +7804,10 @@ (define_insn_and_split "*sub3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { - ix86_expand_binary_operator (MINUS, mode, &operands[3]); + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + ix86_expand_binary_operator (MINUS, mode, &operands[3], + TARGET_APX_NDD); DONE; } }) @@ -7832,25 +7836,36 @@ (define_insn_and_split "*sub3_doubleword_zext" "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") (define_insn "*sub_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -7941,31 +7956,42 @@ (define_insn "*sub_2" [(set (reg FLAGS_REG) (compare (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_2_zext" [(set (reg FLAGS_REG) (compare - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*subqi_ext_0" @@ -8077,7 +8103,8 @@ (define_expand "subv4" (pc)))] "" { - ix86_fixup_binary_operands_no_copy (MINUS, mode, operands); + ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + TARGET_APX_NDD); if (CONST_SCALAR_INT_P (operands[2])) operands[4] = operands[2]; else @@ -8088,35 +8115,45 @@ (define_insn "*subv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (minus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "subv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (minus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8212,6 +8249,8 @@ (define_insn_and_split "*subv4_doubleword_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_subv4_1 (operands[3], operands[4], operands[5], operands[5])); DONE; @@ -8293,18 +8332,25 @@ (define_expand "usubv4" (label_ref (match_operand 3)) (pc)))] "" - "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands);") + "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + TARGET_APX_NDD);") (define_insn "*sub_3" [(set (reg FLAGS_REG) - (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,i,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -8392,16 +8438,21 @@ (define_insn_and_split "*dec_cmov" (define_insn "*subsi_3_zext" [(set (reg FLAGS_REG) - (compare (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe"))) - (set (match_operand:DI 0 "register_operand" "=r") + (compare (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re"))) + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %1|%1, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %1|%1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Add with carry and subtract with borrow diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index c1049022f2a..0c7952ef018 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -42,3 +42,16 @@ FOO (long, add, +) FOO1 (long, add, +) FOO2 (long, add, +) +FOO (char, sub, -) +FOO1 (char, sub, -) +FOO (short, sub, -) +FOO1 (short, sub, -) +FOO (int, sub, -) +FOO1 (int, sub, -) +FOO (long, sub, -) +FOO1 (long, sub, -) +/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), %(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Wed Dec 6 08:06:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174391 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954825vqy; Wed, 6 Dec 2023 00:10:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IGarRKoXSKaBlQ8EhiRImTz8dMQiPNgIyaceNviGjvJFmVVj0YYkKKnowtUKMAGYoUgDDLm X-Received: by 2002:a0c:f101:0:b0:67a:9cf1:c4d0 with SMTP id i1-20020a0cf101000000b0067a9cf1c4d0mr3046554qvl.13.1701850240525; Wed, 06 Dec 2023 00:10:40 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850240; cv=pass; d=google.com; s=arc-20160816; b=BgaeVd8PZUhs8Wb3N2HCLkd98MNNPZ/bg3Zkp4Usyt4qpZ4EtGMwu8zFNWKtN3E72S FMpvb+nJNAy8P21Hw0GuxcK3Cjjp+YHGRU8N7oVIkhCkbFN/7czxizkqxmTwmpRnQcse bo35hmWXiD3z4GsBqtAvGWcAiFvOg4SARcD82nPwZVhue0PnOFkwUsdu3Ct7QKWaze+P T1xrvNssCCX2S7w2rKLy80Wirc5F10Ek+mzWBX5iFbvL4Wv4oKVnCdwp9uqOTrQtbNgk CvXHeDeugEtaBIX70VNIFil0s9CcHSs0UoTyMbMAdhYACSPOCfnHKupKWkSK/qVAfBRx FToA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=iv0f57VOCfJO86RVAobCOdv29e4uyR45GGAeUBziJW8=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=Jvm6cVosHD4GELR3hqAnW0vPy2ajnq6yrLTqi8xQLfq5KM26Tz8SA6omfNRT24GNfg MltXv1JNdJ4JU1tfAhsNijLNlBLDpTiBNGyietqcBjrwGzuPITvC6ghJ9c5Sa7D+NBMP CLOJl3ooBQZXU6mAyT2+V/j7NgrUNrUCt0UWgQp/O5ck7N9s7vMl9I7zPW2UKUDGtjv4 EMg89qm4sGXLL4NZ0qTy6BuwnhXjk5nMIVmabMaObsv/Cwg5uqRIcYh+QhklPw+L9nFa KwKib+DE1IL4uSKuSVpO/zyRhfqgNsA001TA+TKDhjRBUHKGqKM3WAPdEmrbW/tLA/ZT /kMw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="lgA/pnPy"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id c2-20020a0cd602000000b0067ad66de40esi4598582qvj.113.2023.12.06.00.10.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:40 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="lgA/pnPy"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C32B63870C1A for ; Wed, 6 Dec 2023 08:10:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 1904E38618A8 for ; Wed, 6 Dec 2023 08:08:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1904E38618A8 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1904E38618A8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850130; cv=none; b=EWtm1Fp0wGYsjJyxZwGKi0eITOr8cHg0MyApd4SPtliWuM3J+SZm5kz5lUKF7LeuAI36vqS7UE5tCGQOU3Jg8i+FVlpwqcWwSG6VfdNPzkNjSN86c02OVI7eTMdOlUW5JCVN2UBHKHZVC/bUjblzQ0pU+9T8bjN5CuSqlNCyzC0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850130; c=relaxed/simple; bh=PiONA+X/Vl6iT9Ch+PceH+7ogVhyKKSoD5ieWiTaf2Q=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Cn6h9OLlcmfBXjYA2Z0arroOhiL4PFt2i+ZfTP4mT9toYySxd1rOcRFDeKskTfFtFA8ndKNmIty2NCYhgBB3d9mzN47txiozHpctQF/j19Xkc5mx/2lcdxF/plrQzJ5Arz3bSaKW4M8dOXxfQvsfb5uWWyu/zyMLLXpSMYEYnq4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850129; x=1733386129; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PiONA+X/Vl6iT9Ch+PceH+7ogVhyKKSoD5ieWiTaf2Q=; b=lgA/pnPy13USYaw6LPLgq1cni20bZD4xF3AebyWhaSAocJTZPpWo1afG lkHqwfacF7RkAZsH/idlZ7UhSvOxD1MN7UJbI4WMOBnlJoLJinm1jA+0s vvhpL88lXdz0SBlyVccQksYWeKG2r/MN0lK3mHC7q3pLMTwMBQ/Cbnghz ETUQjZGrE1D2+28RiKvRuIlPyfu2WblfgHf49Ry8KL7q5+eHUcBg/m51t itGZFG97xVwYt5Ty5ObCaVoChfk637zrCG9z8Hk13yKLgQpmd+72UGgG3 XOSyAl0rV8oOxov4ghZ681LoAeEWTe+kK2EVag1RVX4mRznSmJHfcXo+m A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085467" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085467" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737756" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737756" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:39 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 956F61005134; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 06/16] [APX NDD] Support APX NDD for sbb insn Date: Wed, 6 Dec 2023 16:06:26 +0800 Message-Id: <20231206080636.178863-7-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519317656969885 X-GMAIL-MSGID: 1784519317656969885 From: Kong Lingling Similar to *add3_doubleword, operands[1] may not equal to operands[0] so extra move and earlyclobber are required. gcc/ChangeLog: * config/i386/i386.md (*sub3_doubleword): Add new alternative for NDD, adopt '&' modifier to NDD dest and emit move when operands[0] not equal to operands[1]. (*sub3_doubleword_zext): Likewise. (*subv4_doubleword): Likewise. (*subv4_doubleword_1): Likewise. (*subv4_overflow_1): Add NDD alternatives and adjust output templates. (*subv4_overflow_2): Likewise. (@sub3_carry): Likewise. (*addsi3_carry_zext_0r): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*subsi3_carry_zext): Likewise. (subborrow): Parse TARGET_APX_NDD to ix86_binary_operator_ok. (subborrow_0): Likewise. (*sub3_eq): Likewise. (*sub3_ne): Likewise. (*sub3_eq_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-sbb.c: New test. --- gcc/config/i386/i386.md | 160 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c | 6 + 2 files changed, 107 insertions(+), 59 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6ec498725aa..90981e733bd 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7781,12 +7781,13 @@ (define_expand "sub3" TARGET_APX_NDD); DONE;") (define_insn_and_split "*sub3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7810,16 +7811,18 @@ (define_insn_and_split "*sub3_doubleword" TARGET_APX_NDD); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*sub3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,&r,&r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") + (match_operand: 1 "nonimmediate_operand" "0,0,r,o") (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7833,7 +7836,8 @@ (define_insn_and_split "*sub3_doubleword_zext" (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*sub_1" [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") @@ -8167,14 +8171,15 @@ (define_insn_and_split "*subv4_doubleword" (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0,0")) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (minus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -8202,22 +8207,24 @@ (define_insn_and_split "*subv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*subv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0")) + (match_operand: 1 "nonimmediate_operand" "0,ro")) (match_operand: 3 "const_scalar_int_operand")) (sign_extend: (minus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,&r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -8255,7 +8262,8 @@ (define_insn_and_split "*subv4_doubleword_1" operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*subv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -8263,11 +8271,11 @@ (define_insn "*subv4_overflow_1" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (minus:SWI (minus:SWI @@ -8275,15 +8283,21 @@ (define_insn "*subv4_overflow_1" (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subv4_overflow_2" @@ -8292,28 +8306,32 @@ (define_insn "*subv4_overflow_2" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand: 6 "const_int_operand" "n")) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (minus:SWI (minus:SWI (match_dup 1) (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8598,15 +8616,18 @@ (define_insn "*addsi3_carry_zext_0" (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0r" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_unset_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "sbb{l}\t{$-1, %k0|%k0, -1}" - [(set_attr "type" "alu") + "@ + sbb{l}\t{$-1, %k0|%k0, -1} + sbb{l}\t{$-1, %1, %k0|%k0, %1, -1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8846,17 +8867,23 @@ (define_insn "*addcarry_1" (const_string "4")))]) (define_insn "@sub3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8943,18 +8970,23 @@ (define_insn "*sub3_carry_0r" (set_attr "mode" "")]) (define_insn "*subsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (minus:SI - (match_operand:SI 1 "register_operand" "0") + (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)])) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sbb{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sbb{l}\t{%2, %k0|%k0, %2} + sbb{l}\t{%2, %1, %k0|%k0, %1, %2} + sbb{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -9039,21 +9071,27 @@ (define_insn "subborrow" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extend: - (match_operand:SWI48 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI48 1 "nonimmediate_operand" "0,0,r,rm")) (plus: (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (zero_extend: - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,rm,r"))))) + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI48 (minus:SWI48 (match_dup 1) (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -9214,7 +9252,8 @@ (define_expand "subborrow_0" (match_operand:SWI48 2 ""))) (set (match_operand:SWI48 0 "register_operand") (minus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (MINUS, mode, operands)") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)") (define_expand "uaddc5" [(match_operand:SWI48 0 "register_operand") @@ -9639,7 +9678,8 @@ (define_insn_and_split "*sub3_eq" (const_int 0))) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9664,7 +9704,8 @@ (define_insn_and_split "*sub3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9693,7 +9734,8 @@ (define_insn_and_split "*sub3_eq_1" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c new file mode 100644 index 00000000000..662e3c607d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c @@ -0,0 +1,6 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-2.c" + +/* { dg-final { scan-assembler-times "sbbq\[^\n\r]*0, %rdi, %rdx" 1 } } */ From patchwork Wed Dec 6 08:06:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174393 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954924vqy; Wed, 6 Dec 2023 00:10:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IGRuyp6ldLR01Vqr/195ILv3Ae6VAzomjRxzqhfUdD39z9BmWKkOek6am4VrxIqIv82JLbb X-Received: by 2002:a05:622a:1650:b0:425:4043:8d69 with SMTP id y16-20020a05622a165000b0042540438d69mr515171qtj.132.1701850255344; Wed, 06 Dec 2023 00:10:55 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850255; cv=pass; d=google.com; s=arc-20160816; b=eID1Alxp61vOctqVo2ghgCLl9gHuEukSpb+ld1x4B/LTlFnBYhXZ/E4Zi3FK502rUn 2TmiIQj/0D1L6I3qwGOtSBlXb95vCTe/hzmnF01uWmtBkGbRufFfLYCC5P9fw07PHEdb QqbFMATWsixYGZsfGYmDncJKje6TH126yYu9MJ83/qDq1PrcmumB2HHLTvNyr0Kjdj/4 +OO/YVTd1+aV0eR57EEMS/97s9HyhJb0sWQ9AL0pXql7Nw6YLo1wFiMwufvx0tOrnsHj /l9RHi59jZMnACDR8tFIV9efrwsZSKklsx8UUi7VtG81XCbnzaB+n8h1WCUagTdBhvZm DWGg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=5Qj/2krKDWfWBNX9M+qZ2GFS1lrVgoFvKOKj3k5QDHs=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=FuyrbXBtpbzbcKGw4IMUOkMkmK6GQepbOTsJhJMZYvuwcGW3+2isdhQ6MGrxzL6CLg +KkIj6NP8poEoittoalT9TXDAQYSF5/ZEdskh/LCKlhYtGpGFD7JpcD3oJt/6g++Uhw5 zVQ8Vr1uPZgr42CxqPHeIyThLwueIgHKFA5jGeMZ6LPHanPcDCeFW+gE0wZgW7KsXDVB zrNzbQ/kBWJxUH8x9ZR1O/2Fry4LnBj/rLNR0WTEN/WDxCu9Uq0Nlo0DRwgSpn8LVFtz 4Br/djsxFKYc8NO5JItteCJN04M2tVSygIT/FX4Mgh8BvkxjcS85A/9HESOtOK7j0qiy 8kgg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kbtt9NB3; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bs17-20020ac86f11000000b004254cb576f4si8595506qtb.792.2023.12.06.00.10.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:55 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kbtt9NB3; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4072E38708F9 for ; Wed, 6 Dec 2023 08:10:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 1B7CC384F9A5 for ; Wed, 6 Dec 2023 08:08:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1B7CC384F9A5 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1B7CC384F9A5 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850129; cv=none; b=kgz78h6k/uTjp1NCnyjsEz5j63NSkValHwmRNUqkLhuPE+Q9enZ8Wrp+80cPcjXYSKhaY0hdaN68b8y/ucnTtHp+G4KOxZ974MkaQoVr6Vj1Jm3pVmtmcrztGPLyW0/iLz2/u3fFcnZ135PlEHlr1DS3fyGjJAObHFBYiJAEMcE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850129; c=relaxed/simple; bh=hVtzY5eE0bFq8CNnqt3d2bueHMn/TQZUu7aO3yw+80k=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=C9OMHC+sDdkxHaBzHjyt7g0rj0ZQRpj2sp2MhnvowgP49GIvzkk1PoVBZAyReTz9bOe2jLSdctgL5eqyoOmCIrlfTGBcOO7GfLbUhvzvU5HNyO5Ado31GHfy9BD3oettoQDG7fVXDQCLTnlVvJx5gpdocM9lRoUFY4nOoi05fDs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850128; x=1733386128; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hVtzY5eE0bFq8CNnqt3d2bueHMn/TQZUu7aO3yw+80k=; b=kbtt9NB38inzrJsu2EJ7vWCCJQbu3lrB4j85pyHRGcU95xrgK/DrJiyg qcIBLMTrdxWVezX985ayfWLxVf4fi9RV/mIVn+4qF4t9C50q5rNxTC8Vl 34zIIAZAx+t5zBQgFMbZJ4kS0ixCu72wJaHHNPdUtUoE4h1V+sR/bYqjP DJ+ZaUMu6O3Wkju60NseLeibldNueopEHZBQdgEYZvRAmHogxv2Su4yh8 ZrvRY6gVAzww8FXuo2eT/CUj9Uiab2F5HI9nncALhM/rwDWig2yHpvl0S Cfd6aEsAjtPmZpRDb1HV1GczS8y0z3jQG7LB4uOj9S7BH6Eo+SvnLd1uB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085465" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085465" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737754" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737754" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:39 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 984871007801; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 07/16] [APX NDD] Support APX NDD for neg insn Date: Wed, 6 Dec 2023 16:06:27 +0800 Message-Id: <20231206080636.178863-8-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519332886648681 X-GMAIL-MSGID: 1784519332886648681 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_unary_operator): Add use_ndd parameter and adjust for NDD. * config/i386/i386-protos.h: Add use_ndd parameter for ix86_unary_operator_ok and ix86_expand_unary_operator. * config/i386/i386.cc (ix86_unary_operator_ok): Add use_ndd parameter and adjust for NDD. * config/i386/i386.md (neg2): Add new constraint for NDD and adjust output template. (*neg_1): Likewise. (*neg2_doubleword): Likewise and adopt '&' to NDD dest. (*neg_2): Likewise. (*neg_ccc_1): Likewise. (*neg_ccc_2): Likewise. (*negsi_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternatives. (*negsi_2_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add neg test. --- gcc/config/i386/i386-expand.cc | 4 +- gcc/config/i386/i386-protos.h | 5 +- gcc/config/i386/i386.cc | 5 +- gcc/config/i386/i386.md | 77 ++++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 29 ++++++++++ 5 files changed, 87 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 93ecde4b4a8..d4bbd33ce07 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1494,7 +1494,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, void ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { bool matching_memory = false; rtx src, dst, op, clob; @@ -1513,7 +1513,7 @@ ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, } /* When source operand is memory, destination must match. */ - if (MEM_P (src) && !matching_memory) + if (!use_ndd && MEM_P (src) && !matching_memory) src = force_reg (mode, src); /* Emit the instruction. */ diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 481527872e8..fa952409729 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -127,7 +127,7 @@ extern bool ix86_vec_interleave_v2df_operator_ok (rtx operands[3], bool high); extern bool ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn); extern bool ix86_agi_dependent (rtx_insn *set_insn, rtx_insn *use_insn); extern void ix86_expand_unary_operator (enum rtx_code, machine_mode, - rtx[]); + rtx[], bool = false); extern rtx ix86_build_const_vector (machine_mode, bool, rtx); extern rtx ix86_build_signbit_mask (machine_mode, bool, bool); extern HOST_WIDE_INT ix86_convert_const_vector_to_integer (rtx, @@ -147,7 +147,8 @@ extern void ix86_split_fp_absneg_operator (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_copysign (rtx []); extern void ix86_expand_xorsign (rtx []); -extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2]); +extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2], + bool = false); extern bool ix86_match_ccmode (rtx, machine_mode); extern bool ix86_match_ptest_ccmode (rtx); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 8aa33aef7e1..4b6bad37c8f 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -16209,11 +16209,12 @@ ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn) bool ix86_unary_operator_ok (enum rtx_code, machine_mode, - rtx operands[2]) + rtx operands[2], + bool use_ndd) { /* If one of operands is memory, source and destination must match. */ if ((MEM_P (operands[0]) - || MEM_P (operands[1])) + || (!use_ndd && MEM_P (operands[1]))) && ! rtx_equal_p (operands[0], operands[1])) return false; return true; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 90981e733bd..e97c1784e9a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -13287,13 +13287,14 @@ (define_expand "neg2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (neg:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NEG, mode, operands); DONE;") + "ix86_expand_unary_operator (NEG, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*neg2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (neg: (match_operand: 1 "nonimmediate_operand" "0"))) + [(set (match_operand: 0 "nonimmediate_operand" "=ro,&r") + (neg: (match_operand: 1 "nonimmediate_operand" "0,ro"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" + "ix86_unary_operator_ok (NEG, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -13310,7 +13311,8 @@ (define_insn_and_split "*neg2_doubleword" [(set (match_dup 2) (neg:DWIH (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) ;; Convert: ;; mov %esi, %edx @@ -13399,22 +13401,29 @@ (define_peephole2 (clobber (reg:CC FLAGS_REG))])]) (define_insn "*neg_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0"))) + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + "ix86_unary_operator_ok (NEG, mode, operands, TARGET_APX_NDD)" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI - (neg:SI (match_operand:SI 1 "register_operand" "0")))) + (neg:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands, + TARGET_APX_NDD)" + "@ + neg{l}\t%k0 + neg{l}\t{%k1, %k0|%k0, %k1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -13440,51 +13449,65 @@ (define_insn_and_split "*neg_1_slp" (define_insn "*neg_2" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + && ix86_unary_operator_ok (NEG, mode, operands, + TARGET_APX_NDD)" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_2_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 1 "register_operand" "0")) + (neg:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (neg:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + && ix86_unary_operator_ok (NEG, SImode, operands, + TARGET_APX_NDD)" + "@ + neg{l}\t%k0 + neg{l}\t{%1, %k0|%k0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*neg_ccc_1" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,rm") (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*neg_ccc_2" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,rm") (const_int 0)] UNSPEC_CC_NE)) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_expand "x86_neg_ccc" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 0c7952ef018..c351f71265e 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -27,8 +27,25 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ { \ TYPE c = *a OP b; \ return c; \ +} + +#define F(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = OP*a; \ + return b; \ } +#define F1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f1_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = OP a; \ + return b; \ +} FOO (char, add, +) FOO1 (char, add, +) FOO2 (char, add, +) @@ -50,8 +67,20 @@ FOO (int, sub, -) FOO1 (int, sub, -) FOO (long, sub, -) FOO1 (long, sub, -) + +F (char, neg, -) +F1 (char, neg, -) +F (short, neg, -) +F1 (short, neg, -) +F (int, neg, -) +F1 (int, neg, -) +F (long, neg, -) +F1 (long, neg, -) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), %(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Dec 6 08:06:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174382 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954300vqy; Wed, 6 Dec 2023 00:09:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IHOnt7A2ntKRbd/JPFlDixFHj/p+VSdFccrPR0xYHAo7tJyMUg3CbrVzwrh82kqWilo5Sa/ X-Received: by 2002:a05:620a:3ac5:b0:77f:2f1b:ad3b with SMTP id ss5-20020a05620a3ac500b0077f2f1bad3bmr358360qkn.146.1701850166258; Wed, 06 Dec 2023 00:09:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850166; cv=pass; d=google.com; s=arc-20160816; b=X+9OR0akvccuLXNmOQ1pgM+6cvEzVYrVVhz83kumje26XwaM6GIwcXFyVjDeI12gQ+ BHmH39GBvPHN92Woe3hi9RUyhYGuvB9Lgsog+mEcHS9HeJzdh8ZKbwbClZyJKZ6Sy369 EH4OnanQNHSzXxxZ16q9Qn6if4Z8dHkazYGm/Zhe4qWLGJ7JorXSCaIAbNtFP7UWOGAk oKRKWkDozYIsa6VsLyweDdI5/OFC0Nm19N7CPewbuHZ0Zj0DVR3Z03b5V+MhsHUhwlp8 r/6UumNk4U7nGD2JcnAuCsQVYYIpUx/C+MF7hCvGdeD4l4Pskk6NFbYRTUNo5zlPsbJt fKaA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=hCHN1dXuPiUHrMKGi89EWPAWvhelwr/DWzwKhWlWEvQ=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=G/+/ZTojVeRyNasWU2YwHssW3fWBpKkOm2B85IJQXE1gb8zX/mfuv8563NQpW63UM/ zom4Ax6wBYxZd3GLmp7C22nzH76bA8ksAnupeWu6SqfjkT41oQ1jAGd1GKJsQ0LndZjL 3ASQyBzxVslOFn4+SJHvxviB2yCwmhXTCX6quMHxCPo3k4pqZPFuyu8a6WDht7dXXbpz UaASfBoBN0h28AIiE5C/guTtbXBLb0gxX67ziX2OSd8PSVqMJRF4lucNG2kkWjMxXDdA Ur2JjEVQ1NPHzx8eSFd6Xo8ixrqvpoDco0GHlbgH7PXbzpyoAuPeYa8dsZaT47ILuVdC L3+g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d619Z83F; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id dv3-20020a05620a1b8300b007770bfd4044si14035168qkb.459.2023.12.06.00.09.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:09:26 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d619Z83F; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1B6D3847733 for ; Wed, 6 Dec 2023 08:09:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 136B138618BA for ; Wed, 6 Dec 2023 08:08:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 136B138618BA Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 136B138618BA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850126; cv=none; b=CfrwSwmhWYSD87TRrw+7ICzIkH+9db9VwvDUhw03h102TbqfDQA+MGTeHfPpcLe3eGCPeAoWhGThyRIqYG8nFsOs8ZQ4uzog5wYGonY5f7aVVno/kr2ffG6nK3j573Oko8in2ot0ic8CA/ZOXuKV6JeaHioKWt45iPDJREkcDuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850126; c=relaxed/simple; bh=u86pmMCtoJtXrXkQcqDTWDaJaXMglnRc5c1JkUvdFEo=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CwYF84vEKOUrZ1krrXXZqlg6hnMvMvACa3HInd4jTNJR2/ena8etKnG6BMuQOdE7lcyNSpUT7ui4aUzOhjRXkOPakJ5rTUhwhMztvDR3+vDYUSbcMzHCwZG0eTkdTqEYr1H6Lu9cQz9eVXt6i9RyuVfof/bmfJCXbkm5b5QMRAU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850126; x=1733386126; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u86pmMCtoJtXrXkQcqDTWDaJaXMglnRc5c1JkUvdFEo=; b=d619Z83FBXAdiiTH7Q6KSvd9qk15p+7IV/R3f6oiHpgfUUA3DoLqk6Sc 1qdrv1QFjopdZJ7m0CuDPZofEgKTbQCAzq7WjzE3gFDr+j4VN3zwuQB9v XFN4MIGZQQDsrNzk4EpF93sMEMiyJG6pWmXeJhjmmdgqPH7YH/yLtnC7g HM1NFtmUPd+neOeIYaM7DqO5UJPW4yB7+8Z9ra55ggmvAJ/WawatMrKMJ DGS1I1OZcNXYNxQ9EHuSpBrvHuVUFYdUST/GE08DximEjMh0kzAuJp08T mPTmIo9+KpjyuLBYSS0NwiP6ToK4f32DyRoz+m1y69YyfosKazFmlYrQy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085461" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085461" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737750" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737750" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:39 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9D0C91007802; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 08/16] [APX NDD] Support APX NDD for not insn Date: Wed, 6 Dec 2023 16:06:28 +0800 Message-Id: <20231206080636.178863-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519239584538538 X-GMAIL-MSGID: 1784519239584538538 From: Kong Lingling For *one_cmplsi2_2_zext, it will be splitted to xor, so its NDD form will be added together with xor NDD support. gcc/ChangeLog: * config/i386/i386.md (one_cmpl2): Add new constraints for NDD and adjust output template. (*one_cmpl2_1): Likewise. (*one_cmplqi2_1): Likewise. (*one_cmpl2_doubleword): Likewise, and adopt '&' to NDD dest. (*one_cmpl2_2): Likewise. (*one_cmplsi2_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add not test. --- gcc/config/i386/i386.md | 58 ++++++++++++++----------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 11 +++++ 2 files changed, 44 insertions(+), 25 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e97c1784e9a..61b7b79543b 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14006,57 +14006,63 @@ (define_expand "one_cmpl2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (not:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NOT, mode, operands); DONE;") + "ix86_expand_unary_operator (NOT, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*one_cmpl2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (not: (match_operand: 1 "nonimmediate_operand" "0")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand: 0 "nonimmediate_operand" "=ro,&r") + (not: (match_operand: 1 "nonimmediate_operand" "0,ro")))] + "ix86_unary_operator_ok (NOT, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(set (match_dup 0) (not:DWIH (match_dup 1))) (set (match_dup 2) (not:DWIH (match_dup 3)))] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) (define_insn "*one_cmpl2_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,?k") - (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,k")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,rm,k")))] + "ix86_unary_operator_ok (NOT, mode, operands, TARGET_APX_NDD)" "@ not{}\t%0 + not{}\t{%1, %0|%0, %1} #" - [(set_attr "isa" "*,") - (set_attr "type" "negnot,msklog") + [(set_attr "isa" "*,apx_ndd,") + (set_attr "type" "negnot,negnot,msklog") (set_attr "mode" "")]) (define_insn "*one_cmplsi2_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,?k") + [(set (match_operand:DI 0 "register_operand" "=r,r,?k") (zero_extend:DI - (not:SI (match_operand:SI 1 "register_operand" "0,k"))))] - "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands)" + (not:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,k"))))] + "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands, + TARGET_APX_NDD)" "@ not{l}\t%k0 + not{l}\t{%1, %k0|%k0, %1} #" - [(set_attr "isa" "x64,avx512bw_512") - (set_attr "type" "negnot,msklog") - (set_attr "mode" "SI,SI")]) + [(set_attr "isa" "x64,apx_ndd,avx512bw_512") + (set_attr "type" "negnot,negnot,msklog") + (set_attr "mode" "SI,SI,SI")]) (define_insn "*one_cmplqi2_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,?k") - (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,k")))] - "ix86_unary_operator_ok (NOT, QImode, operands)" + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,r,?k") + (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,rm,k")))] + "ix86_unary_operator_ok (NOT, QImode, operands, TARGET_APX_NDD)" "@ not{b}\t%0 not{l}\t%k0 + not{b}\t{%1, %0|%0, %1} #" - [(set_attr "isa" "*,*,avx512f") - (set_attr "type" "negnot,negnot,msklog") + [(set_attr "isa" "*,*,apx_ndd,avx512f") + (set_attr "type" "negnot,negnot,negnot,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "1") (const_string "SI") - (and (eq_attr "alternative" "2") + (and (eq_attr "alternative" "3") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -14086,14 +14092,16 @@ (define_insn_and_split "*one_cmpl_1_slp" (define_insn "*one_cmpl2_2" [(set (reg FLAGS_REG) - (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (not:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, mode, operands)" + && ix86_unary_operator_ok (NOT, mode, operands, + TARGET_APX_NDD)" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index c351f71265e..2bd551614c4 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -76,6 +76,15 @@ F (int, neg, -) F1 (int, neg, -) F (long, neg, -) F1 (long, neg, -) + +F (char, not, ~) +F1 (char, not, ~) +F (short, not, ~) +F1 (short, not, ~) +F (int, not, ~) +F1 (int, not, ~) +F (long, not, ~) +F1 (long, not, ~) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -84,3 +93,5 @@ F1 (long, neg, -) /* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Dec 6 08:06:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174392 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954916vqy; Wed, 6 Dec 2023 00:10:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IHcUujpBHBjCQGkMPtIq9edJIeBtCK5+V+Q5cz0YfD807rYLsSxoBAbBofkvR0aiGi1bW/b X-Received: by 2002:a05:620a:8f02:b0:77f:8bc:fd6 with SMTP id rh2-20020a05620a8f0200b0077f08bc0fd6mr486652qkn.143.1701850254892; Wed, 06 Dec 2023 00:10:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850254; cv=pass; d=google.com; s=arc-20160816; b=t0WAHlCvE6SlFgb4ZqVXIHC9AP/a1zRA/JCK/IBS4WpvwwycJsxN3gRSZENBfU80sW BffZGwzJ/uY4Y+lYSY0Skz62GcOL8n7AGKHP5WVYyyBCHIqb0jdgEYStNn5JN/UMgSgb g/+rSprZL20D/V0Rxl2LIHzkMZEhjGMsc2zg2Q2/j9oMoDtznyHerdJ7jVJx5H+GPAIt knS4SFmItYNsmALNjStYXtzw/aX+Q6WZ3Ws+R6dNJ+P6IL2950KEKs+g0psLXyFdP/Gj GZtw8R87daiPISqaTzrtnzPeZYySnJqEWbkL2jZH1LKY8tFKBu0jD1LhMdxMs6dQxJe7 PiBQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=TtIm3ONQ1eCjFaYXtCKFpqZg6OEBRk8qcLop9crKTEI=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=dH0RGNdpkWZPE+1QZFIA9t7tSD9zoomWGBzI0m8pTj2/6IDKypg4GlTjJmivhc+zfZ Ow0ozxmVkeNVLLN1QQwqfNz15FPLh63fBQSjyDVAo8W4Jk3im3JSQVE/oLNkoiw3GAVI 2KZOKW+ZkgnwztOoN7C6eS84HmVBlqpj9EqH3lYjq+t3ZZa/YDsE4DnRTmUFQceZvs3e tY/MWa8u0LmgT+lqMHAW9afeST/wO5VaAV0I72nNFFtSx451/STImq9TTvk/zHV2O3Mn fUprCjMzl1vmXQDUFZl9X7KqHHVkUno4xo3zqrxhXft8OJMQdBHJuxnVR6TduySOcaOw 8fDQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UyUihQZx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id pa20-20020a05620a831400b0077eff515cc6si8419834qkn.508.2023.12.06.00.10.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UyUihQZx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E70D53875DCB for ; Wed, 6 Dec 2023 08:10:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 29EA7384DB7E for ; Wed, 6 Dec 2023 08:08:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 29EA7384DB7E Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 29EA7384DB7E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; cv=none; b=O7awfqenrWPqLbS5hoP4kOfrzVOBSuLJTrWzVaFdxtTCVC2H+eKSyxrZDZpHnGo+ilk98iu1Wg8kKXALeB7/abBuVQI5zKHRL7Qc8YJ41si/G+Fx364fiJS+MLaJ1D4GBSIHV5vwsQTWOu8KIwWjcydQ64vm1uzkVb7DZD/BA9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; c=relaxed/simple; bh=+jfzyOzkcrJHQobtrrx82ZI2EQG5qN2bF2fJM4RB5zc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=AKFbWrLsPBh8/ifvqOU6uGhx2BGYwKfnnZeVuE/GfiVN3gDjRCnsWq7l2zMo6bvczKCUypsZ9bq3GDbC2fC4oy5rcfJlUU1cwfJT4oRHS+C0mZXh/rL47BdEEmKl2WAMrZuwC7siIUiRfA/J/+LnbMglSscBIMzphgPHp77jv4Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850133; x=1733386133; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+jfzyOzkcrJHQobtrrx82ZI2EQG5qN2bF2fJM4RB5zc=; b=UyUihQZxm02eYtg0sB5FhlrkYAJ5UqxRtg7nf07VJ+rztQEgzsvcTh8X ME3JlPdNV4UGYKZwhhXRHn/l4F7JH+n6cM6W5XDIy9UqTPt+zKK1vWEyQ uzgSwRyTrzHDuqGuwDr2YFRC5qdA6/Z8/Q6G0/oqeJ+Z2MNRvCxiP44y8 NVH9sR52mTCfrjh0WX+y6S0cK2DprSw8dXII96Dzu+fET28H56r98ZK04 zbQy65DODIcIcldBhPOxAEW/8O/XY2rNj8Jfnwx9mIMox/pUYlkf5/fcH HMeL1HTGbyl42nmpe2CCZZPhCAW28DJckWdVcRSAv1NXwQk80tqaBVswy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085475" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085475" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737773" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737773" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:40 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9F33C100780D; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 09/16] [APX NDD] Support APX NDD for and insn Date: Wed, 6 Dec 2023 16:06:29 +0800 Message-Id: <20231206080636.178863-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519332920903596 X-GMAIL-MSGID: 1784519332920903596 From: Kong Lingling For NDD form AND insn, there are three splitter fixes after extending legacy patterns. 1. APX NDD does not support high QImode registers like ah, bh, ch, dh, so for some optimization splitters that generates highpart zero_extract for QImode need to be prohibited under NDD pattern. 2. Legacy AND insn will use r/qm/L constraint, and a post-reload splitter will transform it into zero_extend move. But for NDD form AND, the splitter is not strict enough as the splitter assum such AND will have the const_int operand matching the constraint "L", then NDD form AND allows const_int with any QI values. Restrict the splitter condition to match "L" constraint that strictly matches zero-extend sematic. 3. Legacy AND insn will adopt r/0/Z constraint, a splitter will try to optimize such form into strict_lowpart QImode AND when 7th bit is not set. But the splitter will wronly convert non-zext form of NDD and with memory src, then the strict_lowpart transform matches alternative 1 of *_slp_1 and generates *movstrict_1 so the zext sematic was omitted. This could cause highpart of dest not cleared and generates wrong code. Disable the splitter when NDD adopted and operands[0] and operands[1] are not equal. gcc/ChangeLog: * config/i386/i386.md (and3): Add NDD alternatives and adjust output template. (*anddi_1): Likewise. (*and_1): Likewise. (*andqi_1): Likewise. (*andsi_1_zext): Likewise. (*anddi_2): Likewise. (*andsi_2_zext): Likewise. (*andqi_2_maybe_si): Likewise. (*and_2): Likewise. (*and3_doubleword): Add NDD alternative, adopt '&' to NDD dest and emit move for optimized case if operands[0] not equal to operands[1]. (define_split for QI highpart AND): Prohibit splitter to split NDD form AND insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. (define_split for zero_extend and optimization): Prohibit splitter to split NDD form AND insn to zero_extend insn. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add and test. --- gcc/config/i386/i386.md | 175 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 2 files changed, 127 insertions(+), 61 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 61b7b79543b..d2528e0dcf6 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11710,18 +11710,19 @@ (define_expand "and3" (operands[0], gen_lowpart (mode, operands[1]), mode, mode, 1)); else - ix86_expand_binary_operator (AND, mode, operands); + ix86_expand_binary_operator (AND, mode, operands, + TARGET_APX_NDD); DONE; }) (define_insn_and_split "*and3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (and: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -11733,39 +11734,53 @@ (define_insn_and_split "*and3_doubleword" if (operands[2] == const0_rtx) emit_move_insn (operands[0], const0_rtx); else if (operands[2] == constm1_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else - ix86_expand_binary_operator (AND, mode, &operands[0]); + ix86_expand_binary_operator (AND, mode, &operands[0], + TARGET_APX_NDD); if (operands[5] == const0_rtx) emit_move_insn (operands[3], const0_rtx); else if (operands[5] == constm1_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else - ix86_expand_binary_operator (AND, mode, &operands[3]); + ix86_expand_binary_operator (AND, mode, &operands[3], + TARGET_APX_NDD); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,?k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,rm,r,r,r,r,?k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,r,0,0,rm,r,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,Z,re,m,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands, + TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") - (set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,*") + [(set_attr "isa" "x64,apx_ndd,x64,x64,apx_ndd,apx_ndd,x64,avx512bw_512") + (set_attr "type" "alu,alu,alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11773,7 +11788,7 @@ (define_insn "*anddi_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,SI,DI,DI,DI,DI,SI,DI")]) (define_insn_and_split "*anddi_1_btr" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") @@ -11828,36 +11843,45 @@ (define_split ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands, + TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*and_1" - [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,Ya,?k") - (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SWI24 2 "" "r,,L,k"))) + [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,r,r,Ya,?k") + (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,rm,r,qm,k") + (match_operand:SWI24 2 "" "r,,r,,L,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "@ and{}\t{%2, %0|%0, %2} and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2} # #" [(set (attr "isa") - (cond [(eq_attr "alternative" "3") + (cond [(eq_attr "alternative" "2,3") + (const_string "apx_ndd") + (eq_attr "alternative" "5") (if_then_else (eq_attr "mode" "SI") (const_string "avx512bw") (const_string "avx512f")) ] (const_string "*"))) - (set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + (set_attr "type" "alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11865,24 +11889,27 @@ (define_insn "*and_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" ",,SI,")]) + (set_attr "mode" ",,,,SI,")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, QImode, operands)" + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD)" "@ and{b}\t{%2, %0|%0, %2} and{b}\t{%2, %0|%0, %2} and{l}\t{%k2, %k0|%k0, %k2} + and{b}\t{%2, %1, %0|%0, %1, %2} + and{b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "type" "alu,alu,alu,msklog") + [(set_attr "type" "alu,alu,alu,alu,alu,msklog") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,*") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -11985,7 +12012,10 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1]))" + || REGNO (operands[0]) != REGNO (operands[1])) + && (UINTVAL (operands[2]) == GET_MODE_MASK (SImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (HImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (QImode))" [(const_int 0)] { unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]); @@ -12058,10 +12088,10 @@ (define_insn "*anddi_2" [(set (reg FLAGS_REG) (compare (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m")) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,r,rm,r") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,Z,re,m")) (const_int 0))) - (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") + (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,r,r") (and:DI (match_dup 1) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode @@ -12075,38 +12105,46 @@ (define_insn "*anddi_2" && (!CONST_INT_P (operands[2]) || val_signbit_known_set_p (SImode, INTVAL (operands[2])))) ? CCZmode : CCNOmode) - && ix86_binary_operator_ok (AND, DImode, operands)" + && ix86_binary_operator_ok (AND, DImode, operands, TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} - and{q}\t{%2, %0|%0, %2}" + and{q}\t{%2, %0|%0, %2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") - (set_attr "mode" "SI,DI,DI")]) + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,apx_ndd") + (set_attr "mode" "SI,DI,DI,SI,DI,DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_2_zext" [(set (reg FLAGS_REG) (compare (and:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (AND, SImode, operands, TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*andqi_2_maybe_si" [(set (reg FLAGS_REG) (compare (and:QI - (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:QI 2 "general_operand" "qn,m,n")) + (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,n,rn,m")) (const_int 0))) - (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r") (and:QI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (AND, QImode, operands) + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD) && ix86_match_ccmode (insn, CONST_INT_P (operands[2]) && INTVAL (operands[2]) >= 0 ? CCNOmode : CCZmode)" @@ -12117,11 +12155,16 @@ (define_insn "*andqi_2_maybe_si" operands[2] = GEN_INT (INTVAL (operands[2]) & 0xff); return "and{l}\t{%2, %k0|%k0, %2}"; } + if (which_alternative > 2) + return "and{b}\t{%2, %1, %0|%0, %1, %2}"; return "and{b}\t{%2, %0|%0, %2}"; } [(set_attr "type" "alu") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd") (set (attr "mode") - (cond [(eq_attr "alternative" "2") + (cond [(eq_attr "alternative" "3,4") + (const_string "QI") + (eq_attr "alternative" "2") (const_string "SI") (and (match_test "optimize_insn_for_size_p ()") (and (match_operand 0 "ext_QIreg_operand") @@ -12138,15 +12181,21 @@ (define_insn "*andqi_2_maybe_si" (define_insn "*and_2" [(set (reg FLAGS_REG) (compare (and:SWI124 - (match_operand:SWI124 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI124 2 "" ",")) + (match_operand:SWI124 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI124 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,,r,r") (and:SWI124 (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, mode, operands)" - "and{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (AND, mode, operands, + TARGET_APX_NDD)" + "@ + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) (define_insn "*qi_ext_0" @@ -12392,6 +12441,7 @@ (define_insn_and_split "*qi_ext_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (and:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12399,7 +12449,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(~INTVAL (operands[2]) & ~(255 << 8))" + && !(~INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12428,7 +12479,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(~INTVAL (operands[2]) & ~255) - && !(INTVAL (operands[2]) & 128)" + && !(INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (and:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 2bd551614c4..be436d57bdf 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -85,6 +85,15 @@ F (int, not, ~) F1 (int, not, ~) F (long, not, ~) F1 (long, not, ~) + +FOO (char, and, &) +FOO1 (char, and, &) +FOO (short, and, &) +FOO1 (short, and, &) +FOO (int, and, &) +FOO1 (int, and, &) +FOO (long, and, &) +FOO1 (long, and, &) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -95,3 +104,7 @@ F1 (long, not, ~) /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ From patchwork Wed Dec 6 08:06:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174389 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3954643vqy; Wed, 6 Dec 2023 00:10:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IFZ1ysdQHTjP817rlnSYteQ5F+QouwnNyqF6yLuaYcnJ8+fZ650yJLNzTjh0lxt5dWAvGWc X-Received: by 2002:a05:620a:8705:b0:77d:78af:c518 with SMTP id px5-20020a05620a870500b0077d78afc518mr513963qkn.49.1701850214267; Wed, 06 Dec 2023 00:10:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850214; cv=pass; d=google.com; s=arc-20160816; b=s3E9VJ9Ua0cEooF5avDZSANVLp8NxpxCJ7df1VNe74y93WKxLJkPrC2e7cBb+aX/FB u1a05hujI9+laeqKP0/IHQEDsiDr5KxYICnkgM4dFdCO07sJTXGG1mdFHR3t15+KZ03H 6Trw2JEKOma6sICPYI9C3zd8E8NJvIK0awskY58NTRFIflhiu1zRNK2t3Jc7g77iK2O8 RvJAfJCoC9AlrTGD7QqyfAK9gnmjgCtMiQPQIHDff4sF7dlOqkuTVcF7fwt7lefjLgjw HdsC4XPFE0tFCTUbFvTGs7d/0OkH3QEFvsU0wH9oO7n9Tl0ZIHcqWnQuCndhPzx9ltlJ nerQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=JCgrVcKMYg1pId7E5rnhUAuGe5CKM4J30nxDlY7ehNM=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=goI+O4iosQqKGOuucM3GGMeAse6g/xhNpRBUz3vQrO8O7GeGIZTMzt11bQOjAhhxGh HdKIgixOyNXodKFclkFNt/X4gkii/2Mk+5WNa4D2PwvRr5PtjFCKLVO53lbAzfWeFJXM FbAhnHJcedvp9ytB38p1wnJxyyN+Tvmgs4egg/xEXjP0G5TOdH0jmG+ZrxKaPIqGr5Mm p9HkSGblL2zTqxn3AQScbvWiBlGj5DlqBjqd4vwdd5v9Rf03xkLUXz/A1TrsYc7VvtVq bX/nJc/WwfL+LYiPAM44/KWGl89IGou0B9CLUpID2Q1cdIlWlepIZgL0Pjx/Yv89uZby bS/A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TAnc7gpb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id bs19-20020a05620a471300b0077dd2def9b5si14239794qkb.330.2023.12.06.00.10.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:10:14 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TAnc7gpb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8574E383B263 for ; Wed, 6 Dec 2023 08:09:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id EE781384DEC6 for ; Wed, 6 Dec 2023 08:08:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE781384DEC6 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE781384DEC6 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; cv=none; b=Lo4acQKmouQnYscL+nWmaZlHGgNy3xKJ+g20YCkMn06o2X2mE0E5yPqT/tU7cZjiZuJs5IFDia9ZOXAarj0rVpOkZLWauTKDmJtIrSGHdKi08Koff6mnYdFFv3A7O6LVI++aQV3jY5jqSBeLlSfYTIAZAKWhabGh52TfbZ4ZNsA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; c=relaxed/simple; bh=cjfLcONTMgagMdRvxj10lII5iV4+9gqiT0Q5st3mZCc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=XhhrYT7su6DBesoBRhN4lNKWs+65qHBsxkXvNLJcVvwGjcH+kX7EXLhdLtpdUdlzZfwj25tISSSwaWk2SGKDvcJKSDAvtGV3g+WQR5h8HTonndMEQSgZHCA6D3XUTyD1yEwxzAbXlpxvsy6st8+0pBAgNMdFHY9kedIEMKM4iDw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850131; x=1733386131; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cjfLcONTMgagMdRvxj10lII5iV4+9gqiT0Q5st3mZCc=; b=TAnc7gpbb/ClETivHvn/XzCcD/f/7swaPtGVRgT9RUG6cPCERKFw6GTo 9W7L7GSoEbvPv7jWPow/X/7Vs8Ty0oUNF07AommTDU9FUtCjvy1GInbBK ZSO/YxRSs4dmErTcHq6KLcI1XUM4oRcWGThIZ9tQgKSHQbCrXMPL6831a G5UC52waEqwbr2nPUs9s2vnpMAzfjMedwHRlONHEW835D5jBpztuAct5t AV/aRdprWPp5T7ck24kKn/n+hYGawDM+Xm3FLraTqUSCanGnSMmTyeknk Z8XmFv4msJeC53sduyaJ4vvCx7UFDEuAOkrHdP9eYBtUs5owStGKC8frg g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085473" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085473" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737770" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737770" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:40 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A4B61100780E; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 10/16] [APX NDD] Support APX NDD for or/xor insn Date: Wed, 6 Dec 2023 16:06:30 +0800 Message-Id: <20231206080636.178863-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519290315850848 X-GMAIL-MSGID: 1784519290315850848 From: Kong Lingling Similar to AND insn, two splitters need to be adjusted to prevent misoptimizaiton for NDD OR/XOR. Also adjust *one_cmplsi2_2_zext and its corresponding splitter that will generate xor insn. gcc/ChangeLog: * config/i386/i386.md (3): Add new alternative for NDD and adjust output templates. (*_1): Likewise. (*qi_1): Likewise. (*notxor_1): Likewise. (*si_1_zext): Likewise. (*notxorqi_1): Likewise. (*_2): Likewise. (*si_2_zext): Likewise. (*si_2_zext_imm): Likewise. (*si_1_zext_imm): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*one_cmplsi2_2_zext): Likewise. (define_split for *one_cmplsi2_2_zext): Use nonimmediate_operand for operands[3]. (*3_doubleword): Add NDD constraints, adopt '&' to NDD dest and emit move for optimized case if operands[0] != operands[1] or operands[4] != operands[5]. (define_split for QI highpart OR/XOR): Prohibit splitter to split NDD form OR/XOR insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add or and xor test. --- gcc/config/i386/i386.md | 186 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 26 ++++ 2 files changed, 143 insertions(+), 69 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index d2528e0dcf6..ad4c958a1e8 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -12703,17 +12703,19 @@ (define_expand "3" && !x86_64_hilo_general_operand (operands[2], mode)) operands[2] = force_reg (mode, operands[2]); - ix86_expand_binary_operator (, mode, operands); + ix86_expand_binary_operator (, mode, operands, + TARGET_APX_NDD); DONE; }) (define_insn_and_split "*3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,&r,&r") (any_or: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -12725,20 +12727,29 @@ (define_insn_and_split "*3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else if (operands[2] == constm1_rtx) { if ( == IOR) emit_move_insn (operands[0], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[0]); + ix86_expand_unary_operator (NOT, mode, &operands[0], + TARGET_APX_NDD); } else - ix86_expand_binary_operator (, mode, &operands[0]); + ix86_expand_binary_operator (, mode, &operands[0], + TARGET_APX_NDD); if (operands[5] == const0_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else if (operands[5] == constm1_rtx) @@ -12746,37 +12757,43 @@ (define_insn_and_split "*3_doubleword" if ( == IOR) emit_move_insn (operands[3], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[3]); + ix86_expand_unary_operator (NOT, mode, &operands[3], + TARGET_APX_NDD); } else - ix86_expand_binary_operator (, mode, &operands[3]); + ix86_expand_binary_operator (, mode, &operands[3], + TARGET_APX_NDD); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (any_or:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k"))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" "@ {}\t{%2, %0|%0, %2} {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*notxor_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (not:SWI248 (xor:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k")))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, mode, operands)" + "ix86_binary_operator_ok (XOR, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -12792,8 +12809,8 @@ (define_insn_and_split "*notxor_1" DONE; } } - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*iordi_1_bts" @@ -12881,44 +12898,55 @@ (define_insn_and_split "*xor2andn" ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*si_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_1_zext_imm" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "%0")) - (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z"))) + (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "%0,rm")) + (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z,Z"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*qi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, QImode, operands)" + "ix86_binary_operator_ok (, QImode, operands, TARGET_APX_NDD)" "@ {b}\t{%2, %0|%0, %2} {b}\t{%2, %0|%0, %2} {l}\t{%k2, %k0|%k0, %k2} + {b}\t{%2, %1, %0|%0, %1, %2} + {b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -12930,12 +12958,12 @@ (define_insn "*qi_1" (symbol_ref "true")))]) (define_insn_and_split "*notxorqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") (not:QI - (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k")))) + (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, QImode, operands)" + "ix86_binary_operator_ok (XOR, QImode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -12951,12 +12979,12 @@ (define_insn_and_split "*notxorqi_1" DONE; } } - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -13004,44 +13032,59 @@ (define_split (define_insn "*_2" [(set (reg FLAGS_REG) (compare (any_or:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (any_or:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, mode, operands)" - "{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" + "@ + {}\t{%2, %0|%0, %2} + {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand ;; ??? Special case for immediate operand is missing - it is tricky. (define_insn "*si_2_zext" [(set (reg FLAGS_REG) - (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (any_or:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_2_zext_imm" [(set (reg FLAGS_REG) (compare (any_or:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm") + (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z,Z")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*_3" @@ -13062,6 +13105,7 @@ (define_insn "*_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (any_or:SWI248 (match_operand:SWI248 1 "register_operand") @@ -13069,7 +13113,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(INTVAL (operands[2]) & ~(255 << 8))" + && !(INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -13107,7 +13152,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(INTVAL (operands[2]) & ~255) - && (INTVAL (operands[2]) & 128)" + && (INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (any_or:QI (match_dup 1) (match_dup 2))) @@ -14173,20 +14220,21 @@ (define_split (define_insn "*one_cmplsi2_2_zext" [(set (reg FLAGS_REG) - (compare (not:SI (match_operand:SI 1 "register_operand" "0")) + (compare (not:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (not:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, SImode, operands)" + && ix86_unary_operator_ok (NOT, SImode, operands, TARGET_APX_NDD)" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_split [(set (match_operand 0 "flags_reg_operand") (match_operator 2 "compare_operator" - [(not:SI (match_operand:SI 3 "register_operand")) + [(not:SI (match_operand:SI 3 "nonimmediate_operand")) (const_int 0)])) (set (match_operand:DI 1 "register_operand") (zero_extend:DI (not:SI (match_dup 3))))] diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index be436d57bdf..d97648c876d 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -94,6 +94,24 @@ FOO (int, and, &) FOO1 (int, and, &) FOO (long, and, &) FOO1 (long, and, &) + +FOO (char, or, |) +FOO1 (char, or, |) +FOO (short, or, |) +FOO1 (short, or, |) +FOO (int, or, |) +FOO1 (int, or, |) +FOO (long, or, |) +FOO1 (long, or, |) + +FOO (char, xor, ^) +FOO1 (char, xor, ^) +FOO (short, xor, ^) +FOO1 (short, xor, ^) +FOO (int, xor, ^) +FOO1 (int, xor, ^) +FOO (long, xor, ^) +FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -108,3 +126,11 @@ FOO1 (long, and, &) /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "orb\[^\n\r]*1, \\(%rdi\\), %al" 2} } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 6 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "xorb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ From patchwork Wed Dec 6 08:06:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174397 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955344vqy; Wed, 6 Dec 2023 00:11:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IEzTcFDCMhKyr0Q0b0/DHXWAe1Rh8atZ0Fk/AptYroJTuxndxO4ztrEDxGADEd8rCp1nRuK X-Received: by 2002:a25:d4c9:0:b0:db7:d007:d29 with SMTP id m192-20020a25d4c9000000b00db7d0070d29mr426829ybf.5.1701850310758; Wed, 06 Dec 2023 00:11:50 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850310; cv=pass; d=google.com; s=arc-20160816; b=szWO8nfnjyQ8p3zBV9H867i5dfzG/hMJ8qj8SFwX+qfQH1JyWdle5Ipe/b+3ajLuWD knmow7FLZUs0B//wXhHt2EYFoEJZn1FBhs2yhATv9wteoUZmnoJKaT/CSzvIjUoeW8+8 YSZQE5NFOs03h1jhbB5hG5AsYu15m7fPxFN3RytCr5W+at0hgVdeZ3guftTpIY5X47QO 37VzHkkRW5svzd/sg7HJMEuQP0J8J7exZ79SbAajCKGxHYEKYcvSa6LQXVD3YgBhOGI3 AjTQkBdYnSwSdkH5MvCOzq9gyxmfOwO51T2W8/x49lJV479ZyRpqFK3BprmqvdF2qGN2 PtZQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=KxiJR/tigEFJoDX/gVCMCsCjxgfTfgJGNQSXvKWK4Pc=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=NGXa2GjQweSeXw1D+CIm171GksYiWKCTITYKNJIMep7yGt6WjdWpIm44f2S+MOF/7E GnD7X4049kH/OOLefVbfnzlTiPoAz/VFpN995yR3jc7rCjzOMYVl/Q9jdXJilqZ+JBAb ZJaokpo9OB5P1nh26TWF3iliVEl3imjtwPB3WUFgRn/mxenkLQgB0nMnB+coIwyEev4p OSRayPHac202KYvIiqLRFBvTeadN5XDYSx84H3M3ErIONWgScA+WP6Qlf8lXwGygzZ7i emSqfaa2+F2I/Ey0y8b57M6yi6gbKxKX38otxLGnjhnMENJ9M6MFC9A5I3hVxJmLDyTh nxVg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IXIvnyeQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id h9-20020ad45449000000b0067a1a581179si13576212qvt.585.2023.12.06.00.11.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:11:50 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IXIvnyeQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 065003864C69 for ; Wed, 6 Dec 2023 08:11:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id E420C384DEE7 for ; Wed, 6 Dec 2023 08:08:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E420C384DEE7 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E420C384DEE7 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850140; cv=none; b=M+feBhm+Zzsjqx7UgObvO6WEgZeBgDJGVa7HipD9BWwWG4OsPue75XDJqFB79FRaOO5tG5v4irybFarxNvEirWHpmZ9hPC/UwvGD3xmE7mzDwPRJTc8s7ChOuN47m3dGAMlbzE0p7OroIdfNceNXqYDaofRA5/1Y0j4JE7P+q80= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850140; c=relaxed/simple; bh=4TX4Tk8gjWkK5CiD07oAlfxHJn9n8vrYExh0oZ7ZBT8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=L89dcBk/4XJxb+Ym13KqROKUCGGbvD4XUJU0Lc+1mHrp71AiE7WtQKOsBZJ3PDAGZiy2yvqm9VleSQRm72zKojWW0Z8tqaJWAPDtg8VLIgIfbOjQv2b693N49ki1I8QcJQ2hACZXwuW5niZFaLQm8YQR87zVoKaZ84hxZACiHI4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850137; x=1733386137; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4TX4Tk8gjWkK5CiD07oAlfxHJn9n8vrYExh0oZ7ZBT8=; b=IXIvnyeQI6jzTjcTKmDHy67eP7h0vbTn4SO9M/jPqKX9inVLIe7sTj32 ev0iL7fdQBdtBdxD55SH1aV4PWD64jF7R/4ZRPgWEudVYXAyU7qJ6K7PH Edmpwcw70/vDaXWFMq+nzKa8Gqatkm7v1qvo2R/QHuula23jGrBN1axcm IFzzlQuK4BKDOgAB1Z4SeH5YFMIpJ6sLstm3A8janWmThSwbMCbHa/hQA AZprbXIBwxKckBlum4JDbdz+zrCGJCYxsONftXVVoLNia/dLbUNUnswLH 43gwceY+vMrkJPu/6Ptyb1QLlxVBMyoYoQOROWoMQtbeOs9MtcdWUC7ny g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085487" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085487" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737780" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737780" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A7B12100780F; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 11/16] [APX NDD] Support APX NDD for left shift insns Date: Wed, 6 Dec 2023 16:06:31 +0800 Message-Id: <20231206080636.178863-12-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519391550373051 X-GMAIL-MSGID: 1784519391550373051 For left shift, there is an optimization TARGET_DOUBLE_WITH_ADD that shl 1 can be optimized to add. As NDD form of add requires src operand to be register since NDD cannot take 2 memory src, we currently just keep using NDD form shift instead of add. The optimization TARGET_SHIFT1 will try to remove constant 1 to use shorter opcode, but under NDD assembler will automatically use it whether $1 exist or not, so do not involve NDD with it. The doubleword insns for left shift calls ix86_expand_ashl, which assume all shift related pattern has same operand[0] and operand[1]. For these pattern we will support them in a standalone patch. gcc/ChangeLog: * config/i386/i386.md (*ashl3_1): Extend with new alternatives to support NDD, limit the new alternative to generate sal only, and adjust output template for NDD. (*ashlsi3_1_zext): Likewise. (*ashlhi3_1): Likewise. (*ashlqi3_1): Likewise. (*ashl3_cmp): Likewise. (*ashlsi3_cmp_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*ashl3_cconly): Likewise. (*ashl3_doubleword_highpart): Adjust codegen for NDD. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add tests for sal. --- gcc/config/i386/i386.md | 172 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 22 +++ 2 files changed, 136 insertions(+), 58 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ad4c958a1e8..c67896cf97c 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14472,10 +14472,19 @@ (define_insn_and_split "*ashl3_doubleword_highpart" { split_double_mode (mode, &operands[0], 1, &operands[0], &operands[3]); int bits = INTVAL (operands[2]) - ( * BITS_PER_UNIT); - if (!rtx_equal_p (operands[3], operands[1])) - emit_move_insn (operands[3], operands[1]); - if (bits > 0) - emit_insn (gen_ashl3 (operands[3], operands[3], GEN_INT (bits))); + bool op_equal_p = rtx_equal_p (operands[3], operands[1]); + if (bits == 0) + { + if (!op_equal_p) + emit_move_insn (operands[3], operands[1]); + } + else + { + if (!op_equal_p && !TARGET_APX_NDD) + emit_move_insn (operands[3], operands[1]); + rtx op_tmp = TARGET_APX_NDD ? operands[1] : operands[3]; + emit_insn (gen_ashl3 (operands[3], op_tmp, GEN_INT (bits))); + } ix86_expand_clear (operands[0]); DONE; }) @@ -14782,12 +14791,14 @@ (define_insn "*bmi2_ashl3_1" (set_attr "mode" "")]) (define_insn "*ashl3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k") - (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,M,r,"))) + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k,r") + (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,M,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, mode, operands)" + "ix86_binary_operator_ok (ASHIFT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14802,18 +14813,25 @@ (define_insn "*ashl3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + /* For NDD form instructions related to TARGET_SHIFT1, the $1 + immediate do not need to be omitted as assembler will map it + to use shorter encoding. */ + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,bmi2,") + [(set_attr "isa" "*,*,bmi2,,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14855,13 +14873,15 @@ (define_insn "*bmi2_ashlsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*ashlsi3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI - (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,M,r")))) + (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14874,18 +14894,22 @@ (define_insn "*ashlsi3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,*,bmi2") + [(set_attr "isa" "*,*,bmi2,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "3") + (const_string "ishift") (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") @@ -14915,12 +14939,14 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashlhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k") - (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k,r") + (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, HImode, operands)" + "ix86_binary_operator_ok (ASHIFT, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14933,18 +14959,22 @@ (define_insn "*ashlhi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{w}\t%0"; else - return "sal{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{w}\t{%2, %1, %0|%0, %1, %2}" + : "sal{w}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,avx512f") + [(set_attr "isa" "*,*,avx512f,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "msklog") + (eq_attr "alternative" "3") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14960,15 +14990,17 @@ (define_insn "*ashlhi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "HI,SI,HI")]) + (set_attr "mode" "HI,SI,HI,HI")]) (define_insn "*ashlqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k") - (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k,r") + (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, QImode, operands)" + "ix86_binary_operator_ok (ASHIFT, QImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14984,7 +15016,8 @@ (define_insn "*ashlqi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) { if (get_attr_mode (insn) == MODE_SI) return "sal{l}\t%k0"; @@ -14996,16 +15029,19 @@ (define_insn "*ashlqi3_1" if (get_attr_mode (insn) == MODE_SI) return "sal{l}\t{%2, %k0|%k0, %2}"; else - return "sal{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{b}\t{%2, %1, %0|%0, %1, %2}" + : "sal{b}\t{%2, %0|%0, %2}"; } } } - [(set_attr "isa" "*,*,*,avx512dq") + [(set_attr "isa" "*,*,*,avx512dq,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (eq_attr "alternative" "3") (const_string "msklog") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -15021,10 +15057,10 @@ (define_insn "*ashlqi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "QI,SI,SI,QI") + (set_attr "mode" "QI,SI,SI,QI,QI") ;; Potential partial reg stall on alternative 1. (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "1") + (cond [(eq_attr "alternative" "1,4") (symbol_ref "!TARGET_PARTIAL_REG_STALL")] (symbol_ref "true")))]) @@ -15119,10 +15155,10 @@ (define_split (define_insn "*ashl3_cmp" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (ashift:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL @@ -15130,8 +15166,10 @@ (define_insn "*ashl3_cmp" && (TARGET_SHIFT1 || (TARGET_DOUBLE_WITH_ADD && REG_P (operands[0]))))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, mode, operands)" + && ix86_binary_operator_ok (ASHIFT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ALU: @@ -15140,14 +15178,19 @@ (define_insn "*ashl3_cmp" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") @@ -15167,10 +15210,10 @@ (define_insn "*ashl3_cmp" (define_insn "*ashlsi3_cmp_zext" [(set (reg FLAGS_REG) (compare - (ashift:SI (match_operand:SI 1 "register_operand" "0") + (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (ashift:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -15179,8 +15222,10 @@ (define_insn "*ashlsi3_cmp_zext" && (TARGET_SHIFT1 || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFT, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ALU: @@ -15189,14 +15234,19 @@ (define_insn "*ashlsi3_cmp_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") - (cond [(and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") ] @@ -15215,10 +15265,10 @@ (define_insn "*ashlsi3_cmp_zext" (define_insn "*ashl3_cconly" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -15226,22 +15276,28 @@ (define_insn "*ashl3_cconly" || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ALU: gcc_assert (operands[2] == const1_rtx); return "add{}\t%0, %0"; - default: + default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index d97648c876d..9951fb00a4c 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -29,6 +29,16 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ return c; \ } +#define FOO3(TYPE, OP_NAME, OP, IMM) \ +TYPE \ +__attribute__ ((noipa)) \ +foo3_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = a OP IMM; \ + return b; \ +} + + #define F(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -112,6 +122,16 @@ FOO (int, xor, ^) FOO1 (int, xor, ^) FOO (long, xor, ^) FOO1 (long, xor, ^) + +FOO (char, shl, <<) +FOO3 (char, shl, <<, 7) +FOO (short, shl, <<) +FOO3 (short, shl, <<, 7) +FOO (int, shl, <<) +FOO3 (int, shl, <<, 7) +FOO (long, shl, <<) +FOO3 (long, shl, <<, 7) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -134,3 +154,5 @@ FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "sal(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Wed Dec 6 08:06:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174398 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955436vqy; Wed, 6 Dec 2023 00:12:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IEkNGPxpLqxUqLixOyk73KncZF7C1iGc4zve4igNhGUt7muZKJKinvpdlKc5Eo10hLaBpuc X-Received: by 2002:a25:ad17:0:b0:db7:dacf:4d62 with SMTP id y23-20020a25ad17000000b00db7dacf4d62mr213362ybi.94.1701850324453; Wed, 06 Dec 2023 00:12:04 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850324; cv=pass; d=google.com; s=arc-20160816; b=A84HG3hcJPfwL4pKQamC5uGHyKJ30fYPjb+UyrVPx+M4m+Avm9Yd+B6L0OWbCILmA/ 6qOYMDxUNghymnac8JvIY8bLRUe8B8XgQphzU05kk6rrLg/ZZNz9aHE4sR2Eq0BwgQ9E v39ctqce3xoHC2qruRG0bnN6wONN0LpVLbe6aUttLneDrRxyVjikPC0ZJk6hBfz0lCM4 Kn76g6bPNk1ML+4nZnIsdoQtUVChHpffBmw4ZTB9MFTPOvjHzXn0iPbWodc721toyrA0 ZP6biywbluNUsSh5jBMlVi8YA4cOccWjnPf6DiQykN5jUdsbf5gjB62TmnOJ8Ln1hNI+ duew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=DxxyMgPGv9Cd+AvbKrwq+KDSQew88YiMzN+b1YnW/J8=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=yW+4tWOKSx3aynAUqXYFChF40GM2UWb581HywK9AqFmUUKHT0kHyBlG4siZxkMb5w2 +GF1STyXHpTYuG9qXZnAUVlx1YWIZfZ0jXB38LCv8MK0IVzMHZrp7tmAQE44YYR0fq6q JP+827OGneutaxdV48duGtSrVCLK6dWi1oV1Bk++pRrnCi3iztpEa7vVa0bJqb3WKeDw K3aueDUiTAcwA74QBYhwhMR5bjPRQJQPRm2SbdqDgk7Izce+rD9darXhuvKP7AQOruIG ao/u3JDFVaRsmsFcBAw+gcILCPlaCkNm//aXdmdKW9Hab2gnKXppAx8rrxeDSCX9JVQ1 cBXw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=N5hu92Tf; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id 15-20020a0562140d6f00b0066d138d1bf1si12403243qvs.308.2023.12.06.00.12.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:12:04 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=N5hu92Tf; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 017B73847719 for ; Wed, 6 Dec 2023 08:11:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 314303865C21 for ; Wed, 6 Dec 2023 08:09:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 314303865C21 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 314303865C21 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850143; cv=none; b=aSN4nhOw5tXbRlfP1/StDOYAwqV+9Th844w6rMWAwpeHrKA0FAEia60W/+jIIxrxBDsqvZZ/l+fkvJV3aS/de83CnGRw6otKtxll+wvvrWd+hJ/Xfzu6tVJRfo8Qna179fovn4KcJ9JF5jB2976Dqg2TpTbYGwuTrSvDT3DFXi0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850143; c=relaxed/simple; bh=drl7qcIKPh+JWxS4CqOMk3u87RLGgDMdjPB0Q4EUsp8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=tikPnz9nPDDfbvaiP0HFRkifjVFzj7PNIm2bymPCBRPkpc3VGaR100+jnlvUDSDzxz8MOuFtENBRVmswgs85vxPXgl9/Rj3U75ffrbtiLUO5kTHlecz8QQeqosKtfI0ttErMueHIJOeb4Q8HS/EztDoVslFZKgK3R6zh6kug4Ek= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850141; x=1733386141; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=drl7qcIKPh+JWxS4CqOMk3u87RLGgDMdjPB0Q4EUsp8=; b=N5hu92Tfrl47I2EDcUHEXoSdb5bDGs+5NFQyensJ0Y/68fEy0Jm2wsCO A2uJR+0BHMKzypDBRMVdj0ep7Sn9/xrH4R5GKhrpki9A/mxo1PleSCZCH y/BIvMkCmlf55MlZxkep+f03QFRxuNrBya+i6B2IPm5aQuucXbIVMJqmB YCOJ6ayog2PAJnK3HDtHqCuN+f/y14MPr2hykpNX0Hy0vOJlCDMATK8Pm sQHzloEeHhjfEf7tbHqN9LUW2gpp3qXFVs4cKsrmDF/1cCDkJ8+yUvHXB DYXCVQXn2m/Dwtn47ukCzM0I/sR4UrjXXmJ8gHnKpFfB1Qw+uTy27IzQD Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085492" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085492" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737789" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737789" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AA9731007810; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 12/16] [APX NDD] Support APX NDD for right shift insns Date: Wed, 6 Dec 2023 16:06:32 +0800 Message-Id: <20231206080636.178863-13-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519405709240391 X-GMAIL-MSGID: 1784519405709240391 Similar to LSHIFT, rshift do not need to omit $1 for NDD form. gcc/ChangeLog: * config/i386/i386.md (ashr3_cvt): Extend with new alternatives to support NDD, and adjust output templates. (*ashr3_1): Likewise for SI/DI mode. (*lshr3_1): Likewise. (*si3_1_zext): Likewise. (*ashr3_1): Likewise for QI/HI mode. (*lshrqi3_1): Likewise. (*lshrhi3_1): Likewise. (3_cmp): Likewise. (*3_cconly): Likewise. (*ashrsi3_cvt_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*highpartdisi2): Likewise. (*si3_cmp_zext): Likewise. (3_carry): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add l/ashiftrt tests. --- gcc/config/i386/i386.md | 232 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 24 +++ 2 files changed, 166 insertions(+), 90 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c67896cf97c..d1eae7248d9 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15808,39 +15808,45 @@ (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) (define_insn "ashr3_cvt" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "*a,0") + (match_operand:SWI48 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)-1 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" "@ - sar{}\t{%2, %0|%0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{}\t{%2, %0|%0, %2} + sar{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "")]) (define_insn "*ashrsi3_cvt_zext" - [(set (match_operand:DI 0 "register_operand" "=*d,r") + [(set (match_operand:DI 0 "register_operand" "=*d,r,r") (zero_extend:DI - (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0") + (ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && INTVAL (operands[2]) == 31 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, SImode, operands, + TARGET_APX_NDD)" "@ {cltd|cdq} - sar{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{l}\t{%2, %k0|%k0, %2} + sar{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "SI")]) (define_expand "@x86_shift_adj_3" @@ -15882,13 +15888,15 @@ (define_insn "*bmi2_3_1" (set_attr "mode" "")]) (define_insn "*ashr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,r"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15896,14 +15904,16 @@ (define_insn "*ashr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15916,8 +15926,8 @@ (define_insn "*ashr3_1" ;; Specialization of *lshr3_1 below, extracting the SImode ;; highpart of a DI to be extracted, but allowing it to be clobbered. (define_insn_and_split "*highpartdisi2" - [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k") 0) - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k") + [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k,r") 0) + (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "0,0,k,rm") (const_int 32))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" @@ -15936,16 +15946,20 @@ (define_insn_and_split "*highpartdisi2" DONE; } operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); -}) +} +[(set_attr "isa" "*,*,*,apx_ndd")]) + (define_insn "*lshr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k,r") (lshiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,r,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, mode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15954,14 +15968,16 @@ (define_insn "*lshr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{}\t%0"; else - return "shr{}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{}\t{%2, %1, %0|%0, %1, %2}" + : "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,") - (set_attr "type" "ishift,ishiftx,msklog") + [(set_attr "isa" "*,bmi2,,apx_ndd") + (set_attr "type" "ishift,ishiftx,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15994,13 +16010,15 @@ (define_insn "*bmi2_si3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,r")))) + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -16008,14 +16026,16 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16038,20 +16058,25 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashr3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m, r") (ashiftrt:SWI12 - (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + (match_operand:SWI12 1 "nonimmediate_operand" "0, rm") + (match_operand:QI 2 "nonmemory_operand" "c, c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16062,29 +16087,33 @@ (define_insn "*ashr3_1" (set_attr "mode" "")]) (define_insn "*lshrqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k,r") (lshiftrt:QI - (match_operand:QI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI,Wb"))) + (match_operand:QI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, QImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, QImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{b}\t%0"; else - return "shr{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{b}\t{%2, %1, %0|%0, %1, %2}" + : "shr{b}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*,avx512dq,apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -16096,29 +16125,33 @@ (define_insn "*lshrqi3_1" (set_attr "mode" "QI")]) (define_insn "*lshrhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k") + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k, r") (lshiftrt:HI - (match_operand:HI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI, Ww"))) + (match_operand:HI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI, Ww, cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, HImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{w}\t%0"; else - return "shr{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{w}\t{%2, %1, %0|%0, %1, %2}" + : "shr{w}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*, avx512f") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*, avx512f, apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -16171,25 +16204,30 @@ (define_insn "*3_cmp" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (any_shiftrt:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, mode, operands)" + && ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16202,10 +16240,10 @@ (define_insn "*3_cmp" (define_insn "*si3_cmp_zext" [(set (reg FLAGS_REG) (compare - (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0") + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (any_shiftrt:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -16213,15 +16251,20 @@ (define_insn "*si3_cmp_zext" || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, SImode, operands)" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16235,23 +16278,28 @@ (define_insn "*3_cconly" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16855,18 +16903,22 @@ (define_insn "rcrdi2" ;; Versions of sar and shr that set the carry flag. (define_insn "3_carry" [(set (reg:CCC FLAGS_REG) - (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0") + (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") (const_int 1)) (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI48 0 "register_operand" "=r") + (set (match_operand:SWI48 0 "register_operand" "=r,r") (any_shiftrt:SWI48 (match_dup 1) (const_int 1)))] "" { - if (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; + if ((TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; - return "{}\t{1, %0|%0, 1}"; + return use_ndd ? "{}\t{$1, %1, %0|%0, %1, 1}" + : "{}\t{$1, %0|%0, 1}"; } - [(set_attr "type" "ishift1") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift1") (set (attr "length_immediate") (if_then_else (ior (match_test "TARGET_SHIFT1") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 9951fb00a4c..239c427514a 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,6 +2,8 @@ /* { dg-options "-mapxf -march=x86-64 -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ +#include + #define FOO(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -132,6 +134,24 @@ FOO3 (int, shl, <<, 7) FOO (long, shl, <<) FOO3 (long, shl, <<, 7) +FOO (char, sar, >>) +FOO3 (char, sar, >>, 7) +FOO (short, sar, >>) +FOO3 (short, sar, >>, 7) +FOO (int, sar, >>) +FOO3 (int, sar, >>, 7) +FOO (long, sar, >>) +FOO3 (long, sar, >>, 7) + +FOO (uint8_t, shr, >>) +FOO3 (uint8_t, shr, >>, 7) +FOO (uint16_t, shr, >>) +FOO3 (uint16_t, shr, >>, 7) +FOO (uint32_t, shr, >>) +FOO3 (uint32_t, shr, >>, 7) +FOO (uint64_t, shr, >>) +FOO3 (uint64_t, shr, >>, 7) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -156,3 +176,7 @@ FOO3 (long, shl, <<, 7) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "sal(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Wed Dec 6 08:06:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174401 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955671vqy; Wed, 6 Dec 2023 00:12:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IFpnYY2v2i6x5OwzDYBKkt4UURkqWTks8mk6kMRA7NFlwhEu7slhqqzU3CoLyJrKhYeBzJr X-Received: by 2002:a05:620a:200c:b0:77e:fe81:8bb1 with SMTP id c12-20020a05620a200c00b0077efe818bb1mr498925qka.17.1701850356976; Wed, 06 Dec 2023 00:12:36 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850356; cv=pass; d=google.com; s=arc-20160816; b=Mq7WVF8n4iMdzW5HgwCSVJar26MPOx1ckf9rBXiTLVR6DoXJ2GgOKBNHbnY4+87cjP hgO0CC1ssXbtn3+npMptiO0nZr6GP2kbX2wt6ZNAR13nNF9ItfzVgOSTKFJVC1O8yE8Z GHhGo6j7+39qWiOuT7yuWVglK17vhNdtSkg4ZdX2Sg1QLCft6bkUC4xOHffN3udZu+6R zYGDaKLnO8a6YxnUsmjZyvsSffS8UCK8Qh68ydxDYuKUl6nonq3nrYELQxu1l8vPPk5d GoaCuST6QKGrtrfJmefAW2myndwCpb4D2RMlEMavOlf8p8aDcddAU5bzmQW3QY3vo/e3 NJUA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=oYhQNUT2oFHVg8iTLiSmJnzU2aSJXC0k9FkD1KNKbPo=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=EG6FJO1OZNUVLLkxBpnAQ60A6UuNs0UjldEH2fVO2lEIftsu+ImBBWR+gAAQbKOtA6 A7gAPDA5HYDsNzIG/jl/pR9dcZP+OgHJngv5Rk2/3g7uaAdpSsqjxECTCyBdHJVy8LVS ELvtfq3fmHPMj7md/cJcUcEiw4NFfPplJC+qIFLWapr2IsLnWL8z7tikYk3Qfst2MmGK 6ByQq326tLEj1dI1dGhTV591+zTme3CD/45zNY0RZ2M371p+wSoT/0csBJ5W9Zx/9V3t KHwJQAssBQU8huDWo/6pR8OnM9tpfiVj/l8hDfotoWxuWzKw1lYMcu+oF23HNlnHY5Io tiyg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EVmNM72d; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ov9-20020a05620a628900b0077f05b75919si4854394qkn.439.2023.12.06.00.12.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:12:36 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EVmNM72d; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB6393858020 for ; Wed, 6 Dec 2023 08:12:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id D49BA384DECE for ; Wed, 6 Dec 2023 08:08:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D49BA384DECE Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D49BA384DECE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850139; cv=none; b=K5MvejSzTHbqGLkcx7nIV7EY/sHL2o1iQSaobS1EXQyAZo0cCey5QcngkQYIpcxtMuT1QBP7qWsfkKL0buEZIqM+byPx7jjowl2v2hQqBAiqQ8WoNmHmwyXtblLQay1JKu/sJjcSPVOYYE4ZDcCcsQhU5xBDT36PWd4R93VNw3U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850139; c=relaxed/simple; bh=IxiqV1uZ8q6hDpju0o5ORI9DLC/mEIer4C24gJpk9Bo=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=kgOZDulMnle4rqan7mOeEXYrHZzV39TDAVnr0kbDEPiEhWoVq7dIn45OQ6/xH+xBjXZ9lglMybKMxBq6l0D1ctTXu6mao9FyssimMzfA4P4noaMCk14/kSwxQ2mcH957BhGP1RDOnuRiTTr7k6f7Pje2SJh/JtcvRbFUgQkSs/c= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850137; x=1733386137; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IxiqV1uZ8q6hDpju0o5ORI9DLC/mEIer4C24gJpk9Bo=; b=EVmNM72dSnngxvIEGgvYQEhzAhYT3DuejPL6iS1cdIkjGVvItYuA4Ugj Z2Wj9lRcWmHn2qhgSjuyZ7zIPe5GwTiBMhxjtx2A66q2K8X1Yowm5syrt 8zc9Yxt7QiLInTSognUhdKOGdCpjigmoU7nC4P//2HC4/8O9N/SJIQg+3 NCGAYKlhnzz8RbhV0i4fnnhms15yTrJjNdD6QikWJ2ayAJht+8epXogVv KVAb2qCad6AdByFEvWyBWtvNEXGrgaGOWe93Cbc7zNWiA98xgsSvKlFQf tI8XxG5rzXDDlksbbseYmPX72lVssFKC+nWie6aKDy2xK9S5p9t4OEppq g==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085481" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085481" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737777" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737777" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AD52A1007811; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 13/16] [APX NDD] Support APX NDD for rotate insns Date: Wed, 6 Dec 2023 16:06:33 +0800 Message-Id: <20231206080636.178863-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519439998310878 X-GMAIL-MSGID: 1784519439998310878 gcc/ChangeLog: * config/i386/i386.md (*3_1): Extend with a new alternative to support NDD for SI/DI rotate, and adjust output template. (*si3_1_zext): Likewise. (*3_1): Likewise for QI/HI modes. (rcrsi2): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (rcrdi2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for left/right rotate. --- gcc/config/i386/i386.md | 79 +++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 20 +++++++ 2 files changed, 69 insertions(+), 30 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index d1eae7248d9..6e4ac776f8a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16667,13 +16667,15 @@ (define_insn "*bmi2_rorx3_1" (set_attr "mode" "")]) (define_insn "*3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (any_rotate:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16681,14 +16683,16 @@ (define_insn "*3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16738,13 +16742,14 @@ (define_insn "*bmi2_rorxsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,I")))) + (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,I,cI")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16752,14 +16757,16 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16803,19 +16810,25 @@ (define_split (zero_extend:DI (rotatert:SI (match_dup 1) (match_dup 2))))]) (define_insn "*3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") - (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m,r") + (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = get_attr_isa (insn) == ISA_APX_NDD; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "rotate") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "rotate") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16872,31 +16885,37 @@ (define_split ;; Rotations through carry flag (define_insn "rcrsi2" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,r") (plus:SI - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:SI (ltu:SI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 31)))) (clobber (reg:CC FLAGS_REG))] "" - "rcr{l}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{l}\t%0 + rcr{l}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "memory" "none") (set_attr "length_immediate" "0") (set_attr "mode" "SI")]) (define_insn "rcrdi2" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (plus:DI - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0") + (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:DI (ltu:DI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 63)))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "rcr{q}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{q}\t%0 + rcr{q}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "length_immediate" "0") (set_attr "mode" "DI")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 239c427514a..b215f66d3e2 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -40,6 +40,14 @@ foo3_##OP_NAME##_##TYPE (TYPE a) \ return b; \ } +#define FOO4(TYPE, OP_NAME, OP1, OP2, IMM1) \ +TYPE \ +__attribute__ ((noipa)) \ +foo4_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = (a OP1 IMM1 | a OP2 (8 * sizeof(TYPE) - IMM1)); \ + return b; \ +} #define F(TYPE, OP_NAME, OP) \ TYPE \ @@ -152,6 +160,16 @@ FOO3 (uint32_t, shr, >>, 7) FOO (uint64_t, shr, >>) FOO3 (uint64_t, shr, >>, 7) +FOO4 (uint8_t, ror, >>, <<, 1) +FOO4 (uint16_t, ror, >>, <<, 1) +FOO4 (uint32_t, ror, >>, <<, 1) +FOO4 (uint64_t, ror, >>, <<, 1) + +FOO4 (uint8_t, rol, <<, >>, 1) +FOO4 (uint16_t, rol, <<, >>, 1) +FOO4 (uint32_t, rol, <<, >>, 1) +FOO4 (uint64_t, rol, <<, >>, 1) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -180,3 +198,5 @@ FOO3 (uint64_t, shr, >>, 7) /* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "ror(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "rol(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Wed Dec 6 08:06:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174395 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955195vqy; Wed, 6 Dec 2023 00:11:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IGMW1iXNGbu7/Nu7rgWDJb9uah1SYQrC2Ci7zbrUz/ABiAbCmQgn7Pg3kfTCGgew41jKf6m X-Received: by 2002:a25:ac27:0:b0:db5:4ef8:540 with SMTP id w39-20020a25ac27000000b00db54ef80540mr358972ybi.41.1701850289845; Wed, 06 Dec 2023 00:11:29 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850289; cv=pass; d=google.com; s=arc-20160816; b=uDNB16iEw3xhZQkIeni6n5VUeWILwLRm1HqXwnkd4YgbyWH/lGbobHlZ6nhgpPBKdG SEf+jRJdOvSmWGYGk7ne0u6wVS976whGR8LR8thsjBeihlwTr0rt2kmiyAfxyW36PlhO /JULXGHWjN67FtBClXGETGEZTJymhg1/e9fxnRwYKRyi01wPTb2Fjfc5lDUvAYTCkTmK VvWlaU6CIkOzSagKyZ4WvfeLnTknTtpvgWIeUYtFLPuQiOu8YKx0dVjTLvTtIGS6YkUO jeyt+fDyvTi4T4YCVoKGYO23mv2z2FinfopNg5DBAkBhn4mioKoA4EvtyLGkWTXxzgWs o6Lw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=Q4+UL1h/SLicjgyOACM0ZJFGVLSf6kYMw1oLJ7odQ1U=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=e3MUEntEZRfbJcgMbEHMZOW1X4omHeM6RAmkF45yjbNqGkIbDkQEwi02kOFmcgIVC7 UOY4/11Aje/8/NbhUOY4YjHMrJ6Xi5X0AX1D8tfObqVL14QfD5o9Qp93sacPMawKF87n RmoCgCDmjTARE7928gBAUBhcC1RhJ7Aou7xC3wV0t/RO8ajY2he5r+SBYc4mYZZbsDIp IDN5CHeTzaCQrAE6egPgxGX7xiP/WAwYVQuxNh2vXQ1kpt2zbyt5e/MmFctptEiJz5UZ /avlhjKvBt30hZGTOPc2DHA4s/QAS41MAwpida9HLoBL5+uy+u+lolUjbzDS9TqYV3Iz VuVQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="TuA2Z/zg"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f2-20020a05620a280200b0077d7d163d7esi14955983qkp.148.2023.12.06.00.11.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:11:29 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="TuA2Z/zg"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 762FA3835026 for ; Wed, 6 Dec 2023 08:10:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 10C103861895 for ; Wed, 6 Dec 2023 08:08:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 10C103861895 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 10C103861895 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850140; cv=none; b=TY+uvsOAFtUsE/IeJjtF6ui8aRaazE5d7pil8KFrbsSPQtZLYiggXtHs6aaBYTanrSMera+tsvwqqzz7t9E6w0hqkWR6q5VnZ9xzPcAmni+7f4hzYwf0wEKXgSHl6LXEWGPoB9aw25dwKrNIwTjg7vZZQDQEaQfmvlmqYe9f7SQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850140; c=relaxed/simple; bh=f8A9HdiT48dOUFOKSRu3sdeUI/tt5BBmbU7dn7Ho7Z0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=vZDLOGjNwN27DwnjpzmW6yKjEFw9uHmHgPoSs70xH7aRl8juzZk5NQ7PyiaA7J7+lPahnkTXoNyY9Y2HUJbvs8xkp7Sprj0yraPFtk+J56pTO1h1Nl6WmTnUlS+Y9jKsa+gIAHNf4fnYB9B/icAMmf44SJfTkn02ZHUxGhC81Js= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850138; x=1733386138; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=f8A9HdiT48dOUFOKSRu3sdeUI/tt5BBmbU7dn7Ho7Z0=; b=TuA2Z/zgNAvdR3UwEqsVYdYvn1Ybx2vc4f+KWthkZqYQCfPk4P8iipWp lTr3wReCtjD7O7AExHSEZhzCXGpjYZtyhnKkkdvYwPUOmlTj/zoxU4yEO RYxJ3pC5R1AQ2pbnBkE8J+SCXAS46YSZtQpPyeOecdJAEIbmKoBcNtX1n aA2svQ2Sw2jO8Jf3WMiJIOqCGpG2d/dS8CwxtkjR22JBLjJt4d+gyBJjE SB/4Jfv1iDkUB9GEQZfPE7nzsfWmX/rFHsTw5EJEeRliIkQAsYzEGlKh5 kyAFkYuBZkG/jTqePBnFAL6BfuttUp+BIQ3dVPVI4f+5DWN6goXVPV9yW w==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085489" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085489" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737787" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737787" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AFAA51007812; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 14/16] [APX NDD] Support APX NDD for shld/shrd insns Date: Wed, 6 Dec 2023 16:06:34 +0800 Message-Id: <20231206080636.178863-15-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519369654397536 X-GMAIL-MSGID: 1784519369654397536 For shld/shrd insns, the old pattern use match_dup 0 as its shift src and use +r*m as its constraint. To support NDD we added new define_insns to handle NDD form pattern with extra input and dest operand to be fixed in register. gcc/ChangeLog: * config/i386/i386.md (x86_64_shld_ndd): New define_insn. (x86_64_shld_ndd_1): Likewise. (*x86_64_shld_ndd_2): Likewise. (x86_shld_ndd): Likewise. (x86_shld_ndd_1): Likewise. (*x86_shld_ndd_2): Likewise. (x86_64_shrd_ndd): Likewise. (x86_64_shrd_ndd_1): Likewise. (*x86_64_shrd_ndd_2): Likewise. (x86_shrd_ndd): Likewise. (x86_shrd_ndd_1): Likewise. (*x86_shrd_ndd_2): Likewise. (*x86_64_shld_shrd_1_nozext): Adjust codegen under TARGET_APX_NDD. (*x86_shld_shrd_1_nozext): Likewise. (*x86_64_shrd_shld_1_nozext): Likewise. (*x86_shrd_shld_1_nozext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-shld-shrd.c: New test. --- gcc/config/i386/i386.md | 322 +++++++++++++++++- .../gcc.target/i386/apx-ndd-shld-shrd.c | 24 ++ 2 files changed, 344 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6e4ac776f8a..5c6275430d6 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14510,6 +14510,23 @@ (define_insn "x86_64_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shld{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI")]) + (define_insn "x86_64_shld_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (ashift:DI (match_dup 0) @@ -14531,6 +14548,24 @@ (define_insn "x86_64_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shld{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI") + (set_attr "length_immediate" "1")]) + + (define_insn_and_split "*x86_64_shld_shrd_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (ashift:DI (match_operand:DI 4 "nonimmediate_operand") @@ -14556,6 +14591,23 @@ (define_insn_and_split "*x86_64_shld_shrd_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -14588,6 +14640,33 @@ (define_insn_and_split "*x86_64_shld_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shld_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (ashift:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shld" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14610,6 +14689,24 @@ (define_insn "x86_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd" + [(set (match_operand:SI 0 "nonimmediate_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shld{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + + (define_insn "x86_shld_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14631,6 +14728,24 @@ (define_insn "x86_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 32 - INTVAL (operands[3])" + "shld{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shld_shrd_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (ashift:SI (match_operand:SI 4 "nonimmediate_operand") @@ -14655,7 +14770,24 @@ (define_insn_and_split "*x86_shld_shrd_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -14687,6 +14819,33 @@ (define_insn_and_split "*x86_shld_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shld_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (ashift:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_expand "@x86_shift_adj_1" [(set (reg:CCZ FLAGS_REG) (compare:CCZ (and:QI (match_operand:QI 2 "register_operand") @@ -15626,6 +15785,24 @@ (define_insn "x86_64_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shrd{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI")]) + + (define_insn "x86_64_shrd_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (lshiftrt:DI (match_dup 0) @@ -15647,6 +15824,24 @@ (define_insn "x86_64_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shrd{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "DI")]) + + (define_insn_and_split "*x86_64_shrd_shld_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (lshiftrt:DI (match_operand:DI 4 "nonimmediate_operand") @@ -15672,6 +15867,23 @@ (define_insn_and_split "*x86_64_shrd_shld_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shld_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -15704,6 +15916,33 @@ (define_insn_and_split "*x86_64_shrd_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shrd_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 2))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (lshiftrt:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shrd" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15726,6 +15965,23 @@ (define_insn "x86_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shrd{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + (define_insn "x86_shrd_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15747,6 +16003,24 @@ (define_insn "x86_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && (INTVAL (operands[4]) == 32 - INTVAL (operands[3]))" + "shrd{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shrd_shld_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (lshiftrt:SI (match_operand:SI 4 "nonimmediate_operand") @@ -15771,7 +16045,24 @@ (define_insn_and_split "*x86_shrd_shld_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shld_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -15803,6 +16094,33 @@ (define_insn_and_split "*x86_shrd_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shrd_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (lshiftrt:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + ;; Base name for insn mnemonic. (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c new file mode 100644 index 00000000000..87068ea31aa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -Wno-shift-count-overflow -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times {(?n)shld[ql]?[\t ]*\$2} 4 } } */ +/* { dg-final { scan-assembler-times {(?n)shrd[ql]?[\t ]*\$2} 4 } } */ + +typedef unsigned long u64; +typedef unsigned int u32; + +long a; +int c; +const char n = 2; + +long test64r (long e) { long t = ((u64)a >> n) | (e << (64 - n)); return t;} +long test64l (u64 e) { long t = (a << n) | (e >> (64 - n)); return t;} +int test32r (int f) { int t = ((u32)c >> n) | (f << (32 - n)); return t; } +int test32l (u32 f) { int t = (c << n) | (f >> (32 - n)); return t; } + +u64 ua; +u32 uc; + +u64 testu64l (u64 ue) { u64 ut = (ua << n) | (ue >> (64 - n)); return ut; } +u64 testu64r (u64 ue) { u64 ut = (ua >> n) | (ue << (64 - n)); return ut; } +u32 testu32l (u32 uf) { u32 ut = (uc << n) | (uf >> (32 - n)); return ut; } +u32 testu32r (u32 uf) { u32 ut = (uc >> n) | (uf << (32 - n)); return ut; } From patchwork Wed Dec 6 08:06:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174394 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955095vqy; Wed, 6 Dec 2023 00:11:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IElJq5nxEXC/3adZtRWxYXdIrKvUCddIT86TAoN8YRJWAn2euvgdaYzhqoEcJwMaszNXRY0 X-Received: by 2002:ac8:5a4c:0:b0:425:4043:1db6 with SMTP id o12-20020ac85a4c000000b0042540431db6mr689154qta.137.1701850277348; Wed, 06 Dec 2023 00:11:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850277; cv=pass; d=google.com; s=arc-20160816; b=amOYg0teRtPHAdWkibA6IC64qvrtycFAd+ZLx7wJB6BX+PdQyxDETcSKPaNNQWz3MS CMY4WkN79Q7eecXDXOEdR+xOZsIy36fKyv6L38Ai34LqC4y0HtzkWxCqvnbrB3d3RweX SDs3KiLyInPWrqHJk3n4W8ZHP25YL79A9DtDV4FIycmYUjVukdLlueg+pbA4gbn8W8Nd OwNradfy4MkElaalW8Qkzo7MP8mt2VMXjNNutAL0aRwBkttmYBK6BHTCUhs770JWyIZS Pm2f3vr8wSqjg70ylNJPITE+eFRKzzsDmHh6SB0Z2D6NweanZDL/J1tPdjNrSvJHcpNg 0EoA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=daKwnvLQsVR4bMwmYig2MuyqFvuup3P6XnFS+zCkVd8=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=gZSs/LFkrzLY/MkufWmij05tSuFaKprxmAAOu1St+YInbK+/VhVZpFhbmaFTE0nz0P Ym7d3F7DPgvWjglUWrPuBVRuCs6+bdJ7M77S9z0Shlp3Wj8E5VuXC7FPlnJ23XqC6ODy 3bA+/RaK59dXoH1LxwLas0k/HvM5LfC3t4mr1i4kWes7JA0Jb6DPnvIAiohKPBKgPVVt 1yE5wI/Mn1O6yslChFmC7XLvh/zaNEe00EHXKp1nezslEVBgTDcDYzKKdEwJ5kjLZB89 c0NwAzFftDsEM1YbyJ6loo1z5GgN1eCYPw0bja31wEpmejzl0+zPTDeIreSlFOLZ+t/u wISg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iQxAWrYz; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i7-20020a05622a08c700b00425525e7db8si7386606qte.461.2023.12.06.00.11.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:11:17 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iQxAWrYz; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E37663841919 for ; Wed, 6 Dec 2023 08:10:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id E4A8E384F980 for ; Wed, 6 Dec 2023 08:08:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E4A8E384F980 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E4A8E384F980 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850131; cv=none; b=O2lCoeiziTF2Ug2fFkD7Bwjp5meZAjuCfCmZ9MOQj+ezeMU+AawFTKvil02qOACRwqpx/q9gAYwRIwReVfXGCTIGSeECsWGY82hbFp+ySH5jHaSn2hyneoAX0wzP3/NYj7mk9gnowGB6OaWrP0YbN4vYreM8L8gEYvKR07TZeug= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850131; c=relaxed/simple; bh=I9t96uZOxYhOurLG9MHPsj6DR6VTPGqvXyHzjun1hFg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=wf7w/nF7xteDJIPJNZhA0UKYcocuwOXV3zDQmavRl/3PWv9tPVexMe8o4ndqXkvQVdH0K2644y1aXOGAVgusuLIOy0hCEW3f2jEsUIcUKqZaL55Ne3YwNdZM4G24okctknE1q9hywi/ditdLjKnptUzGE4uKGqvHAMlj3kWsUks= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850130; x=1733386130; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I9t96uZOxYhOurLG9MHPsj6DR6VTPGqvXyHzjun1hFg=; b=iQxAWrYzx+6JRyxfDgCJFCSvd+DaC9wILjGpl1o+GEQSvso7bvtZQf/n gTJ+FeChAKVAMW04G48MHZeZsgQOA+hZBPwKGR+8NqQ2PCSnyZI/0KKWj YHaS3b3t/lo0UKwoy6BSp9kjgc93ySeO8ATrF9wyFmWec9M3WDjSKwX0v Zp7hsN3KLlQQV7Uc+7JAMVmPJHr4VlclZ1nheOmH+ciUmkWE15ZjosjoX RkekKAI76Rnr7/ZIwNQPMS15mCPpjb+maWMcfyDdrq+YXo9wIX6W6FnwF HEHdeiB3+yzBOi1IS5Hu7dYSSmyw2GuwoKtE7fTJvfvf2HwDpirGKVj64 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085471" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085471" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737768" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737768" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B22B41007813; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 15/16] [APX NDD] Support APX NDD for cmove insns Date: Wed, 6 Dec 2023 16:06:35 +0800 Message-Id: <20231206080636.178863-16-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519356052084520 X-GMAIL-MSGID: 1784519356052084520 gcc/ChangeLog: * config/i386/i386.md (*movcc_noc): Extend with new constraints to support NDD. (*movsicc_noc_zext): Likewise. (*movsicc_noc_zext_1): Likewise. (*movqicc_noc): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-cmov.c: New test. --- gcc/config/i386/i386.md | 48 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c | 16 +++++++ 2 files changed, 45 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 5c6275430d6..017ab720293 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -24417,47 +24417,56 @@ (define_split (neg:SWI (ltu:SWI (reg:CCC FLAGS_REG) (const_int 0))))]) (define_insn "*movcc_noc" - [(set (match_operand:SWI248 0 "register_operand" "=r,r") + [(set (match_operand:SWI248 0 "register_operand" "=r,r,r,r") (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SWI248 2 "nonimmediate_operand" "rm,0") - (match_operand:SWI248 3 "nonimmediate_operand" "0,rm")))] + (match_operand:SWI248 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SWI248 3 "nonimmediate_operand" "0,rm,r,rm")))] "TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %0|%0, %2} - cmov%O2%c1\t{%3, %0|%0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %0|%0, %3} + cmov%O2%C1\t{%2, %3, %0|%0, %3, %2} + cmov%O2%c1\t{%3, %2, %0|%0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "")]) (define_insn "*movsicc_noc_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (if_then_else:DI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) (zero_extend:DI - (match_operand:SI 2 "nonimmediate_operand" "rm,0")) + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r")) (zero_extend:DI - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) (define_insn "*movsicc_noc_zext_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,r") (zero_extend:DI (if_then_else:SI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 2 "nonimmediate_operand" "rm,0") - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) @@ -24482,14 +24491,15 @@ (define_split }) (define_insn "*movqicc_noc" - [(set (match_operand:QI 0 "register_operand" "=r,r") + [(set (match_operand:QI 0 "register_operand" "=r,r,r") (if_then_else:QI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:QI 2 "register_operand" "r,0") - (match_operand:QI 3 "register_operand" "0,r")))] + (match_operand:QI 2 "register_operand" "r,0,r") + (match_operand:QI 3 "register_operand" "0,r,r")))] "TARGET_CMOVE && !TARGET_PARTIAL_REG_STALL" "#" - [(set_attr "type" "icmov") + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "QI")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c new file mode 100644 index 00000000000..459dc965342 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times "cmove\[^\n\r]*, %eax" 1 } } */ +/* { dg-final { scan-assembler-times "cmovge\[^\n\r]*, %eax" 1 } } */ + +unsigned int c[4]; + +unsigned long long foo1 (int a, unsigned int b) +{ + return a ? b : c[1]; +} + +unsigned int foo3 (int a, int b, unsigned int c, unsigned int d) +{ + return a < b ? c : d; +} From patchwork Wed Dec 6 08:06:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 174400 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3955571vqy; Wed, 6 Dec 2023 00:12:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IF7XXizcYaHhr5QTqlJx0K044tE9IPN6VSXuMMFlWgiuXNPTetdsEoU+11ZwWCDeU7JoEed X-Received: by 2002:ac8:5c41:0:b0:425:4042:f454 with SMTP id j1-20020ac85c41000000b004254042f454mr657300qtj.56.1701850342975; Wed, 06 Dec 2023 00:12:22 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701850342; cv=pass; d=google.com; s=arc-20160816; b=YN/F5Ru8K/TX3G2OhzuHA2ra1pG1/vPhbar+Gt9Lc0d762WWvMpYZz8yTS2MrNwYaO qZZVWbS/GmrYsSxbO+UVEBuAu7x0FhG49cxwWHoEXR/yDRBQgjtkwFWeAIH7OOhBhqYX x1o/hGTZGuuCS9dPEBY6mfE+yWjEo23Y/tjBS3GLxm5fPr2Ec8W3lBIhB0DNrbdCFQ96 3okpbOt/2XOX8KwheSEZ4p4VnVMO6spIhtc15eg3vror+ttuGIekESjxVUdnoDIpfkiC rOD4f6xjo2Q4tv6kN7ZsN/UP7DjSK2IOxL1ChfaQPbZllxnz4TMa1IhQ+KgPQGBgK1So KcTg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ebWrPfDhmtoZODLpaHxhme6QhgsDdtMSUv5GVj2ou2w=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=Laet0AZE45dJxsHbQRKxFiBEQI6yT/Pwx/EYecCQcpsH7OrWgejK6xdLAIg6ouZ6Yy /37jVnRgH7mf3jb92IBUaBxDkelsfXj6zpfTZDbHyMq3WXTZapxWjuAwuUc8izcgar68 B7GoS5dQFCzJZ8EHbBK/n+j6dHZT99mpwTihkwGEtYCOH0M5IE7i+Csa+rljEVBwgZNd 8XRWgDV3HKntjRt5//yoE2JLnDaU20M+Tkhi2n6K36JY1RqALVgKPCC7dYoBRZKFcgoC 117V86QebTAM8Bvlz1wASBWT0zcDZqX7sa4QKhQzDUaoHrBeIfvrfQhh9TXfcwBUYGGr hPpQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OLzZ/JJU"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id v9-20020ac87289000000b0042580ff41e5si224794qto.43.2023.12.06.00.12.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:12:22 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="OLzZ/JJU"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7392D384F498 for ; Wed, 6 Dec 2023 08:12:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id 91518384DEE0 for ; Wed, 6 Dec 2023 08:08:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 91518384DEE0 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 91518384DEE0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; cv=none; b=ZORaUxpt3r+wJTe/7a6dhbgN/BzfdxgmnU/msB4usJJWsrD6z5WoSxGvolrody84Vt6/96y1N2glnoaZbb2twanQ2GxP8ERraP4NfZD+GafX4Cmt14de3FwQM7TtpWHyTe1vSFyZ1Jxg2IOPLz/MpM2KikBh+3DPv1lUnxLOZdM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701850136; c=relaxed/simple; bh=euyg9t3pxTCawTmbGdXw+L/+pvuFnv6BYcoeBDv1j9k=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=w8sXveqjK9qV/H43OOH7Wuo+pMpXZv91h4UwPEgkC9hhgDzVbwlKhCFEmn1iZw4RBbvaza3KOJOaDgPORmakxHcIABbE2l+Aqr4XQW7TqQufKJWwYUdu5lvZ9POS1xKMfoBpUxIVeoiNYgULEgQ+M2ri0Nz7XT9rgLewh+Dr8yo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701850133; x=1733386133; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=euyg9t3pxTCawTmbGdXw+L/+pvuFnv6BYcoeBDv1j9k=; b=OLzZ/JJUgUVe69mIFe6lg7ZNRZ/iA8708d48RDDc2Q7axn2jRoBEr9+i Up09CfKUVmi9GqgGwH89Hio6Bzkvb1lhar3M4yE+VZ7pDWlPyD9w7jb3z H66hHAwzv0duJWI9oGdPYAen6b/6nNjDQrTFtm18ryLvoKr6U6f7ImiJv cCz3rOoh2HZZ7+n4phlfauN7vtmFFuvIR7dGNyhZkbjEzEjZrQZL5hjI5 mnshmlCb3xSKTLhB4Jxf5NJ11xPgpd4ZYcZ3EXrfR4nNbfH0kkIflvPyI p4yBHI3Orj3dJ7X7wHoUbsyXNdb4CTlw+UrVf011aFbgb/SPh8/qmblx+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1085478" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1085478" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 00:08:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10915"; a="1102737776" X-IronPort-AV: E=Sophos;i="6.04,254,1695711600"; d="scan'208";a="1102737776" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 06 Dec 2023 00:08:42 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id B3FB51007814; Wed, 6 Dec 2023 16:08:36 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 16/16] [APX NDD] Support TImode shift for NDD Date: Wed, 6 Dec 2023 16:06:36 +0800 Message-Id: <20231206080636.178863-17-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231206080636.178863-1-hongyu.wang@intel.com> References: <20231206080636.178863-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784519425411062765 X-GMAIL-MSGID: 1784519425411062765 For TImode shifts, they are splitted by splitter functions, which assume operands[0] and operands[1] to be the same. For the NDD alternative the assumption may not be true so add split functions for NDD to emit the NDD form instructions, and omit the handling of !64bit target split. Although the NDD form allows memory src, for post-reload splitter there are no extra register to accept NDD form shift, especially shld/shrd. So only accept register alternative for shift src under NDD. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_split_ashl_ndd): New function to split NDD form lshift. (ix86_split_rshift_ndd): Likewise for l/ashiftrt. * config/i386/i386-protos.h (ix86_split_ashl_ndd): New prototype. (ix86_split_rshift_ndd): Likewise. * config/i386/i386.md (ashl3_doubleword): Add NDD alternative, call ndd split function when operands[0] not equal to operands[1]. (define_split for doubleword lshift): Likewise. (define_peephole for doubleword lshift): Likewise. (3_doubleword): Likewise for l/ashiftrt. (define_split for doubleword l/ashiftrt): Likewise. (define_peephole for doubleword l/ashiftrt): Likewise. gcc/ChangeLog: * gcc.target/i386/apx-ndd-ti-shift.c: New test. --- gcc/config/i386/i386-expand.cc | 136 ++++++++++++++++++ gcc/config/i386/i386-protos.h | 2 + gcc/config/i386/i386.md | 56 ++++++-- .../gcc.target/i386/apx-ndd-ti-shift.c | 91 ++++++++++++ 4 files changed, 273 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index d4bbd33ce07..a53d69d5400 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -6678,6 +6678,142 @@ ix86_split_lshr (rtx *operands, rtx scratch, machine_mode mode) } } +/* Helper function to split TImode ashl under NDD. */ +void +ix86_split_ashl_ndd (rtx *operands, rtx scratch) +{ + gcc_assert (TARGET_APX_NDD); + int half_width = GET_MODE_BITSIZE (TImode) >> 1; + + rtx low[2], high[2]; + int count; + + split_double_mode (TImode, operands, 2, low, high); + if (CONST_INT_P (operands[2])) + { + count = INTVAL (operands[2]) & (GET_MODE_BITSIZE (TImode) - 1); + + if (count >= half_width) + { + count = count - half_width; + if (count == 0) + { + if (!rtx_equal_p (high[0], low[1])) + emit_move_insn (high[0], low[1]); + } + else if (count == 1) + emit_insn (gen_adddi3 (high[0], low[1], low[1])); + else + emit_insn (gen_ashldi3 (high[0], low[1], GEN_INT (count))); + + ix86_expand_clear (low[0]); + } + else if (count == 1) + { + rtx x3 = gen_rtx_REG (CCCmode, FLAGS_REG); + rtx x4 = gen_rtx_LTU (TImode, x3, const0_rtx); + emit_insn (gen_add3_cc_overflow_1 (DImode, low[0], + low[1], low[1])); + emit_insn (gen_add3_carry (DImode, high[0], high[1], high[1], + x3, x4)); + } + else + { + emit_insn (gen_x86_64_shld_ndd (high[0], high[1], low[1], + GEN_INT (count))); + emit_insn (gen_ashldi3 (low[0], low[1], GEN_INT (count))); + } + } + else + { + emit_insn (gen_x86_64_shld_ndd (high[0], high[1], low[1], + operands[2])); + emit_insn (gen_ashldi3 (low[0], low[1], operands[2])); + if (TARGET_CMOVE && scratch) + { + ix86_expand_clear (scratch); + emit_insn (gen_x86_shift_adj_1 + (DImode, high[0], low[0], operands[2], scratch)); + } + else + emit_insn (gen_x86_shift_adj_2 (DImode, high[0], low[0], operands[2])); + } +} + +/* Helper function to split TImode l/ashr under NDD. */ +void +ix86_split_rshift_ndd (enum rtx_code code, rtx *operands, rtx scratch) +{ + gcc_assert (TARGET_APX_NDD); + int half_width = GET_MODE_BITSIZE (TImode) >> 1; + bool ashr_p = code == ASHIFTRT; + rtx (*gen_shr)(rtx, rtx, rtx) = ashr_p ? gen_ashrdi3 + : gen_lshrdi3; + + rtx low[2], high[2]; + int count; + + split_double_mode (TImode, operands, 2, low, high); + if (CONST_INT_P (operands[2])) + { + count = INTVAL (operands[2]) & (GET_MODE_BITSIZE (TImode) - 1); + + if (ashr_p && (count == GET_MODE_BITSIZE (TImode) - 1)) + { + emit_insn (gen_shr (high[0], high[1], + GEN_INT (half_width - 1))); + emit_move_insn (low[0], high[0]); + } + else if (count >= half_width) + { + if (ashr_p) + emit_insn (gen_shr (high[0], high[1], + GEN_INT (half_width - 1))); + else + ix86_expand_clear (high[0]); + + if (count > half_width) + emit_insn (gen_shr (low[0], high[1], + GEN_INT (count - half_width))); + else + emit_move_insn (low[0], high[1]); + } + else + { + emit_insn (gen_x86_64_shrd_ndd (low[0], low[1], high[1], + GEN_INT (count))); + emit_insn (gen_shr (high[0], high[1], GEN_INT (count))); + } + } + else + { + emit_insn (gen_x86_64_shrd_ndd (low[0], low[1], high[1], + operands[2])); + emit_insn (gen_shr (high[0], high[1], operands[2])); + + if (TARGET_CMOVE && scratch) + { + if (ashr_p) + { + emit_move_insn (scratch, high[0]); + emit_insn (gen_shr (scratch, scratch, + GEN_INT (half_width - 1))); + } + else + ix86_expand_clear (scratch); + + emit_insn (gen_x86_shift_adj_1 + (DImode, low[0], high[0], operands[2], scratch)); + } + else if (ashr_p) + emit_insn (gen_x86_shift_adj_3 + (DImode, low[0], high[0], operands[2])); + else + emit_insn (gen_x86_shift_adj_2 + (DImode, low[0], high[0], operands[2])); + } +} + /* Expand move of V1TI mode register X to a new TI mode register. */ static rtx ix86_expand_v1ti_to_ti (rtx x) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index fa952409729..56349064a6c 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -174,8 +174,10 @@ extern void x86_initialize_trampoline (rtx, rtx, rtx); extern rtx ix86_zero_extend_to_Pmode (rtx); extern void ix86_split_long_move (rtx[]); extern void ix86_split_ashl (rtx *, rtx, machine_mode); +extern void ix86_split_ashl_ndd (rtx *, rtx); extern void ix86_split_ashr (rtx *, rtx, machine_mode); extern void ix86_split_lshr (rtx *, rtx, machine_mode); +extern void ix86_split_rshift_ndd (enum rtx_code, rtx *, rtx); extern void ix86_expand_v1ti_shift (enum rtx_code, rtx[]); extern void ix86_expand_v1ti_rotate (enum rtx_code, rtx[]); extern void ix86_expand_v1ti_ashiftrt (rtx[]); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 017ab720293..b4db50f61cd 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14425,13 +14425,14 @@ (define_insn_and_split "*ashl3_doubleword_mask_1" }) (define_insn "ashl3_doubleword" - [(set (match_operand:DWI 0 "register_operand" "=&r") - (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:DWI 0 "register_operand" "=&r,&r") + (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n,r") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] "" "#" - [(set_attr "type" "multi")]) + [(set_attr "type" "multi") + (set_attr "isa" "*,apx_ndd")]) (define_split [(set (match_operand:DWI 0 "register_operand") @@ -14440,7 +14441,15 @@ (define_split (clobber (reg:CC FLAGS_REG))] "epilogue_completed" [(const_int 0)] - "ix86_split_ashl (operands, NULL_RTX, mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]) + && REG_P (operands[1])) + ix86_split_ashl_ndd (operands, NULL_RTX); + else + ix86_split_ashl (operands, NULL_RTX, mode); + DONE; +}) ;; By default we don't ask for a scratch register, because when DWImode ;; values are manipulated, registers are already at a premium. But if @@ -14456,7 +14465,15 @@ (define_peephole2 (match_dup 3)] "TARGET_CMOVE" [(const_int 0)] - "ix86_split_ashl (operands, operands[3], mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]) + && (REG_P (operands[1]))) + ix86_split_ashl_ndd (operands, operands[3]); + else + ix86_split_ashl (operands, operands[3], mode); + DONE; +}) (define_insn_and_split "*ashl3_doubleword_highpart" [(set (match_operand: 0 "register_operand" "=r") @@ -15713,16 +15730,24 @@ (define_insn_and_split "*3_doubleword_mask_1" }) (define_insn_and_split "3_doubleword" - [(set (match_operand:DWI 0 "register_operand" "=&r") - (any_shiftrt:DWI (match_operand:DWI 1 "register_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:DWI 0 "register_operand" "=&r,&r") + (any_shiftrt:DWI (match_operand:DWI 1 "register_operand" "0,r") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] "" "#" "epilogue_completed" [(const_int 0)] - "ix86_split_ (operands, NULL_RTX, mode); DONE;" - [(set_attr "type" "multi")]) +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1])) + ix86_split_rshift_ndd (, operands, NULL_RTX); + else + ix86_split_ (operands, NULL_RTX, mode); + DONE; +} + [(set_attr "type" "multi") + (set_attr "isa" "*,apx_ndd")]) ;; By default we don't ask for a scratch register, because when DWImode ;; values are manipulated, registers are already at a premium. But if @@ -15738,7 +15763,14 @@ (define_peephole2 (match_dup 3)] "TARGET_CMOVE" [(const_int 0)] - "ix86_split_ (operands, operands[3], mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1])) + ix86_split_rshift_ndd (, operands, operands[3]); + else + ix86_split_ (operands, operands[3], mode); + DONE; +}) ;; Split truncations of double word right shifts into x86_shrd_1. (define_insn_and_split "3_doubleword_lowpart" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c b/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c new file mode 100644 index 00000000000..0489712b7f6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c @@ -0,0 +1,91 @@ +/* { dg-do run { target { int128 && { ! ia32 } } } } */ +/* { dg-require-effective-target apxf } */ +/* { dg-options "-O2" } */ + +#include + +#define APX_TARGET __attribute__((noinline, target("apxf"))) +#define NO_APX __attribute__((noinline, target("no-apxf"))) +typedef __uint128_t u128; +typedef __int128 i128; + +#define TI_SHIFT_FUNC(TYPE, op, name) \ +APX_TARGET \ +TYPE apx_##name##TYPE (TYPE a, char b) \ +{ \ + return a op b; \ +} \ +TYPE noapx_##name##TYPE (TYPE a, char b) \ +{ \ + return a op b; \ +} \ + +#define TI_SHIFT_FUNC_CONST(TYPE, i, op, name) \ +APX_TARGET \ +TYPE apx_##name##TYPE##_const (TYPE a) \ +{ \ + return a op i; \ +} \ +NO_APX \ +TYPE noapx_##name##TYPE##_const (TYPE a) \ +{ \ + return a op i; \ +} + +#define TI_SHIFT_TEST(TYPE, name, val) \ +{\ + if (apx_##name##TYPE (val, b) != noapx_##name##TYPE (val, b)) \ + abort (); \ +} + +#define TI_SHIFT_CONST_TEST(TYPE, name, val) \ +{\ + if (apx_##name##1##TYPE##_const (val) \ + != noapx_##name##1##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##2##TYPE##_const (val) \ + != noapx_##name##2##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##3##TYPE##_const (val) \ + != noapx_##name##3##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##4##TYPE##_const (val) \ + != noapx_##name##4##TYPE##_const (val)) \ + abort (); \ +} + +TI_SHIFT_FUNC(i128, <<, ashl) +TI_SHIFT_FUNC(i128, >>, ashr) +TI_SHIFT_FUNC(u128, >>, lshr) + +TI_SHIFT_FUNC_CONST(i128, 1, <<, ashl1) +TI_SHIFT_FUNC_CONST(i128, 65, <<, ashl2) +TI_SHIFT_FUNC_CONST(i128, 64, <<, ashl3) +TI_SHIFT_FUNC_CONST(i128, 87, <<, ashl4) +TI_SHIFT_FUNC_CONST(i128, 127, >>, ashr1) +TI_SHIFT_FUNC_CONST(i128, 87, >>, ashr2) +TI_SHIFT_FUNC_CONST(i128, 27, >>, ashr3) +TI_SHIFT_FUNC_CONST(i128, 64, >>, ashr4) +TI_SHIFT_FUNC_CONST(u128, 127, >>, lshr1) +TI_SHIFT_FUNC_CONST(u128, 87, >>, lshr2) +TI_SHIFT_FUNC_CONST(u128, 27, >>, lshr3) +TI_SHIFT_FUNC_CONST(u128, 64, >>, lshr4) + +int main (void) +{ + if (!__builtin_cpu_supports ("apxf")) + return 0; + + u128 ival = 0x123456788765432FLL; + u128 uval = 0xF234567887654321ULL; + char b = 28; + + TI_SHIFT_TEST(i128, ashl, ival) + TI_SHIFT_TEST(i128, ashr, ival) + TI_SHIFT_TEST(u128, lshr, uval) + TI_SHIFT_CONST_TEST(i128, ashl, ival) + TI_SHIFT_CONST_TEST(i128, ashr, ival) + TI_SHIFT_CONST_TEST(u128, lshr, uval) + + return 0; +}