From patchwork Tue Dec 5 02:29:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173688 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176155vqy; Mon, 4 Dec 2023 18:40:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGBOHeNUVh/xw9m6I04UJyknSu/fbiYyRll0XrFPRZvM7511i67RhPmSsZH8r0pzsH4vzCp X-Received: by 2002:a0c:e784:0:b0:67a:cf28:39c3 with SMTP id x4-20020a0ce784000000b0067acf2839c3mr622353qvn.37.1701744012838; Mon, 04 Dec 2023 18:40:12 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744012; cv=pass; d=google.com; s=arc-20160816; b=pLMo61F6hmYGIGqYqGnmZ1uq81aslxRR9701jPYtgQMynqsHooqIZBLUDtfFFB63Kn ry6xXlW+U+ANdgmTuG6N2own9rXatbWqgd2CuBeF8Zrhs4cqrcoYkgHQMYwqBK2F4Rh/ M/FnyjRHBN/22+YHtJBuCNg/rh2aJQLlJnOa6YFjMVhcZJqU2LG+xPb6A6Cu2HC8Jxw8 schJw/2Gu/zSK8qeNr8U/yP7OxBOTXAOzNiouoZaBjRYDJ/UzYTIEBiDHff4LqzE6N82 8NnNWa9pqMXiT6G6GZPP8g/YyerMb0TuiWgNDBNsTMtmkt7puIjRZbhUPx8YjZVnSI5Y tNbQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=bnqcxZX4IpobxhoAa4rNj+S7m/5ol4kMitPjxWabhTk=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=xwTHrhuO4sqWNOjqrY9J1+09qG+vA+wdD9fApxHYR8CUuBosMz97tc0V2uOubZnLX5 vrN42mff74YV9ppl5T/kMU1GPpZX6Typ9NfsyM0z76Mr2b+j5Jqz8VyB78c2GZXb2Kjc 7HVt7zWZseob++IQYWcpXiFfNVMqsL/+xYggHsPeS9cI28+LRHXKqdxi5iKBXSrzNxwD hlBbj7lJfY19B2nVW9PjZV+KLQCkeVxxBslDQRnxY0OnYjIecWi4fEffQsrXWTkUTftc lI3iWp5gZPY+BMdzPs/fKw1KOjmPEQgYTPl/0sTo7S40rFKJykDOeA/vHw769UdgpVmA Hq+Q== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="MDX/ysbr"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id c3-20020a05621401e300b0066d0819ca14si10881925qvu.380.2023.12.04.18.40.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:40:12 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="MDX/ysbr"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 836893A485F7 for ; Tue, 5 Dec 2023 02:33:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 5315E382B32F for ; Tue, 5 Dec 2023 02:31:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5315E382B32F Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5315E382B32F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; cv=none; b=LwHRT89S7VKxRyqoGBZDPKMBcT7WPBgx+GxkVZL6PWwF+xj7z38KHRb+EvpYX3lSLp9UQ3CMG5Aejc8nhAOBjr0pR8mHgCpCMscZLv+WfKb3uYOor8IYVJsv6Dh1bVp1f47FEs0tD3XuHcEYKz2vSHviUy3X9egbwKvniuFBMr0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; c=relaxed/simple; bh=GoFpIJ+ttg/xtdo6DjR4Ps1h8JdTGVm7t+Dr/qbQozk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Y6XXjKuZzn9+JgUKkV72pW4zf6UVT0F9cg3vt7xhJRuDUvM/pJDjPDT3yqpY7gKygZZY2J+ZZOGq7HvcqemcZtdU1Vg1qXBHn/yxgggStXK89usD4+BdFR3Vfvb8aWhAeCqEAo8T7KmoJuKe4PF064XH+7FS51ZLvjai+arfklU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDB-0001YN-Vw for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743461; x=1733279461; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GoFpIJ+ttg/xtdo6DjR4Ps1h8JdTGVm7t+Dr/qbQozk=; b=MDX/ysbrHJ4MmZSuxbnmoB8G/WM90eeVnOZSKO4xFWUlE3PdhNg8G3LA xYVK6Z9n+noQNhtreDqRJ6lzMMCxGGY52UPL/NdK4rIwr6S4VRVs//RVy aiXTHSEzl37e65c7p16QTpeOV4cyQ9i8bas04ZK2f+UhFjbXDj/vy2qu/ SX3UApki6q5tRJKcFsax1hfPhI8lpK+6lXOU7DyVgwOjc4q0Px1uXKPg1 KqlL2BiCXATxMNkvjXf0Uhan8kAYoPz04AFNDx4So1KOpGelyIKxuYHSV TYbgziUXlqvVX164P9c/s3CUlDaeYJ1ec4Dq17XcnQXjEvjkbJJxfMSlL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277788" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277788" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275507" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275507" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:49 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 372681005663; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 01/17] [APX NDD] Support Intel APX NDD for legacy add insn Date: Tue, 5 Dec 2023 10:29:32 +0800 Message-Id: <20231205022948.504790-2-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -2 X-Spam_score: -0.3 X-Spam_bar: / X-Spam_report: (-0.3 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407930287259052 X-GMAIL-MSGID: 1784407930287259052 From: Kong Lingling APX NDD provides an extra destination register operand for several gpr related legacy insns, so a new alternative can be adopted to operand1 with "r" constraint. This first patch supports NDD for add instruction, and keeps to use lea when all operands are registers since lea have shorter encoding. For add operations containing mem NDD will be adopted to save an extra move. In legacy x86 binary operation expand it will force operands[0] and operands[1] to be the same so add a helper function to allow NDD form pattern that operands[0] and operands[1] can be different. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_fixup_binary_operands): Add new use_ndd flag to check whether ndd can be used for this binop and adjust operand emit. (ix86_binary_operator_ok): Likewise. (ix86_expand_binary_operator): Likewise, and void postreload expand generate lea pattern when use_ndd is explicit parsed. * config/i386/i386-options.cc (ix86_option_override_internal): Prohibit apx subfeatures when not in 64bit mode. * config/i386/i386-protos.h (ix86_binary_operator_ok): Add use_ndd flag. (ix86_fixup_binary_operand): Likewise. (ix86_expand_binary_operand): Likewise. * config/i386/i386.md (*add_1): Extend with new alternatives to support NDD, and adjust output template. (*addhi_1): Likewise. (*addqi_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: New test. --- gcc/config/i386/i386-expand.cc | 19 ++--- gcc/config/i386/i386-options.cc | 2 + gcc/config/i386/i386-protos.h | 6 +- gcc/config/i386/i386.md | 102 ++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 21 +++++ 5 files changed, 96 insertions(+), 54 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 4bd7d4f39c8..3ecda989cf8 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1260,14 +1260,14 @@ ix86_swap_binary_operands_p (enum rtx_code code, machine_mode mode, return false; } - /* Fix up OPERANDS to satisfy ix86_binary_operator_ok. Return the destination to use for the operation. If different from the true - destination in operands[0], a copy operation will be required. */ + destination in operands[0], a copy operation will be required except + under TARGET_APX_NDD. */ rtx ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1307,7 +1307,7 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, src1 = force_reg (mode, src1); /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) src1 = force_reg (mode, src1); /* Improve address combine. */ @@ -1338,11 +1338,11 @@ ix86_fixup_binary_operands_no_copy (enum rtx_code code, void ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { rtx src1, src2, dst, op, clob; - dst = ix86_fixup_binary_operands (code, mode, operands); + dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); src1 = operands[1]; src2 = operands[2]; @@ -1352,7 +1352,8 @@ ix86_expand_binary_operator (enum rtx_code code, machine_mode mode, if (reload_completed && code == PLUS - && !rtx_equal_p (dst, src1)) + && !rtx_equal_p (dst, src1) + && !use_ndd) { /* This is going to be an LEA; avoid splitting it later. */ emit_insn (op); @@ -1451,7 +1452,7 @@ ix86_expand_vector_logical_operator (enum rtx_code code, machine_mode mode, bool ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, - rtx operands[3]) + rtx operands[3], bool use_ndd) { rtx dst = operands[0]; rtx src1 = operands[1]; @@ -1475,7 +1476,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, return false; /* Source 1 cannot be a non-matching memory. */ - if (MEM_P (src1) && !rtx_equal_p (dst, src1)) + if (!use_ndd && MEM_P (src1) && !rtx_equal_p (dst, src1)) /* Support "andhi/andsi/anddi" as a zero-extending move. */ return (code == AND && (mode == HImode diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 877659229d2..27f078790e7 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -2129,6 +2129,8 @@ ix86_option_override_internal (bool main_args_p, if (TARGET_APX_F && !TARGET_64BIT) error ("%<-mapxf%> is not supported for 32-bit code"); + else if (opts->x_ix86_apx_features != apx_none && !TARGET_64BIT) + error ("%<-mapx-features=%> option is not supported for 32-bit code"); if (TARGET_UINTR && !TARGET_64BIT) error ("%<-muintr%> not supported for 32-bit code"); diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 28d0eab11d5..a9d0c568bba 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -108,14 +108,14 @@ extern void ix86_expand_move (machine_mode, rtx[]); extern void ix86_expand_vector_move (machine_mode, rtx[]); extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_binary_operator (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, machine_mode, rtx[]); -extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3]); +extern bool ix86_binary_operator_ok (enum rtx_code, machine_mode, rtx[3], bool = false); extern bool ix86_avoid_lea_for_add (rtx_insn *, rtx[]); extern bool ix86_use_lea_for_mov (rtx_insn *, rtx[]); extern bool ix86_avoid_lea_for_addr (rtx_insn *, rtx[]); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7641b479670..cb227d19f40 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -562,7 +562,7 @@ (define_attr "unit" "integer,i387,sse,mmx,unknown" ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes, + x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, @@ -960,6 +960,8 @@ (define_attr "enabled" "" (symbol_ref "TARGET_AVX512BF16 && TARGET_AVX512VL") (eq_attr "isa" "vpclmulqdqvl") (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") + (eq_attr "isa" "apx_ndd") + (symbol_ref "TARGET_APX_NDD") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") @@ -6285,7 +6287,8 @@ (define_expand "add3" (plus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (PLUS, mode, operands); DONE;") + "ix86_expand_binary_operator (PLUS, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*add3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -6412,26 +6415,29 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 4 || which_alternative == 5); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: @@ -6440,14 +6446,16 @@ (define_insn "*add_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:SWI48 2 "incdec_operand") @@ -6516,25 +6524,26 @@ (define_insn "addsi_1_zext" (set_attr "mode" "SI")]) (define_insn "*addhi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp") - (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp") - (match_operand:HI 2 "general_operand" "rn,m,0,ln"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp,r,r") + (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp,rm,r") + (match_operand:HI 2 "general_operand" "rn,m,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, HImode, operands)" + "ix86_binary_operator_ok (PLUS, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 4 || which_alternative == 5); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return "inc{w}\t%0"; + return use_ndd ? "inc{w}\t{%1, %0|%0, %1}" : "inc{w}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{w}\t%0"; + return use_ndd ? "dec{w}\t{%1, %0|%0, %1}" : "dec{w}\t%0"; } default: @@ -6543,14 +6552,16 @@ (define_insn "*addhi_1" if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], HImode)) - return "sub{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{w}\t{%2, %1, %0|%0, %1, %2}" + : "sub{w}\t{%2, %0|%0, %2}"; - return "add{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{w}\t{%2, %1, %0|%0, %1, %2}" + : "add{w}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") (match_operand:HI 2 "incdec_operand") @@ -6562,30 +6573,35 @@ (define_insn "*addhi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "HI,HI,HI,SI")]) + (set_attr "mode" "HI,HI,HI,SI,HI,HI")]) (define_insn "*addqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp") - (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp") - (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp,r,r") + (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,0,rn,0,ln,rn,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, QImode, operands)" + "ix86_binary_operator_ok (PLUS, QImode, operands, TARGET_APX_NDD)" { bool widen = (get_attr_mode (insn) != MODE_QI); - + bool use_ndd = (which_alternative == 6 || which_alternative == 7); switch (get_attr_type (insn)) { case TYPE_LEA: return "#"; case TYPE_INCDEC: - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (operands[2] == const1_rtx) - return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; + if (use_ndd) + return "inc{b}\t{%1, %0|%0, %1}"; + else + return widen ? "inc{l}\t%k0" : "inc{b}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; + if (use_ndd) + return "dec{b}\t{%1, %0|%0, %1}"; + else + return widen ? "dec{l}\t%k0" : "dec{b}\t%0"; } default: @@ -6594,21 +6610,23 @@ (define_insn "*addqi_1" if (which_alternative == 2 || which_alternative == 4) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], QImode)) { - if (widen) - return "sub{l}\t{%2, %k0|%k0, %2}"; + if (use_ndd) + return "sub{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "sub{b}\t{%2, %0|%0, %2}"; + return widen ? "sub{l}\t{%2, %k0|%k0, %2}" + : "sub{b}\t{%2, %0|%0, %2}"; } - if (widen) - return "add{l}\t{%k2, %k0|%k0, %k2}"; + if (use_ndd) + return "add{b}\t{%2, %1, %0|%0, %1, %2}"; else - return "add{b}\t{%2, %0|%0, %2}"; + return widen ? "add{l}\t{%k2, %k0|%k0, %k2}" + : "add{b}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "5") (const_string "lea") (match_operand:QI 2 "incdec_operand") @@ -6620,7 +6638,7 @@ (define_insn "*addqi_1" (and (eq_attr "type" "alu") (match_operand 2 "const128_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "QI,QI,QI,SI,SI,SI") + (set_attr "mode" "QI,QI,QI,SI,SI,SI,QI,QI") ;; Potential partial reg stall on alternatives 3 and 4. (set (attr "preferred_for_speed") (cond [(eq_attr "alternative" "3,4") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c new file mode 100644 index 00000000000..056a323a647 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mapxf -march=x86-64 -O2" } */ +/* { dg-final { scan-assembler-not "movl"} } */ + +int foo (int *a) +{ + int b = *a - 1; + return b; +} + +int foo2 (int a, int b) +{ + int c = a + b; + return c; +} + +int foo3 (int *a, int b) +{ + int c = *a + b; + return c; +} From patchwork Tue Dec 5 02:29:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173685 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3175890vqy; Mon, 4 Dec 2023 18:39:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IEAn017QElvyLd7lfTlOrDKhAOYnBlQMBGT0XSr2dY5zsl4Qxb+s23aBEV4Px9kC0fhNte6 X-Received: by 2002:ac8:580e:0:b0:425:4043:18d0 with SMTP id g14-20020ac8580e000000b00425404318d0mr746628qtg.131.1701743947240; Mon, 04 Dec 2023 18:39:07 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743947; cv=pass; d=google.com; s=arc-20160816; b=ujnVVGYNR93Het7Ot81L/M//DZscecAs1zAYGaAf3+qcVZKq+IrLjYj146UHzi8yLc h6JHIs1HQkvCOFa6i6cslcGMC3nzRobKe3IYTKuqXvJ1VpKevZqYzim3KRnQ60LijuPw fjnx+kzCuoVuibVkKk35urxiFnrBVPIcIVy0wzGO72Bn5XU3vukJ5tpcREjFgAjbGnA4 4j/LTc3t2cOhzhkm5AwfSkncgW5keDm2rqDVT0yDralCVbTCVYWHeao4FVSHWCk/lLJ8 U4rtrddLpFJWVzKFCByLcZFZe9m7k8tjINEAJfpxLN320qVRC2bkYPCy8jk8dNb8WS/W ZMzg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=XRL3tBOV+e7U/3RgUcYJnYESYC8ZAFUsbML3LdnNAI0=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=f4VDf6CxbG4DUnWX1nTL0rEZdX1OGxEZPmTyZ9lQL0NEJ3viI44sQB9E2sSmo0NnLg vdZotoTicjJ4Su66cZDoNxwVZzwHuIVIkdiXHtzSIK1zecDIIy8RwgSmHM04G6++UJNz hxpijHxJq6w9a7UKmrTH2McoazfsklJIyMsO9+9InmVaeoMtWcn1sScLcNvviQIFRXZ9 QNlNAVkyhIis8kDKLvJrZX96RlXOaOKQAxEHg7Tt/ChAevTvqpL9eLzsCmoo+qsZzG6P j/S9UtapRlIMRj4vugwRiOZvYqNdvLQqkfnGEUQuVQqHeWcJtwpHtmX7cQesIb3Zaaue xE8g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l9MVf2ox; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bs15-20020ac86f0f000000b004254fdbbd17si5197163qtb.300.2023.12.04.18.39.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:39:07 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l9MVf2ox; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B18C63B61C0B for ; Tue, 5 Dec 2023 02:32:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 52DD83882124 for ; Tue, 5 Dec 2023 02:31:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 52DD83882124 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 52DD83882124 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; cv=none; b=aWsS8tc9+z53A0TM3Iujo+IwXc49DPX4iz8GQVF2yQhzo/13s2wftYYW7vg/IHM67K8hl5GR7ZYwVjN2TtjunBJ/lSrgJWFYfzLwpCRBtkyDtvri5R2L5c3xXLikrK4kzO4m1i6lpLiLFhdwq6EpiTK8kzyFZmgx7WHrMF7tLhs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; c=relaxed/simple; bh=1R8R1T66Xd5tmZC/q/CB3N0p4deraFzMzBiiH5YWoV8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Ty9sFATxUPUFR13JR7CaoF/xDoWwH9HVcRLCAP2tzz/3iKqfCdXFkMG1Ikbc1G/81HYGbe/wz4tJi9+724B89vjRX4FL8mWAVMydqNrifRokQXajjwTR2emELgqtMyuiACHXlN1yB155fVHVR/1G/KqmdQagYZ/fXx9JrfZSIvY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALD5-0001YN-UO for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:30:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743455; x=1733279455; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1R8R1T66Xd5tmZC/q/CB3N0p4deraFzMzBiiH5YWoV8=; b=l9MVf2oxM0OS+j5fQBwZHWmyW7gxFNGrvOdt7oXyHLAQPcbk7eRIDVqB MTbfuL+zA1Uz+/zmDA+bNeqSy+U9ynHfjO5bO8VcHOuYr9AXG1rRRfZme 8GIKtHOtQ2j+zYntpNUa2eAE2OQpNwoFJZ+SzYlAj1pT5clLUXy68nZbN JmQs9v/CqXE8NpYnu+Ht48rxjFFnajkfarbCGwMuMnAvGk2miMMTmbut8 zMMWCE7k8R1ddkUOhNyq04J5DQ84QXbYJahKn1RUGOv1C2WJ6uTHEHAfW X426XnxfjMXNDwFJsstcYaJcPILNbY++Tub21yDvg0FpY9oeg+mHyGk1M Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277776" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277776" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275489" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275489" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:49 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 39F0F1005665; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 02/17] [APX NDD] Restrict TImode register usage when NDD enabled Date: Tue, 5 Dec 2023 10:29:33 +0800 Message-Id: <20231205022948.504790-3-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407861193593814 X-GMAIL-MSGID: 1784407861193593814 Under APX NDD, previous TImode allocation will have issue that it was originally allocated using continuous pair, like rax:rdi, rdi:rdx. This will cause issue for all TImode NDD patterns. For NDD we will not assume the arithmetic operations like add have dependency between dest and src1, then write to 1st highpart rdi will be overrided by the 2nd lowpart rdi if 2nd lowpart rdi have different src as input, then the write to 1st highpart rdi will missed and cause miscompliation. To resolve this, under TARGET_APX_NDD we'd only allow register with even regno to be allocated with TImode, then TImode registers will be allocated with non-overlapping pairs. There could be some error for inline assembly if it forcely allocate __int128 with odd number general register. gcc/ChangeLog: * config/i386/i386.cc (ix86_hard_regno_mode_ok): Restrict even regno for TImode if APX NDD enabled. --- gcc/config/i386/i386.cc | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 93a9cb556a5..3efeed396c4 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20873,6 +20873,16 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return true; return !can_create_pseudo_p (); } + /* With TImode we previously have assumption that src1/dest will use same + register, so the allocation of highpart/lowpart can be consecutive, and + 2 TImode insn would held their low/highpart in continuous sequence like + rax:rdx, rdx:rcx. This will not work for APX_NDD since NDD allows + different registers as dest/src1, when writes to 2nd lowpart will impact + the writes to 1st highpart, then the insn will be optimized out. So for + TImode pattern if we support NDD form, the allowed register number should + be even to avoid such mixed high/low part override. */ + else if (TARGET_APX_NDD && mode == TImode) + return regno % 2 == 0; /* We handle both integer and floats in the general purpose registers. */ else if (VALID_INT_MODE_P (mode) || VALID_FP_MODE_P (mode)) From patchwork Tue Dec 5 02:29:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173681 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3175505vqy; Mon, 4 Dec 2023 18:37:44 -0800 (PST) X-Google-Smtp-Source: AGHT+IFYvKVhSHnnatQeAquC0uuIVVYtn2hbdZmOySon6B2TwfHSfk3WPCHHk4/wzw7mrDRYBn/O X-Received: by 2002:a05:620a:40d2:b0:77f:c6:29c1 with SMTP id g18-20020a05620a40d200b0077f00c629c1mr733837qko.129.1701743863978; Mon, 04 Dec 2023 18:37:43 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743863; cv=pass; d=google.com; s=arc-20160816; b=W1GN3lL/Mx9f0k5neZ0nUqQzdrETn7jFPjStUDwkXOh94laiOoQobFeGY5kut6YMjj 7jrbgOatGbWrmTGvIbCPS5iknEJn5KdSSIxeZ2hboWYEVpS7XlWwB310vKm6jR9AOTez gPJyOFobKoDIItRL1N1+jZrYh8xLU2Bn0WQFA4vNfuqYgPqyfh2mwZDJL+iglMhX7OdR DnzMYah6ehrG6Vyr+WtlUhdmvrGQnF/iTx5IFgEQVFMuEhE2ZlcQx3d8nNT6jDYj83Ip HTgr+Vr3BFh2DuOSEJhH885pDUIdJKKpM70+5lQYm+jLXdAQun72gAJO2Cec/DJwgK2D osuQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ojtoXFxWyxoCr+p7ZDox3ryw71dylNUHwGWErmV9aIQ=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=DM6oaFRwn0sHxxUHkI5tM+CxC1a98YiZLKOKHbC3vy5cm4kOYP0Gqkt3OGWfIUtFHF b6kETEBLi0+mn6s1OeCiS5jY+qgpP3vmVko2V3Ne4I2r5YKO+7guV8wC0vdZ1s8DO6tf dv0U7+Sd+A1AvXUVXeUqeqoH5ikM/HxOpyNqs2+P29ddUsx1QNqBpnypUGDUSeumgNbs FiyESkDGaEDJR5wrv71Bt5SLr7GKIs7Bl7boZvXQNXgxAhBpWXrCzFAV0es/2Adf1QJF qMdUYVkghd46Jj2Qy5xhcSfjFa2K77bRd9erFZHwajv6d0/hqtGU5AY7XPaVCs8FPQBb CGng== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Oa+Q9F05; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id dt4-20020a05620a478400b0077d5f329ff1si10892462qkb.380.2023.12.04.18.37.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:37:43 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Oa+Q9F05; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C1D9A39017C1 for ; Tue, 5 Dec 2023 02:32:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id B3C313831170 for ; Tue, 5 Dec 2023 02:31:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3C313831170 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B3C313831170 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743471; cv=none; b=xPapnMvwUTDeegaLNjhNFAuImmBuxgedifMz5u6qnuME/TLr4VPsD0ZEPYAUif5FAZOMGvNEYBYFmDSoLufnOimr3fu08HYeQkIpnB2m+V4QLH2t5nlBO8oT2eKFDrfC4Uusjke7oFbT0OjwV+e5TczVYilv1C1TLS1Ku3Y2FxM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743471; c=relaxed/simple; bh=13Ac2wdwIXUKsuTrvhhLgwvvHzo2whBWP8PgYHBjLqY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=w9/bhkRoj+ZO9jUJ6gyqFeYuFRmBYgsQSYnuBs2nhqe44G02bTWwHeJPHi6JtTENPGkaVboTXZZMkpxO2f0khtbqg3okcWunxw0a4LVC3q8izzDsR4RZBf22GxKR/zwG+tFq+4+ajxiMuySUxcJJrtDDL2Rf4N46/PRJ7UmAoe0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDD-0001YX-Of for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743463; x=1733279463; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=13Ac2wdwIXUKsuTrvhhLgwvvHzo2whBWP8PgYHBjLqY=; b=Oa+Q9F052QIY4lbPC6nG97BPAToo7sIkX6wyIEHem/BY7d6Tzz+iYNNM iysVzjy9VeXj/9Y2KIGUZ3rRXGED68cswYxVKRo6ZvZowdzjQZuWNNsnb Y7aKBqdNgpw9hR+pS+luWZn3XcGhLyIMqsyEP812RC4q7QL5uhlvQT9SK hfx5JICNg3vJbdquYggqkkoBGxQ3Y+0m+6zkIBH1WNjz7vzF7zdx4mufR 5xhrRQZ2T48Ll3HjDl+2DNLU3IvSToxQEUaYT8tCrQXMCMYzL7nZ5MikZ 3iQJ+XsJ+D+2jq23MeYyAWdAZiWRhLUELmShn++9yY/7palyDD7ufJoLR A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277791" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277791" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275509" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275509" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:49 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 3C0E6100566B; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 03/17] [APX NDD] Support APX NDD for optimization patterns of add Date: Tue, 5 Dec 2023 10:29:34 +0800 Message-Id: <20231205022948.504790-4-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407774000941211 X-GMAIL-MSGID: 1784407774000941211 From: Kong Lingling gcc/ChangeLog: * config/i386/i386.md: (addsi_1_zext): Add new alternatives for NDD and adjust output templates. (*add_2): Likewise. (*addsi_2_zext): Likewise. (*add_3): Likewise. (*addsi_3_zext): Likewise. (*adddi_4): Likewise. (*add_4): Likewise. (*add_5): Likewise. (*addv4): Likewise. (*addv4_1): Likewise. (*add3_cconly_overflow_1): Likewise. (*add3_cc_overflow_1): Likewise. (*addsi3_zext_cc_overflow_1): Likewise. (*add3_cconly_overflow_2): Likewise. (*add3_cc_overflow_2): Likewise. (*addsi3_zext_cc_overflow_2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add more test. --- gcc/config/i386/i386.md | 310 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 53 ++-- 2 files changed, 232 insertions(+), 131 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index cb227d19f40..2a73f6dcaec 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6476,13 +6476,15 @@ (define_insn "*add_1" ;; patterns constructed from addsi_1 to match. (define_insn "addsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r") (zero_extend:DI - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le")))) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,le,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 3 || which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -6490,11 +6492,13 @@ (define_insn "addsi_1_zext" case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6504,12 +6508,15 @@ (define_insn "addsi_1_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2 ,%1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (match_operand:SI 2 "incdec_operand") @@ -6811,37 +6818,42 @@ (define_insn "*add_2" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0,") - (match_operand:SWI 2 "" ",,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,,rm,r") + (match_operand:SWI 2 "" ",,0,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, mode, operands)" + && ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 3 || which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 2) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6856,23 +6868,26 @@ (define_insn "*add_2" (define_insn "*addsi_2_zext" [(set (reg FLAGS_REG) (compare - (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r") - (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) + (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe,re")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2 || which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" + : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" + : "dec{l}\t%k0"; } default: @@ -6880,12 +6895,15 @@ (define_insn "*addsi_2_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6899,35 +6917,40 @@ (define_insn "*addsi_2_zext" (define_insn "*add_3" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 2 "" ",0")) - (match_operand:SWI 1 "nonimmediate_operand" "%0,"))) - (clobber (match_scratch:SWI 0 "=,"))] + (neg:SWI (match_operand:SWI 2 "" ",0,,re")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r,rm"))) + (clobber (match_scratch:SWI 0 "=,,r,r"))] "ix86_match_ccmode (insn, CCZmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = (which_alternative == 2 || which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6942,22 +6965,23 @@ (define_insn "*add_3" (define_insn "*addsi_3_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0")) - (match_operand:SI 1 "nonimmediate_operand" "%0,r"))) - (set (match_operand:DI 0 "register_operand" "=r,r") + (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rBMe,0,rBMe,re")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,r,rm"))) + (set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCZmode) - && ix86_binary_operator_ok (PLUS, SImode, operands)" + && ix86_binary_operator_ok (PLUS, SImode, operands, TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2 || which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{l}\t%k0"; + return use_ndd ? "inc{l}\t{%1, %k0|%k0, %1}" : "inc{l}\t%k0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{l}\t%k0"; + return use_ndd ? "dec{l}\t{%1, %k0|%k0, %1}" : "dec{l}\t%k0"; } default: @@ -6965,12 +6989,15 @@ (define_insn "*addsi_3_zext" std::swap (operands[1], operands[2]); if (x86_maybe_negate_const_int (&operands[2], SImode)) - return "sub{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sub{l}\t{%2, %k0|%k0, %2}"; - return "add{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "add{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "add{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -6991,31 +7018,35 @@ (define_insn "*addsi_3_zext" (define_insn "*adddi_4" [(set (reg FLAGS_REG) (compare - (match_operand:DI 1 "nonimmediate_operand" "0") - (match_operand:DI 2 "x86_64_immediate_operand" "e"))) - (clobber (match_scratch:DI 0 "=r"))] + (match_operand:DI 1 "nonimmediate_operand" "0,rm") + (match_operand:DI 2 "x86_64_immediate_operand" "e,e"))) + (clobber (match_scratch:DI 0 "=r,r"))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{q}\t%0"; + return use_ndd ? "inc{q}\t{%1, %0|%0, %1}" : "inc{q}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{q}\t%0"; + return use_ndd ? "dec{q}\t{%1, %0|%0, %1}" : "dec{q}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], DImode)) - return "add{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{q}\t{%2, %1, %0|%0, %1, %2}" + : "add{q}\t{%2, %0|%0, %2}"; - return "sub{q}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{q}\t{%2, %1, %0|%0, %1, %2}" + : "sub{q}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand:DI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7036,30 +7067,36 @@ (define_insn "*adddi_4" (define_insn "*add_4" [(set (reg FLAGS_REG) (compare - (match_operand:SWI124 1 "nonimmediate_operand" "0") + (match_operand:SWI124 1 "nonimmediate_operand" "0,rm") (match_operand:SWI124 2 "const_int_operand"))) - (clobber (match_scratch:SWI124 0 "="))] + (clobber (match_scratch:SWI124 0 "=,r"))] "ix86_match_ccmode (insn, CCGCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == constm1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == const1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (x86_maybe_negate_const_int (&operands[2], mode)) - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") (if_then_else (match_operand: 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7074,36 +7111,41 @@ (define_insn "*add_5" [(set (reg FLAGS_REG) (compare (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,") - (match_operand:SWI 2 "" ",0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,,r,rm") + (match_operand:SWI 2 "" ",0,,re")) (const_int 0))) - (clobber (match_scratch:SWI 0 "=,"))] + (clobber (match_scratch:SWI 0 "=,,r,r"))] "ix86_match_ccmode (insn, CCGOCmode) && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { + bool use_ndd = (which_alternative == 2 || which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_INCDEC: if (operands[2] == const1_rtx) - return "inc{}\t%0"; + return use_ndd ? "inc{}\t{%1, %0|%0, %1}" + : "inc{}\t%0"; else { gcc_assert (operands[2] == constm1_rtx); - return "dec{}\t%0"; + return use_ndd ? "dec{}\t{%1, %0|%0, %1}" + : "dec{}\t%0"; } default: if (which_alternative == 1) std::swap (operands[1], operands[2]); - gcc_assert (rtx_equal_p (operands[0], operands[1])); if (x86_maybe_negate_const_int (&operands[2], mode)) - return "sub{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sub{}\t{%2, %1, %0|%0, %1, %2}" + : "sub{}\t{%2, %0|%0, %2}"; - return "add{}\t{%2, %0|%0, %2}"; + return use_ndd ? "add{}\t{%2, %1, %0|%0, %1, %2}" + : "add{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set (attr "type") (if_then_else (match_operand:SWI 2 "incdec_operand") (const_string "incdec") (const_string "alu"))) @@ -7316,35 +7358,43 @@ (define_insn "*addv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (plus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "addv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (plus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -9187,27 +9237,36 @@ (define_insn "*add3_cconly_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SWI 2 "" ",,re")) (match_dup 1))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "@add3_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 1))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -9252,55 +9311,74 @@ (define_insn "*addsi3_zext_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (match_dup 1))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*add3_cconly_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0") - (match_operand:SWI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SWI 2 "" ",,re")) (match_dup 2))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r,r"))] "!(MEM_P (operands[1]) && MEM_P (operands[2]))" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*add3_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (match_dup 2))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "add{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %0|%0, %2} + add{}\t{%2, %1, %0|%0, %1, %2} + add{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addsi3_zext_cc_overflow_2" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (match_dup 2))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "add{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + add{l}\t{%2, %k0|%k0, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2} + add{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn_and_split "*add3_doubleword_cc_overflow_1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 056a323a647..c1049022f2a 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,20 +2,43 @@ /* { dg-options "-mapxf -march=x86-64 -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ -int foo (int *a) -{ - int b = *a - 1; - return b; -} +#define FOO(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = *a OP 1; \ + return b; \ +} -int foo2 (int a, int b) -{ - int c = a + b; - return c; -} +#define FOO1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo1_##OP_NAME##_##TYPE (TYPE a, TYPE b) \ +{ \ + TYPE c = a OP b; \ + return c; \ +} + +#define FOO2(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ +{ \ + TYPE c = *a OP b; \ + return c; \ +} + +FOO (char, add, +) +FOO1 (char, add, +) +FOO2 (char, add, +) +FOO (short, add, +) +FOO1 (short, add, +) +FOO2 (short, add, +) +FOO (int, add, +) +FOO1 (int, add, +) +FOO2 (int, add, +) +FOO (long, add, +) +FOO1 (long, add, +) +FOO2 (long, add, +) -int foo3 (int *a, int b) -{ - int c = *a + b; - return c; -} From patchwork Tue Dec 5 02:29:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173677 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3174728vqy; Mon, 4 Dec 2023 18:34:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IG5mEtGRDiJSyqRS4onvpxSdrUV8R50FucunzhJEMbMqIB/8hZjh2o0PQ7xGDC8WGmrtAXP X-Received: by 2002:a0c:d848:0:b0:67a:3f7d:c94a with SMTP id i8-20020a0cd848000000b0067a3f7dc94amr613614qvj.8.1701743694481; Mon, 04 Dec 2023 18:34:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743694; cv=pass; d=google.com; s=arc-20160816; b=o2uinS8sqYLxg/qPqDHYb8JmoEmGNZxwVef8E0G1alNNnSP/BFJST1Ma2yO95avJTi 3IdZewflP1I+yHiICrRLggULTBssHWtDpnTGl65m4D5y4Cn7OYWr2K3wQCXrCBVDooAA 5So7gewSw9PNn0/GO8+fNVe+r8W3g7HK6LWLcfVpPtbbkWtl8eQU72HAUvxCBHsIGpQh CzCfg2Ts6AhptnjHDyMTeLjPAd49C9d7y+/ZM5QkFV1t8uwtKlbH/V5u2SW+K62WbEja IaTbQDdS6YWO8Vn5u2CRi6r3CZ81Qjew3s5M2XnwgELo4czTebAmPUMKKqAQpKPbJbz2 YyVg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=JoRh0E9W0QwNLZTbXyA8C0q/HWgKLEbUdE5NNgxTdZQ=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=VcZc8OViWp5GpTgf90CnQ+tFMiRsCEGRvvqeZ6Wxu9vmb53FoxEiDVNUsUAZj5sRlM GzwFmIETesY++mxFMFZJjh0qKRtc0cOKEcqMN0iDQdPi70ZItal/amHhiwvptkD2Aa+k 1Zc3LsX5AoBMKcOVwOK6RY/kTBzwRZUfs4K4G5y+Zhaq7hszTqUGEas3dJGNE/cBmCmd T4YMfg2gEIduKxfxTngtlnxYbwkrSqBlO7+F7AhbYY7wQKBLg1ponLNWHglcYj4Ci0r+ 2lNH5GAzN/T4YqPYGVEXkph4Vsf94OPwcqbpJMqpCTKKEPd3Fo2UD/f9Px/5La3tRzTv rZaA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UvX0tFYo; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id eu18-20020ad44f52000000b0067a1e3a7316si9183113qvb.433.2023.12.04.18.34.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:34:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UvX0tFYo; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 17DFB3952038 for ; Tue, 5 Dec 2023 02:31:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 5CCF93835028 for ; Tue, 5 Dec 2023 02:31:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5CCF93835028 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5CCF93835028 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; cv=none; b=hDqGz1x+VVUmAcDU78AXAZUtIK/AkdprLxDd9mMGwBm8lYUAerFu220yQQIr7NBAC/IM9qlJp/86CtwOAcff3/uTW5U8tN2vCW5mOC7b0PUauHXTRGWjxbfVMVUddeq6Gm+Fzj/HPjrIOBnwpgO8y8gOI5IeklRBVld8VB+U87w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; c=relaxed/simple; bh=4u2n1bPCegS1358DlUpTlJgRKdk7EZ5bHMqauW91YjM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=MbfOpkiYuQwnHM/xxShAza3CRVZWa92rIkQQelYRNEWvg/b/J+kUydPr2vZXKpX6ZAXoxeVHUKSgcuF2/xvSqwgR6gM7QoLKkdGjqZ2h7o/uheR5a2ZKeTV4nlPEa5vhuPCl4W/a9mXDC5KPq+7LFwKoLq3J9u3FEIJ88q8vLls= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALD7-0001Yf-G8 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743457; x=1733279457; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4u2n1bPCegS1358DlUpTlJgRKdk7EZ5bHMqauW91YjM=; b=UvX0tFYoAFIjrY3i613NJn1723AjA+nehBBwU2eEgKOgCPx02dhb4+gx TC5zTs8Tjmh2GB6tHFzIMD1OwVRtHHCiT3STmqXEDhzX25SkZCl5qkgMB +x94ItRijnfjQA5+BENj+zwNZDCder2uyMl/fHThWYP+E7OlR65jUpzd7 I+uCk2TA7951ps16EaCEvAGR7phKy1+FIAKX7MxBqu6c1GD+uK0mpBe5/ TlYTFYOaOWOzsq15fqIKK5X0qZWd9e9g7fz31NSnq5fVsC8dxkNNiIz6N G0HkGwbcLcLMxICzXKM2UfOG7NifVy3Pu5DzINc1UxE31SM6MBHuhL0F6 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277782" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277782" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275495" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275495" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:49 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 3FED0100567D; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 04/17] [APX NDD] Disable seg_prefixed memory usage for NDD add Date: Tue, 5 Dec 2023 10:29:35 +0800 Message-Id: <20231205022948.504790-5-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407596567140266 X-GMAIL-MSGID: 1784407596567140266 NDD uses evex prefix, so when segment prefix is also applied, the instruction could excceed its 15byte limit, especially adding immediates. This could happen when "e" constraint accepts any UNSPEC_TPOFF/UNSPEC_NTPOFF constant and it will add the offset to segment register, which will be encoded using segment prefix. Disable those *POFF constant usage in NDD add alternatives with new constraint. gcc/ChangeLog: * config/i386/constraints.md (je): New constraint. * config/i386/i386-protos.h (x86_poff_operand_p): New function to check any *POFF constant in operand. * config/i386/i386.cc (x86_poff_operand_p): New prototype. * config/i386/i386.md (*add_1): Split out je alternative for add. --- gcc/config/i386/constraints.md | 5 +++++ gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 25 +++++++++++++++++++++++++ gcc/config/i386/i386.md | 10 +++++----- 4 files changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index cbee31fa40a..f4c3c3dd952 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -433,3 +433,8 @@ (define_address_constraint "jb" (define_register_constraint "jc" "TARGET_APX_EGPR && !TARGET_AVX ? GENERAL_GPR16 : GENERAL_REGS") + +(define_constraint "je" + "@internal constant that do not allow any unspec global offsets" + (and (match_operand 0 "x86_64_immediate_operand") + (match_test "!x86_poff_operand_p (op)"))) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index a9d0c568bba..7dfeb6af225 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -66,6 +66,7 @@ extern bool x86_extended_QIreg_mentioned_p (rtx_insn *); extern bool x86_extended_reg_mentioned_p (rtx); extern bool x86_extended_rex2reg_mentioned_p (rtx); extern bool x86_evex_reg_mentioned_p (rtx [], int); +extern bool x86_poff_operand_p (rtx); extern bool x86_maybe_negate_const_int (rtx *, machine_mode); extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3efeed396c4..3e670330ef6 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -23341,6 +23341,31 @@ x86_evex_reg_mentioned_p (rtx operands[], int nops) return false; } +/* Return true when rtx operand does not contain any UNSPEC_*POFF related + constant to avoid APX_NDD instructions excceed encoding length limit. */ +bool +x86_poff_operand_p (rtx operand) +{ + if (GET_CODE (operand) == CONST) + { + rtx op = XEXP (operand, 0); + if (GET_CODE (op) == PLUS) + op = XEXP (op, 0); + + if (GET_CODE (op) == UNSPEC) + { + int unspec = XINT (op, 1); + return (unspec == UNSPEC_NTPOFF + || unspec == UNSPEC_TPOFF + || unspec == UNSPEC_DTPOFF + || unspec == UNSPEC_GOTTPOFF + || unspec == UNSPEC_GOTNTPOFF + || unspec == UNSPEC_INDNTPOFF); + } + } + return false; +} + /* If profitable, negate (without causing overflow) integer constant of mode MODE at location LOC. Return true in this case. */ bool diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 2a73f6dcaec..6b316e698bb 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6415,15 +6415,15 @@ (define_insn_and_split "*add3_doubleword_concat_zext" "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[5]);") (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,r,r,r,r") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,re,BM"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,rm,r,m,r") + (match_operand:SWI48 2 "x86_64_general_operand" "re,BM,0,le,r,e,je,BM"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" { - bool use_ndd = (which_alternative == 4 || which_alternative == 5); + bool use_ndd = (which_alternative >= 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -6454,7 +6454,7 @@ (define_insn "*add_1" : "add{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd") + [(set_attr "isa" "*,*,*,*,apx_ndd,apx_ndd,apx_ndd,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") From patchwork Tue Dec 5 02:29:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173690 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176582vqy; Mon, 4 Dec 2023 18:41:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IGKOr/hYFh8GUGCgk0YRP1sKHH7huNhCgBaF/2ZLBhtomgS465fMzNs3TAt9K6mq3dmj9HC X-Received: by 2002:a05:620a:63c5:b0:77d:55e9:a5cd with SMTP id pw5-20020a05620a63c500b0077d55e9a5cdmr584935qkn.21.1701744098047; Mon, 04 Dec 2023 18:41:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744098; cv=pass; d=google.com; s=arc-20160816; b=h6vxcL8SaICRchytbnAmWfprRi7XTCT+vh2Aw4BXP7xci+BLjT7kyGG9jOAsYN/UU2 Eo2Val5lpT3GqTcHb+QCc6Qi2pdy4VFJdS4d0010SLujFD5fZJUUa+qSdjW/KSk8IhyZ vp2hJd2+QaBlt/wE5g4FCXbwePTKU+Xfrz6hinu4dHh/3WIvTkyyk1Vn9tEzJlTbaNzR 6FRuGravi9aFiu1BCl0L3FvobIleQYmbyl+++zjMG8x3f2xslvD1m27FcSj6imM7JBwv k2u6DmlKfiMnw++blGLrIM/IlgfU6sg7qaV4FzFVvnqlaMndkMlp9mhA1fb5JbUXNag4 32Xg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ImuA35tFdS5hhYmU7XgLCN8j6KWLY57dXwmdo4jsNiw=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=IlooLHsRrQqFEWiEj6WpJt0bSNfIgdfddl8NH21aqlSC07Hbe2ovGfMiaC9BZsRT7g Rk3YMZ/qApIznwu65YjmZGx550DY6ntnYbj2laVJv//GRKqAWtJmu1I7NKm41ZZdaRXH Wc3R9AmsGB1vPfh7so97bRFuwHvbAMBBXFtZvlM3/4be7FlMlL2g2KtJyQ6yC54xkmY5 j8dufTNqZS9p2EMdt2GC1v7Wd/ON518yRI90G+aXBAzTRwyogKUKvyk805tfkLLCMrps h9Bsy3p8wuJ1LSA+3EbZAHiEGYcYWxctQGEDBE0SygFO8v+72LXOlK1+SGhXhWeU8vwy y1KA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fx33RXlb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h24-20020ae9ec18000000b00773fe581c18si10415755qkg.463.2023.12.04.18.41.37 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:41:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fx33RXlb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6DC939C16B1 for ; Tue, 5 Dec 2023 02:33:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 31AF03888C5F for ; Tue, 5 Dec 2023 02:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 31AF03888C5F Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 31AF03888C5F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; cv=none; b=VLDqEMerI1RC3DPDk+XI9YFIT1hz5SfAkxsrA7iFgSyB2buG2xZxLC78MM2Q8wwzkIbNqrLIgetRuTjHzi+l6SUXPrQjRA9tQoDr4KdcqXjujiMNe+wn01PZMff/Pmg8hug/NFIoCIPXT7XngLAVDAKxAe++4evoo/y27ihnA3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; c=relaxed/simple; bh=p+72NHwUMuzPkTpF1PcwCI/iBRkFwZPJV21+AeQPLps=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=K1qg8J6xL+egp3eoDvGaMFKMqaSW3XDUVsG7AH8kWakhlvBEGSvJfoNhs/O/Ju7gaEfA/0Tzzrs6a0i1Sgber4khDLMqRyyy8A+wARHsncI714MrvtyfrKLsHj00stTeOjPaY0CLPn4zTxBZPBvoVMpgkLe+2qshAvo0CDZFTL0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDR-0001Yf-CQ for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743477; x=1733279477; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p+72NHwUMuzPkTpF1PcwCI/iBRkFwZPJV21+AeQPLps=; b=Fx33RXlb8lytAteM3ORsypi7IcqbXHJZOC00fgnzYLXY17SHgrbO9SgO 8DBmvyHoA9J4CQoUDcrvEKGK7XSfLLHLu/07d3ARMvdJZnhTdmQ4wMgAL xwlOObb0FmIorRXkPcF8Dn78a3FLqZEDPFjSg5jsBDFilb18EixuJ8AqR TuVuNE4Blinc/JA2J8fQHGbHIOzLeg8+ob6t4GdAJv8O+TtEQaPvE71nv 0jMThbJ7sX/2O3QSeowR0e5rfqLQQEimfITolCZuN0iHz9ZC3KFpx6Jcr GtgqntBHk28sPS8mBgOwnxufe9tWowk5OLloIMGm3YHKzq7O5fHe00lBb A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277809" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277809" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275534" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275534" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 424BE100568C; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 05/17] [APX NDD] Support APX NDD for adc insns Date: Tue, 5 Dec 2023 10:29:36 +0800 Message-Id: <20231205022948.504790-6-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408019251844516 X-GMAIL-MSGID: 1784408019251844516 From: Kong Lingling Legacy adc patterns are commonly adopted to TImode add, when extending TImode add to NDD version, operands[0] and operands[1] can be different, so extra move should be emitted if those patterns have optimization when adding const0_rtx. NDD instructions will automatically zero-extend dest register to 64bit, so for zext patterns it can adopt all NDD form that have memory src input. gcc/ChangeLog: * config/i386/i386.md (*add3_doubleword): Add ndd constraints, and move operands[1] to operands[0] when they are not equal. (*add3_doubleword_cc_overflow_1): Likewise. (*add3_doubleword_zext): Add ndd constraints. (*addv4_doubleword): Likewise. (*addv4_doubleword_1): Likewise. (addv4_overflow_1): Likewise. (*addv4_overflow_2): Likewise. (@add3_carry): Likewise. (*add3_carry_0): Likewise. (*addsi3_carry_zext): Likewise. (addcarry): Likewise. (addcarry_0): Likewise. (*addcarry_1): Likewise. (*add3_eq): Likewise. (*add3_ne): Likewise. (*addsi3_carry_zext_0): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-adc.c: New test. --- gcc/config/i386/i386.md | 191 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-adc.c | 15 ++ 2 files changed, 134 insertions(+), 72 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-adc.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6b316e698bb..358a3857f89 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6291,12 +6291,12 @@ (define_expand "add3" TARGET_APX_NDD); DONE;") (define_insn_and_split "*add3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,r"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6316,24 +6316,34 @@ (define_insn_and_split "*add3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + /* Under NDD op0 and op1 may not equal, do not delete insn then. */ + bool emit_insn_deleted_note_p = true; + if (!rtx_equal_p (operands[0], operands[1])) + { + emit_move_insn (operands[0], operands[1]); + emit_insn_deleted_note_p = false; + } if (operands[5] != const0_rtx) - ix86_expand_binary_operator (PLUS, mode, &operands[3]); + ix86_expand_binary_operator (PLUS, mode, &operands[3], + TARGET_APX_NDD); else if (!rtx_equal_p (operands[3], operands[4])) emit_move_insn (operands[3], operands[4]); - else + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,r,r") (plus: (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")) - (match_operand: 1 "nonimmediate_operand" "0,0"))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")) + (match_operand: 1 "nonimmediate_operand" "0,0,r,m"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -6349,7 +6359,8 @@ (define_insn_and_split "*add3_doubleword_zext" (match_dup 4)) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*add3_doubleword_concat" [(set (match_operand: 0 "register_operand" "=&r") @@ -7411,14 +7422,14 @@ (define_insn_and_split "*addv4_doubleword" (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0,0")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (plus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -7448,22 +7459,23 @@ (define_insn_and_split "*addv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*addv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (plus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "%0")) - (match_operand: 3 "const_scalar_int_operand" "n")) + (match_operand: 1 "nonimmediate_operand" "%0,rm")) + (match_operand: 3 "const_scalar_int_operand" "n,n")) (sign_extend: (plus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -7501,7 +7513,8 @@ (define_insn_and_split "*addv4_doubleword_1" operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*addv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -7511,9 +7524,9 @@ (define_insn "*addv4_overflow_1" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0"))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r"))) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (plus:SWI (plus:SWI @@ -7521,15 +7534,20 @@ (define_insn "*addv4_overflow_1" [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*addv4_overflow_2" @@ -7540,26 +7558,29 @@ (define_insn "*addv4_overflow_2" (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0"))) - (match_operand: 6 "const_int_operand" "n")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm"))) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (plus:SWI (plus:SWI (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)]) (match_dup 1)) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (plus:SWI (plus:SWI (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8381,17 +8402,22 @@ (define_insn "*subsi_3_zext" ;; Add with carry and subtract with borrow (define_insn "@add3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (plus:SWI (plus:SWI (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8478,31 +8504,39 @@ (define_insn "*add3_carry_0r" (set_attr "mode" "")]) (define_insn "*addsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (plus:SI (plus:SI (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "%0")) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 1 "nonimmediate_operand" "%0,r,rm")) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" - "adc{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands, + TARGET_APX_NDD)" + "@ + adc{l}\t{%2, %k0|%k0, %2} + adc{l}\t{%2, %1, %k0|%k0, %1, %2} + adc{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "adc{l}\t{$0, %k0|%k0, 0}" - [(set_attr "type" "alu") + "@ + adc{l}\t{$0, %k0|%k0, 0} + adc{l}\t{$0, %1, %k0|%k0, %1, 0}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8531,20 +8565,25 @@ (define_insn "addcarry" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0")) - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,rm,r")) + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,r,m"))) (plus: (zero_extend: (match_dup 2)) (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8702,7 +8741,8 @@ (define_expand "addcarry_0" (match_dup 1))) (set (match_operand:SWI48 0 "nonimmediate_operand") (plus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (PLUS, mode, operands)") + "ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD)") (define_insn "*addcarry_1" [(set (reg:CCC FLAGS_REG) @@ -8712,18 +8752,18 @@ (define_insn "*addcarry_1" (plus:SWI48 (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) - (match_operand:SWI48 1 "nonimmediate_operand" "%0")) - (match_operand:SWI48 2 "x86_64_immediate_operand" "e"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,rm")) + (match_operand:SWI48 2 "x86_64_immediate_operand" "e,e"))) (plus: (match_operand: 6 "const_scalar_int_operand") (match_operator: 4 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm") + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") (plus:SWI48 (plus:SWI48 (match_op_dup 5 [(match_dup 3) (const_int 0)]) (match_dup 1)) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && CONST_INT_P (operands[2]) /* Check that operands[6] is operands[2] zero extended from mode to mode. */ @@ -8736,8 +8776,11 @@ (define_insn "*addcarry_1" && ((unsigned HOST_WIDE_INT) CONST_WIDE_INT_ELT (operands[6], 0) == UINTVAL (operands[2])) && CONST_WIDE_INT_ELT (operands[6], 1) == 0))" - "adc{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + adc{}\t{%2, %0|%0, %2} + adc{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "") @@ -9385,12 +9428,12 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" [(set (reg:CCC FLAGS_REG) (compare:CCC (plus: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o")) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o")) (match_dup 1))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (plus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (PLUS, mode, operands)" + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CCC FLAGS_REG) @@ -9419,6 +9462,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_addcarry_0 (operands[3], operands[4], operands[5])); DONE; } @@ -9427,7 +9472,8 @@ (define_insn_and_split "*add3_doubleword_cc_overflow_1" operands[5], mode); else operands[6] = gen_rtx_ZERO_EXTEND (mode, operands[5]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) ;; x == 0 with zero flag test can be done also as x < 1U with carry flag ;; test, where the latter is preferrable if we have some carry consuming @@ -9442,7 +9488,7 @@ (define_insn_and_split "*add3_eq" (match_operand:SWI 1 "nonimmediate_operand")) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (PLUS, mode, operands) + "ix86_binary_operator_ok (PLUS, mode, operands, TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9466,7 +9512,8 @@ (define_insn_and_split "*add3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (PLUS, mode, operands) + && ix86_binary_operator_ok (PLUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c new file mode 100644 index 00000000000..9d5991457da --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-adc.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-1.c" +// *addti3_doubleword +// *addti3_doubleword_zext +// *adddi3_cc_overflow_1 +// *adddi3_carry + +int foo3 (int *a, int b) +{ + int c = *a + b + (a > b); /* { dg-warning "comparison between pointer and integer" } */ + return c; +} +/* { dg-final { scan-assembler-not "xor" } } */ From patchwork Tue Dec 5 02:29:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173679 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3174999vqy; Mon, 4 Dec 2023 18:35:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IEm5tvMIKcJ3PYp2jYeuNrGesh3uVRkb1S+r/8s5TbcfjGvW0ENYeyYrntB7Lk2Oo8mqD7j X-Received: by 2002:a05:620a:100f:b0:77e:fba3:81be with SMTP id z15-20020a05620a100f00b0077efba381bemr574798qkj.84.1701743755405; Mon, 04 Dec 2023 18:35:55 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743755; cv=pass; d=google.com; s=arc-20160816; b=LVNeLkQu+SMc++0jCDinAhzCND/sXmIiLxtetiSJf98U6iXrjKFGF5Ou0VRm+j+gt5 gpWAWjLEkEQjQlkUCxJ2qe3+v7BH/jyZDurdMrq1t/qhVwW0dxQUm6EExzCAghKHFIAa li840qPiFzmaxhrvbV56BqBwmuUccNFPLZO885QkC9ViXw9BPfgMaluG/HaPZwskpapY zSknTnUeuhtDC8ng4Xo0pKATX62uSEGDfEnyaN+zdaK9lyiDfD8u93oncUiiIhTDDHW2 kRxDIitogDHpsuHSYMi9fnZDdnggnA7Et12FlrJ3xIN2zLuYqXg30zQqscpXblbrjTOp +dmw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=gu1mqt/3tPZ199eRN/nC7WE5NuAZj47hLL7QhMtmsOs=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=B0IsHewse5oNkEjjsh4Zy50wVhMbjdI5Qwz6E66P5eJlrPHukqbZK1dR8TWhLOAUfx zY0SS3sWXZCyfgvFRa0UeYTOyj/LmX8dHPDonaUeKlIVhl9owO0dR5lOy5SIcHn0oFlx gMlrgg6erccFv9mngmwsvxz2Y8+YD8pn0ahZJxAOb5zJ5ndrvgFLh8M6Sf3u+Z5xuEBA 6UFCNf5JElSu6aE2Y8dAZyCgb6kBj5n+MzrdPiXtTA+qSc7hcQ3+lLC3UB1oQ90+XNYS p2doOlSqHtFOKkQ4gwDf4xLjO/6O4hNcqaFBm22BZkcieATQTTvD78kcJnD6PsrvFbDG PWQQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dETTQe9t; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b3-20020a05620a270300b0077f0221a000si6238426qkp.55.2023.12.04.18.35.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:35:55 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dETTQe9t; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B994D395C038 for ; Tue, 5 Dec 2023 02:32:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 8861F3884560 for ; Tue, 5 Dec 2023 02:31:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8861F3884560 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8861F3884560 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743472; cv=none; b=aEcFbmytsYjqk/Er5MFJU1aZhKgUFZlPqLmzDGk7q4vLAQBP4pCkDtp4RTd1yNylOsfo2Gidt4ZOKU94T3lAq5u50EY0Cg2p0F6Cz2waVrqJBqRhv0VQZpGHlqUq4AwRMqlJvFeaKyQw/PLvyiyBc54ISln/i/HmS1Hpx0IFg30= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743472; c=relaxed/simple; bh=J0nDFVYt3gXz606f+PJlaAEuw8LK7hZ53mM7sYBhNm8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=d0HGcbSSKs5mHWEGhNCrNV3EX7mw1IsKIqzxqsvrQKDPGmaXj7pYG4V8qcGdQM1pDiDmpOovnztbCElTnGjfA+p3sK35kq25w7+JnEjSKoEMzaMMZTGuPFrxqh9tpp4JrVi7HJWd84qIkZGxrQSgk6AFe9K3K7rqXyLZIUiuWZ4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDI-0001YX-04 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743467; x=1733279467; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J0nDFVYt3gXz606f+PJlaAEuw8LK7hZ53mM7sYBhNm8=; b=dETTQe9tBpDYaS/WH4TbUyRyajmSHzSbAOPcdgnCJqeGIFq0ulNjwq32 TPzT1thMMkp5OCZGQPapijKNAwR42EGXV/5WM4i15EBVNCAo3R3lohLgc VFKg75ZyTLPo/qWFIrirsLKcRjMPQXqpnTDAtDHzNgxYYtz3PwFEPglXy mX7xxXmNnFIJg0NnJVoLj6Jdw9LrLwa5Yq9FAIXNT2fyQO94UERjFL3AX 2x7IP5wd6FAn7KXk79StGAVZYvG3HktWk9UnfL0QsQloXwTJNvAY6d/0k YZm6LTEsLEmeXDrRVNi2bR51AC19iV4V+JFvsigyBZuzAgivqXxhVr2iX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277796" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277796" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275515" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275515" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 463A9100568F; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 06/17] [APX NDD] Support APX NDD for sub insns Date: Tue, 5 Dec 2023 10:29:37 +0800 Message-Id: <20231205022948.504790-7-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407660311441254 X-GMAIL-MSGID: 1784407660311441254 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_fixup_binary_operands_no_copy): Add use_ndd parameter and parse it. * config/i386/i386-protos.h (ix86_fixup_binary_operands_no_copy): Change define. * config/i386/i386.md (sub3): Add new alternatives for NDD and adjust output templates. (*sub_1): Likewise. (*sub_2): Likewise. (subv4): Likewise. (*subv4): Likewise. (subv4_1): Likewise. (usubv4): Likewise. (*sub_3): Likewise. (*subsi_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternatives. (*subsi_2_zext): Likewise. (*subsi_3_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for ndd sub. --- gcc/config/i386/i386-expand.cc | 5 +- gcc/config/i386/i386-protos.h | 2 +- gcc/config/i386/i386.md | 155 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 4 files changed, 120 insertions(+), 55 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 3ecda989cf8..93ecde4b4a8 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1326,9 +1326,10 @@ ix86_fixup_binary_operands (enum rtx_code code, machine_mode mode, void ix86_fixup_binary_operands_no_copy (enum rtx_code code, - machine_mode mode, rtx operands[]) + machine_mode mode, rtx operands[], + bool use_ndd) { - rtx dst = ix86_fixup_binary_operands (code, mode, operands); + rtx dst = ix86_fixup_binary_operands (code, mode, operands, use_ndd); gcc_assert (dst == operands[0]); } diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 7dfeb6af225..481527872e8 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -111,7 +111,7 @@ extern void ix86_expand_vector_move_misalign (machine_mode, rtx[]); extern rtx ix86_fixup_binary_operands (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_fixup_binary_operands_no_copy (enum rtx_code, - machine_mode, rtx[]); + machine_mode, rtx[], bool = false); extern void ix86_expand_binary_operator (enum rtx_code, machine_mode, rtx[], bool = false); extern void ix86_expand_vector_logical_operator (enum rtx_code, diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 358a3857f89..ea5377a0b38 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7772,7 +7772,8 @@ (define_expand "sub3" (minus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand") (match_operand:SDWIM 2 "")))] "" - "ix86_expand_binary_operator (MINUS, mode, operands); DONE;") + "ix86_expand_binary_operator (MINUS, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*sub3_doubleword" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -7798,7 +7799,10 @@ (define_insn_and_split "*sub3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { - ix86_expand_binary_operator (MINUS, mode, &operands[3]); + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + ix86_expand_binary_operator (MINUS, mode, &operands[3], + TARGET_APX_NDD); DONE; } }) @@ -7827,25 +7831,36 @@ (define_insn_and_split "*sub3_doubleword_zext" "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") (define_insn "*sub_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -7936,31 +7951,42 @@ (define_insn "*sub_2" [(set (reg FLAGS_REG) (compare (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subsi_2_zext" [(set (reg FLAGS_REG) (compare - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %k0|%k0, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) (define_insn "*subqi_ext_0" @@ -8072,7 +8098,8 @@ (define_expand "subv4" (pc)))] "" { - ix86_fixup_binary_operands_no_copy (MINUS, mode, operands); + ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + TARGET_APX_NDD); if (CONST_SCALAR_INT_P (operands[2])) operands[4] = operands[2]; else @@ -8083,35 +8110,45 @@ (define_insn "*subv4" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r")) (sign_extend: - (match_operand:SWI 2 "" "We,m"))) + (match_operand:SWI 2 "" "We,m,rWe,m"))) (sign_extend: (minus:SWI (match_dup 1) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "subv4_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "0")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (match_operand: 3 "const_int_operand")) (sign_extend: (minus:SWI (match_dup 1) - (match_operand:SWI 2 "x86_64_immediate_operand" ""))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (match_operand:SWI 2 "x86_64_immediate_operand" ","))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (minus:SWI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[3])" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (cond [(match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8207,6 +8244,8 @@ (define_insn_and_split "*subv4_doubleword_1" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); emit_insn (gen_subv4_1 (operands[3], operands[4], operands[5], operands[5])); DONE; @@ -8288,18 +8327,25 @@ (define_expand "usubv4" (label_ref (match_operand 3)) (pc)))] "" - "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands);") + "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, + TARGET_APX_NDD);") (define_insn "*sub_3" [(set (reg FLAGS_REG) - (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0") - (match_operand:SWI 2 "" ","))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") + (match_operand:SWI 2 "" ",,r,"))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,i,r,r") (minus:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, mode, operands)" - "sub{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %0|%0, %2} + sub{}\t{%2, %1, %0|%0, %1, %2} + sub{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_peephole2 @@ -8387,16 +8433,21 @@ (define_insn_and_split "*dec_cmov" (define_insn "*subsi_3_zext" [(set (reg FLAGS_REG) - (compare (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe"))) - (set (match_operand:DI 0 "register_operand" "=r") + (compare (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re"))) + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCmode) - && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sub{l}\t{%2, %1|%1, %2}" - [(set_attr "type" "alu") + && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sub{l}\t{%2, %1|%1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2} + sub{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "SI")]) ;; Add with carry and subtract with borrow diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index c1049022f2a..0c7952ef018 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -42,3 +42,16 @@ FOO (long, add, +) FOO1 (long, add, +) FOO2 (long, add, +) +FOO (char, sub, -) +FOO1 (char, sub, -) +FOO (short, sub, -) +FOO1 (short, sub, -) +FOO (int, sub, -) +FOO1 (int, sub, -) +FOO (long, sub, -) +FOO1 (long, sub, -) +/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), %(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Tue Dec 5 02:29:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173693 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176768vqy; Mon, 4 Dec 2023 18:42:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IH3tOsdcm980nP0AEPy8GHhP8FCEo5Ev9+eFn+bcVOer7GL1bEvraArZLZrVlvAnsuI+T9M X-Received: by 2002:a25:fc3:0:b0:db0:6cde:3859 with SMTP id 186-20020a250fc3000000b00db06cde3859mr3089964ybp.41.1701744134586; Mon, 04 Dec 2023 18:42:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744134; cv=pass; d=google.com; s=arc-20160816; b=VPTOuW+r6zAK6Jrb1ctdvAyxuUsrFek9xqM+cNXo014QDmxu75Ec+wyrQRR2khzOcw MHOXButHZ/1wb562cTVItM59izbBTfdEO0xFTmaXX7jlzfZfyckMa92ODlecn1V+epTJ hHnHO7zc4+qXacqt1TW8rdJiFpsRTwfPA6OjG6YDDcVNiHgim22WlinVEFt5ApPdsh/l Di1BAJM/WPbiFRCBsk63QnlsRpK0kODS4GC67DA+5LPC0gS6p17Exmi6Lcwu5165sIId izDxU0qet0rxvA7Udju6QeQk5L8ktqRlvEMBkOs1HJqRtQTGu5AWS09DJLMxyrye/5mV 5hKw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=mNuIB5Pi3Q/hDraXr1qJuBtuoEmVsrmtMQIwy81Y0es=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=rrmyZnLS0eZ7KaozrEJ+3+dDEhmpCKBgVlPQpFI6DwewhS4wnkrsVXyvhSb7Y4A7os kPwWWzGKa0BSGNjKiBJW/KkVPo3q93yqxGJ+THs+NDV5g/QAEXXEorfYgIkhAY9kJGsY QUo9s9dubYQzZ4sMiRmXQuTBODzGHAe6NKto3qV9e1PhCNN0TFs4zKYA9S/6clB2wbZe Okjy4NUHyfAyvPSIlKHxX2b2Gc+vT7AOoft8WZtT6zNUtA6QQDQbhxw/AqhljaIlVGQY FOaxkJqOENJBpPMCY8XIYxNnn4idXlsKQ25S9ok/Qri2424euj1Y8X8UOdm6F6PG5Yt+ OLNQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SHduNJiO; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id dr9-20020a05621408e900b0066d0a15a304si5460717qvb.104.2023.12.04.18.42.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:42:14 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SHduNJiO; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7BF95381BAE7 for ; Tue, 5 Dec 2023 02:33:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 15616382F90B for ; Tue, 5 Dec 2023 02:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15616382F90B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 15616382F90B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; cv=none; b=x4SilgYsuXvHxr9rLdmRTszDkieKJi0ySR5SWH9iuMRQ7Y/UdmQpZER+aq70b99ys3XRCRgZlnJuCtu1XH95/xuImXDRiK1pCz2uuLqYJaXR2cbpK4VQov01hJZbQEu3vBP9iiDHO2XgTEjN5LPl54kwvaRPb+qLJsv3OWXCwDM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; c=relaxed/simple; bh=KJa3kApv/sBjqPHmT7Bguy/pfE6+QEi3hZnkhm3tesM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=a+X0dirpN5ygc4+4b9W+GROWWI9duXyuYvfQ2wFYDfG9VGuNqQzLp8OnUd8iBnxKmA2mvD6oM6CvB9093lrDFInhFRREUMtBHG5KYGVrdbRBqiOgXIL6X4hQPTGa8n53APenoZMkJZeaQGdr05bmqpM/b9M+pZBrFzPfavwokas= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDN-0001YX-R8 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743473; x=1733279473; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KJa3kApv/sBjqPHmT7Bguy/pfE6+QEi3hZnkhm3tesM=; b=SHduNJiOtZI1nb3PebfclSRtvc5OopUu9B5zh5cthFgnZGbxZ33lDgSy qXSpCk6RUzRZ2Pa30teVLhhoLUswdbfGckmBH101ti7kEnDPsf7HsYlk+ HgUzyrHCVbRqZgSWqVuTQemRay0mQlArFVqGweX/p5QbU390BZRHyWJjD 5DYy1/m5mSUBwUfRD1h6SN0n11wuiE+JVoBwezDQ3smXNxcV66yjHWkwo uhQLRKBTAaACUgqGL00ALYYj2n03Cb1T1AnJ+aHudhMATuxLqnEfMRz6y 7Gv0ZV3YyE1JpyujevwsZmwVxWzSIdD5+j5mYe/mjvFcmxoE02NN8xNiz g==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277804" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277804" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275529" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275529" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 49694100562B; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 07/17] [APX NDD] Support APX NDD for sbb insn Date: Tue, 5 Dec 2023 10:29:38 +0800 Message-Id: <20231205022948.504790-8-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408057569149362 X-GMAIL-MSGID: 1784408057569149362 From: Kong Lingling Similar to *add3_doubleword, operands[1] may not equal to operands[0] so extra move is required. gcc/ChangeLog: * config/i386/i386.md (*sub3_doubleword): Add new alternative for NDD, and emit move when operands[0] not equal to operands[1]. (*sub3_doubleword_zext): Likewise. (*subv4_doubleword): Likewise. (*subv4_doubleword_1): Likewise. (*subv4_overflow_1): Add NDD alternatives and adjust output templates. (*subv4_overflow_2): Likewise. (@sub3_carry): Likewise. (*addsi3_carry_zext_0r): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*subsi3_carry_zext): Likewise. (subborrow): Parse TARGET_APX_NDD to ix86_binary_operator_ok. (subborrow_0): Likewise. (*sub3_eq): Likewise. (*sub3_ne): Likewise. (*sub3_eq_1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-sbb.c: New test. --- gcc/config/i386/i386.md | 160 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c | 6 + 2 files changed, 107 insertions(+), 59 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ea5377a0b38..e2705ada31a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7776,12 +7776,13 @@ (define_expand "sub3" TARGET_APX_NDD); DONE;") (define_insn_and_split "*sub3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7805,16 +7806,18 @@ (define_insn_and_split "*sub3_doubleword" TARGET_APX_NDD); DONE; } -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*sub3_doubleword_zext" - [(set (match_operand: 0 "nonimmediate_operand" "=r,o") + [(set (match_operand: 0 "nonimmediate_operand" "=r,o,r,r") (minus: - (match_operand: 1 "nonimmediate_operand" "0,0") + (match_operand: 1 "nonimmediate_operand" "0,0,r,o") (zero_extend: - (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")))) + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r,rm,r")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (UNKNOWN, mode, operands)" + "ix86_binary_operator_ok (UNKNOWN, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -7828,7 +7831,8 @@ (define_insn_and_split "*sub3_doubleword_zext" (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))) (const_int 0))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[3]);" +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*sub_1" [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") @@ -8162,14 +8166,15 @@ (define_insn_and_split "*subv4_doubleword" (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0,0")) + (match_operand: 1 "nonimmediate_operand" "0,0,ro,r")) (sign_extend: - (match_operand: 2 "nonimmediate_operand" "r,o"))) + (match_operand: 2 "nonimmediate_operand" "r,o,r,o"))) (sign_extend: (minus: (match_dup 1) (match_dup 2))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel [(set (reg:CC FLAGS_REG) @@ -8197,22 +8202,24 @@ (define_insn_and_split "*subv4_doubleword" (match_dup 5)))])] { split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn_and_split "*subv4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO (minus: (sign_extend: - (match_operand: 1 "nonimmediate_operand" "0")) + (match_operand: 1 "nonimmediate_operand" "0,ro")) (match_operand: 3 "const_scalar_int_operand")) (sign_extend: (minus: (match_dup 1) - (match_operand: 2 "x86_64_hilo_general_operand" ""))))) - (set (match_operand: 0 "nonimmediate_operand" "=ro") + (match_operand: 2 "x86_64_hilo_general_operand" ","))))) + (set (match_operand: 0 "nonimmediate_operand" "=ro,r") (minus: (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_SCALAR_INT_P (operands[2]) && rtx_equal_p (operands[2], operands[3])" "#" @@ -8250,7 +8257,8 @@ (define_insn_and_split "*subv4_doubleword_1" operands[5])); DONE; } -}) +} +[(set_attr "isa" "*,apx_ndd")]) (define_insn "*subv4_overflow_1" [(set (reg:CCO FLAGS_REG) @@ -8258,11 +8266,11 @@ (define_insn "*subv4_overflow_1" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0,0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) (sign_extend: - (match_operand:SWI 2 "" "rWe,m"))) + (match_operand:SWI 2 "" "rWe,m,rWe,m"))) (sign_extend: (minus:SWI (minus:SWI @@ -8270,15 +8278,21 @@ (define_insn "*subv4_overflow_1" (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "")]) (define_insn "*subv4_overflow_2" @@ -8287,28 +8301,32 @@ (define_insn "*subv4_overflow_2" (minus: (minus: (sign_extend: - (match_operand:SWI 1 "nonimmediate_operand" "%0")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,rm")) (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand: 6 "const_int_operand" "n")) + (match_operand: 6 "const_int_operand" "n,n")) (sign_extend: (minus:SWI (minus:SWI (match_dup 1) (match_operator:SWI 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) - (match_operand:SWI 2 "x86_64_immediate_operand" "e"))))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=rm") + (match_operand:SWI 2 "x86_64_immediate_operand" "e,e"))))) + (set (match_operand:SWI 0 "nonimmediate_operand" "=rm,r") (minus:SWI (minus:SWI (match_dup 1) (match_op_dup 5 [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && CONST_INT_P (operands[2]) && INTVAL (operands[2]) == INTVAL (operands[6])" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "mode" "") (set (attr "length_immediate") (if_then_else (match_test "IN_RANGE (INTVAL (operands[2]), -128, 127)") @@ -8593,15 +8611,18 @@ (define_insn "*addsi3_carry_zext_0" (set_attr "mode" "SI")]) (define_insn "*addsi3_carry_zext_0r" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (plus:SI (match_operator:SI 2 "ix86_carry_flag_unset_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 1 "register_operand" "0")))) + (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "sbb{l}\t{$-1, %k0|%k0, -1}" - [(set_attr "type" "alu") + "@ + sbb{l}\t{$-1, %k0|%k0, -1} + sbb{l}\t{$-1, %1, %k0|%k0, %1, -1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -8841,17 +8862,23 @@ (define_insn "*addcarry_1" (const_string "4")))]) (define_insn "@sub3_carry" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (minus:SWI (minus:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0,0") + (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") (match_operator:SWI 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)])) - (match_operand:SWI 2 "" ","))) + (match_operand:SWI 2 "" ",,r,"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -8938,18 +8965,23 @@ (define_insn "*sub3_carry_0r" (set_attr "mode" "")]) (define_insn "*subsi3_carry_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (minus:SI (minus:SI - (match_operand:SI 1 "register_operand" "0") + (match_operand:SI 1 "nonimmediate_operand" "0,r,rm") (match_operator:SI 3 "ix86_carry_flag_operator" [(reg FLAGS_REG) (const_int 0)])) - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (match_operand:SI 2 "x86_64_general_operand" "rBMe,rBMe,re")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)" - "sbb{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "alu") + "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands, + TARGET_APX_NDD)" + "@ + sbb{l}\t{%2, %k0|%k0, %2} + sbb{l}\t{%2, %1, %k0|%k0, %1, %2} + sbb{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) @@ -9034,21 +9066,27 @@ (define_insn "subborrow" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extend: - (match_operand:SWI48 1 "nonimmediate_operand" "0,0")) + (match_operand:SWI48 1 "nonimmediate_operand" "0,0,r,rm")) (plus: (match_operator: 4 "ix86_carry_flag_operator" [(match_operand 3 "flags_reg_operand") (const_int 0)]) (zero_extend: - (match_operand:SWI48 2 "nonimmediate_operand" "r,rm"))))) - (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm,rm,r"))))) + (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (minus:SWI48 (minus:SWI48 (match_dup 1) (match_operator:SWI48 5 "ix86_carry_flag_operator" [(match_dup 3) (const_int 0)])) (match_dup 2)))] - "ix86_binary_operator_ok (MINUS, mode, operands)" - "sbb{}\t{%2, %0|%0, %2}" - [(set_attr "type" "alu") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)" + "@ + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %0|%0, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2} + sbb{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") (set_attr "mode" "")]) @@ -9209,7 +9247,8 @@ (define_expand "subborrow_0" (match_operand:SWI48 2 ""))) (set (match_operand:SWI48 0 "register_operand") (minus:SWI48 (match_dup 1) (match_dup 2)))])] - "ix86_binary_operator_ok (MINUS, mode, operands)") + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD)") (define_expand "uaddc5" [(match_operand:SWI48 0 "register_operand") @@ -9634,7 +9673,8 @@ (define_insn_and_split "*sub3_eq" (const_int 0))) (match_operand:SWI 2 ""))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (MINUS, mode, operands) + "ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9659,7 +9699,8 @@ (define_insn_and_split "*sub3_ne" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" @@ -9688,7 +9729,8 @@ (define_insn_and_split "*sub3_eq_1" "CONST_INT_P (operands[2]) && (mode != DImode || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) - && ix86_binary_operator_ok (MINUS, mode, operands) + && ix86_binary_operator_ok (MINUS, mode, operands, + TARGET_APX_NDD) && ix86_pre_reload_split ()" "#" "&& 1" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c new file mode 100644 index 00000000000..662e3c607d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c @@ -0,0 +1,6 @@ +/* { dg-do compile { target { int128 && { ! ia32 } } } } */ +/* { dg-options "-mapxf -O2" } */ + +#include "pr91681-2.c" + +/* { dg-final { scan-assembler-times "sbbq\[^\n\r]*0, %rdi, %rdx" 1 } } */ From patchwork Tue Dec 5 02:29:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173682 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3175651vqy; Mon, 4 Dec 2023 18:38:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGeOgJ8+p5G8fGVlGxGenee9MHK7nsIWVQ97qOZJh4uQA1Eip8PYW42rFCDjifW7sRuvDEZ X-Received: by 2002:a0c:eb86:0:b0:67a:a5c3:8110 with SMTP id x6-20020a0ceb86000000b0067aa5c38110mr558555qvo.3.1701743893872; Mon, 04 Dec 2023 18:38:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743893; cv=pass; d=google.com; s=arc-20160816; b=nKKMOjdhta1Gl/c0WC42ufcmehAVgZb5XcOCjKgcnL7lDzPv2r+JL+IJ4K7P9uG+UQ qL92KM072AD/R82V7Jjn8slL2/PYiSZDySIdrXfP7eOQKGM2ywgbAxbsFjOmvkzHMZr1 HMHKJSxDiO0rf04O1CCsruszirdCXD9U6w79dHLL8+ZISmmtxb6uoOmOHwmtxzAF+P05 rlVsh4ZwIe1WjnZuu9ffVe6C/5pM26Kx0JJAHS66dWYNghhaFfGQr4DFAFacxRqLSUDH Ls3KMpZRIeAbjyq7rKFXf4yZyQmqz0UgtxbccKCU6gLpZpluRZ2FEGGqd88+R2EwMDCC xW3w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=3PccJ0vyblOtI9yqUIyRuTDPr3ydtErxhe/jPSI2tZU=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=C2csWamOBJeTmaOFruBmk/gMocBc67fS5Bd3HCco0yWP4mxxUXYZYHOpeZo1O9hvyL pqNnLQT10bTT/hpwtxv2hITCZg4l6ZzRBg/rUDZflQF21n/RqNc2ccmKIhCk3cOHL+Xr fClAfwXvmxq0hZ0d/ySc62FpEjPo9wUAcynlvH0Or0td06JJFLuSXirCBm75xncQ93qO EAWrqQ32LkZ9sl95LX+9AYsuHwVHrSLniDq4u+/YcF3YSkmJpLR0J4MVrn4K1Ho3b3BN mLL+Gr5lnl7t8xtk06XYyrqoUk5MZBOOcWuEdallO/VUrMoKRuKXwVc/Hy3RdOHMPpTh i+Cw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KgJDJ0QU; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id qs19-20020a05620a395300b0077d96086de9si11174425qkn.428.2023.12.04.18.38.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:38:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KgJDJ0QU; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 77B163888C64 for ; Tue, 5 Dec 2023 02:32:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 9AA00383117B for ; Tue, 5 Dec 2023 02:31:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9AA00383117B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9AA00383117B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; cv=none; b=H29Is0IacFx9Nma0jDa59FDUrN0L2Ctsh2UTwm9CLNybJopR85AoMk9Zz6I1vDHrq/j6l1gvJM/OPicAzf9AtUvqRsSIVguMfnYOX0pz1f3rTCirc+ODNS7q66yNCbbyrBgJqdxgqtB5HPOYsD9pRN1rtkVzZtAXR4g1itad1uo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743470; c=relaxed/simple; bh=9JcxSEuGk+fwDU4uVoOw8F+oIvpq1KcAMKvYV9480Yo=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=FWH31sk9lQEfmfnKZirzs9GxuG7Rygoshhp4UhKV5iGMQDRNgzRVmnqMHbinetB6A89K8smmK5DMq1n+FnAseYDHBpH9NDInbAHqxaXhcxPnKgTgPFv2nOpFoIjIN/XXCdjI1GfZUKZsql8ca11jza2GC/PJWDCevhOjNis74kM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDC-0001Yf-RW for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743462; x=1733279462; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9JcxSEuGk+fwDU4uVoOw8F+oIvpq1KcAMKvYV9480Yo=; b=KgJDJ0QU8FkqJ+kCtMiepORvqt5O1rHI8joGPaZ9uuOaPUSsGLV3PKMp OaHnZm9ewUBmfQu9SzVgqmW75eBsedm/7BnpGUiT+PaumBndbJSh3AB5u Ir1lKUZjSqDTBniijB8LkiX6cYdr5sZLbyK7qonCWPiJZht1D6x76I2B4 /gwVVViXLA5kQ5xkQC6B6HT93nEKmY4yZU8Cx0rDpn3RB0HO4k4n/y976 XHrgYsKcPWbGLmSX2bP8tJiA1o7sqgCmnxKZ1E9xi+D8E/dcXXuM2jRjL z36mW3OfuEkESABXwA+G+ggPKSlL7utEpawoXl9TYlRQwYz8/S1GyMgkC w==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277793" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277793" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275511" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275511" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 4D126100562C; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 08/17] [APX NDD] Support APX NDD for neg insn Date: Tue, 5 Dec 2023 10:29:39 +0800 Message-Id: <20231205022948.504790-9-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407805399052535 X-GMAIL-MSGID: 1784407805399052535 From: Kong Lingling gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_unary_operator): Add use_ndd parameter and adjust for NDD. * config/i386/i386-protos.h: Add use_ndd parameter for ix86_unary_operator_ok and ix86_expand_unary_operator. * config/i386/i386.cc (ix86_unary_operator_ok): Add use_ndd parameter and adjust for NDD. * config/i386/i386.md (neg2): Add new constraint for NDD and adjust output template. (*neg_1): Likewise. (*neg2_doubleword): Likewise. (*neg_2): Likewise. (*neg_ccc_1): Likewise. (*neg_ccc_2): Likewise. (*negsi_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternatives. (*negsi_2_zext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add neg test. --- gcc/config/i386/i386-expand.cc | 4 +- gcc/config/i386/i386-protos.h | 5 +- gcc/config/i386/i386.cc | 5 +- gcc/config/i386/i386.md | 77 ++++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 29 ++++++++++ 5 files changed, 87 insertions(+), 33 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 93ecde4b4a8..d4bbd33ce07 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1494,7 +1494,7 @@ ix86_binary_operator_ok (enum rtx_code code, machine_mode mode, void ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, - rtx operands[]) + rtx operands[], bool use_ndd) { bool matching_memory = false; rtx src, dst, op, clob; @@ -1513,7 +1513,7 @@ ix86_expand_unary_operator (enum rtx_code code, machine_mode mode, } /* When source operand is memory, destination must match. */ - if (MEM_P (src) && !matching_memory) + if (!use_ndd && MEM_P (src) && !matching_memory) src = force_reg (mode, src); /* Emit the instruction. */ diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 481527872e8..fa952409729 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -127,7 +127,7 @@ extern bool ix86_vec_interleave_v2df_operator_ok (rtx operands[3], bool high); extern bool ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn); extern bool ix86_agi_dependent (rtx_insn *set_insn, rtx_insn *use_insn); extern void ix86_expand_unary_operator (enum rtx_code, machine_mode, - rtx[]); + rtx[], bool = false); extern rtx ix86_build_const_vector (machine_mode, bool, rtx); extern rtx ix86_build_signbit_mask (machine_mode, bool, bool); extern HOST_WIDE_INT ix86_convert_const_vector_to_integer (rtx, @@ -147,7 +147,8 @@ extern void ix86_split_fp_absneg_operator (enum rtx_code, machine_mode, rtx[]); extern void ix86_expand_copysign (rtx []); extern void ix86_expand_xorsign (rtx []); -extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2]); +extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[2], + bool = false); extern bool ix86_match_ccmode (rtx, machine_mode); extern bool ix86_match_ptest_ccmode (rtx); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3e670330ef6..a3b628d2f6d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -16209,11 +16209,12 @@ ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn) bool ix86_unary_operator_ok (enum rtx_code, machine_mode, - rtx operands[2]) + rtx operands[2], + bool use_ndd) { /* If one of operands is memory, source and destination must match. */ if ((MEM_P (operands[0]) - || MEM_P (operands[1])) + || (!use_ndd && MEM_P (operands[1]))) && ! rtx_equal_p (operands[0], operands[1])) return false; return true; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e2705ada31a..1a2fb116f01 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -13282,13 +13282,14 @@ (define_expand "neg2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (neg:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NEG, mode, operands); DONE;") + "ix86_expand_unary_operator (NEG, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*neg2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (neg: (match_operand: 1 "nonimmediate_operand" "0"))) + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (neg: (match_operand: 1 "nonimmediate_operand" "0,ro"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" + "ix86_unary_operator_ok (NEG, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -13305,7 +13306,8 @@ (define_insn_and_split "*neg2_doubleword" [(set (match_dup 2) (neg:DWIH (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) ;; Convert: ;; mov %esi, %edx @@ -13394,22 +13396,29 @@ (define_peephole2 (clobber (reg:CC FLAGS_REG))])]) (define_insn "*neg_1" - [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0"))) + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm"))) (clobber (reg:CC FLAGS_REG))] - "ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + "ix86_unary_operator_ok (NEG, mode, operands, TARGET_APX_NDD)" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI - (neg:SI (match_operand:SI 1 "register_operand" "0")))) + (neg:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + "TARGET_64BIT && ix86_unary_operator_ok (NEG, SImode, operands, + TARGET_APX_NDD)" + "@ + neg{l}\t%k0 + neg{l}\t{%k1, %k0|%k0, %k1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) ;; Alternative 1 is needed to work around LRA limitation, see PR82524. @@ -13435,51 +13444,65 @@ (define_insn_and_split "*neg_1_slp" (define_insn "*neg_2" [(set (reg FLAGS_REG) (compare - (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (neg:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, mode, operands)" - "neg{}\t%0" + && ix86_unary_operator_ok (NEG, mode, operands, + TARGET_APX_NDD)" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*negsi_2_zext" [(set (reg FLAGS_REG) (compare - (neg:SI (match_operand:SI 1 "register_operand" "0")) + (neg:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (neg:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCGOCmode) - && ix86_unary_operator_ok (NEG, SImode, operands)" - "neg{l}\t%k0" + && ix86_unary_operator_ok (NEG, SImode, operands, + TARGET_APX_NDD)" + "@ + neg{l}\t%k0 + neg{l}\t{%1, %k0|%k0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*neg_ccc_1" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,rm") (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (neg:SWI (match_dup 1)))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_insn "*neg_ccc_2" [(set (reg:CCC FLAGS_REG) (unspec:CCC - [(match_operand:SWI 1 "nonimmediate_operand" "0") + [(match_operand:SWI 1 "nonimmediate_operand" "0,rm") (const_int 0)] UNSPEC_CC_NE)) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "" - "neg{}\t%0" + "@ + neg{}\t%0 + neg{}\t{%1, %0|%0, %1}" [(set_attr "type" "negnot") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_expand "x86_neg_ccc" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 0c7952ef018..c351f71265e 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -27,8 +27,25 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ { \ TYPE c = *a OP b; \ return c; \ +} + +#define F(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f_##OP_NAME##_##TYPE (TYPE *a) \ +{ \ + TYPE b = OP*a; \ + return b; \ } +#define F1(TYPE, OP_NAME, OP) \ +TYPE \ +__attribute__ ((noipa)) \ +f1_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = OP a; \ + return b; \ +} FOO (char, add, +) FOO1 (char, add, +) FOO2 (char, add, +) @@ -50,8 +67,20 @@ FOO (int, sub, -) FOO1 (int, sub, -) FOO (long, sub, -) FOO1 (long, sub, -) + +F (char, neg, -) +F1 (char, neg, -) +F (short, neg, -) +F1 (short, neg, -) +F (int, neg, -) +F1 (int, neg, -) +F (long, neg, -) +F1 (long, neg, -) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), %(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Tue Dec 5 02:29:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173686 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3175901vqy; Mon, 4 Dec 2023 18:39:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IHYSEIqlMr6ljj+4T9J2bzLH7KFZyyV6c5MpNN2coCB5mThdGw+JsLZ0ElnoKtchF0J+IWB X-Received: by 2002:ac8:5986:0:b0:425:4054:bc59 with SMTP id e6-20020ac85986000000b004254054bc59mr870051qte.53.1701743950397; Mon, 04 Dec 2023 18:39:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743950; cv=pass; d=google.com; s=arc-20160816; b=RDNLpRKl8Kim++UMrmPlfmHSdEum4DTrLc/YPlhTi2ZQ1zjymuoq0rUSvzaTthGpln /avNrWykpwJiqJpRINu8n6RUZYDhr0c5Bce/QU0R0Y6DojpfpLihG76XvMLZkm5Ynmxj xwI/FtAlBWYz6+5D4L45MkKd5V/1nPQtZjeeFNTNUTitWpewQ0x2iTEo/nawVgoTwQW/ U2hG4q5sZ4rVX0YTvFp8dbmiTfy6vpN98YjBsbBEHaAOeZIEH2hvRhD4B1Y22wJwoOt7 ZEUwo4XaTl9ENSzalygRkMze+qIzSrme0ZLjtRAje/Ax9iV4nQY06ZzcbdZ5J3x1VRh9 g89g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=oV5B0HV0/SK2Z6RL3dpLCJ1krsmEafhDZkmkzsts74c=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=zebGeluFieawXVe9Z5ID0pTs1+s9P00lu05wdEVImaxNkQFWD/ju6dyWRCD3huif58 bR39AckM76koAhH2A4BZnF/OuHF+spj3fAnWG7IcZyyEJ7lNzVhZVhqZKHvWRHaenqYw F2xTCkaFYy1rgIwNXRvLFCelxpPAhcgvlZPk2E0DLHniRuE9vK7/gb4GhIibJN8YcrQZ JHfusY+XZ+zDehIytHcB1OgsJJJf2d8T1+7qoF+fTm/fQ4+keOy7T39woogBwX58MrDY uWM1L0PeQAk9O+LMLPwmS4fX0QcI/tp6HvcYIfEgQPmavDQ0RSJ1yE3y/CxHJdELXkp2 MKcQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gXy++cd+; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id fz15-20020a05622a5a8f00b0042545d7217esi6440007qtb.31.2023.12.04.18.39.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:39:10 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gXy++cd+; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8C8573B61C34 for ; Tue, 5 Dec 2023 02:32:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 28D90388622E for ; Tue, 5 Dec 2023 02:31:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 28D90388622E Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 28D90388622E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743483; cv=none; b=HVyO5WPtcNtPFaV4XPJOAsquGceCMvaZU120tQ3btVq0Vw+oKoSBnN7bm+vG6wP/uQjqcwEtnOAPn93UrdV2QYsY6TgmZXYTw7ErFRz07fEv8GBAFH1XgWx5qlaxcQCIR2S9n+43jkqpoo0WKtpeAZaTM7dooi4VvYr6Q2IwYQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743483; c=relaxed/simple; bh=HXVDf1NJsVpvqO/ZRl7Uo0ujSev80l4LXou/7TyPiPE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=eSHYCjIjIx+QknLvTOKtdwg/eslOcvfI9jUfd9Yk2YJrafG//TQYMxWoJyw6Blm9KihwKqe9+2ZhIQYr41yySDm2+LHb2/5H/FCEEQIkae/9BnTe+E2+50C5Fo22q3KuxargYfNjaSzJ3CW/mvRtHLZpg9hc5/kksd6Hd1dDMvE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDI-0001Yf-RW for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743468; x=1733279468; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HXVDf1NJsVpvqO/ZRl7Uo0ujSev80l4LXou/7TyPiPE=; b=gXy++cd+V4drArpp3GbpQBrYH/ztOSrx1DWcMW+q0Nwhr6jaWmu/Agic 7UQgrytIAjedxBQhZgZz/A/LsnKdUvheTyDLSM3dttT55d56QauyEUqXF Tz5GedhhdM+xEzhkC7+DRAt06YdwYrpyW3978kdxcyICtIt05RCpjz0/Q FPElcRFB1aNdd1ATOPly35hBOPFpBXlz+bS6suzArYfrNArSxrunjbNoA 4UfbAsOvIAN9IdCdd5qgZ/ok6lyvgf+/92033jsqlWfg3x5IWbmAzrryj mWT6O583QhNC7gTJ9Q433CFgB18K4NTUIMN7NiBEpJF+Pgu8d7uYgj2MN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277798" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277798" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275514" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275514" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 4FD91100562D; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 09/17] [APX NDD] Support APX NDD for not insn Date: Tue, 5 Dec 2023 10:29:40 +0800 Message-Id: <20231205022948.504790-10-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -5 X-Spam_score: -0.6 X-Spam_bar: / X-Spam_report: (-0.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407864734515488 X-GMAIL-MSGID: 1784407864734515488 From: Kong Lingling For *one_cmplsi2_2_zext, it will be splitted to xor, so its NDD form will be added together with xor NDD support. gcc/ChangeLog: * config/i386/i386.md (one_cmpl2): Add new constraints for NDD and adjust output template. (*one_cmpl2_1): Likewise. (*one_cmplqi2_1): Likewise. (*one_cmpl2_doubleword): Likewise. (*one_cmpl2_2): Likewise. (*one_cmplsi2_1_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add not test. --- gcc/config/i386/i386.md | 58 ++++++++++++++----------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 11 +++++ 2 files changed, 44 insertions(+), 25 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 1a2fb116f01..050779273a7 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14001,57 +14001,63 @@ (define_expand "one_cmpl2" [(set (match_operand:SDWIM 0 "nonimmediate_operand") (not:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand")))] "" - "ix86_expand_unary_operator (NOT, mode, operands); DONE;") + "ix86_expand_unary_operator (NOT, mode, operands, + TARGET_APX_NDD); DONE;") (define_insn_and_split "*one_cmpl2_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro") - (not: (match_operand: 1 "nonimmediate_operand" "0")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + (not: (match_operand: 1 "nonimmediate_operand" "0,ro")))] + "ix86_unary_operator_ok (NOT, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(set (match_dup 0) (not:DWIH (match_dup 1))) (set (match_dup 2) (not:DWIH (match_dup 3)))] - "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);") + "split_double_mode (mode, &operands[0], 2, &operands[0], &operands[2]);" + [(set_attr "isa" "*,apx_ndd")]) (define_insn "*one_cmpl2_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,?k") - (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,k")))] - "ix86_unary_operator_ok (NOT, mode, operands)" + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0,rm,k")))] + "ix86_unary_operator_ok (NOT, mode, operands, TARGET_APX_NDD)" "@ not{}\t%0 + not{}\t{%1, %0|%0, %1} #" - [(set_attr "isa" "*,") - (set_attr "type" "negnot,msklog") + [(set_attr "isa" "*,apx_ndd,") + (set_attr "type" "negnot,negnot,msklog") (set_attr "mode" "")]) (define_insn "*one_cmplsi2_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,?k") + [(set (match_operand:DI 0 "register_operand" "=r,r,?k") (zero_extend:DI - (not:SI (match_operand:SI 1 "register_operand" "0,k"))))] - "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands)" + (not:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,k"))))] + "TARGET_64BIT && ix86_unary_operator_ok (NOT, SImode, operands, + TARGET_APX_NDD)" "@ not{l}\t%k0 + not{l}\t{%1, %k0|%k0, %1} #" - [(set_attr "isa" "x64,avx512bw_512") - (set_attr "type" "negnot,msklog") - (set_attr "mode" "SI,SI")]) + [(set_attr "isa" "x64,apx_ndd,avx512bw_512") + (set_attr "type" "negnot,negnot,msklog") + (set_attr "mode" "SI,SI,SI")]) (define_insn "*one_cmplqi2_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,?k") - (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,k")))] - "ix86_unary_operator_ok (NOT, QImode, operands)" + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,r,?k") + (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,rm,k")))] + "ix86_unary_operator_ok (NOT, QImode, operands, TARGET_APX_NDD)" "@ not{b}\t%0 not{l}\t%k0 + not{b}\t{%1, %0|%0, %1} #" - [(set_attr "isa" "*,*,avx512f") - (set_attr "type" "negnot,negnot,msklog") + [(set_attr "isa" "*,*,apx_ndd,avx512f") + (set_attr "type" "negnot,negnot,negnot,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "1") (const_string "SI") - (and (eq_attr "alternative" "2") + (and (eq_attr "alternative" "3") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -14081,14 +14087,16 @@ (define_insn_and_split "*one_cmpl_1_slp" (define_insn "*one_cmpl2_2" [(set (reg FLAGS_REG) - (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0")) + (compare (not:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (not:SWI (match_dup 1)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, mode, operands)" + && ix86_unary_operator_ok (NOT, mode, operands, + TARGET_APX_NDD)" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index c351f71265e..2bd551614c4 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -76,6 +76,15 @@ F (int, neg, -) F1 (int, neg, -) F (long, neg, -) F1 (long, neg, -) + +F (char, not, ~) +F1 (char, not, ~) +F (short, not, ~) +F1 (short, not, ~) +F (int, not, ~) +F1 (int, not, ~) +F (long, not, ~) +F1 (long, not, ~) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -84,3 +93,5 @@ F1 (long, neg, -) /* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Tue Dec 5 02:29:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173689 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176258vqy; Mon, 4 Dec 2023 18:40:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IFgLCXjcTqV1VrYFHXHMLV5EtSYBW54r0vtEQrVyCj3cJobBRenGYT0L4SVAjwGkxpbltAC X-Received: by 2002:a05:620a:178b:b0:77e:fba3:81e0 with SMTP id ay11-20020a05620a178b00b0077efba381e0mr744009qkb.118.1701744031154; Mon, 04 Dec 2023 18:40:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744031; cv=pass; d=google.com; s=arc-20160816; b=Wqsh0rllGsDPRsuzjU32IdxCkRgrd44iPox5RIR3+1Bz3MB6blVx7zg3SocsTXD9go aW5ghW3YWT30x3H12890ob4KCC5jLRqsGjqFje0ZisCUA4dIGlvXvoGSTqzNn4kRAt10 GPEf4yg7AwWO9zcnXxM9N3A3+2u9+8+ZGxcv5yhnkOzRB9sTR8gu8rvg+aNtzO9i8YTp Bn7XE+lQ2CLTN6we1ZbvCPcRAbhn5fJ5bEQknbiizgkQStOsx/K8juiaOMZYgfH31xPA q/rELUWAgH86ZnyFjGE8P69f1wUPftOa3Gs6aO6PaIdutZHppL7ZoaqdJJnDse/9fH3j bOzw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ilV0p+Uu4V97pAqW6vu2gNnHmxyN7j30wqdzOGH2ASc=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=ei3ZMwzaPKZg0LBlKvEnhpny27ooHzJZsqj0P3FhRkiFPjlVR2SVwbgP7N1PClzG9E Tzkv5JJylSmaRqn5yr9hROd1r382+ckWZxs2LcSraBpj0uuUr3nV0ySwiFymPXz+Fp2L 4++4oLbfD+DkHan9W+FtdvcTRs7HKtV9F89S+Ecv6Chj+N5H5fBDItlmcUG4fCFRJod0 3uVpsm9MaHgPMaGaR/u/Zkripjtro4YuLqf99ekx0qSBSjiuQFGorqY6pNMboss8hrmA 3qTL2WnTxu98XnhK053TXLrNuNgPsOAcpdDJSDdTmQkLw0gug1NOTvYhE+/kv8WZ1VP+ ImHA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HDLuYxd7; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id v15-20020a05620a0f0f00b0077d84ecb7cesi11621546qkl.467.2023.12.04.18.40.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:40:31 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HDLuYxd7; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A481E3B905A3 for ; Tue, 5 Dec 2023 02:33:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 9B31B382C12B for ; Tue, 5 Dec 2023 02:31:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B31B382C12B Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9B31B382C12B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; cv=none; b=CMXAYdE/QR1VVf4+WSC8OWaic8shqTj0WVCAX/ak7t6Jit+IYrIUVEzy3oSRUoxGPpVk8TpXtIdrXnem97IF8wazycrmmF9MfeRgerR7RLPY4a1cCEfird3n9ms/XBbAqjW/hF8LzRKjaDDx8za8dbdTj5m7CxIscjLKsL3CZUo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743484; c=relaxed/simple; bh=1Xb4hFqKRd08DO/tbahhCtNWZn5wzbPcX4fNSlYjdrY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=UxkGYWBkQXNt53lEnpewY08oOVg+nY7Cq5FuswTaEVZd3b5PMtqDminFvJqATlIFPRDMSX3mHyUrujXrggC16pgW7CP8MBm1fWseI88yxKylWgWWlXM1gyYNZGR19z5mQrgmnFMDcAeHGX4bM5+YPJc80ZNtOC2BIkSw26LZKX8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDK-0001YX-T3 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743470; x=1733279470; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1Xb4hFqKRd08DO/tbahhCtNWZn5wzbPcX4fNSlYjdrY=; b=HDLuYxd7kCqG9Ijzn/Yy8GxYLjl02Lz7ga+xRopBkfw5s7Tfs5Euft00 OxzW+4zVRX9xKwA6kTsUpJGxzJB34Dn9EoOkkO8Eoc6+cIlM+nR4nM/WR pdSXBoO+KolkI2PPvQ7BDPMXjv15d4vt49rMTn1/4PtJ64pC+CzcMMcCW fuHKimtXZn6MROeeuyMn/oQJLNfsjMKc2aStbihh2/ICq33UBVMMRJnPd RPqFhbiZhf64lTj3mudVMqxGNvNAMtGcBOzQ1XOtd7XCVWznI0+f/sezo +T6LcKsdLEbX0wymKuFzmcDXOHlBfoePJkf9dKhdB2+h13UYAkQsQ1kbz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277800" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277800" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275527" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275527" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:51 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 52BFB100562E; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 10/17] [APX NDD] Support APX NDD for and insn Date: Tue, 5 Dec 2023 10:29:41 +0800 Message-Id: <20231205022948.504790-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407948884746553 X-GMAIL-MSGID: 1784407948884746553 From: Kong Lingling For NDD form AND insn, there are three splitter fixes after extending legacy patterns. 1. APX NDD does not support high QImode registers like ah, bh, ch, dh, so for some optimization splitters that generates highpart zero_extract for QImode need to be prohibited under NDD pattern. 2. Legacy AND insn will use r/qm/L constraint, and a post-reload splitter will transform it into zero_extend move. But for NDD form AND, the splitter is not strict enough as the splitter assum such AND will have the const_int operand matching the constraint "L", then NDD form AND allows const_int with any QI values. Restrict the splitter condition to match "L" constraint that strictly matches zero-extend sematic. 3. Legacy AND insn will adopt r/0/Z constraint, a splitter will try to optimize such form into strict_lowpart QImode AND when 7th bit is not set. But the splitter will wronly convert non-zext form of NDD and with memory src, then the strict_lowpart transform matches alternative 1 of *_slp_1 and generates *movstrict_1 so the zext sematic was omitted. This could cause highpart of dest not cleared and generates wrong code. Disable the splitter when NDD adopted and operands[0] and operands[1] are not equal. gcc/ChangeLog: * config/i386/i386.md (and3): Add NDD alternatives and adjust output template. (*anddi_1): Likewise. (*and_1): Likewise. (*andqi_1): Likewise. (*andsi_1_zext): Likewise. (*anddi_2): Likewise. (*andsi_2_zext): Likewise. (*andqi_2_maybe_si): Likewise. (*and_2): Likewise. (*and3_doubleword): Add NDD alternative, emit move for optimized case if operands[0] not equal to operands[1]. (define_split for QI highpart AND): Prohibit splitter to split NDD form AND insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. (define_split for zero_extend and optimization): Prohibit splitter to split NDD form AND insn to zero_extend insn. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add and test. --- gcc/config/i386/i386.md | 175 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 13 ++ 2 files changed, 127 insertions(+), 61 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 050779273a7..64944a1163d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11705,18 +11705,19 @@ (define_expand "and3" (operands[0], gen_lowpart (mode, operands[1]), mode, mode, 1)); else - ix86_expand_binary_operator (AND, mode, operands); + ix86_expand_binary_operator (AND, mode, operands, + TARGET_APX_NDD); DONE; }) (define_insn_and_split "*and3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (and: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -11728,39 +11729,53 @@ (define_insn_and_split "*and3_doubleword" if (operands[2] == const0_rtx) emit_move_insn (operands[0], const0_rtx); else if (operands[2] == constm1_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else - ix86_expand_binary_operator (AND, mode, &operands[0]); + ix86_expand_binary_operator (AND, mode, &operands[0], + TARGET_APX_NDD); if (operands[5] == const0_rtx) emit_move_insn (operands[3], const0_rtx); else if (operands[5] == constm1_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else - ix86_expand_binary_operator (AND, mode, &operands[3]); + ix86_expand_binary_operator (AND, mode, &operands[3], + TARGET_APX_NDD); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,?k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,rm,r,r,r,r,?k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,r,0,0,rm,r,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,Z,re,m,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands, + TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2} # #" - [(set_attr "isa" "x64,x64,x64,x64,avx512bw_512") - (set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,*") + [(set_attr "isa" "x64,apx_ndd,x64,x64,apx_ndd,apx_ndd,x64,avx512bw_512") + (set_attr "type" "alu,alu,alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11768,7 +11783,7 @@ (define_insn "*anddi_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,SI,DI,DI,DI,DI,SI,DI")]) (define_insn_and_split "*anddi_1_btr" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") @@ -11823,36 +11838,45 @@ (define_split ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands, + TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*and_1" - [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,Ya,?k") - (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SWI24 2 "" "r,,L,k"))) + [(set (match_operand:SWI24 0 "nonimmediate_operand" "=rm,r,r,r,Ya,?k") + (and:SWI24 (match_operand:SWI24 1 "nonimmediate_operand" "%0,0,rm,r,qm,k") + (match_operand:SWI24 2 "" "r,,r,,L,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, mode, operands)" + "ix86_binary_operator_ok (AND, mode, operands, TARGET_APX_NDD)" "@ and{}\t{%2, %0|%0, %2} and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2} # #" [(set (attr "isa") - (cond [(eq_attr "alternative" "3") + (cond [(eq_attr "alternative" "2,3") + (const_string "apx_ndd") + (eq_attr "alternative" "5") (if_then_else (eq_attr "mode" "SI") (const_string "avx512bw") (const_string "avx512f")) ] (const_string "*"))) - (set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + (set_attr "type" "alu,alu,alu,alu,imovx,msklog") + (set_attr "length_immediate" "*,*,*,*,0,*") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -11860,24 +11884,27 @@ (define_insn "*and_1" (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" ",,SI,")]) + (set_attr "mode" ",,,,SI,")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (AND, QImode, operands)" + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD)" "@ and{b}\t{%2, %0|%0, %2} and{b}\t{%2, %0|%0, %2} and{l}\t{%k2, %k0|%k0, %k2} + and{b}\t{%2, %1, %0|%0, %1, %2} + and{b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "type" "alu,alu,alu,msklog") + [(set_attr "type" "alu,alu,alu,alu,alu,msklog") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,*") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -11980,7 +12007,10 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!REG_P (operands[1]) - || REGNO (operands[0]) != REGNO (operands[1]))" + || REGNO (operands[0]) != REGNO (operands[1])) + && (UINTVAL (operands[2]) == GET_MODE_MASK (SImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (HImode) + || UINTVAL (operands[2]) == GET_MODE_MASK (QImode))" [(const_int 0)] { unsigned HOST_WIDE_INT ival = UINTVAL (operands[2]); @@ -12053,10 +12083,10 @@ (define_insn "*anddi_2" [(set (reg FLAGS_REG) (compare (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m")) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,r,rm,r") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,Z,re,m")) (const_int 0))) - (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") + (set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,r,r") (and:DI (match_dup 1) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode @@ -12070,38 +12100,46 @@ (define_insn "*anddi_2" && (!CONST_INT_P (operands[2]) || val_signbit_known_set_p (SImode, INTVAL (operands[2])))) ? CCZmode : CCNOmode) - && ix86_binary_operator_ok (AND, DImode, operands)" + && ix86_binary_operator_ok (AND, DImode, operands, TARGET_APX_NDD)" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} - and{q}\t{%2, %0|%0, %2}" + and{q}\t{%2, %0|%0, %2} + and{l}\t{%k2, %k1, %k0|%k0, %k1, %k2} + and{q}\t{%2, %1, %0|%0, %1, %2} + and{q}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") - (set_attr "mode" "SI,DI,DI")]) + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd,apx_ndd") + (set_attr "mode" "SI,DI,DI,SI,DI,DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*andsi_2_zext" [(set (reg FLAGS_REG) (compare (and:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, SImode, operands)" - "and{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (AND, SImode, operands, TARGET_APX_NDD)" + "@ + and{l}\t{%2, %k0|%k0, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2} + and{l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*andqi_2_maybe_si" [(set (reg FLAGS_REG) (compare (and:QI - (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") - (match_operand:QI 2 "general_operand" "qn,m,n")) + (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r") + (match_operand:QI 2 "general_operand" "qn,m,n,rn,m")) (const_int 0))) - (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r") (and:QI (match_dup 1) (match_dup 2)))] - "ix86_binary_operator_ok (AND, QImode, operands) + "ix86_binary_operator_ok (AND, QImode, operands, TARGET_APX_NDD) && ix86_match_ccmode (insn, CONST_INT_P (operands[2]) && INTVAL (operands[2]) >= 0 ? CCNOmode : CCZmode)" @@ -12112,11 +12150,16 @@ (define_insn "*andqi_2_maybe_si" operands[2] = GEN_INT (INTVAL (operands[2]) & 0xff); return "and{l}\t{%2, %k0|%k0, %2}"; } + if (which_alternative > 2) + return "and{b}\t{%2, %1, %0|%0, %1, %2}"; return "and{b}\t{%2, %0|%0, %2}"; } [(set_attr "type" "alu") + (set_attr "isa" "*,*,*,apx_ndd,apx_ndd") (set (attr "mode") - (cond [(eq_attr "alternative" "2") + (cond [(eq_attr "alternative" "3,4") + (const_string "QI") + (eq_attr "alternative" "2") (const_string "SI") (and (match_test "optimize_insn_for_size_p ()") (and (match_operand 0 "ext_QIreg_operand") @@ -12133,15 +12176,21 @@ (define_insn "*andqi_2_maybe_si" (define_insn "*and_2" [(set (reg FLAGS_REG) (compare (and:SWI124 - (match_operand:SWI124 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI124 2 "" ",")) + (match_operand:SWI124 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI124 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI124 0 "nonimmediate_operand" "=m,,r,r") (and:SWI124 (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (AND, mode, operands)" - "and{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (AND, mode, operands, + TARGET_APX_NDD)" + "@ + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %0|%0, %2} + and{}\t{%2, %1, %0|%0, %1, %2} + and{}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) (define_insn "*qi_ext_0" @@ -12387,6 +12436,7 @@ (define_insn_and_split "*qi_ext_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (and:SWI248 (match_operand:SWI248 1 "register_operand") @@ -12394,7 +12444,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(~INTVAL (operands[2]) & ~(255 << 8))" + && !(~INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -12423,7 +12474,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(~INTVAL (operands[2]) & ~255) - && !(INTVAL (operands[2]) & 128)" + && !(INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (and:QI (match_dup 1) (match_dup 2))) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 2bd551614c4..be436d57bdf 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -85,6 +85,15 @@ F (int, not, ~) F1 (int, not, ~) F (long, not, ~) F1 (long, not, ~) + +FOO (char, and, &) +FOO1 (char, and, &) +FOO (short, and, &) +FOO1 (short, and, &) +FOO (int, and, &) +FOO1 (int, and, &) +FOO (long, and, &) +FOO1 (long, and, &) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -95,3 +104,7 @@ F1 (long, not, ~) /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ From patchwork Tue Dec 5 02:29:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173687 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176152vqy; Mon, 4 Dec 2023 18:40:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IFKKAjH7Up36TGVosvZcmACICmi4lwQ6Q1ww099HoD3dZCQTAjVqI34aiJ1uRtYK01Ezyp7 X-Received: by 2002:a05:6102:3912:b0:464:a099:58ab with SMTP id e18-20020a056102391200b00464a09958abmr1161890vsu.30.1701744013555; Mon, 04 Dec 2023 18:40:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744013; cv=pass; d=google.com; s=arc-20160816; b=S4zo/iBkumpEfjWijkPwEnnQOMJyZxlqpH8EUiH5z+xMNrEDQ1rJ3bxXhH3FJ28mFy LHCT+8YgTCc8mAFAI1yDGeBW3Os3otM1pnvR1iA3vufDZtrwa8pt4VLKwmoasJOODSHI 88WEzhgctblJAF4ufAsbaqigExTxvnEV4UW+TSO36x5nNd3f5HM0lNr+F3L08BD7HjT0 ea/+5MOFNlz/iLXu65d5Q74cWemDWTIKxhWhsdWrx8SgCMM/o1FfDfrVQmtNrLRkIFu2 Tjv2kycDX/y3x7ZbZmqJ/etGT9eCfpuGQVl/6t8EuwNRuVbVuj+Xw5N78CpMdIngfqCR ziZg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=nDG4rYszvqHFHMKUW0k3OoRJKuLZ7rohCU1Oz2sZMzo=; fh=yOaFOaAPN8zaff3oteejj7MB/HMAN2vMkEa18PGUfhc=; b=x8acfQlnPFcrxrsvGXf1xxvHiUTCFMpRCSKxf+H18RoO5UYXkAJbR73i/xHnF01ze8 XlFwNapYqY51TGiGxmACQkjn7CZgiL0dQ9uT+auA4cVF+/G3ZEfKYfAGVL4eIxESvROq Y6R7Mhu3lXBWvu10X+2CFyomCPauk2gab6cupfvEWKO9kRv28waG3gi4iusmAMSZTcaW gTc/W2EnBfSG8kJ2yNuII3+cHyGXKUMjnV9vXhSqpVYWAuSEbAfqqrH/wdBFk454jhZW emuzy60cfOeMn/1yVRq+cnhhT0TqeQn/BnUbz0NoKUWAK7CyamTINzLxvgA068xpk3lA 9kOQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jSjp9GIk; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d16-20020a0cf6d0000000b0067a92d9bc38si9368052qvo.44.2023.12.04.18.40.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:40:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jSjp9GIk; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 591D33A485E9 for ; Tue, 5 Dec 2023 02:33:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 000D23948A71 for ; Tue, 5 Dec 2023 02:31:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 000D23948A71 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 000D23948A71 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; cv=none; b=htWnLHIsikqkGbEVCC+dK8K/obTYQu2prL2UAgipEIeU/ZkpFaXXRVFtqxwSUt6oEEZKaFxzmlTdsHukxQvwF2AEYGn+GKL80zns2qi1212U3uO8uN9wt/vYBOn5lNzcmLUWslmN4toPOfxioZ3ftmAGllODimzCI1vB1fjj5is= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; c=relaxed/simple; bh=P+/WOEVsI4/oRjbN/Dm1C/clOkAymdDhCN4ifLrkTQg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=OLAOorFfjwmC8nv1NDseJJrpB6sTx4n6LbrauT8icoiMjc3lWpFGTG4gnHN68pdkfnInS5ck5NpblK4LdzrY5cgh07Us43myABDNFEkhrqRKhznJm5nUi27ZxzUjPQbIndW9CQCWeEwU/lDU3squyXztdiMwvEVwqAZKQehf43M= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDr-0001Yf-8H for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743504; x=1733279504; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P+/WOEVsI4/oRjbN/Dm1C/clOkAymdDhCN4ifLrkTQg=; b=jSjp9GIk2VyD05hC19XSHsY8DpNHlQd0PnBzoIGeLppptFogIurYw42m mW41VCqMNm7JUtp+NUC81jbZjs4YXI/mSdB6xmHKXpEqzaZbYfocCQht0 XYd/DsMeVYUhxgIggCvDPj7GEoHDTG3y8kVw85Relil9K8Y9bMwAlpz4B CtR+YDHYsiZAARAgqZXCmwopUUkraF2NKgbHrkUy5hBVk/rMTySa1Bi0x WVYs0RAutHOUB9O658stIWQAcrfarBgftJ/zIeeFSk2ilKvOetmhoUSQ9 AJ42qg0f+BS3j2fXCwtrX4EEq7g51i+jW9nwxoF3nZnHIRNrTTVuVXxeG A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277823" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277823" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275552" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275552" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 56214100562F; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, Kong Lingling Subject: [PATCH 11/17] [APX NDD] Support APX NDD for or/xor insn Date: Tue, 5 Dec 2023 10:29:42 +0800 Message-Id: <20231205022948.504790-12-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407930656488497 X-GMAIL-MSGID: 1784407930656488497 From: Kong Lingling Similar to AND insn, two splitters need to be adjusted to prevent misoptimizaiton for NDD OR/XOR. Also adjust *one_cmplsi2_2_zext and its corresponding splitter that will generate xor insn. gcc/ChangeLog: * config/i386/i386.md (3): Add new alternative for NDD and adjust output templates. (*_1): Likewise. (*qi_1): Likewise. (*notxor_1): Likewise. (*si_1_zext): Likewise. (*notxorqi_1): Likewise. (*_2): Likewise. (*si_2_zext): Likewise. (*si_2_zext_imm): Likewise. (*si_1_zext_imm): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*one_cmplsi2_2_zext): Likewise. (define_split for *one_cmplsi2_2_zext): Use nonimmediate_operand for operands[3]. (*3_doubleword): Add NDD constraints, emit move for optimized case if operands[0] != operands[1] or operands[4] != operands[5]. (define_split for QI highpart OR/XOR): Prohibit splitter to split NDD form OR/XOR insn to qi_ext_3. (define_split for QI strict_lowpart optimization): Prohibit splitter to split NDD form AND insn to *3_1_slp. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add or and xor test. --- gcc/config/i386/i386.md | 186 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 26 ++++ 2 files changed, 143 insertions(+), 69 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 64944a1163d..62cd21ee3d4 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -12698,17 +12698,19 @@ (define_expand "3" && !x86_64_hilo_general_operand (operands[2], mode)) operands[2] = force_reg (mode, operands[2]); - ix86_expand_binary_operator (, mode, operands); + ix86_expand_binary_operator (, mode, operands, + TARGET_APX_NDD); DONE; }) (define_insn_and_split "*3_doubleword" - [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") + [(set (match_operand: 0 "nonimmediate_operand" "=ro,r,r,r") (any_or: - (match_operand: 1 "nonimmediate_operand" "%0,0") - (match_operand: 2 "x86_64_hilo_general_operand" "r,o"))) + (match_operand: 1 "nonimmediate_operand" "%0,0,ro,r") + (match_operand: 2 "x86_64_hilo_general_operand" "r,o,r,o"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" "#" "&& reload_completed" [(const_int:DWIH 0)] @@ -12720,20 +12722,29 @@ (define_insn_and_split "*3_doubleword" split_double_mode (mode, &operands[0], 3, &operands[0], &operands[3]); if (operands[2] == const0_rtx) - emit_insn_deleted_note_p = true; + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + else + emit_insn_deleted_note_p = true; + } else if (operands[2] == constm1_rtx) { if ( == IOR) emit_move_insn (operands[0], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[0]); + ix86_expand_unary_operator (NOT, mode, &operands[0], + TARGET_APX_NDD); } else - ix86_expand_binary_operator (, mode, &operands[0]); + ix86_expand_binary_operator (, mode, &operands[0], + TARGET_APX_NDD); if (operands[5] == const0_rtx) { - if (emit_insn_deleted_note_p) + if (!rtx_equal_p (operands[3], operands[4])) + emit_move_insn (operands[3], operands[4]); + else if (emit_insn_deleted_note_p) emit_note (NOTE_INSN_DELETED); } else if (operands[5] == constm1_rtx) @@ -12741,37 +12752,43 @@ (define_insn_and_split "*3_doubleword" if ( == IOR) emit_move_insn (operands[3], constm1_rtx); else - ix86_expand_unary_operator (NOT, mode, &operands[3]); + ix86_expand_unary_operator (NOT, mode, &operands[3], + TARGET_APX_NDD); } else - ix86_expand_binary_operator (, mode, &operands[3]); + ix86_expand_binary_operator (, mode, &operands[3], + TARGET_APX_NDD); DONE; -}) +} +[(set_attr "isa" "*,*,apx_ndd,apx_ndd")]) (define_insn "*_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (any_or:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k"))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" "@ {}\t{%2, %0|%0, %2} {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*notxor_1" - [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm,r,r,r,?k") (not:SWI248 (xor:SWI248 - (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI248 2 "" "r,,k")))) + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0,rm,r,k") + (match_operand:SWI248 2 "" "r,,r,,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, mode, operands)" + "ix86_binary_operator_ok (XOR, mode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -12787,8 +12804,8 @@ (define_insn_and_split "*notxor_1" DONE; } } - [(set_attr "isa" "*,*,") - (set_attr "type" "alu, alu, msklog") + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,") + (set_attr "type" "alu, alu, alu, alu, msklog") (set_attr "mode" "")]) (define_insn_and_split "*iordi_1_bts" @@ -12876,44 +12893,55 @@ (define_insn_and_split "*xor2andn" ;; See comment for addsi_1_zext why we do use nonimmediate_operand (define_insn "*si_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")))) + (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_1_zext_imm" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "%0")) - (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z"))) + (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "%0,rm")) + (match_operand:DI 2 "x86_64_zext_immediate_operand" "Z,Z"))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*qi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") - (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") + (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, QImode, operands)" + "ix86_binary_operator_ok (, QImode, operands, TARGET_APX_NDD)" "@ {b}\t{%2, %0|%0, %2} {b}\t{%2, %0|%0, %2} {l}\t{%k2, %k0|%k0, %k2} + {b}\t{%2, %1, %0|%0, %1, %2} + {b}\t{%2, %1, %0|%0, %1, %2} #" - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -12925,12 +12953,12 @@ (define_insn "*qi_1" (symbol_ref "true")))]) (define_insn_and_split "*notxorqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,r,r,?k") (not:QI - (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,m,rn,k")))) + (xor:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,rm,r,k") + (match_operand:QI 2 "general_operand" "qn,m,rn,rn,m,k")))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (XOR, QImode, operands)" + "ix86_binary_operator_ok (XOR, QImode, operands, TARGET_APX_NDD)" "#" "&& reload_completed" [(parallel @@ -12946,12 +12974,12 @@ (define_insn_and_split "*notxorqi_1" DONE; } } - [(set_attr "isa" "*,*,*,avx512f") - (set_attr "type" "alu,alu,alu,msklog") + [(set_attr "isa" "*,*,*,apx_ndd,apx_ndd,avx512f") + (set_attr "type" "alu,alu,alu,alu,alu,msklog") (set (attr "mode") (cond [(eq_attr "alternative" "2") (const_string "SI") - (and (eq_attr "alternative" "3") + (and (eq_attr "alternative" "5") (match_test "!TARGET_AVX512DQ")) (const_string "HI") ] @@ -12999,44 +13027,59 @@ (define_split (define_insn "*_2" [(set (reg FLAGS_REG) (compare (any_or:SWI - (match_operand:SWI 1 "nonimmediate_operand" "%0,0") - (match_operand:SWI 2 "" ",")) + (match_operand:SWI 1 "nonimmediate_operand" "%0,0,rm,r") + (match_operand:SWI 2 "" ",,r,")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m,") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,,r,r") (any_or:SWI (match_dup 1) (match_dup 2)))] "ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, mode, operands)" - "{}\t{%2, %0|%0, %2}" + && ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" + "@ + {}\t{%2, %0|%0, %2} + {}\t{%2, %0|%0, %2} + {}\t{%2, %1, %0|%0, %1, %2} + {}\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,*,apx_ndd,apx_ndd") (set_attr "mode" "")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand ;; ??? Special case for immediate operand is missing - it is tricky. (define_insn "*si_2_zext" [(set (reg FLAGS_REG) - (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_general_operand" "rBMe")) + (compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0,rm,r") + (match_operand:SI 2 "x86_64_general_operand" "rBMe,re,BM")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI (any_or:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*si_2_zext_imm" [(set (reg FLAGS_REG) (compare (any_or:SI - (match_operand:SI 1 "nonimmediate_operand" "%0") - (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z")) + (match_operand:SI 1 "nonimmediate_operand" "%0,rm") + (match_operand:SI 2 "x86_64_zext_immediate_operand" "Z,Z")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (any_or:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_binary_operator_ok (, SImode, operands)" - "{l}\t{%2, %k0|%k0, %2}" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" + "@ + {l}\t{%2, %k0|%k0, %2} + {l}\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "alu") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_insn "*_3" @@ -13057,6 +13100,7 @@ (define_insn "*_3" ;; Don't do the splitting with memory operands, since it introduces risk ;; of memory mismatch stalls. We may want to do the splitting for optimizing ;; for size, but that can (should?) be handled by generic code instead. +;; Don't do the splitting for APX NDD as NDD does not support *h registers. (define_split [(set (match_operand:SWI248 0 "QIreg_operand") (any_or:SWI248 (match_operand:SWI248 1 "register_operand") @@ -13064,7 +13108,8 @@ (define_split (clobber (reg:CC FLAGS_REG))] "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(INTVAL (operands[2]) & ~(255 << 8))" + && !(INTVAL (operands[2]) & ~(255 << 8)) + && !(TARGET_APX_NDD && REGNO (operands[0]) != REGNO (operands[1]))" [(parallel [(set (zero_extract:HI (match_dup 0) (const_int 8) @@ -13102,7 +13147,9 @@ (define_split "reload_completed && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) && !(INTVAL (operands[2]) & ~255) - && (INTVAL (operands[2]) & 128)" + && (INTVAL (operands[2]) & 128) + && !(TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]))" [(parallel [(set (strict_low_part (match_dup 0)) (any_or:QI (match_dup 1) (match_dup 2))) @@ -14168,20 +14215,21 @@ (define_split (define_insn "*one_cmplsi2_2_zext" [(set (reg FLAGS_REG) - (compare (not:SI (match_operand:SI 1 "register_operand" "0")) + (compare (not:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (not:SI (match_dup 1))))] "TARGET_64BIT && ix86_match_ccmode (insn, CCNOmode) - && ix86_unary_operator_ok (NOT, SImode, operands)" + && ix86_unary_operator_ok (NOT, SImode, operands, TARGET_APX_NDD)" "#" [(set_attr "type" "alu1") + (set_attr "isa" "*,apx_ndd") (set_attr "mode" "SI")]) (define_split [(set (match_operand 0 "flags_reg_operand") (match_operator 2 "compare_operator" - [(not:SI (match_operand:SI 3 "register_operand")) + [(not:SI (match_operand:SI 3 "nonimmediate_operand")) (const_int 0)])) (set (match_operand:DI 1 "register_operand") (zero_extend:DI (not:SI (match_dup 3))))] diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index be436d57bdf..d97648c876d 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -94,6 +94,24 @@ FOO (int, and, &) FOO1 (int, and, &) FOO (long, and, &) FOO1 (long, and, &) + +FOO (char, or, |) +FOO1 (char, or, |) +FOO (short, or, |) +FOO1 (short, or, |) +FOO (int, or, |) +FOO1 (int, or, |) +FOO (long, or, |) +FOO1 (long, or, |) + +FOO (char, xor, ^) +FOO1 (char, xor, ^) +FOO (short, xor, ^) +FOO1 (short, xor, ^) +FOO (int, xor, ^) +FOO1 (int, xor, ^) +FOO (long, xor, ^) +FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -108,3 +126,11 @@ FOO1 (long, and, &) /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "orb\[^\n\r]*1, \\(%rdi\\), %al" 2} } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 6 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "or(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "xorb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ From patchwork Tue Dec 5 02:29:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173695 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3177123vqy; Mon, 4 Dec 2023 18:43:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IFC5xXbzGxEXgQ/wJy8gwAC9bsj21kGbPilJJuX2DgWQfmLrssbai3DBGYpFBoRLcFs4zD4 X-Received: by 2002:a25:86cb:0:b0:db7:dacf:6218 with SMTP id y11-20020a2586cb000000b00db7dacf6218mr3966617ybm.106.1701744207143; Mon, 04 Dec 2023 18:43:27 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744207; cv=pass; d=google.com; s=arc-20160816; b=WcFuVlnJgJHunWmKb9onF0Lqprn1bnn9TzOqmS2bCYLwcOMu4JDVaVb16PoejpzN5S 9x1OnRUPUZz7Kj8NRuoWivhl2Puwby3SlwKXwYZkyzH/0HkEQEbLeuP4GBn3yHyAaO+s DRftWVp9vTtsBnVsZdb8I7O9o6NtLLF/n89Gk7FaYZgnR2kMbN5Jq86shyqU1w7TViu2 d41uog1UBVT89T0q71F/UUVwdBowtE176zQ4cGtnkyWEUg6ieeg8uWlSkWU3A/76TWHn WHYpEf8a6M00qpFNqXW00BaRL60wGX9Gr5F6CKlipZ4raY1hlDf4DTZO5RibTNEeR8ox exhw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=kCaQlcshE3zEdbqBJ9AlGZ/ZaqX/WZfU+SsqXUP6zKg=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=jPXS6qaaKBdiN4tiyZAAkshhof/JNVUkSJVzKLzWbPHWdvZOvikR4PqmbtxAmw55ui 7/wAiQlpWEc9TsrQ48fKg+QDiVESomIHGiw/85qhK+k8tQJl4DQfe8RKT4SifRGWdn8u b98YdHuUK4KQr/HyiZ7D7bbHstPt7qL0JvAFS/vXrn6Pr7XnmDKffF1hrea/xaD65XiD h8+/wcmLVNu2X/xghz5AWnf+JE67PAu87ZAcwuDZ04I0NtBTHoXvWR5MkKFdNoEknmL6 +i4cqrdnDGxxCJFU4jnRxkRt6Y13lbp/sTjo+B5Cm6gSlgYtILLqZs9KJn7cUExVHMtP DiEg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JHE2dBTH; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id pv19-20020ad45493000000b0067a3aea089csi8183699qvb.131.2023.12.04.18.43.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:43:27 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JHE2dBTH; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5AE7A393E24F for ; Tue, 5 Dec 2023 02:34:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 5EC2F394881A for ; Tue, 5 Dec 2023 02:31:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5EC2F394881A Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5EC2F394881A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; cv=none; b=OdbF+oWRKjlWLwcJOShxGpSBbU1HDEJecICfauZTIltENrzwjYCUPXkKGehoewbkZ7yTAiIUJhPOf1cukjfh40+raO3PCOklHB5hwXmb7f8Ezg6F9Y9Ikm9PReTISUCq+rMszeLE1EgIvqgnzpthD3imdsR+UAkYfOs+SCb7wvE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; c=relaxed/simple; bh=WZzdNXbpHAeFaEBEcmBXY2cdEluFcaiZm77u6CUAJEg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Lml8oPyiUHRMSfFwSMgzeNowNkOJsfZMsejKg8B3p+pA9fUrTzmNLLVeGF+OHa6Yu8E5iarwKWLGtGHNHDHRj4eZ73kZ30j/dK49jUcF/845i4QrbQVxvbtfUB9MP8RUxEfhL3MI/7gth8wauC3HjqjPb8F2msxkSL6VKVTy7Hs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDq-0001YX-N7 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743503; x=1733279503; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WZzdNXbpHAeFaEBEcmBXY2cdEluFcaiZm77u6CUAJEg=; b=JHE2dBTHCCKvk4R4Gy6aSM3ULi6pBWyPrJJDQJXKV2p5XNsvxi8cs70x PkOw96t3URtzKuvlaYXsqoSwxHhGnho95L+7RT9lLTHvfJRx+ASfP72lp evmHfsp24mpLPi7uqNjranffJjr0GjePoPL8nNl9XvA1y+EDYv4TClO0S hs5oBjNbuqJ369rguMT0/5broS6meUEPwFM2FmbPgOdhsKXWK65oITv+3 DQYDnW9muLNc7iJ2C5s3lEwB2/HiXVFcTjdSk/3YgE6O1BBIPoiHbdoHt j8rRQ3Kmwnns91IXENh7tcW9OUKRgVsTwyVdBuJd6ugh0EMJQBkuzM6hY w==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277821" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277821" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275554" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275554" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 59C0D1005630; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 12/17] [APX NDD] Support APX NDD for left shift insns Date: Tue, 5 Dec 2023 10:29:43 +0800 Message-Id: <20231205022948.504790-13-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408133807924535 X-GMAIL-MSGID: 1784408133807924535 For left shift, there is an optimization TARGET_DOUBLE_WITH_ADD that shl 1 can be optimized to add. As NDD form of add requires src operand to be register since NDD cannot take 2 memory src, we currently just keep using NDD form shift instead of add. The optimization TARGET_SHIFT1 will try to remove constant 1 to use shorter opcode, but under NDD assembler will automatically use it whether $1 exist or not, so do not involve NDD with it. The doubleword insns for left shift calls ix86_expand_ashl, which assume all shift related pattern has same operand[0] and operand[1]. For these pattern we will support them in a standalone patch. gcc/ChangeLog: * config/i386/i386.md (*ashl3_1): Extend with new alternatives to support NDD, limit the new alternative to generate sal only, and adjust output template for NDD. (*ashlsi3_1_zext): Likewise. (*ashlhi3_1): Likewise. (*ashlqi3_1): Likewise. (*ashl3_cmp): Likewise. (*ashlsi3_cmp_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*ashl3_cconly): Likewise. (*ashl3_doubleword_highpart): Adjust codegen for NDD. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add tests for sal. --- gcc/config/i386/i386.md | 172 ++++++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 22 +++ 2 files changed, 136 insertions(+), 58 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 62cd21ee3d4..43be1364bff 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14467,10 +14467,19 @@ (define_insn_and_split "*ashl3_doubleword_highpart" { split_double_mode (mode, &operands[0], 1, &operands[0], &operands[3]); int bits = INTVAL (operands[2]) - ( * BITS_PER_UNIT); - if (!rtx_equal_p (operands[3], operands[1])) - emit_move_insn (operands[3], operands[1]); - if (bits > 0) - emit_insn (gen_ashl3 (operands[3], operands[3], GEN_INT (bits))); + bool op_equal_p = rtx_equal_p (operands[3], operands[1]); + if (bits == 0) + { + if (!op_equal_p) + emit_move_insn (operands[3], operands[1]); + } + else + { + if (!op_equal_p && !TARGET_APX_NDD) + emit_move_insn (operands[3], operands[1]); + rtx op_tmp = TARGET_APX_NDD ? operands[1] : operands[3]; + emit_insn (gen_ashl3 (operands[3], op_tmp, GEN_INT (bits))); + } ix86_expand_clear (operands[0]); DONE; }) @@ -14777,12 +14786,14 @@ (define_insn "*bmi2_ashl3_1" (set_attr "mode" "")]) (define_insn "*ashl3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k") - (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,M,r,"))) + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,?k,r") + (ashift:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,l,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,M,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, mode, operands)" + "ix86_binary_operator_ok (ASHIFT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14797,18 +14808,25 @@ (define_insn "*ashl3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + /* For NDD form instructions related to TARGET_SHIFT1, the $1 + immediate do not need to be omitted as assembler will map it + to use shorter encoding. */ + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,bmi2,") + [(set_attr "isa" "*,*,bmi2,,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14850,13 +14868,15 @@ (define_insn "*bmi2_ashlsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*ashlsi3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (zero_extend:DI - (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,M,r")))) + (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,l,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (ASHIFT, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14869,18 +14889,22 @@ (define_insn "*ashlsi3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,*,bmi2") + [(set_attr "isa" "*,*,bmi2,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "ishiftx") + (eq_attr "alternative" "3") + (const_string "ishift") (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") @@ -14910,12 +14934,14 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashlhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k") - (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,Yp,?k,r") + (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,M,Ww,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, HImode, operands)" + "ix86_binary_operator_ok (ASHIFT, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14928,18 +14954,22 @@ (define_insn "*ashlhi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{w}\t%0"; else - return "sal{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{w}\t{%2, %1, %0|%0, %1, %2}" + : "sal{w}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,*,avx512f") + [(set_attr "isa" "*,*,avx512f,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "lea") (eq_attr "alternative" "2") (const_string "msklog") + (eq_attr "alternative" "3") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -14955,15 +14985,17 @@ (define_insn "*ashlhi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "HI,SI,HI")]) + (set_attr "mode" "HI,SI,HI,HI")]) (define_insn "*ashlqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k") - (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k") - (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,Yp,?k,r") + (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,l,k,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,cI,M,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFT, QImode, operands)" + "ix86_binary_operator_ok (ASHIFT, QImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 4); switch (get_attr_type (insn)) { case TYPE_LEA: @@ -14979,7 +15011,8 @@ (define_insn "*ashlqi3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) { if (get_attr_mode (insn) == MODE_SI) return "sal{l}\t%k0"; @@ -14991,16 +15024,19 @@ (define_insn "*ashlqi3_1" if (get_attr_mode (insn) == MODE_SI) return "sal{l}\t{%2, %k0|%k0, %2}"; else - return "sal{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{b}\t{%2, %1, %0|%0, %1, %2}" + : "sal{b}\t{%2, %0|%0, %2}"; } } } - [(set_attr "isa" "*,*,*,avx512dq") + [(set_attr "isa" "*,*,*,avx512dq,apx_ndd") (set (attr "type") (cond [(eq_attr "alternative" "2") (const_string "lea") (eq_attr "alternative" "3") (const_string "msklog") + (eq_attr "alternative" "4") + (const_string "ishift") (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) @@ -15016,10 +15052,10 @@ (define_insn "*ashlqi3_1" (match_test "optimize_function_for_size_p (cfun)"))))) (const_string "0") (const_string "*"))) - (set_attr "mode" "QI,SI,SI,QI") + (set_attr "mode" "QI,SI,SI,QI,QI") ;; Potential partial reg stall on alternative 1. (set (attr "preferred_for_speed") - (cond [(eq_attr "alternative" "1") + (cond [(eq_attr "alternative" "1,4") (symbol_ref "!TARGET_PARTIAL_REG_STALL")] (symbol_ref "true")))]) @@ -15114,10 +15150,10 @@ (define_split (define_insn "*ashl3_cmp" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (ashift:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL @@ -15125,8 +15161,10 @@ (define_insn "*ashl3_cmp" && (TARGET_SHIFT1 || (TARGET_DOUBLE_WITH_ADD && REG_P (operands[0]))))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, mode, operands)" + && ix86_binary_operator_ok (ASHIFT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: @@ -15135,14 +15173,19 @@ (define_insn "*ashl3_cmp" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") @@ -15162,10 +15205,10 @@ (define_insn "*ashl3_cmp" (define_insn "*ashlsi3_cmp_zext" [(set (reg FLAGS_REG) (compare - (ashift:SI (match_operand:SI 1 "register_operand" "0") + (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (ashift:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -15174,8 +15217,10 @@ (define_insn "*ashlsi3_cmp_zext" && (TARGET_SHIFT1 || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (ASHIFT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFT, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: @@ -15184,14 +15229,19 @@ (define_insn "*ashlsi3_cmp_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{l}\t%k0"; else - return "sal{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "sal{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "sal{l}\t{%2, %k0|%k0, %2}"; } } - [(set (attr "type") - (cond [(and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 2 "const1_operand")) (const_string "alu") ] @@ -15210,10 +15260,10 @@ (define_insn "*ashlsi3_cmp_zext" (define_insn "*ashl3_cconly" [(set (reg FLAGS_REG) (compare - (ashift:SWI (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (ashift:SWI (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -15221,22 +15271,28 @@ (define_insn "*ashl3_cconly" || TARGET_DOUBLE_WITH_ADD))) && ix86_match_ccmode (insn, CCGOCmode)" { + bool use_ndd = (which_alternative == 1); switch (get_attr_type (insn)) { case TYPE_ALU: gcc_assert (operands[2] == const1_rtx); return "add{}\t%0, %0"; - default: + default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sal{}\t%0"; else - return "sal{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sal{}\t{%2, %1, %0|%0, %1, %2}" + : "sal{}\t{%2, %0|%0, %2}"; } } - [(set (attr "type") - (cond [(and (and (match_test "TARGET_DOUBLE_WITH_ADD") + [(set_attr "isa" "*,apx_ndd") + (set (attr "type") + (cond [(eq_attr "alternative" "1") + (const_string "ishift") + (and (and (match_test "TARGET_DOUBLE_WITH_ADD") (match_operand 0 "register_operand")) (match_operand 2 "const1_operand")) (const_string "alu") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index d97648c876d..9951fb00a4c 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -29,6 +29,16 @@ foo2_##OP_NAME##_##TYPE (TYPE *a, TYPE b) \ return c; \ } +#define FOO3(TYPE, OP_NAME, OP, IMM) \ +TYPE \ +__attribute__ ((noipa)) \ +foo3_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = a OP IMM; \ + return b; \ +} + + #define F(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -112,6 +122,16 @@ FOO (int, xor, ^) FOO1 (int, xor, ^) FOO (long, xor, ^) FOO1 (long, xor, ^) + +FOO (char, shl, <<) +FOO3 (char, shl, <<, 7) +FOO (short, shl, <<) +FOO3 (short, shl, <<, 7) +FOO (int, shl, <<) +FOO3 (int, shl, <<, 7) +FOO (long, shl, <<) +FOO3 (long, shl, <<, 7) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -134,3 +154,5 @@ FOO1 (long, xor, ^) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 3 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)di, %(?:|r|e)si, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ +/* { dg-final { scan-assembler-times "sal(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ From patchwork Tue Dec 5 02:29:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173706 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3177598vqy; Mon, 4 Dec 2023 18:45:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IGfEorI78KHLgrwSrocbu4Sy8cCcKNs5ibLYJT9DkAf65E+A5iPY+C7ZO0nEJs7EKkWj9Na X-Received: by 2002:a05:620a:2158:b0:77f:8cf:3e45 with SMTP id m24-20020a05620a215800b0077f08cf3e45mr725308qkm.67.1701744305413; Mon, 04 Dec 2023 18:45:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744305; cv=pass; d=google.com; s=arc-20160816; b=Dedy623eTM7w9uPHNlaoe4Opg/BbfoYfTxWT63KHov8C2ekTM2R7Rsg3+ccHREbHuS +bo/zflal2z4RfM1FzG2mAnHyxTkZ5i/l4jUO22W6wyjHOEcqayh/5NVX+vwg+Kd7kYU tkDYfecsly4GmPb1stlKrBL3/kV2DSC0dZzKT2J09cGK9ap0Ii2VuN6OedW3jKLf381v xUp06CX5pAkCsfyvm06/lp09SlvXEN2D8cix3WnLfvm8YX/2W4GwtgCKvJL/UdbfzCOi 6Po3DwZDEG7W6fNrxuChrM80QtP5RuXBPlPd7WXVs7W9MmiMycTaK3Vo173EXHyExJgK WFnA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=1w533h2TxHlo5+ENbfieI9Mx2a7r9/1dws+UEsowEvY=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=FcRKETtcIzzM4RjOkSdNuG6g6FTQGAVeZXhmZICdEzWvOefkdMcojIzCjV9CqA/Skj HuWD+0H93AMlGNrzc9FyuRE1CEF4VCBdkH1CIZgc2bO2stnAMBDeZXl+7kTuP2Y5GYE6 w/0d7f+DryfAEy4/QsBtaYrlkS3+U3cXnZn+UiID2M0CJDdw8Jto/uwM6BjONKtq2eg7 J6RCcoeim+YjECuvMTkxAyOiTXdtw5+omBIXpUZgyEM4ayy+NKGAFlxIMct1jH3GX/bj Kos6itbvYj+8Y9EGjJHtBYt7Tr5n/l3AEInkQo63BZm6a93GXCS9xupTPi3ZlSiXOPbM RjcQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TdVIpPyD; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f14-20020a05620a408e00b0077d883cb7b1si11448489qko.238.2023.12.04.18.45.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:45:05 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TdVIpPyD; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 878403957A9A for ; Tue, 5 Dec 2023 02:35:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id A5E6B394D8DA for ; Tue, 5 Dec 2023 02:31:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A5E6B394D8DA Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A5E6B394D8DA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; cv=none; b=n3DZpzY394xC8Mu1mIbjZtuXxC2xa0e66Zl4XeKjkyhxXleENS5gXiTbO/ntFBXZrY2gPKiMkFUPB7Q/MvKm4Afo6GBYp5Fux+6oVri2WQp01Gh5lRkUXumO6tO/O3CMDcddZpry396QHHmGAZWja0T9dMuVxfzMQKl1F201/lU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743516; c=relaxed/simple; bh=+NuQHXJaGHho6bZLq6Ri6U6LJt6eiGvd0cJ/dMcSPV0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=KsoTO6GXNpbLKTOJ48CGkZAQHIjgkwx71oHlPUCtf8YPNtSkwubZkOqvRRIig7tRl4uIKBKT4Z222DJiFQEYiZJHSSBZb0iNcq8KMYMi/TRCXHsyIqw5+WVhMKIrNQBxWfl7D84r1lOh1JrlZpDnBmEb/gq+uCYntyYOKOnPa7c= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDs-0001YN-Hl for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743505; x=1733279505; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+NuQHXJaGHho6bZLq6Ri6U6LJt6eiGvd0cJ/dMcSPV0=; b=TdVIpPyDwqBv09haLnHZQtQkIPbVdObIU9tgvPKnOEsa1O/mSvn4dr2w lDJW6HhuIFCykV5Kq0nm3W8fa6rJ4a/okLvI5xEXaKRgVubrjnY+YSKK+ 3z5fsmB7KRXk9kP20WTXmT8SJOmPPDLjCKFXBDr8Z6Mj5R0VVdbvRI7Id 6Ntd5WDiDjBazyyWmK86cxYJP0YBgdMI8w1VlVbDJg9uxE43jcPlVkmdX zPvrGsuKpsddm94raTEJe87ITo7NiuYtFMMvsOJQYX1/GY1z6Sptln5CX nzBLiR7FWa56pCOT4WaOJ61mfmlGTzo8b2I3TzCybbU+ACBRrz3KPFMkC Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277825" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277825" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:59 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275556" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275556" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 5D6C81005631; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 13/17] [APX NDD] Support APX NDD for right shift insns Date: Tue, 5 Dec 2023 10:29:44 +0800 Message-Id: <20231205022948.504790-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408236930930709 X-GMAIL-MSGID: 1784408236930930709 Similar to LSHIFT, rshift do not need to omit $1 for NDD form. gcc/ChangeLog: * config/i386/i386.md (ashr3_cvt): Extend with new alternatives to support NDD, and adjust output templates. (*ashr3_1): Likewise for SI/DI mode. (*lshr3_1): Likewise. (*si3_1_zext): Likewise. (*ashr3_1): Likewise for QI/HI mode. (*lshrqi3_1): Likewise. (*lshrhi3_1): Likewise. (3_cmp): Likewise. (*3_cconly): Likewise. (*ashrsi3_cvt_zext): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (*highpartdisi2): Likewise. (*si3_cmp_zext): Likewise. (3_carry): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add l/ashiftrt tests. --- gcc/config/i386/i386.md | 232 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 24 +++ 2 files changed, 166 insertions(+), 90 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 43be1364bff..8bec8a63ba9 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15803,39 +15803,45 @@ (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) (define_insn "ashr3_cvt" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "*a,0") + (match_operand:SWI48 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)-1 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" "@ - sar{}\t{%2, %0|%0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{}\t{%2, %0|%0, %2} + sar{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "")]) (define_insn "*ashrsi3_cvt_zext" - [(set (match_operand:DI 0 "register_operand" "=*d,r") + [(set (match_operand:DI 0 "register_operand" "=*d,r,r") (zero_extend:DI - (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0") + (ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && INTVAL (operands[2]) == 31 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, SImode, operands, + TARGET_APX_NDD)" "@ {cltd|cdq} - sar{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{l}\t{%2, %k0|%k0, %2} + sar{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "SI")]) (define_expand "@x86_shift_adj_3" @@ -15877,13 +15883,15 @@ (define_insn "*bmi2_3_1" (set_attr "mode" "")]) (define_insn "*ashr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,r"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15891,14 +15899,16 @@ (define_insn "*ashr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15911,8 +15921,8 @@ (define_insn "*ashr3_1" ;; Specialization of *lshr3_1 below, extracting the SImode ;; highpart of a DI to be extracted, but allowing it to be clobbered. (define_insn_and_split "*highpartdisi2" - [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k") 0) - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k") + [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k,r") 0) + (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "0,0,k,rm") (const_int 32))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" @@ -15931,16 +15941,20 @@ (define_insn_and_split "*highpartdisi2" DONE; } operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); -}) +} +[(set_attr "isa" "*,*,*,apx_ndd")]) + (define_insn "*lshr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k,r") (lshiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,r,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, mode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15949,14 +15963,16 @@ (define_insn "*lshr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{}\t%0"; else - return "shr{}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{}\t{%2, %1, %0|%0, %1, %2}" + : "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,") - (set_attr "type" "ishift,ishiftx,msklog") + [(set_attr "isa" "*,bmi2,,apx_ndd") + (set_attr "type" "ishift,ishiftx,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15989,13 +16005,15 @@ (define_insn "*bmi2_si3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,r")))) + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -16003,14 +16021,16 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16033,20 +16053,25 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashr3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m, r") (ashiftrt:SWI12 - (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + (match_operand:SWI12 1 "nonimmediate_operand" "0, rm") + (match_operand:QI 2 "nonmemory_operand" "c, c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16057,29 +16082,33 @@ (define_insn "*ashr3_1" (set_attr "mode" "")]) (define_insn "*lshrqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k,r") (lshiftrt:QI - (match_operand:QI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI,Wb"))) + (match_operand:QI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, QImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, QImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{b}\t%0"; else - return "shr{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{b}\t{%2, %1, %0|%0, %1, %2}" + : "shr{b}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*,avx512dq,apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -16091,29 +16120,33 @@ (define_insn "*lshrqi3_1" (set_attr "mode" "QI")]) (define_insn "*lshrhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k") + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k, r") (lshiftrt:HI - (match_operand:HI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI, Ww"))) + (match_operand:HI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI, Ww, cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, HImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, HImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "shr{w}\t%0"; else - return "shr{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{w}\t{%2, %1, %0|%0, %1, %2}" + : "shr{w}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*, avx512f") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*, avx512f, apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -16166,25 +16199,30 @@ (define_insn "*3_cmp" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (any_shiftrt:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, mode, operands)" + && ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16197,10 +16235,10 @@ (define_insn "*3_cmp" (define_insn "*si3_cmp_zext" [(set (reg FLAGS_REG) (compare - (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0") + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (any_shiftrt:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -16208,15 +16246,20 @@ (define_insn "*si3_cmp_zext" || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, SImode, operands)" + && ix86_binary_operator_ok (, SImode, operands, + TARGET_APX_NDD)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16230,23 +16273,28 @@ (define_insn "*3_cconly" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16850,18 +16898,22 @@ (define_insn "rcrdi2" ;; Versions of sar and shr that set the carry flag. (define_insn "3_carry" [(set (reg:CCC FLAGS_REG) - (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0") + (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") (const_int 1)) (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI48 0 "register_operand" "=r") + (set (match_operand:SWI48 0 "register_operand" "=r,r") (any_shiftrt:SWI48 (match_dup 1) (const_int 1)))] "" { - if (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + bool use_ndd = which_alternative == 1; + if ((TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; - return "{}\t{1, %0|%0, 1}"; + return use_ndd ? "{}\t{$1, %1, %0|%0, %1, 1}" + : "{}\t{$1, %0|%0, 1}"; } - [(set_attr "type" "ishift1") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift1") (set (attr "length_immediate") (if_then_else (ior (match_test "TARGET_SHIFT1") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 9951fb00a4c..239c427514a 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,6 +2,8 @@ /* { dg-options "-mapxf -march=x86-64 -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ +#include + #define FOO(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -132,6 +134,24 @@ FOO3 (int, shl, <<, 7) FOO (long, shl, <<) FOO3 (long, shl, <<, 7) +FOO (char, sar, >>) +FOO3 (char, sar, >>, 7) +FOO (short, sar, >>) +FOO3 (short, sar, >>, 7) +FOO (int, sar, >>) +FOO3 (int, sar, >>, 7) +FOO (long, sar, >>) +FOO3 (long, sar, >>, 7) + +FOO (uint8_t, shr, >>) +FOO3 (uint8_t, shr, >>, 7) +FOO (uint16_t, shr, >>) +FOO3 (uint16_t, shr, >>, 7) +FOO (uint32_t, shr, >>) +FOO3 (uint32_t, shr, >>, 7) +FOO (uint64_t, shr, >>) +FOO3 (uint64_t, shr, >>, 7) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -156,3 +176,7 @@ FOO3 (long, shl, <<, 7) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "sal(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Tue Dec 5 02:29:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173694 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176988vqy; Mon, 4 Dec 2023 18:42:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IHYDFcQxDE+5CsSUdKBsZO18XUCxwb0j8TvM8b7FR1RU/LwSiEXXLgnYTmuxWHku6V46XcD X-Received: by 2002:a05:622a:3c8:b0:425:4043:96f9 with SMTP id k8-20020a05622a03c800b00425404396f9mr734316qtx.134.1701744178808; Mon, 04 Dec 2023 18:42:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744178; cv=pass; d=google.com; s=arc-20160816; b=hEjU+uf+mmOfMyk48bXxC4Zg9EAjwFNuY3m+rDp/1VdKhJ57msqXXK6ue3ZLonwh7+ CNb+6xWANJ3216GTmkf1DEGMZL9R7ad7BwiDanasi9A1fkMfYBN0DZIsQuyxXQg0cSkc eocO9ZBOSUxplWwx0Y5S8/w6bQyA6DfyEzdsbyBkihTVPa8X2hG30H/fk79GADtE0izA jWCM65zMAEKLQAp2fW0M124tmejO7QK0u6e8uUCA86WbaT5koLd6NeoJswfgPrakPf2J zGo5S8VKNMucNk6GGW2Hk3b6P/m5sQ3L1n1YsvCQkBXqcPzKdiCAL0lBCRH/p0uWyTld A0CA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=bOl8SGaChoaN3fXOmMLmyrE/uCRhX+IqWV4IqTwDdQ4=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=fsG8ODjfwF/GbgaN8ajczftCQteJxwnibHitOXGQK9xkWQXlnkdFkBM9zLFT2t5joP m485rFeVN21BBFjsyRVaozQAU2tYvDPO9DWvqSMSDX00BAEXQCBpkRw1knTtU1+jeOAw e+K/pcxfgNDfUUQhDri2yVR45f/QG+vW6bxb1xCgf583UhouAHQC1Wr8ZPcgYJhsjbBN B7TV+7Nt5N+UME1ohGaGIx9cVZwDWZZ9V0ZIjpm2pme1WrWmgi8rJLEchQ0ZJvfJIQ2j KPUVDCf2qfhyynnjbvE2GP5gRmtt5PHr8FcQlmr+daCkM9jJsZi9zQv5cs6+vdiQRD0U ULsg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F9UIn7YA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id 15-20020a05621420cf00b0067ab0caa43bsi6955134qve.150.2023.12.04.18.42.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:42:58 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F9UIn7YA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB0BC3A6C984 for ; Tue, 5 Dec 2023 02:34:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id DFAD23888C77 for ; Tue, 5 Dec 2023 02:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DFAD23888C77 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DFAD23888C77 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743489; cv=none; b=SaeOhMapkYxAlTf4175nhBeDa21+wnvLxAAJnUSrg3n4TLq8x1nZP0VnViAxMta7Zd7VbUnx3f6n8rc8x+8H1hON/CRkHLAlsS8YCu1MQROzL/NqwQi5Io+n3Z4xIYUv8rEI9V0aIucBove7D8HiAcqjJT79IVh2nYRuussC2wE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743489; c=relaxed/simple; bh=tiVSZ4PO+CVHKBJyoTsXQrqyOks4cKkDdgLG9Zb8NZs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=jWwUmUYZuoY2epq7bN4hXb3Elt3krXy4S/3sjCF0nL+a9A1pbVrm5OkMDs9uDoPyW1xRKsjFzLE0FjBA5LvedR44cxVYCWoX75p2AO3/Js+MgAScZEPRHacH53P+wn8uAY4WvRZLYknHQ6o4BNrczzoB4ijrbGKQaBlyigIWYZ4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDS-0001YN-HB for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743478; x=1733279478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tiVSZ4PO+CVHKBJyoTsXQrqyOks4cKkDdgLG9Zb8NZs=; b=F9UIn7YA1rxTGNsKwvtsY3bUUNsYLzOCpDzCmlvdI32fW8Oac5If0Gik wjOzBFtJiKxB0gOvAl2i7yUBrAxzmbq5FiyW0WKnGzItK3Ts7z6gvOMqn 2pDG8VUjmP5NskLjmSM1y3+QJKJX13nRkPYAEMMGcKfJUNXoRMF2VTFdx i+ASxnTjyL4LN9d5HSCF+II9PAjy5/uAirk2Smxy+TkzjcsNPeqmmCHXu 2VihXULX2xxPnEz8wqeEoJdIEOf6syCStYKUQZ3uTauT8sFwKNwcfkjhJ 6hHPOx10u8JDjWXXTQgmsT8zJ9/VEqJ11gDI8lio1ARsdjyMvqUi2F+Ez Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277812" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277812" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275538" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275538" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 60A631007801; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 14/17] [APX NDD] Support APX NDD for rotate insns Date: Tue, 5 Dec 2023 10:29:45 +0800 Message-Id: <20231205022948.504790-15-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408104379005679 X-GMAIL-MSGID: 1784408104379005679 gcc/ChangeLog: * config/i386/i386.md (*3_1): Extend with a new alternative to support NDD for SI/DI rotate, and adjust output template. (*si3_1_zext): Likewise. (*3_1): Likewise for QI/HI modes. (rcrsi2): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (rcrdi2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for left/right rotate. --- gcc/config/i386/i386.md | 79 +++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 20 +++++++ 2 files changed, 69 insertions(+), 30 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8bec8a63ba9..6398f544a17 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16662,13 +16662,15 @@ (define_insn "*bmi2_rorx3_1" (set_attr "mode" "")]) (define_insn "*3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (any_rotate:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16676,14 +16678,16 @@ (define_insn "*3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16733,13 +16737,14 @@ (define_insn "*bmi2_rorxsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,I")))) + (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,I,cI")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16747,14 +16752,16 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16798,19 +16805,25 @@ (define_split (zero_extend:DI (rotatert:SI (match_dup 1) (match_dup 2))))]) (define_insn "*3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") - (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m,r") + (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "rotate") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "rotate") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16867,31 +16880,37 @@ (define_split ;; Rotations through carry flag (define_insn "rcrsi2" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,r") (plus:SI - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:SI (ltu:SI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 31)))) (clobber (reg:CC FLAGS_REG))] "" - "rcr{l}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{l}\t%0 + rcr{l}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "memory" "none") (set_attr "length_immediate" "0") (set_attr "mode" "SI")]) (define_insn "rcrdi2" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (plus:DI - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0") + (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:DI (ltu:DI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 63)))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "rcr{q}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{q}\t%0 + rcr{q}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "length_immediate" "0") (set_attr "mode" "DI")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 239c427514a..b215f66d3e2 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -40,6 +40,14 @@ foo3_##OP_NAME##_##TYPE (TYPE a) \ return b; \ } +#define FOO4(TYPE, OP_NAME, OP1, OP2, IMM1) \ +TYPE \ +__attribute__ ((noipa)) \ +foo4_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = (a OP1 IMM1 | a OP2 (8 * sizeof(TYPE) - IMM1)); \ + return b; \ +} #define F(TYPE, OP_NAME, OP) \ TYPE \ @@ -152,6 +160,16 @@ FOO3 (uint32_t, shr, >>, 7) FOO (uint64_t, shr, >>) FOO3 (uint64_t, shr, >>, 7) +FOO4 (uint8_t, ror, >>, <<, 1) +FOO4 (uint16_t, ror, >>, <<, 1) +FOO4 (uint32_t, ror, >>, <<, 1) +FOO4 (uint64_t, ror, >>, <<, 1) + +FOO4 (uint8_t, rol, <<, >>, 1) +FOO4 (uint16_t, rol, <<, >>, 1) +FOO4 (uint32_t, rol, <<, >>, 1) +FOO4 (uint64_t, rol, <<, >>, 1) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -180,3 +198,5 @@ FOO3 (uint64_t, shr, >>, 7) /* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "ror(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "rol(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ From patchwork Tue Dec 5 02:29:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173703 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3177395vqy; Mon, 4 Dec 2023 18:44:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IEr7c1dfXWEfhzxB0hcxn4eP64g8d8goawhnC/sZ45VdtF8KEnL2VUkCD8Q31ndkNq4UJGs X-Received: by 2002:a05:620a:2231:b0:77e:fba3:58f2 with SMTP id n17-20020a05620a223100b0077efba358f2mr524017qkh.131.1701744262046; Mon, 04 Dec 2023 18:44:22 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744262; cv=pass; d=google.com; s=arc-20160816; b=XmdxIjHwcjEd28uDjJpTqNETfElE1gUbm42GDgFY99vhdUFqW5ZMMFcYU+W+fyfkV3 ymYYc8d0LRHuRWYvXYt2L7w8V3ft0qcwFTodb0qyO2j0HKsXYyH1WBXLUSbGrPYEt0sC zjsz2KKhObN+Uc7SLMv32uZtq8e7k3KYnbuhWTOOv+2XF+Od5Sq5CE/B1HlVWYVrvjF0 vRR8Dy3b5x3Q5Ivf0Qi5fNE7kksvuerRbGdpHO9J+gjbLXKdbxWsP50AAouZXqUoPjOq pN1642SyKPL07A5BlMFtS67Qp5zwuB9key9blf1PHuNN0k0LIk4qUYIWVMmsJPNB/xkW zW1Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=b79jMvUqW4fzGEA/TayTzE8KM8WT8iPFxrZ3XSofKgw=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=0H/OFsM0fLK3MGv3kfpB547aduUyHvZYINXsQooS33AMNR9KypeRJBUsJRklYP7Mzv xtJ6nD5BmdAstOHPklT1J/7np8d+phPmCcr7xZmIM/IhQIYnSO97YaqU8luPYDOwIJUI D4pjOuOVLx+nL/gJaUIL2yo/lI40M1SNL/OiVjyjDTve9sK8Dz5A8SeRPTpL58Kou9pm yP5P64zTpb5yLyydFtaTKX5Wo6bl8xcrojBsTr1HruO3f2fRiXvg5roKvwX5YycrXODO j59mhWvMoT0nB8RNDj8qFZdq5xC29xSROwEyGBrf0mOn17kwH76muHT3Pu2ZDy4U/7D6 8SCg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FPOe42zx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t4-20020a05620a450400b0077d58e69edfsi11718681qkp.233.2023.12.04.18.44.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:44:22 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FPOe42zx; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EB11B382F93E for ; Tue, 5 Dec 2023 02:34:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 090C13882AC8 for ; Tue, 5 Dec 2023 02:31:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 090C13882AC8 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 090C13882AC8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; cv=none; b=UTDudvxmBlVePHhBPqkDpuoN2fN/36ALDc6YHtmGQQWiyD7QV5il4KPvUanqNi/rMcs8UeAqdjVEZz5A/4oF0sxAlvAEszR1bFamUSpU/Qp0Q8zCrBVD5jgMxLW9JeQygHMe3CVXVW2XqOJ2YvHbNMJBfFsbOBMBozXo0d/QeR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; c=relaxed/simple; bh=fN1I2pg2vwrYLLAvpmDXedP3UFhmAvX4nfoChV5QruY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=M19rL5hcum5Tq1PXbTcF3ihsyP4l2oQGZuBlo0WOOZHaA/2xqy2cX+8jiffmtV/DB162m2AphSYALG4kebT07mTCaBnmw9/XZZp1ftotNm0HTfzV8whZ1x+5sfKIUEsBMIMl9k831zGXFUQHrLGMX8zdQ5umrcz7xIDkI1orlzg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDe-0001Yf-CP for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743490; x=1733279490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fN1I2pg2vwrYLLAvpmDXedP3UFhmAvX4nfoChV5QruY=; b=FPOe42zxWKD5xAJkP2TmtAOE4e9QsR56xICm2Jq3YquUSDQvvmICn7Vs 9nuYbJj5WbmRFbtCf5+6JF2Roc8eJ2B+aslGqS9xL1MTlouf2bCLaw2Td Rp2y+QPbTyr81Z62N93dXwsJ7qSZCz1o5djaW+7eQ2ixuYdQwkclG8jyP MTyFptxyDjlUAlW8KAWhbQiy/d5tvlEEdVbznWsbtcWePevB1fKCLwrdo YLsFQhkZKkcTfs98IqVsd65GY9W9F8q2s6ElGqO3xdhoa+fcPz/n2teIz DQmrm38aHJUpyyfZrAus/K92EU5Dm1wkBYpOsqcfNtyM6gAWgWqX+oAAc Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277816" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277816" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275547" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275547" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 6347F1007802; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 15/17] [APX NDD] Support APX NDD for shld/shrd insns Date: Tue, 5 Dec 2023 10:29:46 +0800 Message-Id: <20231205022948.504790-16-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408190886367650 X-GMAIL-MSGID: 1784408190886367650 For shld/shrd insns, the old pattern use match_dup 0 as its shift src and use +r*m as its constraint. To support NDD we added new define_insns to handle NDD form pattern with extra input and dest operand to be fixed in register. gcc/ChangeLog: * config/i386/i386.md (x86_64_shld_ndd): New define_insn. (x86_64_shld_ndd_1): Likewise. (*x86_64_shld_ndd_2): Likewise. (x86_shld_ndd): Likewise. (x86_shld_ndd_1): Likewise. (*x86_shld_ndd_2): Likewise. (x86_64_shrd_ndd): Likewise. (x86_64_shrd_ndd_1): Likewise. (*x86_64_shrd_ndd_2): Likewise. (x86_shrd_ndd): Likewise. (x86_shrd_ndd_1): Likewise. (*x86_shrd_ndd_2): Likewise. (*x86_64_shld_shrd_1_nozext): Adjust codegen under TARGET_APX_NDD. (*x86_shld_shrd_1_nozext): Likewise. (*x86_64_shrd_shld_1_nozext): Likewise. (*x86_shrd_shld_1_nozext): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-shld-shrd.c: New test. --- gcc/config/i386/i386.md | 322 +++++++++++++++++- .../gcc.target/i386/apx-ndd-shld-shrd.c | 24 ++ 2 files changed, 344 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6398f544a17..0af7e82deee 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14505,6 +14505,23 @@ (define_insn "x86_64_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shld{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI")]) + (define_insn "x86_64_shld_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (ashift:DI (match_dup 0) @@ -14526,6 +14543,24 @@ (define_insn "x86_64_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shld_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shld{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI") + (set_attr "length_immediate" "1")]) + + (define_insn_and_split "*x86_64_shld_shrd_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (ashift:DI (match_operand:DI 4 "nonimmediate_operand") @@ -14551,6 +14586,23 @@ (define_insn_and_split "*x86_64_shld_shrd_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -14583,6 +14635,33 @@ (define_insn_and_split "*x86_64_shld_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shld_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (ashift:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (ashift:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (lshiftrt:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shld" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14605,6 +14684,24 @@ (define_insn "x86_shld" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd" + [(set (match_operand:SI 0 "nonimmediate_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shld{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + + (define_insn "x86_shld_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashift:SI (match_dup 0) @@ -14626,6 +14723,24 @@ (define_insn "x86_shld_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shld_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 32 - INTVAL (operands[3])" + "shld{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shld_shrd_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (ashift:SI (match_operand:SI 4 "nonimmediate_operand") @@ -14650,7 +14765,24 @@ (define_insn_and_split "*x86_shld_shrd_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shrd_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -14682,6 +14814,33 @@ (define_insn_and_split "*x86_shld_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shld_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (ashift:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (lshiftrt:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (ashift:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (lshiftrt:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_expand "@x86_shift_adj_1" [(set (reg:CCZ FLAGS_REG) (compare:CCZ (and:QI (match_operand:QI 2 "register_operand") @@ -15621,6 +15780,24 @@ (define_insn "x86_64_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Jc") + (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (minus:QI (const_int 64) + (and:QI (match_dup 3) (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shrd{q}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "DI")]) + + (define_insn "x86_64_shrd_1" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (lshiftrt:DI (match_dup 0) @@ -15642,6 +15819,24 @@ (define_insn "x86_64_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_64_shrd_ndd_1" + [(set (match_operand:DI 0 "register_operand" "=r") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_63_operand")) + (subreg:DI + (ashift:TI + (zero_extend:TI + (match_operand:DI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_255_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && INTVAL (operands[4]) == 64 - INTVAL (operands[3])" + "shrd{q}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "DI")]) + + (define_insn_and_split "*x86_64_shrd_shld_1_nozext" [(set (match_operand:DI 0 "nonimmediate_operand") (ior:DI (lshiftrt:DI (match_operand:DI 4 "nonimmediate_operand") @@ -15667,6 +15862,23 @@ (define_insn_and_split "*x86_64_shrd_shld_1_nozext" operands[4] = force_reg (DImode, operands[4]); emit_insn (gen_x86_64_shld_1 (operands[0], operands[4], operands[3], operands[2])); } + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (DImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (DImode, operands[1]); + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_64_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_64_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } else { operands[1] = force_reg (DImode, operands[1]); @@ -15699,6 +15911,33 @@ (define_insn_and_split "*x86_64_shrd_2" (const_int 63)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_64_shrd_ndd_2" + [(set (match_operand:DI 0 "nonimmediate_operand") + (ior:DI (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:DI (match_operand:DI 2 "register_operand") + (minus:QI (const_int 64) (match_dup 2))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:DI (lshiftrt:DI (match_dup 1) + (and:QI (match_dup 3) (const_int 63))) + (subreg:DI + (ashift:TI + (zero_extend:TI (match_dup 2)) + (minus:QI (const_int 64) + (and:QI (match_dup 3) + (const_int 63)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (DImode); + emit_move_insn (operands[4], operands[0]); +}) + (define_insn "x86_shrd" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15721,6 +15960,23 @@ (define_insn "x86_shrd" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (and:QI (match_operand:QI 3 "nonmemory_operand" "Ic") + (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (minus:QI (const_int 32) + (and:QI (match_dup 3) (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD" + "shrd{l}\t{%s3%2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "mode" "SI")]) + (define_insn "x86_shrd_1" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (lshiftrt:SI (match_dup 0) @@ -15742,6 +15998,24 @@ (define_insn "x86_shrd_1" (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) +(define_insn "x86_shrd_ndd_1" + [(set (match_operand:SI 0 "register_operand" "=r") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "rm") + (match_operand:QI 3 "const_0_to_31_operand")) + (subreg:SI + (ashift:DI + (zero_extend:DI + (match_operand:SI 2 "register_operand" "r")) + (match_operand:QI 4 "const_0_to_63_operand")) 0))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && (INTVAL (operands[4]) == 32 - INTVAL (operands[3]))" + "shrd{l}\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ishift") + (set_attr "length_immediate" "1") + (set_attr "mode" "SI")]) + + (define_insn_and_split "*x86_shrd_shld_1_nozext" [(set (match_operand:SI 0 "nonimmediate_operand") (ior:SI (lshiftrt:SI (match_operand:SI 4 "nonimmediate_operand") @@ -15766,7 +16040,24 @@ (define_insn_and_split "*x86_shrd_shld_1_nozext" operands[4] = force_reg (SImode, operands[4]); emit_insn (gen_x86_shld_1 (operands[0], operands[4], operands[3], operands[2])); } - else + else if (TARGET_APX_NDD) + { + rtx tmp = gen_reg_rtx (SImode); + if (MEM_P (operands[4])) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + } + else if (MEM_P (operands[1])) + emit_insn (gen_x86_shld_ndd_1 (tmp, operands[1], operands[4], + operands[3], operands[2])); + else + emit_insn (gen_x86_shrd_ndd_1 (tmp, operands[4], operands[1], + operands[2], operands[3])); + emit_move_insn (operands[0], tmp); + } + else { operands[1] = force_reg (SImode, operands[1]); rtx tmp = gen_reg_rtx (SImode); @@ -15798,6 +16089,33 @@ (define_insn_and_split "*x86_shrd_2" (const_int 31)))) 0))) (clobber (reg:CC FLAGS_REG))])]) +(define_insn_and_split "*x86_shrd_ndd_2" + [(set (match_operand:SI 0 "nonimmediate_operand") + (ior:SI (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand") + (match_operand:QI 3 "nonmemory_operand")) + (ashift:SI (match_operand:SI 2 "register_operand") + (minus:QI (const_int 32) (match_dup 3))))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_APX_NDD + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(parallel [(set (match_dup 4) + (ior:SI (lshiftrt:SI (match_dup 1) + (and:QI (match_dup 3) (const_int 31))) + (subreg:SI + (ashift:DI + (zero_extend:DI (match_dup 2)) + (minus:QI (const_int 32) + (and:QI (match_dup 3) + (const_int 31)))) 0))) + (clobber (reg:CC FLAGS_REG)) + (set (match_dup 0) (match_dup 4))])] +{ + operands[4] = gen_reg_rtx (SImode); + emit_move_insn (operands[4], operands[0]); +}) + ;; Base name for insn mnemonic. (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c new file mode 100644 index 00000000000..87068ea31aa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -Wno-shift-count-overflow -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times {(?n)shld[ql]?[\t ]*\$2} 4 } } */ +/* { dg-final { scan-assembler-times {(?n)shrd[ql]?[\t ]*\$2} 4 } } */ + +typedef unsigned long u64; +typedef unsigned int u32; + +long a; +int c; +const char n = 2; + +long test64r (long e) { long t = ((u64)a >> n) | (e << (64 - n)); return t;} +long test64l (u64 e) { long t = (a << n) | (e >> (64 - n)); return t;} +int test32r (int f) { int t = ((u32)c >> n) | (f << (32 - n)); return t; } +int test32l (u32 f) { int t = (c << n) | (f >> (32 - n)); return t; } + +u64 ua; +u32 uc; + +u64 testu64l (u64 ue) { u64 ut = (ua << n) | (ue >> (64 - n)); return ut; } +u64 testu64r (u64 ue) { u64 ut = (ua >> n) | (ue << (64 - n)); return ut; } +u32 testu32l (u32 uf) { u32 ut = (uc << n) | (uf >> (32 - n)); return ut; } +u32 testu32r (u32 uf) { u32 ut = (uc >> n) | (uf << (32 - n)); return ut; } From patchwork Tue Dec 5 02:29:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173680 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3175500vqy; Mon, 4 Dec 2023 18:37:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IHbn4GM/5zR9howFqj59WqJB1YejtnU0saH6O9+KSksKSDcD7F0VCDEBWS+UxInLMrl313I X-Received: by 2002:ac8:5841:0:b0:423:77ae:f96e with SMTP id h1-20020ac85841000000b0042377aef96emr691403qth.10.1701743862162; Mon, 04 Dec 2023 18:37:42 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701743862; cv=pass; d=google.com; s=arc-20160816; b=hFScdDbbcB4IOVyNejCzReot+JtVc91a3r4eWATljQT+OkSl/S0nNxhDziqeNBWAVm iVo8RqSKVjU6O7rOkttjKV/0KWr4++GygKS1XL0nwo760pS6TO3/Yb0JF9bjtJqc8WUt H97IJLOXS8SgdKhZ30anCBcHaJpe02iQjBO/jqRz7wUNJoN4suwn0mUApDhJIZWxLKhx Lp/5gEldjSwKx0Qd3rpFwgG17tcudNP0UgJ0496hWSsKkk8Gh/bdl8aGR2x6UyrvlcAf 1ZZvi/rMYK6v8x4aBy68LliOfr0MVEny1KZdAyHyn32GvwN1D+gG4yI51asAiNxGQqtF q14Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=9FOvNwZpmtq9EgQjHMOtumH8M9p00IfJtWwOoPfu8ew=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=BPD6hgSNd7ap66Vwk48HXPCQO+dCaL0bDBoEG6eMwodXvdX7otKPg4w6uAfxEDBusx cvlufuwMiIkaj9Rn2NeTaeFN16Rf4gfKIMiAu24uv0FpR8U7ISmRvwEUqo21YZRtwCrU v/yrpji9gTHYviNq3m+LH0MCLCFk5IxQs7wFfMLIIi489CuxoFFGnP8lpmN4ZNcLJITf xfhQLY01X+Ku4sDDA0ow44zh5tqs4iezIee0LDy7vy54zOwF3INvvKmKHkyAwAUL0uP8 CTjaG8hnC30yCDnJ5n6VK8LiuaBZu1LfR8hyf7J5PLf5UkRHgpEfK9WE8OsHGZOO/XUz GIyA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EEeKNCe9; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c14-20020ac8660e000000b0042371ae173csi9930584qtp.727.2023.12.04.18.37.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:37:42 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EEeKNCe9; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 840B23843B6C for ; Tue, 5 Dec 2023 02:32:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 89A3838983AA for ; Tue, 5 Dec 2023 02:31:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 89A3838983AA Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 89A3838983AA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; cv=none; b=EZdIR+ZPMUULuhB/KGunkw7eIuSKTroQCVVVsNdhdwt7iUM01hyzPx2uu/hY4e8YLv5o7VAJdyJZGNNr1tBfVCqEFSMsUQul7DTI3kDG7fNQnzHyzgaIlC3YFHT585C8W49wAUdAPYxDldmWZAaa5PlfgIilqEmYyOdf3p9x/1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; c=relaxed/simple; bh=6dOyIoxG1d/XJ1Ocnp9aSg24gK9WiaJakovZRy8bIdg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=DxPRh+Ir8rM1S4NMjEzVQlm+umNsadGElzgUeKR5YFhlQ849Vq78RK7X+5kakfX66WJmtGtxPvvtxB5SzkdhiB6wxpb5/89hlhpcl7uM0f2NMZmyuV4tNGL6UzJwjsDa0NRafwkvk8qGfI6py/9qyRSEmk1E1yITdgRB1hPiLcw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDe-0001YX-Ai for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743490; x=1733279490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6dOyIoxG1d/XJ1Ocnp9aSg24gK9WiaJakovZRy8bIdg=; b=EEeKNCe9i+2uetY4VzoudjL0v5cAo/dHNMrgjoLpyAagCll3guIr5iqB WZCYpjUkW2A09UBPtak3cLKQY17/9TSanRfimLqG+xlzdbC79rx5Qo+sn +NwgJmI2neWoNP5rhezlWZ/O+g+bOIo0z05JItcC0AJEB93XYvB+eXFV+ WeB9DrDKGsmbJsCNwUWhI0Zps0vRVQtTBjnP2WhJRFQAOAoq/MiFlLbvJ 66L7o+fM11lnYhSMZrQZ5Ttuww0EEYx4KF18uUeIIhFkSKfsxIy4vV7cV nIhEILB837866dzfOLeODnwv9j9ks7CYJUrLwJ4vF9PIS938Dsgrrx4EL A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277814" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277814" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275540" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275540" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 669A2100780D; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 16/17] [APX NDD] Support APX NDD for cmove insns Date: Tue, 5 Dec 2023 10:29:47 +0800 Message-Id: <20231205022948.504790-17-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784407771635556945 X-GMAIL-MSGID: 1784407771635556945 gcc/ChangeLog: * config/i386/i386.md (*movcc_noc): Extend with new constraints to support NDD. (*movsicc_noc_zext): Likewise. (*movsicc_noc_zext_1): Likewise. (*movqicc_noc): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd-cmov.c: New test. --- gcc/config/i386/i386.md | 48 ++++++++++++-------- gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c | 16 +++++++ 2 files changed, 45 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 0af7e82deee..853f53c2bb9 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -24412,47 +24412,56 @@ (define_split (neg:SWI (ltu:SWI (reg:CCC FLAGS_REG) (const_int 0))))]) (define_insn "*movcc_noc" - [(set (match_operand:SWI248 0 "register_operand" "=r,r") + [(set (match_operand:SWI248 0 "register_operand" "=r,r,r,r") (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SWI248 2 "nonimmediate_operand" "rm,0") - (match_operand:SWI248 3 "nonimmediate_operand" "0,rm")))] + (match_operand:SWI248 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SWI248 3 "nonimmediate_operand" "0,rm,r,rm")))] "TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %0|%0, %2} - cmov%O2%c1\t{%3, %0|%0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %0|%0, %3} + cmov%O2%C1\t{%2, %3, %0|%0, %3, %2} + cmov%O2%c1\t{%3, %2, %0|%0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "")]) (define_insn "*movsicc_noc_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (if_then_else:DI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) (zero_extend:DI - (match_operand:SI 2 "nonimmediate_operand" "rm,0")) + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r")) (zero_extend:DI - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) (define_insn "*movsicc_noc_zext_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,r") (zero_extend:DI (if_then_else:SI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:SI 2 "nonimmediate_operand" "rm,0") - (match_operand:SI 3 "nonimmediate_operand" "0,rm"))))] + (match_operand:SI 2 "nonimmediate_operand" "rm,0,rm,r") + (match_operand:SI 3 "nonimmediate_operand" "0,rm,r,rm"))))] "TARGET_64BIT && TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ cmov%O2%C1\t{%2, %k0|%k0, %2} - cmov%O2%c1\t{%3, %k0|%k0, %3}" - [(set_attr "type" "icmov") + cmov%O2%c1\t{%3, %k0|%k0, %3} + cmov%O2%C1\t{%2, %3, %k0|%k0, %3, %2} + cmov%O2%c1\t{%3, %2, %k0|%k0, %2, %3}" + [(set_attr "isa" "*,*,apx_ndd,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "SI")]) @@ -24477,14 +24486,15 @@ (define_split }) (define_insn "*movqicc_noc" - [(set (match_operand:QI 0 "register_operand" "=r,r") + [(set (match_operand:QI 0 "register_operand" "=r,r,r") (if_then_else:QI (match_operator 1 "ix86_comparison_operator" [(reg FLAGS_REG) (const_int 0)]) - (match_operand:QI 2 "register_operand" "r,0") - (match_operand:QI 3 "register_operand" "0,r")))] + (match_operand:QI 2 "register_operand" "r,0,r") + (match_operand:QI 3 "register_operand" "0,r,r")))] "TARGET_CMOVE && !TARGET_PARTIAL_REG_STALL" "#" - [(set_attr "type" "icmov") + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "icmov") (set_attr "mode" "QI")]) (define_split diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c new file mode 100644 index 00000000000..459dc965342 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -m64 -mapxf" } */ +/* { dg-final { scan-assembler-times "cmove\[^\n\r]*, %eax" 1 } } */ +/* { dg-final { scan-assembler-times "cmovge\[^\n\r]*, %eax" 1 } } */ + +unsigned int c[4]; + +unsigned long long foo1 (int a, unsigned int b) +{ + return a ? b : c[1]; +} + +unsigned int foo3 (int a, int b, unsigned int c, unsigned int d) +{ + return a < b ? c : d; +} From patchwork Tue Dec 5 02:29:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173691 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176678vqy; Mon, 4 Dec 2023 18:41:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IH9yy1BEd0xl+8NiBxtwDbXitp2kC6hfzjMR9jVbjfxQY+PvapQ6N5Uvfn56sst5T9h9xfn X-Received: by 2002:a05:622a:180d:b0:425:4043:18ab with SMTP id t13-20020a05622a180d00b00425404318abmr734310qtc.94.1701744119136; Mon, 04 Dec 2023 18:41:59 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744119; cv=pass; d=google.com; s=arc-20160816; b=mB7K2LjDK5qFavZRzoqy1DGSvbib5IT7pbJApxbebO5l4l0dnSPu9Ko5Sx0T7FZP/z SHybzNBDPUyAe5BcWabGs6LaqYKTepSgxnBOSYu9o0jEn4rEUoHb1yexUGtccwWP+yqV wV0rDJI61XV/dwqfOH0h2Z8UFNwelovsHCIM8QCcCPEIF2R/vCtcL4OhV16gqCry6ZA5 RHTx2CmJmHIFZ6XC2FJwug0tBKajhDlSq9WBCcyWLKxd8YOaFhkdZ4b34Kji4NHYtgDe HVynx8vMqI/wG9w9EPYwuvgyYzjbQrySrVZZ2WM9JqPDqYMLuo9ENt3RSIjM69T4THAg OlVw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ncRz2VySLMpiiW1btngQ5xBQDo/Ly8x/tin+gWZT4pY=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=dkm6Ai2s/AXP5gJtRaeoNfqV0GVG8P4k2x6vP8wxHdvc0c6jF2rPu69jX7Iwp1y10S tpxRHwu+7TiGmwMwTWusty3U4789bwvF7pnWezajlyBx53tXyBCgARa5Hn40xlXEoXfN 3F2cC9itexC8vgzShRNH7kFvhUKTIVQE+GG9wXvKtCoj+vxwHhtXT2edpZo7eBfedhhx OHBbGVCTYiEkSV4Me2g7pN/sOnnZXJpSitWu01hxTkYpJ2B3q/awAriVDXoUNI/3BwRl T3uUyqRXtdrFYljeJ6/s0Fk/KNr6w/sgv9Ycy48mj8z66v0gC3f7apYQqrlMOqa7XnO3 /JHA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C1EDh8Qy; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bw25-20020a05622a099900b00423807a611dsi11896541qtb.424.2023.12.04.18.41.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:41:59 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C1EDh8Qy; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D93A03ABA6A6 for ; Tue, 5 Dec 2023 02:33:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 556C13898C67 for ; Tue, 5 Dec 2023 02:31:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 556C13898C67 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 556C13898C67 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; cv=none; b=Zzwd2GGInipvadpkHy8zV5svD0vLyBXJWdpR9wSAmJ1kYgSJJLd6U0OP3u+Ab0pnRkTQfosbPThY25092OGqxb5yPFWIopRro83VGnX1WRnj0qlIwznNJMUhXjxu8Bwbvg3+h9vUPu7xiLskag7iGaO+zBTfdbVqxhyaya9iOKk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743503; c=relaxed/simple; bh=Ok+OyFV+zE2gERJkWqPUHnvBdcsFzSjivsaUoc8N/G8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=E6FnFybytlkBlUUuFEcwfsUQTYLRwWshqdZYARQ4Gt9Z8aE98O9DiUuoOwikVU+hVrLe62Rcgv5JrDkmmO0BDb+IC2uqA3PHH+8jAZJ3Q4PtLiTngS10/Q/kOOtMLmYZOx7/d4jFtb9FjeBRUjy8Rv34C7OdF264czKLdkvVk6Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDf-0001YN-30 for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743491; x=1733279491; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ok+OyFV+zE2gERJkWqPUHnvBdcsFzSjivsaUoc8N/G8=; b=C1EDh8QyLeA40kuJ2JFmtXk5nZQrn6yc2toITdDIkbIMLFjy1eYR9yzJ Neu3YQ40X7xjhFstaGpL10ssFs0k29kH8Gv+rtadxRZj3wK3AA7NyKFdd kH0hOecACaAnxx/pqukJj9g2yU0kCEqpvL7OqOp/dt/ZI8iyN72sfErdf yB5SxIhQKgu/QmI1ykmYlk5WZE+SJ6PQnpIRhXmcdiUdHVMHjWoPht10J 7/3+rN59FlFZtTfkvifmR4inYAkMoCdVZwqb/09qdSOOdqy3/RXSDoUgJ WxnMxxSP3f4cQcYJ7dabqBf5G1jP0iU1z23iztI7sb183+ukSWQLjXYuX w==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277819" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277819" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275550" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275550" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 69132100780E; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 17/17] [APX NDD] Support TImode shift for NDD Date: Tue, 5 Dec 2023 10:29:48 +0800 Message-Id: <20231205022948.504790-18-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408041556245778 X-GMAIL-MSGID: 1784408041556245778 For TImode shifts, they are splitted by splitter functions, which assume operands[0] and operands[1] to be the same. For the NDD alternative the assumption may not be true so add split functions for NDD to emit the NDD form instructions, and omit the handling of !64bit target split. Although the NDD form allows memory src, for post-reload splitter there are no extra register to accept NDD form shift, especially shld/shrd. So only accept register alternative for shift src under NDD. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_split_ashl_ndd): New function to split NDD form lshift. (ix86_split_rshift_ndd): Likewise for l/ashiftrt. * config/i386/i386-protos.h (ix86_split_ashl_ndd): New prototype. (ix86_split_rshift_ndd): Likewise. * config/i386/i386.md (ashl3_doubleword): Add NDD alternative, call ndd split function when operands[0] not equal to operands[1]. (define_split for doubleword lshift): Likewise. (define_peephole for doubleword lshift): Likewise. (3_doubleword): Likewise for l/ashiftrt. (define_split for doubleword l/ashiftrt): Likewise. (define_peephole for doubleword l/ashiftrt): Likewise. gcc/ChangeLog: * gcc.target/i386/apx-ndd-ti-shift.c: New test. --- gcc/config/i386/i386-expand.cc | 136 ++++++++++++++++++ gcc/config/i386/i386-protos.h | 2 + gcc/config/i386/i386.md | 56 ++++++-- .../gcc.target/i386/apx-ndd-ti-shift.c | 91 ++++++++++++ 4 files changed, 273 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index d4bbd33ce07..a53d69d5400 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -6678,6 +6678,142 @@ ix86_split_lshr (rtx *operands, rtx scratch, machine_mode mode) } } +/* Helper function to split TImode ashl under NDD. */ +void +ix86_split_ashl_ndd (rtx *operands, rtx scratch) +{ + gcc_assert (TARGET_APX_NDD); + int half_width = GET_MODE_BITSIZE (TImode) >> 1; + + rtx low[2], high[2]; + int count; + + split_double_mode (TImode, operands, 2, low, high); + if (CONST_INT_P (operands[2])) + { + count = INTVAL (operands[2]) & (GET_MODE_BITSIZE (TImode) - 1); + + if (count >= half_width) + { + count = count - half_width; + if (count == 0) + { + if (!rtx_equal_p (high[0], low[1])) + emit_move_insn (high[0], low[1]); + } + else if (count == 1) + emit_insn (gen_adddi3 (high[0], low[1], low[1])); + else + emit_insn (gen_ashldi3 (high[0], low[1], GEN_INT (count))); + + ix86_expand_clear (low[0]); + } + else if (count == 1) + { + rtx x3 = gen_rtx_REG (CCCmode, FLAGS_REG); + rtx x4 = gen_rtx_LTU (TImode, x3, const0_rtx); + emit_insn (gen_add3_cc_overflow_1 (DImode, low[0], + low[1], low[1])); + emit_insn (gen_add3_carry (DImode, high[0], high[1], high[1], + x3, x4)); + } + else + { + emit_insn (gen_x86_64_shld_ndd (high[0], high[1], low[1], + GEN_INT (count))); + emit_insn (gen_ashldi3 (low[0], low[1], GEN_INT (count))); + } + } + else + { + emit_insn (gen_x86_64_shld_ndd (high[0], high[1], low[1], + operands[2])); + emit_insn (gen_ashldi3 (low[0], low[1], operands[2])); + if (TARGET_CMOVE && scratch) + { + ix86_expand_clear (scratch); + emit_insn (gen_x86_shift_adj_1 + (DImode, high[0], low[0], operands[2], scratch)); + } + else + emit_insn (gen_x86_shift_adj_2 (DImode, high[0], low[0], operands[2])); + } +} + +/* Helper function to split TImode l/ashr under NDD. */ +void +ix86_split_rshift_ndd (enum rtx_code code, rtx *operands, rtx scratch) +{ + gcc_assert (TARGET_APX_NDD); + int half_width = GET_MODE_BITSIZE (TImode) >> 1; + bool ashr_p = code == ASHIFTRT; + rtx (*gen_shr)(rtx, rtx, rtx) = ashr_p ? gen_ashrdi3 + : gen_lshrdi3; + + rtx low[2], high[2]; + int count; + + split_double_mode (TImode, operands, 2, low, high); + if (CONST_INT_P (operands[2])) + { + count = INTVAL (operands[2]) & (GET_MODE_BITSIZE (TImode) - 1); + + if (ashr_p && (count == GET_MODE_BITSIZE (TImode) - 1)) + { + emit_insn (gen_shr (high[0], high[1], + GEN_INT (half_width - 1))); + emit_move_insn (low[0], high[0]); + } + else if (count >= half_width) + { + if (ashr_p) + emit_insn (gen_shr (high[0], high[1], + GEN_INT (half_width - 1))); + else + ix86_expand_clear (high[0]); + + if (count > half_width) + emit_insn (gen_shr (low[0], high[1], + GEN_INT (count - half_width))); + else + emit_move_insn (low[0], high[1]); + } + else + { + emit_insn (gen_x86_64_shrd_ndd (low[0], low[1], high[1], + GEN_INT (count))); + emit_insn (gen_shr (high[0], high[1], GEN_INT (count))); + } + } + else + { + emit_insn (gen_x86_64_shrd_ndd (low[0], low[1], high[1], + operands[2])); + emit_insn (gen_shr (high[0], high[1], operands[2])); + + if (TARGET_CMOVE && scratch) + { + if (ashr_p) + { + emit_move_insn (scratch, high[0]); + emit_insn (gen_shr (scratch, scratch, + GEN_INT (half_width - 1))); + } + else + ix86_expand_clear (scratch); + + emit_insn (gen_x86_shift_adj_1 + (DImode, low[0], high[0], operands[2], scratch)); + } + else if (ashr_p) + emit_insn (gen_x86_shift_adj_3 + (DImode, low[0], high[0], operands[2])); + else + emit_insn (gen_x86_shift_adj_2 + (DImode, low[0], high[0], operands[2])); + } +} + /* Expand move of V1TI mode register X to a new TI mode register. */ static rtx ix86_expand_v1ti_to_ti (rtx x) diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index fa952409729..56349064a6c 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -174,8 +174,10 @@ extern void x86_initialize_trampoline (rtx, rtx, rtx); extern rtx ix86_zero_extend_to_Pmode (rtx); extern void ix86_split_long_move (rtx[]); extern void ix86_split_ashl (rtx *, rtx, machine_mode); +extern void ix86_split_ashl_ndd (rtx *, rtx); extern void ix86_split_ashr (rtx *, rtx, machine_mode); extern void ix86_split_lshr (rtx *, rtx, machine_mode); +extern void ix86_split_rshift_ndd (enum rtx_code, rtx *, rtx); extern void ix86_expand_v1ti_shift (enum rtx_code, rtx[]); extern void ix86_expand_v1ti_rotate (enum rtx_code, rtx[]); extern void ix86_expand_v1ti_ashiftrt (rtx[]); diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 853f53c2bb9..331dda89b29 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14420,13 +14420,14 @@ (define_insn_and_split "*ashl3_doubleword_mask_1" }) (define_insn "ashl3_doubleword" - [(set (match_operand:DWI 0 "register_operand" "=&r") - (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:DWI 0 "register_operand" "=&r,r") + (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n,r") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] "" "#" - [(set_attr "type" "multi")]) + [(set_attr "type" "multi") + (set_attr "isa" "*,apx_ndd")]) (define_split [(set (match_operand:DWI 0 "register_operand") @@ -14435,7 +14436,15 @@ (define_split (clobber (reg:CC FLAGS_REG))] "epilogue_completed" [(const_int 0)] - "ix86_split_ashl (operands, NULL_RTX, mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]) + && REG_P (operands[1])) + ix86_split_ashl_ndd (operands, NULL_RTX); + else + ix86_split_ashl (operands, NULL_RTX, mode); + DONE; +}) ;; By default we don't ask for a scratch register, because when DWImode ;; values are manipulated, registers are already at a premium. But if @@ -14451,7 +14460,15 @@ (define_peephole2 (match_dup 3)] "TARGET_CMOVE" [(const_int 0)] - "ix86_split_ashl (operands, operands[3], mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1]) + && (REG_P (operands[1]))) + ix86_split_ashl_ndd (operands, operands[3]); + else + ix86_split_ashl (operands, operands[3], mode); + DONE; +}) (define_insn_and_split "*ashl3_doubleword_highpart" [(set (match_operand: 0 "register_operand" "=r") @@ -15708,16 +15725,24 @@ (define_insn_and_split "*3_doubleword_mask_1" }) (define_insn_and_split "3_doubleword" - [(set (match_operand:DWI 0 "register_operand" "=&r") - (any_shiftrt:DWI (match_operand:DWI 1 "register_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:DWI 0 "register_operand" "=&r,r") + (any_shiftrt:DWI (match_operand:DWI 1 "register_operand" "0,r") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] "" "#" "epilogue_completed" [(const_int 0)] - "ix86_split_ (operands, NULL_RTX, mode); DONE;" - [(set_attr "type" "multi")]) +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1])) + ix86_split_rshift_ndd (, operands, NULL_RTX); + else + ix86_split_ (operands, NULL_RTX, mode); + DONE; +} + [(set_attr "type" "multi") + (set_attr "isa" "*,apx_ndd")]) ;; By default we don't ask for a scratch register, because when DWImode ;; values are manipulated, registers are already at a premium. But if @@ -15733,7 +15758,14 @@ (define_peephole2 (match_dup 3)] "TARGET_CMOVE" [(const_int 0)] - "ix86_split_ (operands, operands[3], mode); DONE;") +{ + if (TARGET_APX_NDD + && !rtx_equal_p (operands[0], operands[1])) + ix86_split_rshift_ndd (, operands, operands[3]); + else + ix86_split_ (operands, operands[3], mode); + DONE; +}) ;; Split truncations of double word right shifts into x86_shrd_1. (define_insn_and_split "3_doubleword_lowpart" diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c b/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c new file mode 100644 index 00000000000..0489712b7f6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c @@ -0,0 +1,91 @@ +/* { dg-do run { target { int128 && { ! ia32 } } } } */ +/* { dg-require-effective-target apxf } */ +/* { dg-options "-O2" } */ + +#include + +#define APX_TARGET __attribute__((noinline, target("apxf"))) +#define NO_APX __attribute__((noinline, target("no-apxf"))) +typedef __uint128_t u128; +typedef __int128 i128; + +#define TI_SHIFT_FUNC(TYPE, op, name) \ +APX_TARGET \ +TYPE apx_##name##TYPE (TYPE a, char b) \ +{ \ + return a op b; \ +} \ +TYPE noapx_##name##TYPE (TYPE a, char b) \ +{ \ + return a op b; \ +} \ + +#define TI_SHIFT_FUNC_CONST(TYPE, i, op, name) \ +APX_TARGET \ +TYPE apx_##name##TYPE##_const (TYPE a) \ +{ \ + return a op i; \ +} \ +NO_APX \ +TYPE noapx_##name##TYPE##_const (TYPE a) \ +{ \ + return a op i; \ +} + +#define TI_SHIFT_TEST(TYPE, name, val) \ +{\ + if (apx_##name##TYPE (val, b) != noapx_##name##TYPE (val, b)) \ + abort (); \ +} + +#define TI_SHIFT_CONST_TEST(TYPE, name, val) \ +{\ + if (apx_##name##1##TYPE##_const (val) \ + != noapx_##name##1##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##2##TYPE##_const (val) \ + != noapx_##name##2##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##3##TYPE##_const (val) \ + != noapx_##name##3##TYPE##_const (val)) \ + abort (); \ + if (apx_##name##4##TYPE##_const (val) \ + != noapx_##name##4##TYPE##_const (val)) \ + abort (); \ +} + +TI_SHIFT_FUNC(i128, <<, ashl) +TI_SHIFT_FUNC(i128, >>, ashr) +TI_SHIFT_FUNC(u128, >>, lshr) + +TI_SHIFT_FUNC_CONST(i128, 1, <<, ashl1) +TI_SHIFT_FUNC_CONST(i128, 65, <<, ashl2) +TI_SHIFT_FUNC_CONST(i128, 64, <<, ashl3) +TI_SHIFT_FUNC_CONST(i128, 87, <<, ashl4) +TI_SHIFT_FUNC_CONST(i128, 127, >>, ashr1) +TI_SHIFT_FUNC_CONST(i128, 87, >>, ashr2) +TI_SHIFT_FUNC_CONST(i128, 27, >>, ashr3) +TI_SHIFT_FUNC_CONST(i128, 64, >>, ashr4) +TI_SHIFT_FUNC_CONST(u128, 127, >>, lshr1) +TI_SHIFT_FUNC_CONST(u128, 87, >>, lshr2) +TI_SHIFT_FUNC_CONST(u128, 27, >>, lshr3) +TI_SHIFT_FUNC_CONST(u128, 64, >>, lshr4) + +int main (void) +{ + if (!__builtin_cpu_supports ("apxf")) + return 0; + + u128 ival = 0x123456788765432FLL; + u128 uval = 0xF234567887654321ULL; + char b = 28; + + TI_SHIFT_TEST(i128, ashl, ival) + TI_SHIFT_TEST(i128, ashr, ival) + TI_SHIFT_TEST(u128, lshr, uval) + TI_SHIFT_CONST_TEST(i128, ashl, ival) + TI_SHIFT_CONST_TEST(i128, ashr, ival) + TI_SHIFT_CONST_TEST(u128, lshr, uval) + + return 0; +}