From patchwork Wed Nov 15 09:47:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 165244 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp2430367vqg; Wed, 15 Nov 2023 01:52:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFUxiwaSy1k+CEPXJmkgCxIuJHGiFLzRFuB0S8pp8FlICosM8w7NM2chaI+pjLI3UkFGN3B X-Received: by 2002:a05:620a:8a8b:b0:777:1d46:fd4a with SMTP id qu11-20020a05620a8a8b00b007771d46fd4amr3759279qkn.29.1700041958355; Wed, 15 Nov 2023 01:52:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700041958; cv=pass; d=google.com; s=arc-20160816; b=JAn1VE3v6p0JAglI6o/48DnG8gNjJFyOdgXCWcouZY60LlV8ap8+U428NzphbCoHM2 gE8YG0s8zf6iEQgrg3yPLYl0UxlSpe6hFojBWlF3X7bT5EXwfyM2jtOQzk1xpeALWmBY T1xJf5RX/W+yKFoPkEfD1mPDtrQpzoFKTxTXHEk19JjjVEyDWO+77Xc3HBoc1KadPFRr dFe6YNyZzLUW7NoGl2sfKYaK9dZySqXSEQx8BpIB9hHBo9uGQnxpuWlludpte7El67vf 9eGqq58KoqaDPuNPy4v0NU0721Sy0mh48kd2z4f1JKkSrbOycFh31XbYMn1Fyqi1L5Ra Zikg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=jnSDj9iMZGcJAvT/I6rt+WaF7OlMTJmxfGxqWpMQu4E=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=nwx9Qiwsi3iih5Cu38uZSCgKZkMkD/EimFCTAa7GWDxeAYLj6gsjdHe54hKameJRyF ZP/WBWRqUcQrxZMt5zD3Inzc+LfYMltufaq2sCcDyPXUQGb1YTEMVq9BVN9mpj1GYV5u zWNNGstcN0znbB9kMgBkd3t3ZvjjlWvG3TtTTJ66BKg+jYmX39IO8pRUNHeLuGRjJNpb au+0UB6mHUegc9sPYf23dLp9WcWdNwkeaIBGxUqDMn2KIY6LKSeoof0h8fnVntTZmeRH 8K4ltVyXx4WXx8dIfcATfSp77CpKUDaoVTdqQgNcs8EW+N4u+BYhaTfEMu3YondBD6D2 iTuQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PUXZSmPm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y7-20020ae9f407000000b0076d1c784892si7920720qkl.478.2023.11.15.01.52.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 01:52:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PUXZSmPm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 213CE3858C42 for ; Wed, 15 Nov 2023 09:52:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id 8FDB63857C44 for ; Wed, 15 Nov 2023 09:47:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8FDB63857C44 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8FDB63857C44 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; cv=none; b=JRwFwi++gqLkWaEu744EDX/6ya+4MttchLTEnnaB26KVEaSn9uKXfVacD4eIxqUPR6ZSeJQ+/cS7uCbPhdOpKUasx8uQeIyLFkKQHW9iDLBe+8jnlApyOWBF9nRVuEVtM9TsB33etaeTXK/YthshK3UVLsHHWsELW+wvLvEbWcY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700041646; c=relaxed/simple; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=SLaOPFo8xGYMY2iN/VQe9f9Lb61o65NiFPmCqyvwK2h68UqFcshMtCBfWwS2jEtwDLnSM+PSLcQczV96r8HXmt9ucKvQ4fD5XkPLXtnlqqNnk4VU/mtZ+QUlr4uusUWvfhM1AiptfI9+Pr8l4SB4ncG2wiXuUuP5zk8HO/wPb+0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700041643; x=1731577643; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VpNCWGB9NMslHvCUVLHNlyvvNT5JOpyi/vLRAZ7ryZQ=; b=PUXZSmPmQLaGX65XitX0JyONtsH7joGrFDtOZenoc5lAk0xqqAcIZIzh RC9v7PRbDtDG2Z9mgopR6h2Xvq/wXhp+n1RwN8K2dqpP1PEctmj6goYna ORDIKO09JL/XEAv7HRPbo10Om2hDqvYGa+XJ6Pp007m/OS1o1gVIy4AA0 k7NB011mvTKBL8eYvD3xaZhSiD+2NhPm7fhGCaQqian4LzEOLbARtdiTg MrmWGwxT20YJ3WQu3MKlhsqV/mRCXSLmjdXEOEHKcMOYBBk9MDGf32Oir hbgAvFKvEkkRYi6zc+eGf/YVbakJ2WRUkm6htoAPjT72pElZjiflp3HC8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="457342622" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="457342622" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 01:47:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="938431725" X-IronPort-AV: E=Sophos;i="6.03,304,1694761200"; d="scan'208";a="938431725" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 15 Nov 2023 01:47:12 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C7A5D1005682; Wed, 15 Nov 2023 17:47:05 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 13/16] [APX NDD] Support APX NDD for right shift insns Date: Wed, 15 Nov 2023 17:47:02 +0800 Message-Id: <20231115094705.3976553-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231115094705.3976553-1-hongyu.wang@intel.com> References: <20231115094705.3976553-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782623196322819617 X-GMAIL-MSGID: 1782623196322819617 Similar to LSHIFT, rshift should also emit $1 for NDD form with CX_REG as operands[1]. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_can_use_ndd_p): Add LSHIFTRT and RSHIFTRT. * config/i386/i386.md (ashr3_cvt): Extend with new alternatives to support NDD, and adjust output templates. (*ashrsi3_cvt_zext): Likewise. (*ashr3_1): Likewise for SI/DI mode. (*highpartdisi2): Likewise. (*lshr3_1): Likewise. (*si3_1_zext): Likewise. (*ashr3_1): Likewise for QI/HI mode. (*lshrqi3_1): Likewise. (*lshrhi3_1): Likewise. (3_cmp): Likewise. (*si3_cmp_zext): Likewise. (*3_cconly): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add l/ashiftrt tests. --- gcc/config/i386/i386-expand.cc | 2 + gcc/config/i386/i386.md | 265 +++++++++++++++--------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 24 +++ 3 files changed, 191 insertions(+), 100 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 7e3080482a6..8e040346fbb 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1277,6 +1277,8 @@ bool ix86_can_use_ndd_p (enum rtx_code code) case IOR: case XOR: case ASHIFT: + case ASHIFTRT: + case LSHIFTRT: return true; default: return false; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a0e81545f17..3ff333d4a41 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15490,39 +15490,45 @@ (define_mode_attr cvt_mnemonic [(SI "{cltd|cdq}") (DI "{cqto|cqo}")]) (define_insn "ashr3_cvt" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=*d,rm,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "*a,0") + (match_operand:SWI48 1 "nonimmediate_operand" "*a,0,rm") (match_operand:QI 2 "const_int_operand"))) (clobber (reg:CC FLAGS_REG))] "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)-1 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ - sar{}\t{%2, %0|%0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{}\t{%2, %0|%0, %2} + sar{}\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "")]) (define_insn "*ashrsi3_cvt_zext" - [(set (match_operand:DI 0 "register_operand" "=*d,r") + [(set (match_operand:DI 0 "register_operand" "=*d,r,r") (zero_extend:DI - (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0") + (ashiftrt:SI (match_operand:SI 1 "register_operand" "*a,0,r") (match_operand:QI 2 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && INTVAL (operands[2]) == 31 && (TARGET_USE_CLTD || optimize_function_for_size_p (cfun)) - && ix86_binary_operator_ok (ASHIFTRT, SImode, operands)" + && ix86_binary_operator_ok (ASHIFTRT, SImode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" "@ {cltd|cdq} - sar{l}\t{%2, %k0|%k0, %2}" - [(set_attr "type" "imovx,ishift") - (set_attr "prefix_0f" "0,*") - (set_attr "length_immediate" "0,*") - (set_attr "modrm" "0,1") + sar{l}\t{%2, %k0|%k0, %2} + sar{l}\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "isa" "*,*,apx_ndd") + (set_attr "type" "imovx,ishift,ishift") + (set_attr "prefix_0f" "0,*,*") + (set_attr "length_immediate" "0,*,*") + (set_attr "modrm" "0,1,1") (set_attr "mode" "SI")]) (define_expand "@x86_shift_adj_3" @@ -15564,13 +15570,15 @@ (define_insn "*bmi2_3_1" (set_attr "mode" "")]) (define_insn "*ashr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (ashiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,r"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15578,14 +15586,18 @@ (define_insn "*ashr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return use_ndd ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15598,8 +15610,8 @@ (define_insn "*ashr3_1" ;; Specialization of *lshr3_1 below, extracting the SImode ;; highpart of a DI to be extracted, but allowing it to be clobbered. (define_insn_and_split "*highpartdisi2" - [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k") 0) - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k") + [(set (subreg:DI (match_operand:SI 0 "register_operand" "=r,x,?k,r") 0) + (lshiftrt:DI (match_operand:DI 1 "register_operand" "0,0,k,r") (const_int 32))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" @@ -15618,16 +15630,20 @@ (define_insn_and_split "*highpartdisi2" DONE; } operands[0] = gen_rtx_REG (DImode, REGNO (operands[0])); -}) +} +[(set_attr "isa" "*,*,*,apx_ndd")]) + (define_insn "*lshr3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,?k,r") (lshiftrt:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k") - (match_operand:QI 2 "nonmemory_operand" "c,r,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,k,rm") + (match_operand:QI 2 "nonmemory_operand" "c,r,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, mode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, mode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 3); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15636,14 +15652,18 @@ (define_insn "*lshr3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{}\t{%1, %0|%0, %1}" + : "shr{}\t%0"; else - return "shr{}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{}\t{%2, %1, %0|%0, %1, %2}" + : "shr{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2,") - (set_attr "type" "ishift,ishiftx,msklog") + [(set_attr "isa" "*,bmi2,,apx_ndd") + (set_attr "type" "ishift,ishiftx,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15676,13 +15696,15 @@ (define_insn "*bmi2_si3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,r")))) + (any_shiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,r,cI")))) (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" + "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFTX: @@ -15690,14 +15712,18 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "ishift,ishiftx") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "ishift,ishiftx,ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15720,20 +15746,26 @@ (define_split "operands[2] = gen_lowpart (SImode, operands[2]);") (define_insn "*ashr3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m, r") (ashiftrt:SWI12 - (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + (match_operand:SWI12 1 "nonimmediate_operand" "0, rm") + (match_operand:QI 2 "nonmemory_operand" "c, c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (ASHIFTRT, mode, operands)" + "ix86_binary_operator_ok (ASHIFTRT, mode, operands, + ix86_can_use_ndd_p (ASHIFTRT))" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "sar{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "sar{}\t{%1, %0|%0, %1}" + : "sar{}\t%0"; else - return "sar{}\t{%2, %0|%0, %2}"; + return which_alternative ? "sar{}\t{%2, %1, %0|%0, %1, %2}" + : "sar{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15744,29 +15776,35 @@ (define_insn "*ashr3_1" (set_attr "mode" "")]) (define_insn "*lshrqi3_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k") + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,?k,r") (lshiftrt:QI - (match_operand:QI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI,Wb"))) + (match_operand:QI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI,Wb,cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, QImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, QImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{b}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{b}\t{%1, %0|%0, %1}" + : "shr{b}\t%0"; else - return "shr{b}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{b}\t{%2, %1, %0|%0, %1, %2}" + : "shr{b}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*,avx512dq") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*,avx512dq,apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15778,29 +15816,35 @@ (define_insn "*lshrqi3_1" (set_attr "mode" "QI")]) (define_insn "*lshrhi3_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k") + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm, ?k, r") (lshiftrt:HI - (match_operand:HI 1 "nonimmediate_operand" "0, k") - (match_operand:QI 2 "nonmemory_operand" "cI, Ww"))) + (match_operand:HI 1 "nonimmediate_operand" "0, k, rm") + (match_operand:QI 2 "nonmemory_operand" "cI, Ww, cI"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (LSHIFTRT, HImode, operands)" + "ix86_binary_operator_ok (LSHIFTRT, HImode, operands, + ix86_can_use_ndd_p (LSHIFTRT))" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ISHIFT: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "shr{w}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(use_ndd && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return use_ndd ? "shr{w}\t{%1, %0|%0, %1}" + : "shr{w}\t%0"; else - return "shr{w}\t{%2, %0|%0, %2}"; + return use_ndd ? "shr{w}\t{%2, %1, %0|%0, %1, %2}" + : "shr{w}\t{%2, %0|%0, %2}"; case TYPE_MSKLOG: return "#"; default: gcc_unreachable (); } } - [(set_attr "isa" "*, avx512f") - (set_attr "type" "ishift,msklog") + [(set_attr "isa" "*, avx512f, apx_ndd") + (set_attr "type" "ishift,msklog,ishift") (set (attr "length_immediate") (if_then_else (and (and (match_operand 2 "const1_operand") @@ -15853,25 +15897,31 @@ (define_insn "*3_cmp" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "" ",")) (const_int 0))) - (set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (set (match_operand:SWI 0 "nonimmediate_operand" "=m,r") (any_shiftrt:SWI (match_dup 1) (match_dup 2)))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, mode, operands)" + && ix86_binary_operator_ok (, mode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REG_P (operands[1]) + && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15884,10 +15934,10 @@ (define_insn "*3_cmp" (define_insn "*si3_cmp_zext" [(set (reg FLAGS_REG) (compare - (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0") + (any_shiftrt:SI (match_operand:SI 1 "register_operand" "0,r") (match_operand:QI 2 "const_1_to_31_operand")) (const_int 0))) - (set (match_operand:DI 0 "register_operand" "=r") + (set (match_operand:DI 0 "register_operand" "=r,r") (zero_extend:DI (any_shiftrt:SI (match_dup 1) (match_dup 2))))] "TARGET_64BIT && (optimize_function_for_size_p (cfun) @@ -15895,15 +15945,20 @@ (define_insn "*si3_cmp_zext" || (operands[2] == const1_rtx && TARGET_SHIFT1)) && ix86_match_ccmode (insn, CCGOCmode) - && ix86_binary_operator_ok (, SImode, operands)" + && ix86_binary_operator_ok (, SImode, operands, + ix86_can_use_ndd_p ())" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{l}\t%k0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{l}\t{%1, %k0|%k0, %1}" + : "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return which_alternative ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -15917,10 +15972,10 @@ (define_insn "*3_cconly" [(set (reg FLAGS_REG) (compare (any_shiftrt:SWI - (match_operand:SWI 1 "register_operand" "0") - (match_operand:QI 2 "" "")) + (match_operand:SWI 1 "register_operand" "0,r") + (match_operand:QI 2 "" ",")) (const_int 0))) - (clobber (match_scratch:SWI 0 "="))] + (clobber (match_scratch:SWI 0 "=,r"))] "(optimize_function_for_size_p (cfun) || !TARGET_PARTIAL_FLAG_REG_STALL || (operands[2] == const1_rtx @@ -15928,12 +15983,18 @@ (define_insn "*3_cconly" && ix86_match_ccmode (insn, CCGOCmode)" { if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) - return "{}\t%0"; + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative + ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return which_alternative + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "ishift") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16537,18 +16598,22 @@ (define_insn "rcrdi2" ;; Versions of sar and shr that set the carry flag. (define_insn "3_carry" [(set (reg:CCC FLAGS_REG) - (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0") + (unspec:CCC [(and:SWI48 (match_operand:SWI48 1 "register_operand" "0,r") (const_int 1)) (const_int 0)] UNSPEC_CC_NE)) - (set (match_operand:SWI48 0 "register_operand" "=r") + (set (match_operand:SWI48 0 "register_operand" "=r,r") (any_shiftrt:SWI48 (match_dup 1) (const_int 1)))] "" { - if (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) - return "{}\t%0"; - return "{}\t{1, %0|%0, 1}"; + if ((TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !(which_alternative && REGNO (operands[1]) == CX_REG)) + return which_alternative ? "{}\t{%1, %0|%0, %1}" + : "{}\t%0"; + return which_alternative ? "{}\t{$1, %1, %0|%0, %1, 1}" + : "{}\t{$1, %0|%0, 1}"; } - [(set_attr "type" "ishift1") + [(set_attr "isa" "*, apx_ndd") + (set_attr "type" "ishift1") (set (attr "length_immediate") (if_then_else (ior (match_test "TARGET_SHIFT1") diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 481ec8b00a8..28c0df72988 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -2,6 +2,8 @@ /* { dg-options "-mapxf -O2" } */ /* { dg-final { scan-assembler-not "movl"} } */ +#include + #define FOO(TYPE, OP_NAME, OP) \ TYPE \ __attribute__ ((noipa)) \ @@ -132,6 +134,24 @@ FOO3 (int, shl, <<, 7) FOO (long, shl, <<) FOO3 (long, shl, <<, 7) +FOO (char, sar, >>) +FOO3 (char, sar, >>, 7) +FOO (short, sar, >>) +FOO3 (short, sar, >>, 7) +FOO (int, sar, >>) +FOO3 (int, sar, >>, 7) +FOO (long, sar, >>) +FOO3 (long, sar, >>, 7) + +FOO (uint8_t, shr, >>) +FOO3 (uint8_t, shr, >>, 7) +FOO (uint16_t, shr, >>) +FOO3 (uint16_t, shr, >>, 7) +FOO (uint32_t, shr, >>) +FOO3 (uint32_t, shr, >>, 7) +FOO (uint64_t, shr, >>) +FOO3 (uint64_t, shr, >>, 7) + /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:l|w|q)\[^\n\r]%(?:|r|e)si, \\(%rdi\\), %(?:|r|e)ax" 4 } } */ @@ -156,3 +176,7 @@ FOO3 (long, shl, <<, 7) /* { dg-final { scan-assembler-times "xor(?:l|w|q)\[^\n\r]%(?:|r|e)si, %(?:|r|e)di, %(?:|r|e)ax" 2 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "sal(?:l|w|q)\[^\n\r]*7, %(?:|r|e)di, %(?:|r|e)ax" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */