From patchwork Tue Dec 5 02:29:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 173694 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3176988vqy; Mon, 4 Dec 2023 18:42:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IHYDFcQxDE+5CsSUdKBsZO18XUCxwb0j8TvM8b7FR1RU/LwSiEXXLgnYTmuxWHku6V46XcD X-Received: by 2002:a05:622a:3c8:b0:425:4043:96f9 with SMTP id k8-20020a05622a03c800b00425404396f9mr734316qtx.134.1701744178808; Mon, 04 Dec 2023 18:42:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701744178; cv=pass; d=google.com; s=arc-20160816; b=hEjU+uf+mmOfMyk48bXxC4Zg9EAjwFNuY3m+rDp/1VdKhJ57msqXXK6ue3ZLonwh7+ CNb+6xWANJ3216GTmkf1DEGMZL9R7ad7BwiDanasi9A1fkMfYBN0DZIsQuyxXQg0cSkc eocO9ZBOSUxplWwx0Y5S8/w6bQyA6DfyEzdsbyBkihTVPa8X2hG30H/fk79GADtE0izA jWCM65zMAEKLQAp2fW0M124tmejO7QK0u6e8uUCA86WbaT5koLd6NeoJswfgPrakPf2J zGo5S8VKNMucNk6GGW2Hk3b6P/m5sQ3L1n1YsvCQkBXqcPzKdiCAL0lBCRH/p0uWyTld A0CA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=bOl8SGaChoaN3fXOmMLmyrE/uCRhX+IqWV4IqTwDdQ4=; fh=n8eNxIWSYJwy/CU3QSXzDvE/zeEoomCGojuOcYEQEyQ=; b=fsG8ODjfwF/GbgaN8ajczftCQteJxwnibHitOXGQK9xkWQXlnkdFkBM9zLFT2t5joP m485rFeVN21BBFjsyRVaozQAU2tYvDPO9DWvqSMSDX00BAEXQCBpkRw1knTtU1+jeOAw e+K/pcxfgNDfUUQhDri2yVR45f/QG+vW6bxb1xCgf583UhouAHQC1Wr8ZPcgYJhsjbBN B7TV+7Nt5N+UME1ohGaGIx9cVZwDWZZ9V0ZIjpm2pme1WrWmgi8rJLEchQ0ZJvfJIQ2j KPUVDCf2qfhyynnjbvE2GP5gRmtt5PHr8FcQlmr+daCkM9jJsZi9zQv5cs6+vdiQRD0U ULsg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F9UIn7YA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id 15-20020a05621420cf00b0067ab0caa43bsi6955134qve.150.2023.12.04.18.42.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:42:58 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F9UIn7YA; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB0BC3A6C984 for ; Tue, 5 Dec 2023 02:34:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id DFAD23888C77 for ; Tue, 5 Dec 2023 02:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DFAD23888C77 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DFAD23888C77 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743489; cv=none; b=SaeOhMapkYxAlTf4175nhBeDa21+wnvLxAAJnUSrg3n4TLq8x1nZP0VnViAxMta7Zd7VbUnx3f6n8rc8x+8H1hON/CRkHLAlsS8YCu1MQROzL/NqwQi5Io+n3Z4xIYUv8rEI9V0aIucBove7D8HiAcqjJT79IVh2nYRuussC2wE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701743489; c=relaxed/simple; bh=tiVSZ4PO+CVHKBJyoTsXQrqyOks4cKkDdgLG9Zb8NZs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=jWwUmUYZuoY2epq7bN4hXb3Elt3krXy4S/3sjCF0nL+a9A1pbVrm5OkMDs9uDoPyW1xRKsjFzLE0FjBA5LvedR44cxVYCWoX75p2AO3/Js+MgAScZEPRHacH53P+wn8uAY4WvRZLYknHQ6o4BNrczzoB4ijrbGKQaBlyigIWYZ4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rALDS-0001YN-HB for gcc-patches@gcc.gnu.org; Mon, 04 Dec 2023 21:31:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701743478; x=1733279478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tiVSZ4PO+CVHKBJyoTsXQrqyOks4cKkDdgLG9Zb8NZs=; b=F9UIn7YA1rxTGNsKwvtsY3bUUNsYLzOCpDzCmlvdI32fW8Oac5If0Gik wjOzBFtJiKxB0gOvAl2i7yUBrAxzmbq5FiyW0WKnGzItK3Ts7z6gvOMqn 2pDG8VUjmP5NskLjmSM1y3+QJKJX13nRkPYAEMMGcKfJUNXoRMF2VTFdx i+ASxnTjyL4LN9d5HSCF+II9PAjy5/uAirk2Smxy+TkzjcsNPeqmmCHXu 2VihXULX2xxPnEz8wqeEoJdIEOf6syCStYKUQZ3uTauT8sFwKNwcfkjhJ 6hHPOx10u8JDjWXXTQgmsT8zJ9/VEqJ11gDI8lio1ARsdjyMvqUi2F+Ez Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="373277812" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="373277812" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 18:29:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="841275538" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="841275538" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2023 18:29:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 60A631007801; Tue, 5 Dec 2023 10:29:48 +0800 (CST) From: Hongyu Wang To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 14/17] [APX NDD] Support APX NDD for rotate insns Date: Tue, 5 Dec 2023 10:29:45 +0800 Message-Id: <20231205022948.504790-15-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231205022948.504790-1-hongyu.wang@intel.com> References: <20231205022948.504790-1-hongyu.wang@intel.com> MIME-Version: 1.0 Received-SPF: softfail client-ip=192.55.52.136; envelope-from=wwwhhhyyy333@gmail.com; helo=mgamail.intel.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_PASS, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784408104379005679 X-GMAIL-MSGID: 1784408104379005679 gcc/ChangeLog: * config/i386/i386.md (*3_1): Extend with a new alternative to support NDD for SI/DI rotate, and adjust output template. (*si3_1_zext): Likewise. (*3_1): Likewise for QI/HI modes. (rcrsi2): Likewise, and use nonimmediate_operand for operands[1] to accept memory input for NDD alternative. (rcrdi2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Add test for left/right rotate. --- gcc/config/i386/i386.md | 79 +++++++++++++++---------- gcc/testsuite/gcc.target/i386/apx-ndd.c | 20 +++++++ 2 files changed, 69 insertions(+), 30 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8bec8a63ba9..6398f544a17 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16662,13 +16662,15 @@ (define_insn "*bmi2_rorx3_1" (set_attr "mode" "")]) (define_insn "*3_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r") (any_rotate:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "c,"))) + (match_operand:SWI48 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "c,,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16676,14 +16678,16 @@ (define_insn "*3_1" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16733,13 +16737,14 @@ (define_insn "*bmi2_rorxsi3_1_zext" (set_attr "mode" "SI")]) (define_insn "*si3_1_zext" - [(set (match_operand:DI 0 "register_operand" "=r,r") + [(set (match_operand:DI 0 "register_operand" "=r,r,r") (zero_extend:DI - (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") - (match_operand:QI 2 "nonmemory_operand" "cI,I")))) + (any_rotate:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm,rm") + (match_operand:QI 2 "nonmemory_operand" "cI,I,cI")))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (, SImode, operands)" { + bool use_ndd = (which_alternative == 2); switch (get_attr_type (insn)) { case TYPE_ROTATEX: @@ -16747,14 +16752,16 @@ (define_insn "*si3_1_zext" default: if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{l}\t%k0"; else - return "{l}\t{%2, %k0|%k0, %2}"; + return use_ndd ? "{l}\t{%2, %1, %k0|%k0, %1, %2}" + : "{l}\t{%2, %k0|%k0, %2}"; } } - [(set_attr "isa" "*,bmi2") - (set_attr "type" "rotate,rotatex") + [(set_attr "isa" "*,bmi2,apx_ndd") + (set_attr "type" "rotate,rotatex,rotate") (set (attr "preferred_for_size") (cond [(eq_attr "alternative" "0") (symbol_ref "true")] @@ -16798,19 +16805,25 @@ (define_split (zero_extend:DI (rotatert:SI (match_dup 1) (match_dup 2))))]) (define_insn "*3_1" - [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m") - (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0") - (match_operand:QI 2 "nonmemory_operand" "c"))) + [(set (match_operand:SWI12 0 "nonimmediate_operand" "=m,r") + (any_rotate:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "0,rm") + (match_operand:QI 2 "nonmemory_operand" "c,c"))) (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (, mode, operands)" + "ix86_binary_operator_ok (, mode, operands, + TARGET_APX_NDD)" { + bool use_ndd = which_alternative == 1; if (operands[2] == const1_rtx - && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun)) + && !use_ndd) return "{}\t%0"; else - return "{}\t{%2, %0|%0, %2}"; + return use_ndd + ? "{}\t{%2, %1, %0|%0, %1, %2}" + : "{}\t{%2, %0|%0, %2}"; } - [(set_attr "type" "rotate") + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "rotate") (set (attr "length_immediate") (if_then_else (and (match_operand 2 "const1_operand") @@ -16867,31 +16880,37 @@ (define_split ;; Rotations through carry flag (define_insn "rcrsi2" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,r") (plus:SI - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:SI (ltu:SI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 31)))) (clobber (reg:CC FLAGS_REG))] "" - "rcr{l}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{l}\t%0 + rcr{l}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "memory" "none") (set_attr "length_immediate" "0") (set_attr "mode" "SI")]) (define_insn "rcrdi2" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,r") (plus:DI - (lshiftrt:DI (match_operand:DI 1 "register_operand" "0") + (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "0,rm") (const_int 1)) (ashift:DI (ltu:DI (reg:CCC FLAGS_REG) (const_int 0)) (const_int 63)))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT" - "rcr{q}\t%0" - [(set_attr "type" "ishift1") + "@ + rcr{q}\t%0 + rcr{q}\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,apx_ndd") + (set_attr "type" "ishift1") (set_attr "length_immediate" "0") (set_attr "mode" "DI")]) diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c b/gcc/testsuite/gcc.target/i386/apx-ndd.c index 239c427514a..b215f66d3e2 100644 --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c @@ -40,6 +40,14 @@ foo3_##OP_NAME##_##TYPE (TYPE a) \ return b; \ } +#define FOO4(TYPE, OP_NAME, OP1, OP2, IMM1) \ +TYPE \ +__attribute__ ((noipa)) \ +foo4_##OP_NAME##_##TYPE (TYPE a) \ +{ \ + TYPE b = (a OP1 IMM1 | a OP2 (8 * sizeof(TYPE) - IMM1)); \ + return b; \ +} #define F(TYPE, OP_NAME, OP) \ TYPE \ @@ -152,6 +160,16 @@ FOO3 (uint32_t, shr, >>, 7) FOO (uint64_t, shr, >>) FOO3 (uint64_t, shr, >>, 7) +FOO4 (uint8_t, ror, >>, <<, 1) +FOO4 (uint16_t, ror, >>, <<, 1) +FOO4 (uint32_t, ror, >>, <<, 1) +FOO4 (uint64_t, ror, >>, <<, 1) + +FOO4 (uint8_t, rol, <<, >>, 1) +FOO4 (uint16_t, rol, <<, >>, 1) +FOO4 (uint32_t, rol, <<, >>, 1) +FOO4 (uint64_t, rol, <<, >>, 1) + /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */ /* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ @@ -180,3 +198,5 @@ FOO3 (uint64_t, shr, >>, 7) /* { dg-final { scan-assembler-times "sar(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */ /* { dg-final { scan-assembler-times "shr(?:b|l|w|q)\[^\n\r]*7, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "ror(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */ +/* { dg-final { scan-assembler-times "rol(?:b|l|w|q)\[^\n\r]*1, %(?:|r|e)di(?:|l), %(?:|r|e)a(?:x|l)" 4 } } */