From patchwork Tue Oct 31 15:10:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 160141 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b90f:0:b0:403:3b70:6f57 with SMTP id t15csp308814vqg; Tue, 31 Oct 2023 08:10:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGRZ0N9LlLoz0eT7m6JQH9YsxwOVMzG1ujrNX1cKI4jv2zjTz5XoqcCvQMKVPmmHBf73+dV X-Received: by 2002:a05:6870:1017:b0:1eb:7a0b:9dcc with SMTP id 23-20020a056870101700b001eb7a0b9dccmr13445552oai.5.1698765054285; Tue, 31 Oct 2023 08:10:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698765054; cv=pass; d=google.com; s=arc-20160816; b=TXK9HHN0oVfwDtKAY77JEyBY85vNvD8Wsq/CYoVfe1eiMTU2NoKYvvLwg0R6jZS3CS toDcUqa/SPnoB0NSH7GsELqtEyM4wpn6nPiuMcRT58M5X1ESXJpKgptO3YUk9LQzW+Xf QTv25JzKx4nVwjnBjsZ7h4BXGbYlTi6CmOdYrRPcRCOIpn+cdVujM4yCl+S5PKN8dWvu ORFwv9y9IMp5pfgLqX/Idpbtej0i+/Oq+3iVdwpvJ4J1QtBn9rfxXP5pgFVPUPWwQjjD IpOwDS5ER8bBkF3lI1eoV7SDLgzfb+bwgSJbrCa3hBxRsL59UyFxuA59Q+Ij2vYfm0aH lt2Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=vyRJxQQLw15d3/eJelJ5jF9lpCalajbjmrXjHkV9uWg=; fh=ZhoqF74dgOqAZwaTs8Qqd9D4B/xVeIfaG9dzbeNwyjI=; b=xxG7Xy6hF/mLakaqQo2QRaOv5yDRWf1rhXjTwOatHFipM87v0fKyH09j8Bar3Wm45D HjmXiM3B8BThOX4StK8ZSbWvITfVG5ioHuIw5qCdqTS98ynirevWrMPAOG2Yr+/X4k5S ett81GPVS2hqxeGY6/oj9aafUInqtQwtppVgDEtMsyatzFVyHSsd8zVHOJ9E0Qedk9ll MEHdaHwRhAaHAglxQI0XkMuLrETaVQL7H84dGoETUL7xwijjSykWegIGF+J7lW0+LpRD V3Ffv850Cd209JzoVcVBAM2BLd6JyD8K7V+cEg375LLtCeuP9dyyDoNO0g5QrX+18lXR xh2g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iHyHYJzt; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id v9-20020a67f1c9000000b004526a29920asi175724vsm.827.2023.10.31.08.10.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 08:10:54 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iHyHYJzt; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0A3B93858413 for ; Tue, 31 Oct 2023 15:10:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id D09CF3858D1E for ; Tue, 31 Oct 2023 15:10:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D09CF3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D09CF3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.126 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698765029; cv=none; b=x04iH8kwceblY1aknDbEPOVx8sJJDzIksxb1wC6C3+3TfKGkIX1PwDQ8ByCpHcu1G2WyAvzxOHY07/yaKC/BBioMAIAq+uxpH6AuPD0q4W8f7MEGIGbAjVqADIkEky9R/xwhn2FpNhh5m5FZGGU3tlUDhV50rPgYAaagdQ7keOs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698765029; c=relaxed/simple; bh=63Bjv0UTCTk2UrLHV8N8CZYOClSVHPRSjPt/PcVN6W4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=mZHeigbHovV4Xrr3U33L8inxZ0OM/bxMqdZ8kMPhxXQeJwMFOAbY4mTf6b5n5ZKz3eArN5UcpgCWo/6dEkF0+OcmBrgudlA/l16f4vht3bf/uYeiQ8VvEQAaxDLYMDF3RJrW4IC3wH5HXbX+UkTzG38lgNf8ml8u2Cohs9EIqJ0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698765026; x=1730301026; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=63Bjv0UTCTk2UrLHV8N8CZYOClSVHPRSjPt/PcVN6W4=; b=iHyHYJztxgg2rD2MCKEgLwdYy2xwzcekEq0oLjL1LenkK+tHt50PZzKj 3xglsZbqqf/uGvcdqDtK6DdFadoFzCb/SFd0IcH5i0vWfDqk9wm0WD0LH SgR1ZPvt9ODAHGY+b9hB9p5yw0bQxLovSZhOZnt6eP11vgA7bW88Eylqn 4PuR/jO2z2xDtqU6T0Z6k8U8q+LKMb9FgcGYWjMeVicuoEjEB6JwbHD90 vsOx17bqbeOkP/lbp3L0hdmOvlNZPx3cHM2M4KT8FgKOqOi7ER+R5iRbI VbF8QtwcMMWw4OoFMCdbVAaGAxkYLQSTOSkrhDhyODq1o97RnyLl330t0 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10880"; a="373357042" X-IronPort-AV: E=Sophos;i="6.03,265,1694761200"; d="scan'208";a="373357042" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Oct 2023 08:10:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10880"; a="710452481" X-IronPort-AV: E=Sophos;i="6.03,265,1694761200"; d="scan'208";a="710452481" Received: from shvmail02.sh.intel.com ([10.239.244.9]) by orsmga003.jf.intel.com with ESMTP; 31 Oct 2023 08:10:05 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail02.sh.intel.com (Postfix) with ESMTP id 65FFF10056B1; Tue, 31 Oct 2023 23:10:04 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com, hongtao.liu@intel.com, richard.guenther@gmail.com Subject: [PATCH v4] VECT: Refine the type size restriction of call vectorizer Date: Tue, 31 Oct 2023 23:10:03 +0800 Message-Id: <20231031151003.80256-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231018012009.849697-1-pan2.li@intel.com> References: <20231018012009.849697-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781183137273798896 X-GMAIL-MSGID: 1781284265451593022 From: Pan Li Update in v4: * Append the check to vectorizable_internal_function. Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to refine this data type size check and unblock the standard name like lrintmn2 on conditions. The type size of vectype_out need to be exactly the same as the type size of vectype_in when the vectype_out size isn't participating in the optab selection. While there is no such restriction when the vectype_out is somehow a part of the optab query. The below test are passed for this patch. * The risc-v regression tests. * Ensure the lrintf standard name in risc-v. The below test are ongoing. * The x86 bootstrap and regression test. * The aarch64 regression test. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_internal_function): Add type size check for vectype_out doesn't participating for optab query. (vectorizable_call): Remove the type size check. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/tree-vect-stmts.cc | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a9200767f67..799b4ab10c7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, const direct_internal_fn_info &info = direct_internal_fn (ifn); if (info.vectorizable) { + bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out); tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); + + /* The type size of both the vectype_in and vectype_out should be + exactly the same when vectype_out isn't participating the optab. + While there is no restriction for type size when vectype_out + is part of the optab query. */ + if (type0 != vectype_out && type1 != vectype_out && !same_size_p) + return IFN_LAST; + if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1), OPTIMIZE_FOR_SPEED)) return ifn; @@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "mismatched vector sizes %T and %T\n", - vectype_in, vectype_out); - return false; - } if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in))