From patchwork Mon Oct 30 12:22:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 159723 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp2169218vqb; Mon, 30 Oct 2023 05:23:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEJ1+WFg+OOBrqJG1XstXegdyR9XlZsXlyU1OawDjfen4uhaKRxAviLh18UGcEoixk+eauN X-Received: by 2002:a05:620a:8b82:b0:774:226b:c327 with SMTP id qx2-20020a05620a8b8200b00774226bc327mr7748759qkn.67.1698668611081; Mon, 30 Oct 2023 05:23:31 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698668611; cv=pass; d=google.com; s=arc-20160816; b=EOQ1tfcn8Mhb9bpmPmoUB5u4OQZL/6T7LoyVmzfQqhJi5dn7zCDU4WnhHsmkxCigzc hMsLzfxU4h7QNOgacQDCbPZ7viG81ltJHpxNjHXHjGJeYur3Eb00ZHXKNnyZp6PCPWp5 SV4yzfMEikAEGLMbB5MyZ+IWS6i6AbyOH64eGzjw2P+BFJSYSWqVAeAizLurmrhkM631 ixFcYq0XC1E5kDvdOmEgiquyEINOryW3jJ7QGYu5njUxNHiGk80TZO/EhFxTO3U1af1C lBqtVm6Ka0koOQ8kGCKwo+r7ueL/bA5EO6VQD15UdEEeeT1B6AYNdhL22NjhE860XrT+ Z3aA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=5UfOYA6PwS2ZVBJRs9DzX/o3zaDZqX6XbQkk8wx0Lco=; fh=ZhoqF74dgOqAZwaTs8Qqd9D4B/xVeIfaG9dzbeNwyjI=; b=l3/xpFlhqhod0ytjFOA96G5mKd3NHTOwzyNG8KTlViOn1VTn3qf0/YlgIVPIt1Ry+t o4WOv6ygqTgAj/B91eNHRLPZbPUVRbU53ns/nDy5ekkjh4aGrtGwZbzkQRHcZ3OZE+sL W5BIoP40Nv4H4dJ7XdiAtHzvC31jMz1tjNQVxO0/I2/FvB4TCA7zHpNI39hxe2RjqOym zQ58WGbJx+o8HQwOvjNy8vM+QW0Lrrkn++Ztr5q180z7ntaqNAfrn7lAr6a95lNmQX/j E24dihyCerS6Y6ZtijMUoHivudujbtEtt9y6nov8yMAXcoNSZCbFYzKtFJO/0IyMWqgn iyfQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fotgmm85; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id w3-20020a05620a148300b00778b115b79csi5598591qkj.417.2023.10.30.05.23.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Oct 2023 05:23:31 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fotgmm85; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B11683857725 for ; Mon, 30 Oct 2023 12:23:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id 09B4A3858D20 for ; Mon, 30 Oct 2023 12:23:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 09B4A3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 09B4A3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698668584; cv=none; b=U7vbJPr6SddX7EueiJDjECHpkE2qOJVN6MCAfgxXBci5HuyIoTrCQN/MAKqi1XVWEdRAnqKGNW8LJRoItQ/rBWihUsnUbvfhp8046h3bj55C5t4M96GLQ8JwFuvaDeKS74UFhQ+7a8Ys5xQ1DmPmRZF+k5M+IvBh5L1birnnwIY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698668584; c=relaxed/simple; bh=FX4u8CA3ix3Zt4l0Unj9dYD0UJl7oZpPCugSuVpNlX0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=bb8XXLafHD9VCXEZRmomUpS4xbzaycIl13O0JdN9VtSwRNOEGCcmF+BKTfa/mqPbCcNNjT12TTRE+6yOJqLnEI6CosBCqkt9Mh+IZMulrCUhBmh34riMbEHNgobXbHLqFq5phccDiRUWw9h3wdPs0x8+B+ruoFMMMQGt2nz+Qp0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698668582; x=1730204582; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FX4u8CA3ix3Zt4l0Unj9dYD0UJl7oZpPCugSuVpNlX0=; b=fotgmm85t+IzPCzL8PDwzexPI5+diUwl3Vq8SrRRffJiiMeuxseNPYUU Uh3uYGT7/qP80g9b2cbdV2bGvOJwkI2CZEgpzHDVUdM6Xmve+1xBItuls px5MC6UQqyfPSzN0Dh9PcWZdeqKfwrm4oMln2ohmy2T3DhMu5Hai6mCIH vSBdz+2P2KVA91HQkKTjeBnfL94/xY1/jljmRmtax78VpN4flqaLnOZC/ M62vVxOqVl1FTMiLOdwk+PfA3XJPL/1XRW3sBAG8jnrKXN5JUmDyW0Hl/ MQx1Rwo0BFrdBaiRBV+H41dv3y3WggtqMKBrzHrDwAZSOAMv9EnP4CIgm Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10878"; a="387871460" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="387871460" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 05:23:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10878"; a="877112826" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="877112826" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 30 Oct 2023 05:22:57 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 20B60100571A; Mon, 30 Oct 2023 20:22:57 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com, hongtao.liu@intel.com, richard.guenther@gmail.com Subject: [PATCH v3] VECT: Refine the type size restriction of call vectorizer Date: Mon, 30 Oct 2023 20:22:56 +0800 Message-Id: <20231030122256.3710809-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231018012009.849697-1-pan2.li@intel.com> References: <20231018012009.849697-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781183137273798896 X-GMAIL-MSGID: 1781183137273798896 From: Pan Li Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to refine this data type size check and unblock the standard name like lrintmn2 on conditions. The type size of vectype_out need to be exactly the same as the type size of vectype_in when the vectype_out size isn't participating in the optab selection. While there is no such restriction when the vectype_out is somehow a part of the optab query. The below test are passed for this patch. * The x86 bootstrap and regression test. * The aarch64 regression test. * The risc-v regression tests. * Ensure the lrintf standard name in risc-v. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_type_size_legal_p): New func impl to predicate the type size is legal or not. (vectorizable_call): Leverage vectorizable_type_size_legal_p. Signed-off-by: Pan Li --- gcc/tree-vect-stmts.cc | 51 +++++++++++++++++++++++++++++++----------- 1 file changed, 38 insertions(+), 13 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a9200767f67..24b3448d961 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1430,6 +1430,35 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, return IFN_LAST; } +/* Return TRUE when the type size is legal for the call vectorizer, + or FALSE. + The type size of both the vectype_in and vectype_out should be + exactly the same when vectype_out isn't participating the optab. + While there is no restriction for type size when vectype_out + is part of the optab query. + */ +static bool +vectorizable_type_size_legal_p (internal_fn ifn, tree vectype_out, + tree vectype_in) +{ + bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out); + + if (ifn == IFN_LAST || !direct_internal_fn_p (ifn)) + return same_size_p; + + const direct_internal_fn_info &difn_info = direct_internal_fn (ifn); + + if (!difn_info.vectorizable) + return same_size_p; + + /* According to vectorizable_internal_function, the type0/1 < 0 indicates + the vectype_out participating the optable selection. Aka the type size + check can be skipped here. */ + if (difn_info.type0 < 0 || difn_info.type1 < 0) + return true; + + return same_size_p; +} static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info, gimple_stmt_iterator *); @@ -3361,19 +3390,6 @@ vectorizable_call (vec_info *vinfo, return false; } - /* FORNOW: we don't yet support mixtures of vector sizes for calls, - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed - by a pack of the two vectors into an SI vector. We would need - separate code to handle direct VnDI->VnSI IFN_CTZs. */ - if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "mismatched vector sizes %T and %T\n", - vectype_in, vectype_out); - return false; - } if (VECTOR_BOOLEAN_TYPE_P (vectype_out) != VECTOR_BOOLEAN_TYPE_P (vectype_in)) @@ -3431,6 +3447,15 @@ vectorizable_call (vec_info *vinfo, ifn = vectorizable_internal_function (cfn, callee, vectype_out, vectype_in); + if (!vectorizable_type_size_legal_p (ifn, vectype_out, vectype_in)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "mismatched vector sizes %T and %T\n", + vectype_in, vectype_out); + return false; + } + /* If that fails, try asking for a target-specific built-in function. */ if (ifn == IFN_LAST) {