From patchwork Mon Oct 30 10:39:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 159681 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp2110665vqb; Mon, 30 Oct 2023 03:39:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGhl6kxJMnamL7XWLNOLpIkB4b8Sy659q3gG8l0EMf3A62J1BrRzH4jrAe8ShYoJRw3aiM4 X-Received: by 2002:a0d:cc91:0:b0:5a7:dda8:f291 with SMTP id o139-20020a0dcc91000000b005a7dda8f291mr8916035ywd.33.1698662383044; Mon, 30 Oct 2023 03:39:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698662383; cv=pass; d=google.com; s=arc-20160816; b=YjYTUQb3yoeTCsQM27+u23Q+FoQf06esKEoXkiC6wf8VNq5gzyjJEDnoMt63c6ZEAn sr+pNDaZ9uwyGt6eU6bkLEnNuPLAfwzM3mL33Mfvthx1E1rzT9a9MhIR3s0cdgWGgLnG n7tB1tYIMa5+Zhi3kG6SLESegvbtqMfu+nOVxs6HO7vEaUsgvLNHOCP1wdZTaYF3+b1c aU6mlBYMf5/gzFuOU+pxhTK64gGnq69Mt5r0sCuf2Gvfii8N30F7Gqzo5qX5uQpDuWkJ 7mTC75fE2CGkxyrW5GnKgimykUVmuZ31QH7xZlSm6lyNLfrou3Lhrnqzp5LJg7YvHQFT 8trQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=mEK6Rl6E+ByGwLFhNh6DXjCkNIK7rO28JX2iUIRgfl0=; fh=mlgczqbmeDnBuudaUD0OKNtrN3YEp3C0UptLjTcyUZ8=; b=b0gwqqg6YkRoHVlxaKKKcqF5meOAZa6s5ENYBnhqWRAK7labyGmRq4l5V6NZwNjpv0 Z9vXTNVVDrNiaZGi29WHfEzojmZE1+vppSMoMgfXtzQDVeKMpPZC340n2Oe0YWCRg1AT QAmQO+AUXFdSBcvOpbUx/niyJBoq19IlIItpOfU/q6LHHnPT+2wPeVy+tN3uYIqnlbJY OteBdTL3zbWyqpXd8uxXqwToHT4MqdxSYsa/dMFrAbQW0Lkah3eb26aHuL5rhxLWGrpg VVqceTJXrTtPNlxT3XBgZrPbPHwusFEMtMi3QL/p14KghaNP4oCDU8bmPIFBtmMNxiaX 7UTA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ee9-20020a0562140a4900b0066d854ccbe4si5257278qvb.449.2023.10.30.03.39.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Oct 2023 03:39:43 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C87B33858C5F for ; Mon, 30 Oct 2023 10:39:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 7BD463858D28 for ; Mon, 30 Oct 2023 10:39:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7BD463858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7BD463858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698662360; cv=none; b=qcp/VTulTk66CigpsBpTE9JzrIPM7L0znyX7JETupN+VhQAjP5x3/hCHxyWEhYKzAEEIIsjvdwgmm85CC2WJPVg+ldgtB7easo+qTDybZRld1Zkm1RBZwHQI5nlrERdgLTx87UKKj5d8VQEsVKbNEd8VA15YrgBvEhtHTBaKH8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698662360; c=relaxed/simple; bh=R11iOrAr7YUd28WsqA6HVMZycROOxle/Iiwm77Q9Hb8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=eycB6t4y0UudQZokBvHOnIPBOnjRXPopqf7NnQCpqHplVlsYL5ksabr+3UBYvVvr7ybAWLGvCg7KB7YsYbCr4waFZQzMCuVFV/O5XQ9XKAc5jI6/uSEIcHCfeH4KjsCYZsTwXSD/VIONQC56Df1TwOWwPVC7b7PAYvgVXi9gz4s= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1698662347t3yhwm37 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 30 Oct 2023 18:39:06 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: qcKkmz/zJhxM8E/PBjyjO4INo9Vqg8lRZTrAxPF4ZSi/wmwRxb9+BGPgCf75d 9ALJDITIOoRCz8LeLCwtXIORekMVDOtusevt9iWoAp8g5rwj2yaDwuhDqGuiv7rFVJEg7RZ W6EdrY59DttN6PNaFungJMhgZowktSMGq4oHZKeMoc2EdtyiPb88NAjTOLSyO0vSlkQWRLu 3Q3bBbJd7Wzgs8+TIZHVTsz7NNvhYBUNRgGHBWJmiqPaQmOR8srfozKVcQRrvSE/W4BiyWt P/Uvf5dIQIMO9JeV9p8mIRd2jim5+X7KwWxU7CRDmI4GhsW49n5s/8+1jJdVmdfUgfF0bsK Ig/HuD2Wn3s65lvfLTC4RL5aYGSuy3lPRKnVixczyl9/NJbkX1ayR0xUO/x2EZLZsVfQDpI weGoc4WsdxY= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 4243857335951975006 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, jeffreyalaw@gmail.com, Juzhe-Zhong Subject: [PATCH] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN Date: Mon, 30 Oct 2023 18:39:04 +0800 Message-Id: <20231030103904.2394773-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781176606522636609 X-GMAIL-MSGID: 1781176606522636609 As previous Richard's suggested, we should support strided load/store in loop vectorizer instead hacking RISC-V backend. This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN. The GIMPLE IR is: v = mask_len_strided_load (ptr, stride, mask, len, bias) mask_len_strided_store (ptr, stride, v, mask, len, bias) This patch is the prerequisite patch for the following loop vectorizer patch. gcc/ChangeLog: * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store. * internal-fn.cc (expand_scatter_store_optab_fn): Ditto. (expand_gather_load_optab_fn): Ditto. (internal_load_fn_p): Ditto. (internal_strided_fn_p): Ditto. (internal_fn_len_index): Ditto. (internal_fn_mask_index): Ditto. (internal_fn_stored_value_index): Ditto. * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto. (MASK_LEN_STRIDED_STORE): Ditto. * internal-fn.h (internal_strided_fn_p): Ditto. * optabs.def (OPTAB_CD): Ditto. --- gcc/doc/md.texi | 23 +++++++++++++++++++++ gcc/internal-fn.cc | 49 +++++++++++++++++++++++++++++++++++++-------- gcc/internal-fn.def | 4 ++++ gcc/internal-fn.h | 1 + gcc/optabs.def | 2 ++ 5 files changed, 71 insertions(+), 8 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index fab2513105a..f27148c3a3c 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5094,6 +5094,18 @@ Bit @var{i} of the mask is set if element @var{i} of the result should be loaded from memory and clear if element @var{i} of the result should be undefined. Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored. +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern +@item @samp{mask_len_strided_load@var{m}@var{n}} +Load several separate memory locations into a vector of mode m. +Operand 1 is a scalar base address and operand 2 is mode @var{n} +specifying each uniform stride between consecutive element. +operand 3 is mask operand, operand 4 is length operand and operand 5 is +bias operand. Similar to mask_len_load, the instruction loads at most +(operand 4 + operand 5) elements from memory. Bit @var{i} of the mask is set +if element @var{i} of the result should be loaded from memory and clear if +element @var{i} of the result should be undefined. +Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored. + @cindex @code{scatter_store@var{m}@var{n}} instruction pattern @item @samp{scatter_store@var{m}@var{n}} Store a vector of mode @var{m} into several distinct memory locations. @@ -5131,6 +5143,17 @@ at most (operand 6 + operand 7) elements of (operand 4) to memory. Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be stored. Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored. +@cindex @code{mask_len_strided_store@var{m}@var{n}} instruction pattern +@item @samp{mask_len_strided_store@var{m}@var{n}} +Store a vector of mode @var{m} into several distinct memory locations. +Operand 0 is a scalar base address, operand 2 is the vector to be stored, +and operand 1 is mode @var{n} specifying each uniform stride between consecutive element. +operand 3 is mask operand, operand 4 is length operand and operand 5 is +bias operand. Similar to mask_len_store, the instruction stores at most +(operand 4 + operand 5) elements to memory. Bit @var{i} of the mask is set +if element @var{i} of the result should be storeed. +Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored. + @cindex @code{vec_set@var{m}} instruction pattern @item @samp{vec_set@var{m}} Set given field in the vector value. Operand 0 is the vector to modify, diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index e7451b96353..5c1a6015de4 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3570,20 +3570,23 @@ expand_scatter_store_optab_fn (internal_fn, gcall *stmt, direct_optab optab) int rhs_index = internal_fn_stored_value_index (ifn); tree base = gimple_call_arg (stmt, 0); tree offset = gimple_call_arg (stmt, 1); - tree scale = gimple_call_arg (stmt, 2); tree rhs = gimple_call_arg (stmt, rhs_index); rtx base_rtx = expand_normal (base); rtx offset_rtx = expand_normal (offset); - HOST_WIDE_INT scale_int = tree_to_shwi (scale); rtx rhs_rtx = expand_normal (rhs); class expand_operand ops[8]; int i = 0; create_address_operand (&ops[i++], base_rtx); create_input_operand (&ops[i++], offset_rtx, TYPE_MODE (TREE_TYPE (offset))); - create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); - create_integer_operand (&ops[i++], scale_int); + if (!internal_strided_fn_p (ifn)) + { + create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); + tree scale = gimple_call_arg (stmt, 2); + HOST_WIDE_INT scale_int = tree_to_shwi (scale); + create_integer_operand (&ops[i++], scale_int); + } create_input_operand (&ops[i++], rhs_rtx, TYPE_MODE (TREE_TYPE (rhs))); i = add_mask_and_len_args (ops, i, stmt); @@ -3597,23 +3600,27 @@ expand_scatter_store_optab_fn (internal_fn, gcall *stmt, direct_optab optab) static void expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab) { + internal_fn ifn = gimple_call_internal_fn (stmt); tree lhs = gimple_call_lhs (stmt); tree base = gimple_call_arg (stmt, 0); tree offset = gimple_call_arg (stmt, 1); - tree scale = gimple_call_arg (stmt, 2); rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); rtx base_rtx = expand_normal (base); rtx offset_rtx = expand_normal (offset); - HOST_WIDE_INT scale_int = tree_to_shwi (scale); int i = 0; class expand_operand ops[8]; create_output_operand (&ops[i++], lhs_rtx, TYPE_MODE (TREE_TYPE (lhs))); create_address_operand (&ops[i++], base_rtx); create_input_operand (&ops[i++], offset_rtx, TYPE_MODE (TREE_TYPE (offset))); - create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); - create_integer_operand (&ops[i++], scale_int); + if (!internal_strided_fn_p (ifn)) + { + create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); + tree scale = gimple_call_arg (stmt, 2); + HOST_WIDE_INT scale_int = tree_to_shwi (scale); + create_integer_operand (&ops[i++], scale_int); + } i = add_mask_and_len_args (ops, i, stmt); insn_code icode = convert_optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)), TYPE_MODE (TREE_TYPE (offset))); @@ -4596,6 +4603,7 @@ internal_load_fn_p (internal_fn fn) case IFN_GATHER_LOAD: case IFN_MASK_GATHER_LOAD: case IFN_MASK_LEN_GATHER_LOAD: + case IFN_MASK_LEN_STRIDED_LOAD: case IFN_LEN_LOAD: case IFN_MASK_LEN_LOAD: return true; @@ -4648,6 +4656,22 @@ internal_gather_scatter_fn_p (internal_fn fn) } } +/* Return true if IFN is some form of strided load or strided store. */ + +bool +internal_strided_fn_p (internal_fn fn) +{ + switch (fn) + { + case IFN_MASK_LEN_STRIDED_LOAD: + case IFN_MASK_LEN_STRIDED_STORE: + return true; + + default: + return false; + } +} + /* If FN takes a vector len argument, return the index of that argument, otherwise return -1. */ @@ -4683,6 +4707,8 @@ internal_fn_len_index (internal_fn fn) case IFN_COND_LEN_XOR: case IFN_COND_LEN_SHL: case IFN_COND_LEN_SHR: + case IFN_MASK_LEN_STRIDED_LOAD: + case IFN_MASK_LEN_STRIDED_STORE: return 4; case IFN_COND_LEN_NEG: @@ -4715,6 +4741,10 @@ internal_fn_mask_index (internal_fn fn) case IFN_MASK_LEN_STORE: return 2; + case IFN_MASK_LEN_STRIDED_LOAD: + case IFN_MASK_LEN_STRIDED_STORE: + return 3; + case IFN_MASK_GATHER_LOAD: case IFN_MASK_SCATTER_STORE: case IFN_MASK_LEN_GATHER_LOAD: @@ -4735,6 +4765,9 @@ internal_fn_stored_value_index (internal_fn fn) { switch (fn) { + case IFN_MASK_LEN_STRIDED_STORE: + return 2; + case IFN_MASK_STORE: case IFN_MASK_STORE_LANES: case IFN_SCATTER_STORE: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index a2023ab9c3d..0fa532e8f6b 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -199,6 +199,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE, mask_gather_load, gather_load) DEF_INTERNAL_OPTAB_FN (MASK_LEN_GATHER_LOAD, ECF_PURE, mask_len_gather_load, gather_load) +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_LOAD, ECF_PURE, + mask_len_strided_load, gather_load) DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, len_load, len_load) DEF_INTERNAL_OPTAB_FN (MASK_LEN_LOAD, ECF_PURE, mask_len_load, mask_len_load) @@ -208,6 +210,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0, mask_scatter_store, scatter_store) DEF_INTERNAL_OPTAB_FN (MASK_LEN_SCATTER_STORE, 0, mask_len_scatter_store, scatter_store) +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_STORE, 0, + mask_len_strided_store, scatter_store) DEF_INTERNAL_OPTAB_FN (MASK_STORE, 0, maskstore, mask_store) DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, store_lanes) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 99de13a0199..d25925b9a10 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -235,6 +235,7 @@ extern bool can_interpret_as_conditional_op_p (gimple *, tree *, extern bool internal_load_fn_p (internal_fn); extern bool internal_store_fn_p (internal_fn); extern bool internal_gather_scatter_fn_p (internal_fn); +extern bool internal_strided_fn_p (internal_fn); extern int internal_fn_mask_index (internal_fn); extern int internal_fn_len_index (internal_fn); extern int internal_fn_stored_value_index (internal_fn); diff --git a/gcc/optabs.def b/gcc/optabs.def index 2ccbe4197b7..3d85ac5f678 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -98,9 +98,11 @@ OPTAB_CD(mask_len_store_optab, "mask_len_store$a$b") OPTAB_CD(gather_load_optab, "gather_load$a$b") OPTAB_CD(mask_gather_load_optab, "mask_gather_load$a$b") OPTAB_CD(mask_len_gather_load_optab, "mask_len_gather_load$a$b") +OPTAB_CD(mask_len_strided_load_optab, "mask_len_strided_load$a$b") OPTAB_CD(scatter_store_optab, "scatter_store$a$b") OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b") OPTAB_CD(mask_len_scatter_store_optab, "mask_len_scatter_store$a$b") +OPTAB_CD(mask_len_strided_store_optab, "mask_len_strided_store$a$b") OPTAB_CD(vec_extract_optab, "vec_extract$a$b") OPTAB_CD(vec_init_optab, "vec_init$a$b")