From patchwork Tue Dec 12 08:28:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 177175 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp7583681vqy; Tue, 12 Dec 2023 00:29:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IEp5/TjhHt0p0/LTX3DO2q9tMHnG1zIo7rN8XcnnWtmiYXFt12Sayc5eGxxXY2I8SYwKg/0 X-Received: by 2002:a05:622a:cf:b0:423:b9ba:c165 with SMTP id p15-20020a05622a00cf00b00423b9bac165mr8213263qtw.49.1702369764775; Tue, 12 Dec 2023 00:29:24 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702369764; cv=pass; d=google.com; s=arc-20160816; b=DeVhrZoNluwRRXMtJB/J1F0Oz3XPYFJ9uBbLUCMMgfxSX2+1sGH0U2+t144rlShnOP 5xiXViCgIBvwgUe7lMN/4uOL7gKV6oibUHW33U8Dw+bWUn/BOds18FX+qffgJqMJWBFi aJZZiZ4HgIk3o2lugtRl0AGjNgEh9koo6ClP6Q7rAx6uxPIRGfYb8sMxr/vFQtkrji/F Q6vVBDdXw/Cr3o0JYEha+pnxFw0SVVMjuTretTXgSNTBDQgd7d/ZfsTpBa8XxIWiFbSr ut3liOpoEy9WhUGncZIR//Tw3XsaOa0QmDFePKZg59r6lX7firZbxXY99BgpXmEJRaq3 /PoA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=SNE1F0s87nQ+N95I+87bphCkiC40TVGucrFO4FdU9kQ=; fh=yqBQmCEeFYB2Wjmf8l8QkV/dOy5iKwSEx/iU/FYQjxU=; b=S/wPqf7Wez1KI3iATvbg4Y/GkJtHGpnSrhLegGKCrN21Qo9nebCiRdgPbO2u0b1Tw6 BADDs9XVrPLrl8mhKIuZENb3W/MKaNC+yl52QDPLrgw4W0+T1GXl7eMF0AfNMFx0WeKi Wd7d72buaNgmVVuhsFRcB6lI5iEOVHWS2o1JlxUwvimw6dSxVVzuMUbh48WOZZLQEzus jvkfAEIoWGJr4sjFH+craYvI+3qCfTBnw7cEsBbBEHbvOjYhaH/vX9ERiwAlLqzEqsM+ imDO6oJVG9Ajsft++moOqh23UAotvmfUlIcQUA4twPNphRG5WbG2zL2R/5XG9JfCkzYw khZA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MUuY7yhO; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t15-20020a05622a180f00b0041092f38eeasi10468563qtc.434.2023.12.12.00.29.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 00:29:24 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MUuY7yhO; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B933D385C30F for ; Tue, 12 Dec 2023 08:29:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by sourceware.org (Postfix) with ESMTPS id 2B733385734D for ; Tue, 12 Dec 2023 08:28:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2B733385734D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2B733385734D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.55.52.136 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702369736; cv=none; b=HuDTPn6G8N1uAjYs02V2UbJuSUsNCu/ou5ZJPr+ojcmeIP4b0F3vmotmrB+JlKYxVFc+mDNHvHbv/Rf6/RHnDMvnunKgvQOe2iD9ai2twRgr0pvGb/YH07VINgX9ck1JY8n/HH+yoPFoKsDyXGn+Gz6c+VWifvdFINDicoImw7s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702369736; c=relaxed/simple; bh=b/ZZND0yOU9NeoqnJavLuATBnxUEax1LhN1LrIs6lOc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=p7lp+S3I1nYQmfeweA6x6Obc0qF+m+2UUG+l9LT8yBM6kIeq7pK6TFuJbP65G43YRSdLjHNy6D2xxQ3gJrUseXKqKQUot/8Ubf8Gw1IMW8sJZe1qsT/w2JH9rDRzgV0BVIfmdivNM9hALA03vb1Y491p3EryU1cq7MGpF/XwC+Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702369734; x=1733905734; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=b/ZZND0yOU9NeoqnJavLuATBnxUEax1LhN1LrIs6lOc=; b=MUuY7yhO+aa8V7yZWP+BhQFESPFrccgQTdbtm5enhiRqQ8+Clhdzql+0 Adnym41pAWIK49jbdfdWtXjaoE8Y/aNYbgUIGswrGYdyEvI67cri0x6xl /7y0wXTigFXueu5377qolzXL+6Fg/Wch8NQvXESmocrOpbeBd4C4GUur3 8yVtAv9YFjO7MuL0zNaF2tpWtvyctFIV3/0qAUp4Fn7hIlKmFZozhIZuE yKOs5nymImCIBu9v/ngAMSWFJIqKl7Fryv6ntR1pnqSk/vwPGmRP2Ki3Z zCB62mxBlLwjmw9JbnCbFkRnbrkj/9Y0SUiB5yzv+28I9yl+vzlj8Me3c g==; X-IronPort-AV: E=McAfee;i="6600,9927,10921"; a="374277947" X-IronPort-AV: E=Sophos;i="6.04,269,1695711600"; d="scan'208";a="374277947" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2023 00:28:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10921"; a="1104813285" X-IronPort-AV: E=Sophos;i="6.04,269,1695711600"; d="scan'208";a="1104813285" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 12 Dec 2023 00:28:50 -0800 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 10DDA10056DD; Tue, 12 Dec 2023 16:28:50 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com Subject: [PATCH v1] RISC-V: Disable RVV VCOMPRESS avl propagation Date: Tue, 12 Dec 2023 16:28:49 +0800 Message-Id: <20231212082849.1845268-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785064078538085904 X-GMAIL-MSGID: 1785064078538085904 From: Pan Li This patch would like to disable the avl propagation for the follow reasons. According to the ISA, the first vl elements of vector register group vs2 should be extracted and packed for vcompress. And the highest element of vs2 vector may be touched by the mask, which may be eliminated by avl propagation. For example, given original vl = 4 here. We have: v0 = 0b1000 v1 = {0x1, 0x2, 0x3, 0x4} v2 = {0x5, 0x6, 0x7, 0x8} Then: vcompress v1, v2, v0 (avl = 4), v1 = {0x8, 0x2, 0x3, 0x4}. <== Correct. vcompress v1, v2, v0 (avl = 2), v1 will be unchanged. <== Wrong. Finally, we cannot propagate avl of vcompress because it may has senmatics change to the result. This patch also fix the failure of gcc.c-torture/execute/990128-1.c for the following configurations. riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8 riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Disable the avl propogation for the vcompress. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c: New test. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/config/riscv/riscv-avlprop.cc | 35 ++++++++++++------ .../rvv/autovec/binop/vcompress-avlprop-1.c | 36 +++++++++++++++++++ 2 files changed, 61 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c diff --git a/gcc/config/riscv/riscv-avlprop.cc b/gcc/config/riscv/riscv-avlprop.cc index 02f006742f1..a6159816cf7 100644 --- a/gcc/config/riscv/riscv-avlprop.cc +++ b/gcc/config/riscv/riscv-avlprop.cc @@ -113,19 +113,34 @@ avl_can_be_propagated_p (rtx_insn *rinsn) touching the element with i > AVL. So, we don't do AVL propagation on these following situations: - - The index of "vrgather dest, source, index" may pick up the - element which has index >= AVL, so we can't strip the elements - that has index >= AVL of source register. - - The last element of vslide1down is AVL + 1 according to RVV ISA: - vstart <= i < vl-1 vd[i] = vs2[i+1] if v0.mask[i] enabled - - The last multiple elements of vslidedown can be the element - has index >= AVL according to RVV ISA: - 0 <= i+OFFSET < VLMAX src[i] = vs2[i+OFFSET] - vstart <= i < vl vd[i] = src[i] if v0.mask[i] enabled. */ + vgather: + - The index of "vrgather dest, source, index" may pick up the + element which has index >= AVL, so we can't strip the elements + that has index >= AVL of source register. + vslide1down: + - The last element of vslide1down is AVL + 1 according to RVV ISA: + vstart <= i < vl-1 vd[i] = vs2[i+1] if v0.mask[i] enabled + - The last multiple elements of vslidedown can be the element + has index >= AVL according to RVV ISA: + 0 <= i+OFFSET < VLMAX src[i] = vs2[i+OFFSET] + vstart <= i < vl vd[i] = src[i] if v0.mask[i] enabled. + vcompress: + - According to the ISA, the first vl elements of vector register + group vs2 should be extracted and packed for vcompress. And the + highest element of vs2 vector may be touched by the mask. For + example, given vlmax = 4 here. + v0 = 0b1000 + v1 = {0x1, 0x2, 0x3, 0x4} + v2 = {0x5, 0x6, 0x7, 0x8} + vcompress v1, v2, v0 with avl = 4, v1 = {0x8, 0x2, 0x3, 0x4}. + vcompress v1, v2, v0 with avl = 2, v1 will be unchanged. + Thus, we cannot propagate avl of vcompress because it may has + senmatics change to the result. */ return get_attr_type (rinsn) != TYPE_VGATHER && get_attr_type (rinsn) != TYPE_VSLIDEDOWN && get_attr_type (rinsn) != TYPE_VISLIDE1DOWN - && get_attr_type (rinsn) != TYPE_VFSLIDE1DOWN; + && get_attr_type (rinsn) != TYPE_VFSLIDE1DOWN + && get_attr_type (rinsn) != TYPE_VCOMPRESS; } static bool diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c new file mode 100644 index 00000000000..43f79fe3b7b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d -O3 --param=riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#define MAX 10 + +struct s { struct s *n; } *p; +struct s ss; +struct s sss[MAX]; + +/* +** build_linked_list: +** ... +** vsetivli\s+zero,\s*8,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vcompress\.vm\s+v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** vcompress\.vm\s+v[0-9]+,\s*v[0-9]+,\s*v0 +** vsetivli\s+zero,\s*2,\s*e64,\s*m1,\s*ta,\s*ma +** ... +*/ +void +build_linked_list () +{ + int i; + struct s *next; + + p = &ss; + next = p; + + for (i = 0; i < MAX; i++) { + next->n = &sss[i]; + next = next->n; + } + + next->n = 0; +}