From patchwork Mon Dec 11 09:47:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergei Lewis X-Patchwork-Id: 176564 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp6932151vqy; Mon, 11 Dec 2023 01:48:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IFHO4WuRInvJIDDQd41jfQ3oQ7i5y8U9pNa0x6ISYKhZ5OjOKvqYS3x11vKVGJMEgNNOxS+ X-Received: by 2002:a05:622a:82:b0:425:93d0:8267 with SMTP id o2-20020a05622a008200b0042593d08267mr5860659qtw.48.1702288101523; Mon, 11 Dec 2023 01:48:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702288101; cv=pass; d=google.com; s=arc-20160816; b=PlLVh3yHhmsF/+ROX5d1RAXUZqsZ8pn92VHxSeMnDz/IAHDc4Qd8fTXBYH9UWiGCw9 GHedXFbIxjk7fCEdI5l51J+j1Pe8KPOVqK9JupG1Fc730ZZ+h55TGA7+18ALgejytlyE HlFk22zfeDJIrJLeLWhHk4eowjcXCar+vcPhKn3/N/YnMou7RPX0H4HQ0pzwVSv3DLam 5M/OmAepC/dgLJqkLRUNWTHgN/KuktTeT/HCHGWa1GG5upn81rB3VF2pY1v7pphpVqWS M2AFnF8LIYnULiyzmNkwft+QeTrX+vWJIL4WG/ljCjQWEq5vfGKEE6dGxzZcXKSXR8ru 2SXQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=UMYAYhj0juR5BhwJDXUd2WPJyHBggrXYqN+rD8C7fSQ=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=DGZOsoaMjbPpFRuFI+i5I1CwZ/frLDf5BGDYbux9G39EvMlA6Ncq9HSeJ99cSoqvzs BjXXMcCp+hKAt9g4zM44/Qfrd9RY2HxOKYOUkhwBwjBygXyFWupuF5gtFNeKCcbbiG8Z TKA7aPP7m4ZgccYPZkrzlqJp9b89OXoffTvsr+BzA3ggvAaY/oddN1QGJoou2ppwKr/K 8sfI07Uv36667qsECigNjT8xavvF9fCREiIsg0W7UcZJ5tjvVYdZdkCoR2RWPG6g4bvM Ia73UHXFq8JdJOMyo861uGKTK995MyWRvLlv9lo0+X/mjwOGrqoDEZVgP3LcYWrZwL9p QyCg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=gwDFJI9F; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h12-20020ac8514c000000b004258582ed33si7425405qtn.692.2023.12.11.01.48.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:48:21 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=gwDFJI9F; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D2246385AC1A for ; Mon, 11 Dec 2023 09:48:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id A243A3858C98 for ; Mon, 11 Dec 2023 09:47:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A243A3858C98 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A243A3858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288065; cv=none; b=tyDSRuYF3/UV5LVpIifX8314Lmf34HxZJBRCf+zlWbqBLaYseJ9/oVQ5mQUoBTbqkuufM/Nt5aP9Lr9p6kUCNGk15BD/VRJjtuX6MiGdQqGRPEpZljRG/N6Pky7zm3IA/5bFM2+PrJ15Wlzs8E3K3VTImMxEW1DePlp3iph9FsE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288065; c=relaxed/simple; bh=EQxveASLDB5n1tRzibO1gyG7GUMyfPNHlBz4f1xYJDM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=O3luZ3YY2P1KYWgIey+h5OXor7oxYQMHfyjOl3vOFJnsDJ/ettkSMP+viN551vhyI8JISUExfq4hLAHV7rlVfOQm86sdqLuMNUHPQ9H+xxJNEc9nEMRkKLFgqmHQkJHk67H5x5NpiFCe0IYYWiA9ag19sJ/A7k2iZAx4su1Cuzc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-50bfd8d5c77so4749460e87.1 for ; Mon, 11 Dec 2023 01:47:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1702288061; x=1702892861; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=UMYAYhj0juR5BhwJDXUd2WPJyHBggrXYqN+rD8C7fSQ=; b=gwDFJI9F18GD2rzmt0VXKV2GK91TL0ccr/x+Rlo+eUbnNiCiRZGOo4QCkzwwqMkFrU UfDhOjs54HLaxHi1xBLLYr7Rbgo5Ua64lRDaXdBFtK7YUTh+XlLy3+YvNRjF8QiByy9T GoHzDnqJnCpkDrxQoQ2ntxSfqffXk60u498N1M9E4U/fdWnAdEe+vQ2TZvafyRIYpaUK 2eiKxdYNLMGtK5Byr2zuRsopI96Y67oLAXMWphSGhBi30qtH26cSJem46vfONfaDoYTO 2/wNn5YEjXl5aOhCRGrSHy4ukkFwDNlmv4MIaAYjJ89EmtICzfskQx9ozpD0EHZI9OpV KfPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702288061; x=1702892861; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UMYAYhj0juR5BhwJDXUd2WPJyHBggrXYqN+rD8C7fSQ=; b=aEtmv4zXL/iwDbqmhmXAeC8S/+j8LyfCkhvFnD9S+s8HrwsDmOVqv8w63f36KRaKRL IvUhQZLmsbgsuldoM7c133fM1VDtGprZO+50mgHxKrKrPDfUdi5kjO1Wlb1aAHCEiC5g w1zIZ3e89O+kLYHH2m64nsZY+IfCOY4mgWubzp4XL4/DHaRGfePTCg0YufCEdpc5pfkc shb10WvMeKghQ+y0kCQ1NhEFs11xcrxeiKOtsTBjIfhRjXssZOw9oJ2xibutPRRwkEVO Op/xjJH8/ua7e/Y69DsOvwhsc6j6OkZkgSDg0c8jz+qVo42EUmE+ej6McabZSFQ288ki lzXA== X-Gm-Message-State: AOJu0YzKD5hPnZRUXEBrApcilus9Q5TFDFzbFnwUDg6g3r7Ul+JVLn8g 2D2ih5RFYqZWQvl77Xc24YfDeTk7fI6ejcgY0GH7/Q== X-Received: by 2002:a05:6512:6cb:b0:50b:fbdf:de7d with SMTP id u11-20020a05651206cb00b0050bfbdfde7dmr1038789lff.154.1702288061160; Mon, 11 Dec 2023 01:47:41 -0800 (PST) Received: from slewis-laptop.ba.rivosinc.com ([51.52.155.69]) by smtp.gmail.com with ESMTPSA id a16-20020adffad0000000b003333b8eb84fsm8128298wrs.113.2023.12.11.01.47.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:47:40 -0800 (PST) From: Sergei Lewis To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/3] RISC-V: movmem for RISCV with V extension Date: Mon, 11 Dec 2023 09:47:26 +0000 Message-Id: <20231211094728.1623032-2-slewis@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231211094728.1623032-1-slewis@rivosinc.com> References: <20231211094728.1623032-1-slewis@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784978448653246015 X-GMAIL-MSGID: 1784978448653246015 gcc/ChangeLog * config/riscv/riscv.md (movmem): Use riscv_vector::expand_block_move, if and only if we know the entire operation can be performed using one vector load followed by one vector store gcc/testsuite/ChangeLog * gcc.target/riscv/rvv/base/movmem-1.c: New test --- gcc/config/riscv/riscv.md | 22 +++++++ .../gcc.target/riscv/rvv/base/movmem-1.c | 59 +++++++++++++++++++ 2 files changed, 81 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index eed997116b0..88fde290a8a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2359,6 +2359,28 @@ FAIL; }) +;; inlining general memmove is a pessimisation: we can't avoid having to decide +;; which direction to go at runtime, which is costly in instruction count +;; however for situations where the entire move fits in one vector operation +;; we can do all reads before doing any writes so we don't have to worry +;; so generate the inline vector code in such situations +;; nb. prefer scalar path for tiny memmoves +(define_expand "movmem" + [(parallel [(set (match_operand:BLK 0 "general_operand") + (match_operand:BLK 1 "general_operand")) + (use (match_operand:P 2 "")) + (use (match_operand:SI 3 "const_int_operand"))])] + "TARGET_VECTOR" +{ + if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN/8) + && (INTVAL (operands[2]) <= TARGET_MIN_VLEN) + && riscv_vector::expand_block_move (operands[0], operands[1], + operands[2])) + DONE; + else + FAIL; +}) + ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c new file mode 100644 index 00000000000..b930241ae5d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c @@ -0,0 +1,59 @@ +/* { dg-do compile } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8) + +/* tiny memmoves should not be vectorised +** f1: +** li\s+a2,15 +** tail\s+memmove +*/ +char * f1 (char *a, char const *b) +{ + return memmove (a, b, 15); +} + +/* vectorise+inline minimum vector register width with LMUL=1 +** f2: +** ( +** vsetivli\s+zero,16,e8,m1,ta,ma +** | +** li\s+[ta][0-7],\d+ +** vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma +** ) +** vle8\.v\s+v\d+,0\(a1\) +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +char * f2 (char *a, char const *b) +{ + return memmove (a, b, MIN_VECTOR_BYTES); +} + +/* vectorise+inline up to LMUL=8 +** f3: +** li\s+[ta][0-7],\d+ +** vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma +** vle8\.v\s+v\d+,0\(a1\) +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +char * f3 (char *a, char const *b) +{ + return memmove (a, b, MIN_VECTOR_BYTES*8); +} + +/* don't vectorise if the move is too large for one operation +** f4: +** li\s+a2,\d+ +** tail\s+memmove +*/ +char * f4 (char *a, char const *b) +{ + return memmove (a, b, MIN_VECTOR_BYTES*8+1); +} + From patchwork Mon Dec 11 09:47:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergei Lewis X-Patchwork-Id: 176565 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp6932152vqy; Mon, 11 Dec 2023 01:48:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IFPxzwKhwct0hPqaNrDRCR4yzZP/rq8uss15njYr5chdxgAArAubmNA1K2oQpH4fReEeIxC X-Received: by 2002:a05:622a:118f:b0:425:77ca:5798 with SMTP id m15-20020a05622a118f00b0042577ca5798mr7190244qtk.102.1702288101610; Mon, 11 Dec 2023 01:48:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702288101; cv=pass; d=google.com; s=arc-20160816; b=aFJIHOE8CRAmvdN7dmQ1DSEIwMZtvZbUq53vV4alfoTVU5RGKanTye2n7k+zXNnhdY xnGAm00tYnSM/vdTL6GZ+w7Ztr3ewo7LI2rAz484kCJU6iZcoTH3zqdskpfgCy9JTtR9 QxgzQuqMXVf1pscJiNRrjNsuHdjMpViH074R4HLQKX8dfZxsv+AVfTAIg3qnkQSvhppU +5ByOFGNCWn38Oqd1sVZ3c4FcB+i0tlGHTtC85+agZlnP85RoBfdYBGksLw2Q1gQc/mm qy/oA5VSM/yInBwN/irPSoeNs5wfw9RPK0GugVk3u+3lmmTwRCzOpRv5qsg1dTQb74dU KFSg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=DLJtSj3eJtLuwHT2ondHly641hqYKY4Kj6abchZpEhg=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=jyWcv+yCkjhHXOGJ/zMcSZlAx3ATbcJccisx+FYttZ+jCa38eJQByD0VqkFb1K60F+ O0hTxW5ysfhHivVP5/1H/xtwK3HjAJM74L5bO78B/SSTykY481aBMqu6MFZjIH0FGi62 gTladJfLWMJZbtZMBVGguRIkz2hKyx+dfgYWHomVDiVx+37fMdOZvw9tEaB2gfCqWErw aT2WxSn18zWJkDmNkp8BgysU4ngANv4u4M1J03fnub4sAAMFDPpoGlrxXZmIZ2dm/HVM 0BIEGXnswurynpv53tb784fxLC4rj6AYj06lVmqgDk//C0U3BSXJt1zsTBOZrPAA+phg LFig== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=eQT1UcnQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id s29-20020a05622a1a9d00b004255266675dsi8229959qtc.393.2023.12.11.01.48.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:48:21 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=eQT1UcnQ; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E912638582B6 for ; Mon, 11 Dec 2023 09:48:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id 3B8493858CD1 for ; Mon, 11 Dec 2023 09:47:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3B8493858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3B8493858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::336 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288067; cv=none; b=PMAzEeGpDHjPrZgwRYbjeSiBI4tm6EBtcLmKPDkGdM99YkDu4lC6Fe+6PHLkNgAdzonTXyMXbe0TT8qsVzkJz43viTfcfpTkgPXPQPIg480Zz59nyXk94jC5xEAAi7L7TCsVCKg176PmZm6eu/IhsQRdLAOyx/pCz+pGA/hFzFQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288067; c=relaxed/simple; bh=UWN7OBjUk3BPy1dGYa78YGbtk+wFYQ1l6ANbHDZByDg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Uihk/b2Q07UDjXzICpkuNLeEwFjh1tCSsASZ7iZ8FqRu5z3vk2/m8r/MHqEhTMNEwlkDFPzWCwNAvIINpcgmqWpbcC4hWy7TEqEH8TpnWLcHfptVL6A7tHjM6EbV1QQoTL9hYcXyH0IiqaeurVTekhHpgHxFx3FVA0GZA7duUcg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-40c317723a8so33439145e9.3 for ; Mon, 11 Dec 2023 01:47:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1702288063; x=1702892863; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=DLJtSj3eJtLuwHT2ondHly641hqYKY4Kj6abchZpEhg=; b=eQT1UcnQBGswRW1eTwFtZM5CJ1Z4/LInjRexT1rTOPc3YVqbStd5tRxdhRFWJIJwf9 dlIFlU//Jq8AZ0agJW8do8scByDVIyd8qoGitkR2ukJnMI5Ep5OVCM2gRY8ZYWxPLbvL NmE1V9fUAgLzYCx+Lgq9Lhg9yQbJiot4hwmtbrW0+9630lpU2dGZ1vjuc1SjxtZurVFY Wc/I1PgtnvWekPxnHObryjkNIMCpTgfV9YS0P206BGWdN+fQd+Z1AEtgm4oZlNeV2sw/ dy+yt73s/mKJkuDkJPQALlTKslqG4IAUpHr5N6HQP1gvmZrRZzuiIb467MTCp4vAO0yt ExkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702288063; x=1702892863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DLJtSj3eJtLuwHT2ondHly641hqYKY4Kj6abchZpEhg=; b=rpcNe/b50dgP8YW67jF8oG77aHxgQD0HZe0VeNduEBJleCXOWd9G08Zx3DkFBREXp4 8SXUTAUlML8lMrSjdtINvzpN7EK3bZcLBZi3lTBqEf7HGiMnrk4ye3x934HvlqaBjYwL 5flM+teF8QruFxYNu73xOo6u0x84pkjNOxkzKzsyXQTWRAUlhgk16l3hpWEzSHp2u57X oaDQo5jo6nteprzaaje/G5gz/8Zx5a1AhnFldv8g4NIUtBtxVEeknGCrS4Rd0E7px37K FVx1hJLu70HqvrCFUL/n39CQr4G6VGS9t5KrwI8x2hYGRslhsxVEWuAPCYmrW2N6fM/w gv5Q== X-Gm-Message-State: AOJu0YzRAcjoto+d2hhcJ/kUmM/NdfBUeubnARAwrCzl0l9yXFFDhax2 34/8SDSQ+zXck1TOZX/KGtiw2IjzZ24XBSTV03NY7g== X-Received: by 2002:a05:600c:35c5:b0:40c:2a69:6c2d with SMTP id r5-20020a05600c35c500b0040c2a696c2dmr1997702wmq.163.1702288062729; Mon, 11 Dec 2023 01:47:42 -0800 (PST) Received: from slewis-laptop.ba.rivosinc.com ([51.52.155.69]) by smtp.gmail.com with ESMTPSA id a16-20020adffad0000000b003333b8eb84fsm8128298wrs.113.2023.12.11.01.47.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:47:41 -0800 (PST) From: Sergei Lewis To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/3] RISC-V: setmem for RISCV with V extension Date: Mon, 11 Dec 2023 09:47:27 +0000 Message-Id: <20231211094728.1623032-3-slewis@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231211094728.1623032-1-slewis@rivosinc.com> References: <20231211094728.1623032-1-slewis@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784978448196490878 X-GMAIL-MSGID: 1784978448196490878 gcc/ChangeLog * config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New function declaration. * config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): New function: this generates an inline vectorised memory set, if and only if we know the entire operation can be performed in a single vector store * config/riscv/riscv.md (setmem): Try riscv_vector::expand_vec_setmem for constant lengths gcc/testsuite/ChangeLog * gcc.target/riscv/rvv/base/setmem-1.c: New tests --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 82 +++++++++++++++ gcc/config/riscv/riscv.md | 14 +++ .../gcc.target/riscv/rvv/base/setmem-1.c | 99 +++++++++++++++++++ 4 files changed, 196 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/setmem-1.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 20bbb5b859c..950cb65c910 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -560,6 +560,7 @@ void expand_popcount (rtx *); void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false); bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool); void emit_vec_extract (rtx, rtx, poly_int64); +bool expand_vec_setmem (rtx, rtx, rtx, rtx); /* Rounding mode bitfield for fixed point VXRM. */ enum fixed_point_rounding_mode diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 11c1f74d0b3..0abbd5f8b28 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -1247,4 +1247,86 @@ expand_strcmp (rtx result, rtx src1, rtx src2, rtx nbytes, return true; } + +/* Select appropriate LMUL for a single vector operation based on + byte size of data to be processed. + On success, return true and populate lmul_out. + If length_in is too wide for a single vector operation, return false + and leave lmul_out unchanged. */ + +static bool +select_appropriate_lmul (HOST_WIDE_INT length_in, + HOST_WIDE_INT &lmul_out) +{ + /* if it's tiny, default operation is likely better; maybe worth + considering fractional lmul in the future as well. */ + if (length_in < (TARGET_MIN_VLEN/8)) + return false; + + /* find smallest lmul large enough for entire op. */ + HOST_WIDE_INT lmul = 1; + while ((lmul <= 8) && (length_in > ((lmul*TARGET_MIN_VLEN)/8))) + { + lmul <<= 1; + } + + if (lmul > 8) + return false; + + lmul_out = lmul; + return true; +} + +/* Used by setmemdi in riscv.md. */ +bool +expand_vec_setmem (rtx dst_in, rtx length_in, rtx fill_value_in, + rtx alignment_in) +{ + /* we're generating vector code. */ + if (!TARGET_VECTOR) + return false; + /* if we can't reason about the length, let libc handle the operation. */ + if (!CONST_INT_P (length_in)) + return false; + + HOST_WIDE_INT length = INTVAL (length_in); + HOST_WIDE_INT lmul; + + /* select an lmul such that the data just fits into one vector operation; + bail if we can't. */ + if (!select_appropriate_lmul (length, lmul)) + return false; + + machine_mode vmode = riscv_vector::get_vector_mode (QImode, + BYTES_PER_RISCV_VECTOR * lmul).require (); + rtx dst_addr = copy_addr_to_reg (XEXP (dst_in, 0)); + rtx dst = change_address (dst_in, vmode, dst_addr); + + rtx fill_value = gen_reg_rtx (vmode); + rtx broadcast_ops[] = {fill_value, fill_value_in}; + + /* If the length is exactly vlmax for the selected mode, do that. + Otherwise, use a predicated store. */ + if (known_eq (GET_MODE_SIZE (vmode), INTVAL (length_in))) + { + emit_vlmax_insn (code_for_pred_broadcast (vmode), + UNARY_OP, broadcast_ops); + emit_move_insn (dst, fill_value); + } + else + { + if (!satisfies_constraint_K (length_in)) + length_in= force_reg (Pmode, length_in); + emit_nonvlmax_insn (code_for_pred_broadcast (vmode), UNARY_OP, + broadcast_ops, length_in); + machine_mode mask_mode = riscv_vector::get_vector_mode + (BImode, GET_MODE_NUNITS (vmode)).require (); + rtx mask = CONSTM1_RTX (mask_mode); + emit_insn (gen_pred_store (vmode, dst, mask, fill_value, length_in, + get_avl_type_rtx (riscv_vector::NONVLMAX))); + } + + return true; +} + } diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 88fde290a8a..29d3b1aa342 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2381,6 +2381,20 @@ FAIL; }) +(define_expand "setmemsi" + [(set (match_operand:BLK 0 "memory_operand") ;; Dest + (match_operand:QI 2 "nonmemory_operand")) ;; Value + (use (match_operand:SI 1 "const_int_operand")) ;; Length + (match_operand:SI 3 "const_int_operand")] ;; Align + "TARGET_VECTOR" +{ + if (riscv_vector::expand_vec_setmem (operands[0], operands[1], operands[2], + operands[3])) + DONE; + else + FAIL; +}) + ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/setmem-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/setmem-1.c new file mode 100644 index 00000000000..d1a5ff002a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/setmem-1.c @@ -0,0 +1,99 @@ +/* { dg-do compile } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8) + +/* tiny memsets should use scalar ops +** f1: +** sb\s+a1,0\(a0\) +** ret +*/ +void * f1 (void *a, int const b) +{ + return memset (a, b, 1); +} + +/* tiny memsets should use scalar ops +** f2: +** sb\s+a1,0\(a0\) +** sb\s+a1,1\(a0\) +** ret +*/ +void * f2 (void *a, int const b) +{ + return memset (a, b, 2); +} + +/* tiny memsets should use scalar ops +** f3: +** sb\s+a1,0\(a0\) +** sb\s+a1,1\(a0\) +** sb\s+a1,2\(a0\) +** ret +*/ +void * f3 (void *a, int const b) +{ + return memset (a, b, 3); +} + +/* vectorise+inline minimum vector register width with LMUL=1 +** f4: +** ( +** vsetivli\s+zero,\d+,e8,m1,ta,ma +** | +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m1,ta,ma +** ) +** vmv\.v\.x\s+v\d+,a1 +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +void * f4 (void *a, int const b) +{ + return memset (a, b, MIN_VECTOR_BYTES); +} + +/* vectorised code should use smallest lmul known to fit length +** f5: +** ( +** vsetivli\s+zero,\d+,e8,m2,ta,ma +** | +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m2,ta,ma +** ) +** vmv\.v\.x\s+v\d+,a1 +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +void * f5 (void *a, int const b) +{ + return memset (a, b, MIN_VECTOR_BYTES+1); +} + +/* vectorise+inline up to LMUL=8 +** f6: +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m8,ta,ma +** vmv\.v\.x\s+v\d+,a1 +** vse8\.v\s+v\d+,0\(a0\) +** ret +*/ +void * f6 (void *a, int const b) +{ + return memset (a, b, MIN_VECTOR_BYTES*8); +} + +/* don't vectorise if the move is too large for one operation +** f7: +** li\s+a2,\d+ +** tail\s+memset +*/ +void * f7 (void *a, int const b) +{ + return memset (a, b, MIN_VECTOR_BYTES*8+1); +} + From patchwork Mon Dec 11 09:47:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergei Lewis X-Patchwork-Id: 176566 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp6932315vqy; Mon, 11 Dec 2023 01:48:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IF9GAC3Gux+pljuJzxFJRyxdoRQkthPRPkM67ScgyttGtMCnGBPUxZtsFUs21xKFpxNv2bS X-Received: by 2002:a05:6830:16c5:b0:6cd:a989:c7ea with SMTP id l5-20020a05683016c500b006cda989c7eamr3488111otr.16.1702288126847; Mon, 11 Dec 2023 01:48:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702288126; cv=pass; d=google.com; s=arc-20160816; b=c7uiHWG8K3aEzBrqbyI1qHo2WKtJx+47EvYiiNvHMANDiPzL5FMxlHmJKCK10E1niv 4EF7C67W4au2EUZ5QLp3TPsyZNFX6zu1bgEwjHcVNVcc5Ss3vSklS180QbDUuCNivRxj OCDEQUGmrwNuZxVvSqwQOly2owS9eefmGPajC/7IChvWEY8aLuY8atC2E7xfeRj+Dfyx G1p5lAdTLP+k1powv8jxrXW8lOrq2ZKLL5zREVSxcb/Jbl9/cGO5Vno8rxJRbfjkueHE oMFo+2hfz/7ZkOFrKe4w4zgHLgTfcsUGK43QKWUph+VNOoKXldccfL561/0jTHIvq1Ii E9jA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=+vNnEJyT+71zNWqBrK7LEjZc5a8gaFOso0RLnK5N8DY=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=Dl5LZgpx69i5BGzeVL4YJbOsspIxcthL7WeYdEkNNXgUmkD8aC148DL0casZQd0mmP rDU0htWjeTAxp9AQ9Rt5xBkDqydCH6Ry35YbQ4WqrW/zd4IaMLc5Ib3GRec7fdF59WXr hBI5Xi/af/7ywWCrEOeKD8Igm1QRx2FhFCxHiMsnlOD/V5+5FgNXtM93mjkQaDDlCMSE an/KQAvtk7+YkzK9n9ytdRWp9ZfwjyyBmwzCCmfOXQmqdzK0IfD/oREFo64BAAXQoRUV uI9ogS2DT/JEtAYUQFA/HvyONxr5ZW9oCBwsQBIkrADN1xTMgc6JndIbRjxJSoX1GlQi MXPQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=HitZyh66; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id do2-20020a05622a478200b0042393d2035esi7745877qtb.344.2023.12.11.01.48.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:48:46 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=HitZyh66; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 905F13858C5E for ; Mon, 11 Dec 2023 09:48:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id DFCBA3858C53 for ; Mon, 11 Dec 2023 09:47:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DFCBA3858C53 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DFCBA3858C53 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::434 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288068; cv=none; b=XxK3yEtG8D/tJAQn1PdkmIvsN2rMP/DC86LmOKszLbzD7lLU00K7DFiuMSlTgzIADmzF8kTfBcUqUoqGaTaxy/CvYCMgXe9hx7+3orInBdacYNU/xwtLdJAZrxi5Usn9GCrRVo6IJTYEy4EfgA3MROvY2gmI8I//RLAgktksR04= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702288068; c=relaxed/simple; bh=IBQNzaJYngU5cFF+LSo2CGGn2H/Qsh6mdpPw80wo5ao=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BWH4h7bH6M0jMBgXDMu2lAK4/cwj4nQmDmRLYZTCGt59i3XYc9tYREvYbzT40CdebSXL2txpsaeCGbsQScrmksNv0rLlFDycFu3xLtfvuoeljllOZ9EZiUOlxhLTUnviWoAuTf2lKJBoW2qZoatcjgUuxitiEFy8OjKdQ6qJFgg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-3360ae2392eso2563221f8f.2 for ; Mon, 11 Dec 2023 01:47:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1702288063; x=1702892863; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=+vNnEJyT+71zNWqBrK7LEjZc5a8gaFOso0RLnK5N8DY=; b=HitZyh66Z+l8gSRmhmNe6/7L3FZZWUNtz2hTMwiyH740GkkcfqhdBEICCrAgHAxryH hZ5aisUA3lG6xpCIKVNtNEtDHU0e71zMsLJs9Nl1sLlRnOvVDKBXA9VLZ86rCPWmGi5j kz0KOZq7DuFKS3jF0yIzdjRmR4lKc0XB9KgOFdba2fcR245BmIZuMc5Ld5P+JRGCRcXf VxBu6DRtupEVrp8BgV79PQJ6cljR2isIbau2C9lgtowOZN/3tM3HhqHpdb6NbbqojM/u RHq3XbTqdqXgpv7n3uIOexU8dsRCwDcGSNPKO20YX/c0Y+rkRWonApDfohNNpkx0v4lw Zilw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702288063; x=1702892863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+vNnEJyT+71zNWqBrK7LEjZc5a8gaFOso0RLnK5N8DY=; b=EOH0MqufyZf6BrUynVXTe4prOWOSR/tpAIMGmeUmzuvX6zFSIHxPzHwpkC/Ko9ZOTm uC4E+Q5nBeg2ndB7YSoTG+wu5DvVcOP8iYOOTa2jFU+KqPTmD0AAX1kX0DYoF9SKwQe+ 979cdzNeQz3nS7vw+CazAz0gtbEYZ9InTLtkrySLZaRTgH5JAdLFvDWk015Dd8LDIwOk Pqqbgq4YEG02oNE7UO+1c2cDgJ48vT0aHS7xvekSIbnPt+iKgx6HwooDF2Qbg+drWSt/ J6yt65tUtyQXcJZazCWzLy5P03LG05x7MgOgX3viU9CDLaoTA+fb+2rNmjA1voPXlP+k GTQQ== X-Gm-Message-State: AOJu0Yzxltjd0VaoW9Lz8jaHh+L/jLXEkUv9VZ8o8vyqjk3FqD/91Co8 9TeJOBAWKIddxEv7GsaXUG+u7OYJg/enNzOyjr7ITA== X-Received: by 2002:adf:fe85:0:b0:333:2155:67ee with SMTP id l5-20020adffe85000000b00333215567eemr1902724wrr.35.1702288063301; Mon, 11 Dec 2023 01:47:43 -0800 (PST) Received: from slewis-laptop.ba.rivosinc.com ([51.52.155.69]) by smtp.gmail.com with ESMTPSA id a16-20020adffad0000000b003333b8eb84fsm8128298wrs.113.2023.12.11.01.47.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:47:42 -0800 (PST) From: Sergei Lewis To: gcc-patches@gcc.gnu.org Subject: [PATCH 3/3] RISC-V: cmpmem for RISCV with V extension Date: Mon, 11 Dec 2023 09:47:28 +0000 Message-Id: <20231211094728.1623032-4-slewis@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231211094728.1623032-1-slewis@rivosinc.com> References: <20231211094728.1623032-1-slewis@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784978474669320439 X-GMAIL-MSGID: 1784978474669320439 gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_vector::expand_vec_cmpmem): New function declaration. * config/riscv/riscv-string.cc (riscv_vector::expand_vec_cmpmem): New function; this generates an inline vectorised memory compare, if and only if we know the entire operation can be performed in a single vector load per input * config/riscv/riscv.md (cmpmemsi): Try riscv_vector::expand_vec_cmpmem for constant lengths gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/cmpmem-1.c: New codegen tests * gcc.target/riscv/rvv/base/cmpmem-2.c: New execution tests --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 111 ++++++++++++++++++ gcc/config/riscv/riscv.md | 15 +++ .../gcc.target/riscv/rvv/base/cmpmem-1.c | 85 ++++++++++++++ .../gcc.target/riscv/rvv/base/cmpmem-2.c | 69 +++++++++++ 5 files changed, 281 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-2.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 950cb65c910..72378438552 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -561,6 +561,7 @@ void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false); bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool); void emit_vec_extract (rtx, rtx, poly_int64); bool expand_vec_setmem (rtx, rtx, rtx, rtx); +bool expand_vec_cmpmem (rtx, rtx, rtx, rtx); /* Rounding mode bitfield for fixed point VXRM. */ enum fixed_point_rounding_mode diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 0abbd5f8b28..6128565310b 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -1329,4 +1329,115 @@ expand_vec_setmem (rtx dst_in, rtx length_in, rtx fill_value_in, return true; } + +/* Used by cmpmemsi in riscv.md. */ + +bool +expand_vec_cmpmem (rtx result_out, rtx blk_a_in, rtx blk_b_in, rtx length_in) +{ + /* we're generating vector code. */ + if (!TARGET_VECTOR) + return false; + /* if we can't reason about the length, let libc handle the operation. */ + if (!CONST_INT_P (length_in)) + return false; + + HOST_WIDE_INT length = INTVAL (length_in); + HOST_WIDE_INT lmul; + + /* select an lmul such that the data just fits into one vector operation; + bail if we can't. */ + if (!select_appropriate_lmul (length, lmul)) + return false; + + /* strategy: + load entire blocks at a and b into vector regs + generate mask of bytes that differ + find first set bit in mask + find offset of first set bit in mask, use 0 if none set + result is ((char*)a[offset] - (char*)b[offset]) + */ + + machine_mode vmode = riscv_vector::get_vector_mode (QImode, + BYTES_PER_RISCV_VECTOR * lmul).require (); + rtx blk_a_addr = copy_addr_to_reg (XEXP (blk_a_in, 0)); + rtx blk_a = change_address (blk_a_in, vmode, blk_a_addr); + rtx blk_b_addr = copy_addr_to_reg (XEXP (blk_b_in, 0)); + rtx blk_b = change_address (blk_b_in, vmode, blk_b_addr); + + rtx vec_a = gen_reg_rtx (vmode); + rtx vec_b = gen_reg_rtx (vmode); + + machine_mode mask_mode = get_mask_mode (vmode); + rtx mask = gen_reg_rtx (mask_mode); + rtx mismatch_ofs = gen_reg_rtx (Pmode); + + rtx ne = gen_rtx_NE (mask_mode, vec_a, vec_b); + rtx vmsops[] = {mask, ne, vec_a, vec_b}; + rtx vfops[] = {mismatch_ofs, mask}; + + /* If the length is exactly vlmax for the selected mode, do that. + Otherwise, use a predicated store. */ + + if (known_eq (GET_MODE_SIZE (vmode), INTVAL (length_in))) + { + emit_move_insn (vec_a, blk_a); + emit_move_insn (vec_b, blk_b); + emit_vlmax_insn (code_for_pred_cmp (vmode), + riscv_vector::COMPARE_OP, vmsops); + + emit_vlmax_insn (code_for_pred_ffs (mask_mode, Pmode), + riscv_vector::CPOP_OP, vfops); + } + else + { + if (!satisfies_constraint_K (length_in)) + length_in= force_reg (Pmode, length_in); + + rtx memmask = CONSTM1_RTX (mask_mode); + + rtx m_ops_a[] = {vec_a, memmask, blk_a}; + rtx m_ops_b[] = {vec_b, memmask, blk_b}; + + emit_nonvlmax_insn (code_for_pred_mov (vmode), + riscv_vector::UNARY_OP_TAMA, m_ops_a, length_in); + emit_nonvlmax_insn (code_for_pred_mov (vmode), + riscv_vector::UNARY_OP_TAMA, m_ops_b, length_in); + + emit_nonvlmax_insn (code_for_pred_cmp (vmode), + riscv_vector::COMPARE_OP, vmsops, length_in); + + emit_nonvlmax_insn (code_for_pred_ffs (mask_mode, Pmode), + riscv_vector::CPOP_OP, vfops, length_in); + } + + /* mismatch_ofs is -1 if blocks match, or the offset of + the first mismatch otherwise. */ + rtx ltz = gen_reg_rtx (Xmode); + emit_insn (gen_slt_3 (LT, Xmode, Xmode, ltz, mismatch_ofs, const0_rtx)); + /* mismatch_ofs += (mismatch_ofs < 0) ? 1 : 0. */ + emit_insn (gen_rtx_SET (mismatch_ofs, gen_rtx_PLUS (Pmode, + mismatch_ofs, ltz))); + + /* unconditionally load the bytes at mismatch_ofs and subtract them + to get our result. */ + emit_insn (gen_rtx_SET (blk_a_addr, gen_rtx_PLUS (Pmode, + mismatch_ofs, blk_a_addr))); + emit_insn (gen_rtx_SET (blk_b_addr, gen_rtx_PLUS (Pmode, + mismatch_ofs, blk_b_addr))); + + blk_a = change_address (blk_a, QImode, blk_a_addr); + blk_b = change_address (blk_b, QImode, blk_b_addr); + + rtx byte_a = gen_reg_rtx (SImode); + rtx byte_b = gen_reg_rtx (SImode); + do_zero_extendqi2 (byte_a, blk_a); + do_zero_extendqi2 (byte_b, blk_b); + + emit_insn (gen_rtx_SET (result_out, gen_rtx_MINUS (SImode, + byte_a, byte_b))); + + + return true; +} } diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 29d3b1aa342..39829c8566c 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2395,6 +2395,21 @@ FAIL; }) +(define_expand "cmpmemsi" + [(set (match_operand:SI 0 "register_operand" "") + (compare:SI (match_operand:BLK 1 "memory_operand" "") + (match_operand:BLK 2 "memory_operand" ""))) + (use (match_operand:SI 3 "general_operand" "")) + (use (match_operand:SI 4 "" ""))] + "TARGET_VECTOR" +{ + if (riscv_vector::expand_vec_cmpmem (operands[0], operands[1], + operands[2], operands[3])) + DONE; + else + FAIL; +}) + ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-1.c new file mode 100644 index 00000000000..686ac6d6b0c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-1.c @@ -0,0 +1,85 @@ +/* { dg-do compile } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8) + +/* trivial memcmp should use inline scalar ops +** f1: +** lbu\s+a\d+,0\(a0\) +** lbu\s+a\d+,0\(a1\) +** subw\s+a0,a\d+,a\d+ +** ret +*/ +int f1 (void * a, void * b) +{ + return memcmp (a, b, 1); +} + +/* tiny memcmp should use libc +** f2: +** li\s+a2,\d+ +** tail\s+memcmp +*/ +int f2 (void * a, void * b) +{ + return memcmp (a, b, MIN_VECTOR_BYTES-1); +} + +/* vectorise+inline minimum vector register width with LMUL=1 +** f3: +** ( +** vsetivli\s+zero,\d+,e8,m1,ta,ma +** | +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m1,ta,ma +** ) +** ... +** ret +*/ +int f3 (void * a, void * b) +{ + return memcmp (a, b, MIN_VECTOR_BYTES); +} + +/* vectorised code should use smallest lmul known to fit length +** f4: +** ( +** vsetivli\s+zero,\d+,e8,m2,ta,ma +** | +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m2,ta,ma +** ) +** ... +** ret +*/ +int f4 (void * a, void * b) +{ + return memcmp (a, b, MIN_VECTOR_BYTES+1); +} + +/* vectorise+inline up to LMUL=8 +** f5: +** li\s+a\d+,\d+ +** vsetvli\s+zero,a\d+,e8,m8,ta,ma +** ... +** ret +*/ +int f5 (void * a, void * b) +{ + return memcmp (a, b, MIN_VECTOR_BYTES*8); +} + +/* don't inline if the length is too large for one operation +** f6: +** li\s+a2,\d+ +** tail\s+memcmp +*/ +int f6 (void * a, void * b) +{ + return memcmp (a, b, MIN_VECTOR_BYTES*8+1); +} + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-2.c new file mode 100644 index 00000000000..eedd23d4db0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-2.c @@ -0,0 +1,69 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-options "-O2" } */ + +#include +#include + +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8) + +static inline __attribute__((always_inline)) +void do_one_test( int const size, int const diff_offset, + int const diff_dir ) +{ + unsigned char A[size]; + unsigned char B[size]; + unsigned char const fill_value = 0x55; + memset( A, fill_value, size ); + memset( B, fill_value, size ); + + if( diff_dir != 0 ) { + if( diff_dir < 0 ) { + A[diff_offset] = fill_value-1; + } else { + A[diff_offset] = fill_value+1; + } + } + + if( memcmp( A, B, size ) != diff_dir ) { + abort (); + } +} + +int main() +{ + do_one_test( 0, 0, 0 ); + + do_one_test( 1, 0, -1 ); + do_one_test( 1, 0, 0 ); + do_one_test( 1, 0, 1 ); + + do_one_test( MIN_VECTOR_BYTES-1, 0, -1 ); + do_one_test( MIN_VECTOR_BYTES-1, 0, 0 ); + do_one_test( MIN_VECTOR_BYTES-1, 0, 1 ); + do_one_test( MIN_VECTOR_BYTES-1, 1, -1 ); + do_one_test( MIN_VECTOR_BYTES-1, 1, 0 ); + do_one_test( MIN_VECTOR_BYTES-1, 1, 1 ); + + do_one_test( MIN_VECTOR_BYTES, 0, -1 ); + do_one_test( MIN_VECTOR_BYTES, 0, 0 ); + do_one_test( MIN_VECTOR_BYTES, 0, 1 ); + do_one_test( MIN_VECTOR_BYTES, MIN_VECTOR_BYTES-1, -1 ); + do_one_test( MIN_VECTOR_BYTES, MIN_VECTOR_BYTES-1, 0 ); + do_one_test( MIN_VECTOR_BYTES, MIN_VECTOR_BYTES-1, 1 ); + + do_one_test( MIN_VECTOR_BYTES+1, 0, -1 ); + do_one_test( MIN_VECTOR_BYTES+1, 0, 0 ); + do_one_test( MIN_VECTOR_BYTES+1, 0, 1 ); + do_one_test( MIN_VECTOR_BYTES+1, MIN_VECTOR_BYTES, -1 ); + do_one_test( MIN_VECTOR_BYTES+1, MIN_VECTOR_BYTES, 0 ); + do_one_test( MIN_VECTOR_BYTES+1, MIN_VECTOR_BYTES, 1 ); + + do_one_test( MIN_VECTOR_BYTES*8, 0, -1 ); + do_one_test( MIN_VECTOR_BYTES*8, 0, 0 ); + do_one_test( MIN_VECTOR_BYTES*8, 0, 1 ); + do_one_test( MIN_VECTOR_BYTES*8, MIN_VECTOR_BYTES*8-1, -1 ); + do_one_test( MIN_VECTOR_BYTES*8, MIN_VECTOR_BYTES*8-1, 0 ); + do_one_test( MIN_VECTOR_BYTES*8, MIN_VECTOR_BYTES*8-1, 1 ); + + return 0; +}