From patchwork Tue Sep 12 11:57:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 138437 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9ecd:0:b0:3f2:4152:657d with SMTP id t13csp682219vqx; Tue, 12 Sep 2023 14:10:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGZB5s0IVthb2JbozmrDKBStNQai3bS3gcmCnCIX+JjORGxKmlcJjoAYux3mxXxgHhuWvHa X-Received: by 2002:a05:6830:1e6b:b0:6b7:4a86:f038 with SMTP id m11-20020a0568301e6b00b006b74a86f038mr887750otr.15.1694553039056; Tue, 12 Sep 2023 14:10:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694553038; cv=none; d=google.com; s=arc-20160816; b=Qnd0C4NsnpyqEbz5fiA+noYsTJZNebuDpy++OUZbOm+XIBKEr47JYwRZwiiPzpzQ3q cORu/cEx/Fzh/EJal7ss+i8Ic5zgJy9MsW2AoWlUF4XSGc8hqHcP05inCX6+OE8Xo9aT Q3pkG4AKIvNvysy2XaMJ8PgfLiWqT59mVLMcy8xJ98yuSEEvS5hdIBd2KClyJD9SElxG Mxi4dh65o7TL1nRwBj1CZV0HnDJ1rM3oZYpETlOvtZ+Fpp/DJ5m8USJc8UWSHFnsomEj 0TcM1FiJ9GfglIAfvUOsZ3EP2CG12d/fmoYPmH2mg7bCpC71qb0yaSG4meEq4nXpB6Te 1vKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vB3tBBbp39GWDT+lu59LRTBFrUwVNDbAhUtqYeqqtj8=; fh=7ac1WITHXs70wMZD5cN37stWlaqYMESEmjDrX6h2Nb8=; b=w9yYu7ENOshICs2egHa5PK05H07t+gYevt+iODk87L1b8XHFmLqyGaLaRnCW9f4ihB ir2UTxeIgIp4UjTCzsIsUZWIfhnVkS3T/rBbGC97LWz5mVtzIkeX2R2jcCnAWPvD+8xw J6KXFdKOFqXSN6V3aHeMxvMQAal+KK0Ybabs1HhWEBZjl4W49+PTrdusTDhIEygKKuqy LYnz9IDjwOC48H3CDcIrfyV3tArKLQN3byyZePrELkKKeubLiAmQBHFBnHfojUje671N oOkGhfIYFFEQvV4hFP9VodOIqOScqdZcj5GEBmp3DXOVmIs2p/COvZaNS4GB+NPFA1X8 YAfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dVMV6Pcc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id r10-20020a632b0a000000b0056baff5c55esi8441203pgr.74.2023.09.12.14.10.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 14:10:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=dVMV6Pcc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id A5C198489E41; Tue, 12 Sep 2023 04:58:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234663AbjILL6g (ORCPT + 37 others); Tue, 12 Sep 2023 07:58:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234939AbjILL6R (ORCPT ); Tue, 12 Sep 2023 07:58:17 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDE26170A; Tue, 12 Sep 2023 04:57:54 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CADE2C433C7; Tue, 12 Sep 2023 11:57:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1694519874; bh=1ADMXMRwZkpdzFukU2GAMQEjbVj/Leav+1IcY00RlEw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dVMV6Pccozr1DmAoiXckxsMasYBCGNzRqYFffPsr6ND06gImbEZbAjJBeqNujL1nP HvzKuliJG53cdzoINehbgOT8cBE4SuQOydEN9H9vh9ARVG38rKPJ0Uv7yi71AeX3gz bnqWoGQWURSrVno/EXIIsIeXQ5E872EUYCGCArYmr7WQo4fxG8cOdbCIlJC+LTSrty F4AFyqW+CSincnMlFM/hA+bEGsPy5In6aoWvXlJYn2lHIPD4lUvx1J9fhXdWO8b8Bl /1mJ8Ut4yH4iKYxJDI3eYCKHrgWohMwPlx8t2p/VsRy4v0/0Ba4JGZt3B+XKYNSG5S huDDZuwK3IPKg== From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv@lists.infradead.org, Andy Chiu , Greentime Hu , "Jason A . Donenfeld" , Samuel Neves Cc: Heiko Stuebner , Herbert Xu , "David S. Miller" , linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, Han-Kuan Chen , Conor Dooley Subject: [RFC PATCH 3/6] riscv: Add vector extension XOR implementation Date: Tue, 12 Sep 2023 13:57:25 +0200 Message-Id: <20230912115728.172982-4-bjorn@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230912115728.172982-1-bjorn@kernel.org> References: <20230912115728.172982-1-bjorn@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 12 Sep 2023 04:58:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776867647370209704 X-GMAIL-MSGID: 1776867647370209704 From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu Signed-off-by: Andy Chiu Acked-by: Conor Dooley --- arch/riscv/include/asm/xor.h | 82 ++++++++++++++++++++++++++++++++++++ arch/riscv/lib/Makefile | 1 + arch/riscv/lib/xor.S | 81 +++++++++++++++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/lib/xor.S diff --git a/arch/riscv/include/asm/xor.h b/arch/riscv/include/asm/xor.h new file mode 100644 index 000000000000..903c3275f8d0 --- /dev/null +++ b/arch/riscv/include/asm/xor.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ + +#include +#include +#ifdef CONFIG_RISCV_ISA_V +#include +#include + +void xor_regs_2_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2); +void xor_regs_3_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3); +void xor_regs_4_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4); +void xor_regs_5_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5); + +static void xor_vector_2(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2) +{ + kernel_vector_begin(); + xor_regs_2_(bytes, p1, p2); + kernel_vector_end(); +} + +static void xor_vector_3(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3) +{ + kernel_vector_begin(); + xor_regs_3_(bytes, p1, p2, p3); + kernel_vector_end(); +} + +static void xor_vector_4(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4) +{ + kernel_vector_begin(); + xor_regs_4_(bytes, p1, p2, p3, p4); + kernel_vector_end(); +} + +static void xor_vector_5(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5) +{ + kernel_vector_begin(); + xor_regs_5_(bytes, p1, p2, p3, p4, p5); + kernel_vector_end(); +} + +static struct xor_block_template xor_block_rvv = { + .name = "rvv", + .do_2 = xor_vector_2, + .do_3 = xor_vector_3, + .do_4 = xor_vector_4, + .do_5 = xor_vector_5 +}; + +#undef XOR_TRY_TEMPLATES +#define XOR_TRY_TEMPLATES \ + do { \ + xor_speed(&xor_block_8regs); \ + xor_speed(&xor_block_32regs); \ + if (has_vector()) { \ + xor_speed(&xor_block_rvv);\ + } \ + } while (0) +#endif diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 26cb2502ecf8..494f9cd1a00c 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -11,3 +11,4 @@ lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +lib-$(CONFIG_RISCV_ISA_V) += xor.o diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S new file mode 100644 index 000000000000..3bc059e18171 --- /dev/null +++ b/arch/riscv/lib/xor.S @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ +#include +#include +#include + +ENTRY(xor_regs_2_) + vsetvli a3, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a3 + vxor.vv v16, v0, v8 + add a2, a2, a3 + vse8.v v16, (a1) + add a1, a1, a3 + bnez a0, xor_regs_2_ + ret +END(xor_regs_2_) +EXPORT_SYMBOL(xor_regs_2_) + +ENTRY(xor_regs_3_) + vsetvli a4, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a4 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a4 + vxor.vv v16, v0, v16 + add a3, a3, a4 + vse8.v v16, (a1) + add a1, a1, a4 + bnez a0, xor_regs_3_ + ret +END(xor_regs_3_) +EXPORT_SYMBOL(xor_regs_3_) + +ENTRY(xor_regs_4_) + vsetvli a5, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a5 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a5 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a5 + vxor.vv v16, v0, v24 + add a4, a4, a5 + vse8.v v16, (a1) + add a1, a1, a5 + bnez a0, xor_regs_4_ + ret +END(xor_regs_4_) +EXPORT_SYMBOL(xor_regs_4_) + +ENTRY(xor_regs_5_) + vsetvli a6, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a6 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a6 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a6 + vxor.vv v0, v0, v24 + vle8.v v8, (a5) + add a4, a4, a6 + vxor.vv v16, v0, v8 + add a5, a5, a6 + vse8.v v16, (a1) + add a1, a1, a6 + bnez a0, xor_regs_5_ + ret +END(xor_regs_5_) +EXPORT_SYMBOL(xor_regs_5_)