From patchwork Wed Sep 20 07:46:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Xiao W" X-Patchwork-Id: 142267 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp3952785vqi; Wed, 20 Sep 2023 00:53:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+9GUm0EOPsGUk0z/Jb6xuo+jq8iWshFNu93wTuKLs5Kb/ytLq1bJb4xQ9VzNIYHXS/WmM X-Received: by 2002:a25:690d:0:b0:d44:351c:8ac2 with SMTP id e13-20020a25690d000000b00d44351c8ac2mr1997557ybc.35.1695196383554; Wed, 20 Sep 2023 00:53:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695196383; cv=none; d=google.com; s=arc-20160816; b=e7qdycYPKr3gn/YUMpPfFwqiCOt77zZrceu14nAkMclu9aYPgON8e6FsRziI7XrARE Z/zT1gsk/VFr8qE3AJByAlK7c7pdSMLtGQ0ZqNwCUwGxSzsw/JgIvI5ht2wDLXRUNbks qQr7onB4kKoeCAdguBDtEBAChpQ/MrlqOjwO/Cr/SRZt7Xvd2FilIp4cMxzlnH5NV3/3 HpquOSapuBdRKanyXyZNEy18uxQ/mKCjidHXwuWjCGqOX6ZQq3wx8dIrEDKyiS9jUcbY 5NpowWc9GSaa/rCDUftmlUHPzvVEx4fitVU1Wz6shc/ryxYKVxPFXG7rHTieBM3MsauN aiig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XQhrJ6BDT72Rx/qCcTzYNzjOZ9dXyA8dN8ZNWfkLWew=; fh=IZKJ+GLRBclpWEjBDtcqwANaV1EY2kJ9JKX1ZdioJYk=; b=gr4VjFZdC2wBGyVG5YJUMWx919L2yfEJcK0cjwBwkS2/KBcqEK957VSAqJ7TsyI3dR E1i0ZF4zQ6o0ncguwXbb+VkRTvIc3+huyyu2VSUuCzNGjLTn/fKeBfjFHGFH7oW31lDE PF9lkEwr7TD5XtbAVZ7ESuiOushDp6I9AZfPOD1yPmh+k+kMqGmMWvTHRGodiC+KFlk7 u5xGSZww5ZCii19aOf0OebSPZSzvqP83wPSPKNG2S48kt2Gd3bbPQd/NBMSI5jBV341M OXRleplaxIGfVlSGLvd2dI9c0v7CHW40fsTj0Q39cH3UKOF+Y050lvg4vVi/qx5vFvDd VAwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jk26wCGu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id bg26-20020a056a02011a00b005703b492a23si2833579pgb.308.2023.09.20.00.53.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 00:53:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jk26wCGu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 6175B82A9BBF; Wed, 20 Sep 2023 00:40:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233681AbjITHj5 (ORCPT + 26 others); Wed, 20 Sep 2023 03:39:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233790AbjITHjy (ORCPT ); Wed, 20 Sep 2023 03:39:54 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02A34C9; Wed, 20 Sep 2023 00:39:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695195589; x=1726731589; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WLrigc7M2aJBmw6Tk0aSBjJF93oGzGXZq+VW9pT9fwo=; b=jk26wCGus1Z8U1/dMGkHFS7JLvhrW50uRDn43soJTaoqAavZlM2JQeV9 LQ16pYTNybSWh3vryLSUBndQ6l7RFEXLxOfUXO0mO8AjNSg2uVNyW1isp lV7Pi6tGYgdWMxXf3o3dd9E0Lb1W5oTcuSo9BU67xyw0JAQsSebzvc2Vq k6fxYikbbfzx9pzLkZtwdBRNq7w8KCUO7JJAh+e4+izgGXoGEPUtUnmqP xhk/42IYd1EPHqaOMynGjFCYNvr6VqSrAJNOqjTb9mFg5n8RsevbZLHPR +Dmz2Rezw0H+iY1M9HrpVeCIfWReX7bcwZm/yZRz/Q5o4Cq+OzyjMIKMV A==; X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="446622625" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="446622625" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 00:39:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="816791831" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="816791831" Received: from xiao-desktop.sh.intel.com ([10.239.46.158]) by fmsmga004.fm.intel.com with ESMTP; 20 Sep 2023 00:39:45 -0700 From: Xiao Wang To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, ardb@kernel.org Cc: anup@brainfault.org, haicheng.li@intel.com, ajones@ventanamicro.com, linux-riscv@lists.infradead.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Xiao Wang Subject: [PATCH v2 1/2] riscv: Rearrange hwcap.h and cpufeature.h Date: Wed, 20 Sep 2023 15:46:52 +0800 Message-Id: <20230920074653.2509631-2-xiao.w.wang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230920074653.2509631-1-xiao.w.wang@intel.com> References: <20230920074653.2509631-1-xiao.w.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 20 Sep 2023 00:40:12 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777542242948821744 X-GMAIL-MSGID: 1777542242948821744 Now hwcap.h and cpufeature.h are mutually including each other, and most of the variable/API declarations in hwcap.h are implemented in cpufeature.c, so, it's better to move them into cpufeature.h and leave only macros for ISA extension logical IDs in hwcap.h. BTW, the riscv_isa_extension_mask macro is not used now, so this patch removes it. Signed-off-by: Xiao Wang Reviewed-by: Andrew Jones --- arch/riscv/include/asm/cpufeature.h | 83 ++++++++++++++++++++++++++ arch/riscv/include/asm/hwcap.h | 91 ----------------------------- arch/riscv/include/asm/pgtable.h | 1 + arch/riscv/include/asm/switch_to.h | 2 +- arch/riscv/include/asm/vector.h | 2 +- 5 files changed, 86 insertions(+), 93 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 13b7d35648a9..3061d33abc2f 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -7,7 +7,10 @@ #define _ASM_CPUFEATURE_H #include +#include #include +#include +#include /* * These are probed via a device_initcall(), via either the SBI or directly @@ -33,4 +36,84 @@ extern struct riscv_isainfo hart_isa[NR_CPUS]; void check_unaligned_access(int cpu); void riscv_user_isa_enable(void); +unsigned long riscv_get_elf_hwcap(void); + +struct riscv_isa_ext_data { + const unsigned int id; + const char *name; + const char *property; +}; + +extern const struct riscv_isa_ext_data riscv_isa_ext[]; +extern const size_t riscv_isa_ext_count; +extern bool riscv_isa_fallback; + +unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap); + +bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit); +#define riscv_isa_extension_available(isa_bitmap, ext) \ + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext) + +static __always_inline bool +riscv_has_extension_likely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { + asm_volatile_goto( + ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1) + : + : [ext] "i" (ext) + : + : l_no); + } else { + if (!__riscv_isa_extension_available(NULL, ext)) + goto l_no; + } + + return true; +l_no: + return false; +} + +static __always_inline bool +riscv_has_extension_unlikely(const unsigned long ext) +{ + compiletime_assert(ext < RISCV_ISA_EXT_MAX, + "ext must be < RISCV_ISA_EXT_MAX"); + + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { + asm_volatile_goto( + ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1) + : + : [ext] "i" (ext) + : + : l_yes); + } else { + if (__riscv_isa_extension_available(NULL, ext)) + goto l_yes; + } + + return false; +l_yes: + return true; +} + +static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) +{ + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) + return true; + + return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); +} + +static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) +{ + if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext)) + return true; + + return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); +} + #endif diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 31774bcdf1c6..141b7109c25c 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -8,9 +8,6 @@ #ifndef _ASM_RISCV_HWCAP_H #define _ASM_RISCV_HWCAP_H -#include -#include -#include #include #define RISCV_ISA_EXT_a ('a' - 'a') @@ -67,92 +64,4 @@ #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA #endif -#ifndef __ASSEMBLY__ - -#include -#include - -unsigned long riscv_get_elf_hwcap(void); - -struct riscv_isa_ext_data { - const unsigned int id; - const char *name; - const char *property; -}; - -extern const struct riscv_isa_ext_data riscv_isa_ext[]; -extern const size_t riscv_isa_ext_count; -extern bool riscv_isa_fallback; - -unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap); - -#define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext) - -bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit); -#define riscv_isa_extension_available(isa_bitmap, ext) \ - __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext) - -static __always_inline bool -riscv_has_extension_likely(const unsigned long ext) -{ - compiletime_assert(ext < RISCV_ISA_EXT_MAX, - "ext must be < RISCV_ISA_EXT_MAX"); - - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - asm_volatile_goto( - ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1) - : - : [ext] "i" (ext) - : - : l_no); - } else { - if (!__riscv_isa_extension_available(NULL, ext)) - goto l_no; - } - - return true; -l_no: - return false; -} - -static __always_inline bool -riscv_has_extension_unlikely(const unsigned long ext) -{ - compiletime_assert(ext < RISCV_ISA_EXT_MAX, - "ext must be < RISCV_ISA_EXT_MAX"); - - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - asm_volatile_goto( - ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1) - : - : [ext] "i" (ext) - : - : l_yes); - } else { - if (__riscv_isa_extension_available(NULL, ext)) - goto l_yes; - } - - return false; -l_yes: - return true; -} - -static __always_inline bool riscv_cpu_has_extension_likely(int cpu, const unsigned long ext) -{ - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_likely(ext)) - return true; - - return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); -} - -static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsigned long ext) -{ - if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE) && riscv_has_extension_unlikely(ext)) - return true; - - return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); -} -#endif - #endif /* _ASM_RISCV_HWCAP_H */ diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index b2ba3f79cfe9..e05b5dc1f0cb 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -291,6 +291,7 @@ static inline pte_t pud_pte(pud_t pud) } #ifdef CONFIG_RISCV_ISA_SVNAPOT +#include static __always_inline bool has_svnapot(void) { diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h index a727be723c56..f90d8e42f3c7 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -9,7 +9,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index c5ee07b3df07..87aaef656257 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -15,7 +15,7 @@ #include #include #include -#include +#include #include #include From patchwork Wed Sep 20 07:46:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Xiao W" X-Patchwork-Id: 142517 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp4273260vqi; Wed, 20 Sep 2023 09:42:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHNmp8A7ymwkVCeYkZNlQns2DMue6qP+X1skFAAnTKYr90dy18d0V68YpX0OFAvozAZ+9d3 X-Received: by 2002:a05:6358:284:b0:143:61d:ffd3 with SMTP id w4-20020a056358028400b00143061dffd3mr3357425rwj.4.1695228123203; Wed, 20 Sep 2023 09:42:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695228123; cv=none; d=google.com; s=arc-20160816; b=eIfFXDTWOjptXNXF7VxuchNLqUlPSMh192BF+eoNFk2nNkHNR4JWPl4F65Dr7fx07J wnE/cZ4zvTryPwfv5oEX6ixwPL1ZHFUOy5TBrTmxkl1UKo9jKf2H0cI31DELc2/5VTJn r+Wa88Zp7jVwK4fMxxHCbyuMcvIVRrUcd83tOIzLGSeloGFyK6kEjjVwtP8Hdw5ySjAV 17ngeOve8jTp+21ox8oAZo8konDlc+KO8C9POqxwoWl/egx11hoEt8H35hg93XiV4Plb GhtljD2xyFaULkzqSHE8RWBkWmsDV0SUnwNk78qAKqcpN7zfbjd6UNXKXnRUblVvStt5 P7vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0YyWjr7V25MfFnUgBXPxIJuuZIQPLlFtpD9VodLrRQ0=; fh=IZKJ+GLRBclpWEjBDtcqwANaV1EY2kJ9JKX1ZdioJYk=; b=E0vl2jzyPxqo4P5l4JvfmaNqqkZCRZwUJuK6/tRbSf5s7c5DMxFpPSPSPjJsVGT2rC LIumacUPFH2NRFdp6l9FVnBBft4WzfvknA00m+SicxIGDfXoZ5QrSp/Up6957yjZFonu pL6qiQ4oQggdPQrVaMfa9AnYLS6ghdWBvTlyI0vQG6rFYBrH4IxuoT6nnjs8PK9I5zh2 iHpTjUC4NfACBOrTi0BgisE1My1Inbb+fSnNdXjtou29GUyzQeG9nqOej9vJ8fSNCxPm ufggZZBF185XFnCCmhYClQdJw5VFCHIj70e6veOjZ/LpPq25VMLcoezo0QFEvn5Wh7VY ev/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fhp0BFxi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id l27-20020a63701b000000b0056417c8a310si1828431pgc.204.2023.09.20.09.42.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 09:42:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fhp0BFxi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id A2ACB8374E79; Wed, 20 Sep 2023 00:40:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233804AbjITHkE (ORCPT + 26 others); Wed, 20 Sep 2023 03:40:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233814AbjITHj6 (ORCPT ); Wed, 20 Sep 2023 03:39:58 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 098469E; Wed, 20 Sep 2023 00:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695195592; x=1726731592; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D6J5R/1H9NYmwj1HBb/6nCFtRveDV5ws9Ss+Z61Hsv4=; b=fhp0BFxid/4XCiwEPDP+PyK6jr6iUmAtZlYzy6B4LlnX7IakwrbWLJXs Kpam+GJdGGy91RwkQbYf8P8IarAwFIgiX65ZaEPmUw1KcNTnEqRTIDNAv N/nS/7HW8rUW+Rqj8D6AraxDoo4/RCfSS3Ve8LN7hbqnTHGVhopNmGvrc Ce4ZC1xUoQkyUCEpecEWN9AUIDYSaGwBy3qKSna3ElhR5g3dyS1UUH7H+ 5uirz3w7PKHf/xhPUI8Ws+2YGEJOEI71y1ZVV82ou0Af4FKK7Y1D3np+u zKjos8kIcecBF0AsdwVj2+QuHhHQKCIpkDw4qvk+smjzjmcEA3qTC4suI Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="446622643" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="446622643" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 00:39:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="816791848" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="816791848" Received: from xiao-desktop.sh.intel.com ([10.239.46.158]) by fmsmga004.fm.intel.com with ESMTP; 20 Sep 2023 00:39:48 -0700 From: Xiao Wang To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, ardb@kernel.org Cc: anup@brainfault.org, haicheng.li@intel.com, ajones@ventanamicro.com, linux-riscv@lists.infradead.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Xiao Wang Subject: [PATCH v2 2/2] riscv: Optimize bitops with Zbb extension Date: Wed, 20 Sep 2023 15:46:53 +0800 Message-Id: <20230920074653.2509631-3-xiao.w.wang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230920074653.2509631-1-xiao.w.wang@intel.com> References: <20230920074653.2509631-1-xiao.w.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 20 Sep 2023 00:40:12 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777550753401540845 X-GMAIL-MSGID: 1777575524460400937 This patch leverages the alternative mechanism to dynamically optimize bitops (including __ffs, __fls, ffs, fls) with Zbb instructions. When Zbb ext is not supported by the runtime CPU, legacy implementation is used. If Zbb is supported, then the optimized variants will be selected via alternative patching. The legacy bitops support is taken from the generic C implementation as fallback. If the parameter is a build-time constant, we leverage compiler builtin to calculate the result directly, this approach is inspired by x86 bitops implementation. EFI stub runs before the kernel, so alternative mechanism should not be used there, this patch introduces a macro NO_ALTERNATIVE for this purpose. Signed-off-by: Xiao Wang --- arch/riscv/include/asm/bitops.h | 266 +++++++++++++++++++++++++- drivers/firmware/efi/libstub/Makefile | 2 +- 2 files changed, 264 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h index 3540b690944b..c97e774cb647 100644 --- a/arch/riscv/include/asm/bitops.h +++ b/arch/riscv/include/asm/bitops.h @@ -15,13 +15,273 @@ #include #include +#if !defined(CONFIG_RISCV_ISA_ZBB) || defined(NO_ALTERNATIVE) #include -#include -#include #include +#include +#include + +#else +#include +#include + +#if (BITS_PER_LONG == 64) +#define CTZW "ctzw " +#define CLZW "clzw " +#elif (BITS_PER_LONG == 32) +#define CTZW "ctz " +#define CLZW "clz " +#else +#error "Unexpected BITS_PER_LONG" +#endif + +static __always_inline unsigned long variable__ffs(unsigned long word) +{ + int num; + + asm_volatile_goto( + ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBB, 1) + : : : : legacy); + + asm volatile ( + ".option push\n" + ".option arch,+zbb\n" + "ctz %0, %1\n" + ".option pop\n" + : "=r" (word) : "r" (word) :); + + return word; + +legacy: + num = 0; +#if BITS_PER_LONG == 64 + if ((word & 0xffffffff) == 0) { + num += 32; + word >>= 32; + } +#endif + if ((word & 0xffff) == 0) { + num += 16; + word >>= 16; + } + if ((word & 0xff) == 0) { + num += 8; + word >>= 8; + } + if ((word & 0xf) == 0) { + num += 4; + word >>= 4; + } + if ((word & 0x3) == 0) { + num += 2; + word >>= 2; + } + if ((word & 0x1) == 0) + num += 1; + return num; +} + +/** + * __ffs - find first set bit in a long word + * @word: The word to search + * + * Undefined if no set bit exists, so code should check against 0 first. + */ +#define __ffs(word) \ + (__builtin_constant_p(word) ? \ + (unsigned long)__builtin_ctzl(word) : \ + variable__ffs(word)) + +static __always_inline unsigned long variable__fls(unsigned long word) +{ + int num; + + asm_volatile_goto( + ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBB, 1) + : : : : legacy); + + asm volatile ( + ".option push\n" + ".option arch,+zbb\n" + "clz %0, %1\n" + ".option pop\n" + : "=r" (word) : "r" (word) :); + + return BITS_PER_LONG - 1 - word; + +legacy: + num = BITS_PER_LONG - 1; +#if BITS_PER_LONG == 64 + if (!(word & (~0ul << 32))) { + num -= 32; + word <<= 32; + } +#endif + if (!(word & (~0ul << (BITS_PER_LONG-16)))) { + num -= 16; + word <<= 16; + } + if (!(word & (~0ul << (BITS_PER_LONG-8)))) { + num -= 8; + word <<= 8; + } + if (!(word & (~0ul << (BITS_PER_LONG-4)))) { + num -= 4; + word <<= 4; + } + if (!(word & (~0ul << (BITS_PER_LONG-2)))) { + num -= 2; + word <<= 2; + } + if (!(word & (~0ul << (BITS_PER_LONG-1)))) + num -= 1; + return num; +} + +/** + * __fls - find last set bit in a long word + * @word: the word to search + * + * Undefined if no set bit exists, so code should check against 0 first. + */ +#define __fls(word) \ + (__builtin_constant_p(word) ? \ + (unsigned long)(BITS_PER_LONG - 1 - __builtin_clzl(word)) : \ + variable__fls(word)) + +static __always_inline int variable_ffs(int x) +{ + int r; + + asm_volatile_goto( + ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBB, 1) + : : : : legacy); + + asm volatile ( + ".option push\n" + ".option arch,+zbb\n" + "bnez %1, 1f\n" + "li %0, 0\n" + "j 2f\n" + "1:\n" + CTZW "%0, %1\n" + "addi %0, %0, 1\n" + "2:\n" + ".option pop\n" + : "=r" (r) : "r" (x) :); + + return r; + +legacy: + r = 1; + if (!x) + return 0; + if (!(x & 0xffff)) { + x >>= 16; + r += 16; + } + if (!(x & 0xff)) { + x >>= 8; + r += 8; + } + if (!(x & 0xf)) { + x >>= 4; + r += 4; + } + if (!(x & 3)) { + x >>= 2; + r += 2; + } + if (!(x & 1)) { + x >>= 1; + r += 1; + } + return r; +} + +/** + * ffs - find first set bit in a word + * @x: the word to search + * + * This is defined the same way as the libc and compiler builtin ffs routines. + * + * ffs(value) returns 0 if value is 0 or the position of the first set bit if + * value is nonzero. The first (least significant) bit is at position 1. + */ +#define ffs(x) (__builtin_constant_p(x) ? __builtin_ffs(x) : variable_ffs(x)) + +static __always_inline int variable_fls(unsigned int x) +{ + int r; + + asm_volatile_goto( + ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBB, 1) + : : : : legacy); + + asm volatile ( + ".option push\n" + ".option arch,+zbb\n" + "bnez %1, 1f\n" + "li %0, 0\n" + "j 2f\n" + "1:\n" + CLZW "%0, %1\n" + "neg %0, %0\n" + "addi %0, %0, 32\n" + "2:\n" + ".option pop\n" + : "=r" (r) : "r" (x) :); + + return r; + +legacy: + r = 32; + if (!x) + return 0; + if (!(x & 0xffff0000u)) { + x <<= 16; + r -= 16; + } + if (!(x & 0xff000000u)) { + x <<= 8; + r -= 8; + } + if (!(x & 0xf0000000u)) { + x <<= 4; + r -= 4; + } + if (!(x & 0xc0000000u)) { + x <<= 2; + r -= 2; + } + if (!(x & 0x80000000u)) { + x <<= 1; + r -= 1; + } + return r; +} + +/** + * fls - find last set bit in a word + * @x: the word to search + * + * This is defined in a similar way as ffs, but returns the position of the most + * significant set bit. + * + * fls(value) returns 0 if value is 0 or the position of the last set bit if + * value is nonzero. The last (most significant) bit is at position 32. + */ +#define fls(x) \ + (__builtin_constant_p(x) ? \ + (int)(((x) != 0) ? \ + (sizeof(unsigned int) * 8 - __builtin_clz(x)) : 0) : \ + variable_fls(x)) + +#endif + +#include #include #include -#include #include diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index a1157c2a7170..d68cacd4e3af 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -28,7 +28,7 @@ cflags-$(CONFIG_ARM) += -DEFI_HAVE_STRLEN -DEFI_HAVE_STRNLEN \ -DEFI_HAVE_MEMCHR -DEFI_HAVE_STRRCHR \ -DEFI_HAVE_STRCMP -fno-builtin -fpic \ $(call cc-option,-mno-single-pic-base) -cflags-$(CONFIG_RISCV) += -fpic +cflags-$(CONFIG_RISCV) += -fpic -DNO_ALTERNATIVE cflags-$(CONFIG_LOONGARCH) += -fpie cflags-$(CONFIG_EFI_PARAMS_FROM_FDT) += -I$(srctree)/scripts/dtc/libfdt