From patchwork Wed Oct 19 08:47:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxi Chen X-Patchwork-Id: 4633 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp212106wrs; Wed, 19 Oct 2022 02:06:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7WvPk8bcQf54HeGDxBLhSie/qaJ+npmzBR6qlIgo6ZCwW4daNIXi7hyESialRkMtq7sQfV X-Received: by 2002:a17:906:1b49:b0:78d:b7b5:71cc with SMTP id p9-20020a1709061b4900b0078db7b571ccmr5778924ejg.536.1666170405048; Wed, 19 Oct 2022 02:06:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666170405; cv=none; d=google.com; s=arc-20160816; b=emq2cqB+2/GhDQi9/if5G0iAjTijEdHubJrdSV0QYPkv/E6ALt3ePSuLkM7SwIAvaC TK58ieoherW7M62c2dOSAYKHiIYTDamZ0uBtIE/ui3VS9hUWHmexYsMjCWaQi+HwDTU0 jZhtKbNfczZQejbqD0mKxTtMcYZsCV/Zzj62IeMVyMGbJeuBPHHYzjPJplACCBV95kYM rgLGO4FI0Av95xIF4+3mmSXUrOWewr3WsQVJ2xRcPcVfB7E18CYr4+zHClsPBr+V3MnD mgE5rNh5JnLZ0LM8h904i3eTekPrKgIKnAzbM5ywP9O/1BabEv/1bg/MlE3jkGkUTj+3 lhpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5xKVjl5I36BQWxj5Vs6UUZBVoB7j2A9SYeSmkZFMphM=; b=DxVtGmAY07qd+aBFfZOD7qBno6+1945IYWY/ES7YUs5Y00A8DZvnhxkIJlJ6/OBdft 4Otb0ZU10o8DJ3XzGd2XGEn/y6QfYL2FYfKXVoka0E361wqW0TNNFpGKyD4Jj5V4M38k b7p4BMe4mEkC71z5dFtK0HCVdJ15I0XnRavXXvf5y/xvMlfhQ8mtbj48vYubCdHb0a8h A/gy4tTtdnhGFrl9iv3T641gWmevcFUwprpXXTkhG+IJd6ZdhYgKC6dfwohiAHUNw5eA AyTxhVrrcAwNH99db+DIhVYtryYGYKPhKIvaFv8hkqFCT5gTiO/p36p+g3Lyk9FiqnU7 1u8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lOlt412V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y66-20020a50bb48000000b00458ea8cbfb2si13038687ede.505.2022.10.19.02.06.13; Wed, 19 Oct 2022 02:06:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lOlt412V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231966AbiJSI4l (ORCPT + 99 others); Wed, 19 Oct 2022 04:56:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231860AbiJSIzk (ORCPT ); Wed, 19 Oct 2022 04:55:40 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4DCB9B875; Wed, 19 Oct 2022 01:52:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666169533; x=1697705533; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mMORpyBJhxkfSV067Kavho36Iu4oPBvAncoY3FxSR6s=; b=lOlt412Vld3aJ4gvn8Y8sVJsRnDzLzFQuMHwsCfyY7NjbtZhfxBYLo0K G0tIB+hacNyyfuvIqVHct2KpHCVHdD/QXW6Tay1zA34asG6rmkqNWYgiy V3F0pACWE2FAujUxGMm7BnpUKzClEJ3nEtUJbUfCK1Hh96g2vok27zXLy iG2FNM628VsVDNFgpGQWkKhYLexKNIW0DoTDzhxCc4pXY4bPjhHaPepYc RhOuhh9XgnbDt27sibK/UtkjD/eyMe/iH3+4L/P7IosAGaJOFeRKHtreD PYSmWYfFmvDNobIsEtunvbq0/UOLEvbDNtTqu41JcdOl+hQCln+9qbOaV Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10504"; a="392649505" X-IronPort-AV: E=Sophos;i="5.95,195,1661842800"; d="scan'208";a="392649505" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2022 01:48:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10504"; a="804195982" X-IronPort-AV: E=Sophos;i="5.95,195,1661842800"; d="scan'208";a="804195982" Received: from jiaxichen-precision-3650-tower.sh.intel.com ([10.239.159.75]) by orsmga005.jf.intel.com with ESMTP; 19 Oct 2022 01:47:56 -0700 From: Jiaxi Chen To: kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com, ndesaulniers@google.com, alexandre.belloni@bootlin.com, peterz@infradead.org, jiaxi.chen@linux.intel.com, jpoimboe@kernel.org, chang.seok.bae@intel.com, pawan.kumar.gupta@linux.intel.com, babu.moger@amd.com, jmattson@google.com, sandipan.das@amd.com, tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, fenghua.yu@intel.com, keescook@chromium.org, jane.malalane@citrix.com, nathan@kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/6] x86: KVM: Enable AVX-VNNI-INT8 CPUID and expose it to guest Date: Wed, 19 Oct 2022 16:47:32 +0800 Message-Id: <20221019084734.3590760-5-jiaxi.chen@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221019084734.3590760-1-jiaxi.chen@linux.intel.com> References: <20221019084734.3590760-1-jiaxi.chen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747106298932447051?= X-GMAIL-MSGID: =?utf-8?q?1747106298932447051?= AVX-VNNI-INT8 is a new set of instructions in the latest Intel platform Sierra Forest. It multiplies the individual bytes of two unsigned or unsigned source operands, then add and accumulate the results into the destination dword element size operand. This instruction allows for the platform to have superior AI capabilities. The bit definition: CPUID.(EAX=7,ECX=1):EDX[bit 4] This patch enables this CPUID in the kernel feature bits and expose it to guest OS. Since the CPUID involves a bit of EDX (EAX=7,ECX=1) which has not been enumerated yet, this patch adds CPUID_7_1_EDX to CPUID subleaves. At the same time, word 20 is newly-defined in CPU features for CPUID level 0x00000007:1 (EDX). Signed-off-by: Jiaxi Chen --- arch/x86/include/asm/cpufeature.h | 7 +++++-- arch/x86/include/asm/cpufeatures.h | 5 ++++- arch/x86/include/asm/disabled-features.h | 3 ++- arch/x86/include/asm/required-features.h | 3 ++- arch/x86/kernel/cpu/common.c | 1 + arch/x86/kvm/cpuid.c | 5 ++++- arch/x86/kvm/reverse_cpuid.h | 1 + 7 files changed, 19 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 1a85e1fb0922..d46b802930b0 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -32,6 +32,7 @@ enum cpuid_leafs CPUID_8000_0007_EBX, CPUID_7_EDX, CPUID_8000_001F_EAX, + CPUID_7_1_EDX, }; #define X86_CAP_FMT_NUM "%d:%d" @@ -94,8 +95,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 20, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 21)) #define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -118,8 +120,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 20, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 20)) + BUILD_BUG_ON_ZERO(NCAPINTS != 21)) #define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index a682f646243f..b2aa761ea110 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -13,7 +13,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 20 /* N 32-bit words worth of info */ +#define NCAPINTS 21 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */ /* @@ -423,6 +423,9 @@ #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */ #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ +/* Intel-defined CPU features, CPUID level 0x00000007:1 (EDX), word 20 */ +#define X86_FEATURE_AVX_VNNI_INT8 (20*32+ 4) /* Support for VPDPB[SU,UU,SS]D[,S] */ + /* * BUG word(s) */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 33d2cd04d254..2db6929ca868 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -111,6 +111,7 @@ #define DISABLED_MASK17 0 #define DISABLED_MASK18 0 #define DISABLED_MASK19 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define DISABLED_MASK20 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21) #endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index aff774775c67..3b5522b3f051 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -98,6 +98,7 @@ #define REQUIRED_MASK17 0 #define REQUIRED_MASK18 0 #define REQUIRED_MASK19 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20) +#define REQUIRED_MASK20 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21) #endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 3e508f239098..7c659127ed89 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1031,6 +1031,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (eax >= 1) { cpuid_count(0x00000007, 1, &eax, &ebx, &ecx, &edx); c->x86_capability[CPUID_7_1_EAX] = eax; + c->x86_capability[CPUID_7_1_EDX] = edx; } } diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 837bcd1373e5..b1b53a5c788a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -660,6 +660,9 @@ void kvm_set_cpu_caps(void) F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) | F(AVX_IFMA)); + kvm_cpu_cap_mask(CPUID_7_1_EDX, + F(AVX_VNNI_INT8)); + kvm_cpu_cap_mask(CPUID_D_1_EAX, F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES) | f_xfd ); @@ -913,9 +916,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) goto out; cpuid_entry_override(entry, CPUID_7_1_EAX); + cpuid_entry_override(entry, CPUID_7_1_EDX); entry->ebx = 0; entry->ecx = 0; - entry->edx = 0; } break; case 0xa: { /* Architectural Performance Monitoring */ diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h index a19d473d0184..fb4f4bffad5c 100644 --- a/arch/x86/kvm/reverse_cpuid.h +++ b/arch/x86/kvm/reverse_cpuid.h @@ -48,6 +48,7 @@ static const struct cpuid_reg reverse_cpuid[] = { [CPUID_7_1_EAX] = { 7, 1, CPUID_EAX}, [CPUID_12_EAX] = {0x00000012, 0, CPUID_EAX}, [CPUID_8000_001F_EAX] = {0x8000001f, 0, CPUID_EAX}, + [CPUID_7_1_EDX] = { 7, 1, CPUID_EDX}, }; /*