Message ID | 20231005131402.14611-5-kirill.shutemov@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2016:b0:403:3b70:6f57 with SMTP id fe22csp347018vqb; Thu, 5 Oct 2023 07:43:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGJvAn/zD8UsCCqldbyZG4aK99hzFzfQq6Hr+u7/W1wI4FGQLBQTXYGEnFd6RTrWbfFddIy X-Received: by 2002:a17:90b:1c83:b0:274:616e:3fc4 with SMTP id oo3-20020a17090b1c8300b00274616e3fc4mr5108042pjb.34.1696517035251; Thu, 05 Oct 2023 07:43:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696517035; cv=none; d=google.com; s=arc-20160816; b=M1fsMjeScg/9zTxb+UrVyxTuWi/4gxLtmJ6BM4ZhWvKkDc4oKW51BiC0RlabXZw9uC UhmmFb6T+G2h8INggeh+6u6dImUJkeKzRtmb6NVt5HFUW+yN5IS7nMBViXuthnpGWKo4 UEArPy4yhap8fZqRB+hk32SP3qce5mztIdDtze4vq/d1Ss/i8Hs4fFBezZJg308n7t6u G93cMYsOWWCnQqHj/GsaCcwo5KWAtQi4SgRg62yVfJLIKiylB+CCsQvUvz4g4H0g9qxq RsXh6lQKOdVTuO37vubbgZkdugRUIU5UnqIPXfmtswSS0jAd1OLBTQpBsIlpga6CyFVs rZTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XQmQyJ5xt0pQVoTwNJHZfJnSL6zlkQV0CETTN97MCXA=; fh=F1gpqSS/HYttK+doOKuW4yrAifo5qykyq1MCI9SIQmQ=; b=NDmQT58D4yKM5bLdi+O6+0E/j1ghaFnaxzbUzRnwmed4mfjT/18AMxGZX4qZcPxxAy pKI3282aER1yHsik7GoGxVwv8Q/bJAN9ZmTycPfxJKHMXS0RQiy+55y5cIaaPtY7NVI4 olhduWE2lLG067jlmuhu88brih0Fv7mBqSNqo+xB8gcbFzvaIVm7zLe3RWP5nIPxbznq WrmgmDgyRYgT9Wo9WT2LVfpE/ulUU/vojgJ6nAF4Z4R1GcIWd2p79l03AeRlYFJnusHw RryT0o+dPt9eB9+KPYzOeBaxoCoE58/YQpC2nZ1fjx7uU3fPlUfd2hw72pyGyvOKYGOq xevQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GCWrkuOQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id ls11-20020a17090b350b00b002746ab58dfcsi1884684pjb.18.2023.10.05.07.43.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 07:43:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GCWrkuOQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 7A9F0802FD36; Thu, 5 Oct 2023 07:43:54 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236627AbjJEOnd (ORCPT <rfc822;ezelljr.billy@gmail.com> + 19 others); Thu, 5 Oct 2023 10:43:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237199AbjJEOip (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 5 Oct 2023 10:38:45 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C2DA4E366 for <linux-kernel@vger.kernel.org>; Thu, 5 Oct 2023 07:04:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696514656; x=1728050656; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6jCOPtPHdWrnAAWeulGySr3RPESjo4P/hViU3qiYp2Q=; b=GCWrkuOQt7HQCuvSqcn/iFX4EiuNrHJBa+8wAvElzWAkDqp0x7bfKt95 J/oSwZIX2BcUwaWo/+FSoaZ+izLJg4K/ITp7UPhx9bf88YOGSuisiEJcQ JYGhkd2l1r6Y3+kMfhxZKWDFHCxFn4sZRGFP8vy/fd9kZFpv9eA/Rcn2C Y+9RsSc+Pt6SPUT1qKqIzRqZVQyugtKlPd4feDHTzyGEmSQEbcrxPEt5n mIxMOZYJZaHTj5gn5WqdBlVBiFlbuF731WtVBD4u1Km0UNeJWOWALpsmt cOpEAwLRdmM5sPQuaLUz7nro+vzGrvU7ejaIfBbPLEbFgHK1WNV5bynJE g==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="382357442" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="382357442" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Oct 2023 06:14:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="817564301" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="817564301" Received: from skwasnia-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.251.222.71]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Oct 2023 06:14:16 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 35E2A10A14C; Thu, 5 Oct 2023 16:14:14 +0300 (+03) From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> To: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org Cc: "Rafael J. Wysocki" <rafael@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Adrian Hunter <adrian.hunter@intel.com>, Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>, Elena Reshetova <elena.reshetova@intel.com>, Jun Nakajima <jun.nakajima@intel.com>, Rick Edgecombe <rick.p.edgecombe@intel.com>, Tom Lendacky <thomas.lendacky@amd.com>, kexec@lists.infradead.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Subject: [PATCH 04/13] x86/kvm: Do not try to disable kvmclock if it was not enabled Date: Thu, 5 Oct 2023 16:13:53 +0300 Message-ID: <20231005131402.14611-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231005131402.14611-1-kirill.shutemov@linux.intel.com> References: <20231005131402.14611-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 05 Oct 2023 07:43:54 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778927046747235159 X-GMAIL-MSGID: 1778927046747235159 |
Series |
x86/tdx: Add kexec support
|
|
Commit Message
Kirill A. Shutemov
Oct. 5, 2023, 1:13 p.m. UTC
kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is
present in the VM. It leads to write to a MSR that doesn't exist on some
configurations, namely in TDX guest:
unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000)
at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30)
kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt
features.
Do not disable kvmclock if it was not enumerated or disabled by user
from kernel command line.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown")
---
arch/x86/kernel/kvmclock.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
Comments
+Paolo Please use get_maintainers... On Thu, Oct 05, 2023, Kirill A. Shutemov wrote: > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > present in the VM. It leads to write to a MSR that doesn't exist on some > configurations, namely in TDX guest: > > unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000) > at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30) > > kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt > features. > > Do not disable kvmclock if it was not enumerated or disabled by user > from kernel command line. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown") > --- > arch/x86/kernel/kvmclock.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > index fb8f52149be9..cba2e732e53f 100644 > --- a/arch/x86/kernel/kvmclock.c > +++ b/arch/x86/kernel/kvmclock.c > @@ -22,7 +22,7 @@ > #include <asm/x86_init.h> > #include <asm/kvmclock.h> > > -static int kvmclock __initdata = 1; > +static int kvmclock __ro_after_init = 1; > static int kvmclock_vsyscall __initdata = 1; > static int msr_kvm_system_time __ro_after_init = MSR_KVM_SYSTEM_TIME; > static int msr_kvm_wall_clock __ro_after_init = MSR_KVM_WALL_CLOCK; > @@ -195,7 +195,12 @@ static void kvm_setup_secondary_clock(void) > > void kvmclock_disable(void) > { > - native_write_msr(msr_kvm_system_time, 0, 0); > + if (!kvm_para_available() || !kvmclock) > + return; > + > + if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE) || > + kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) > + native_write_msr(msr_kvm_system_time, 0, 0); Rather than recheck everything and preserve kvmclock, what about leaving the MSR indices '0' by default and then disable msr_kvm_system_time iff it's non-zero. That way the disable path won't become stale if the conditions for enabling kvmclock change. diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index fb8f52149be9..f2fff625576d 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -24,8 +24,8 @@ static int kvmclock __initdata = 1; static int kvmclock_vsyscall __initdata = 1; -static int msr_kvm_system_time __ro_after_init = MSR_KVM_SYSTEM_TIME; -static int msr_kvm_wall_clock __ro_after_init = MSR_KVM_WALL_CLOCK; +static int msr_kvm_system_time __ro_after_init; +static int msr_kvm_wall_clock __ro_after_init; static u64 kvm_sched_clock_offset __ro_after_init; static int __init parse_no_kvmclock(char *arg) @@ -195,7 +195,8 @@ static void kvm_setup_secondary_clock(void) void kvmclock_disable(void) { - native_write_msr(msr_kvm_system_time, 0, 0); + if (msr_kvm_system_time) + native_write_msr(msr_kvm_system_time, 0, 0); } static void __init kvmclock_init_mem(void) @@ -294,7 +295,10 @@ void __init kvmclock_init(void) if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) { msr_kvm_system_time = MSR_KVM_SYSTEM_TIME_NEW; msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK_NEW; - } else if (!kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE)) { + } else if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE)) { + msr_kvm_system_time = MSR_KVM_SYSTEM_TIME; + msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK; + } else { return; }
On Fri, Oct 06, 2023 at 07:36:54AM -0700, Sean Christopherson wrote: > +Paolo > > Please use get_maintainers... Will do, sorry. > On Thu, Oct 05, 2023, Kirill A. Shutemov wrote: > > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > > present in the VM. It leads to write to a MSR that doesn't exist on some > > configurations, namely in TDX guest: > > > > unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000) > > at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30) > > > > kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt > > features. > > > > Do not disable kvmclock if it was not enumerated or disabled by user > > from kernel command line. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown") > > --- > > arch/x86/kernel/kvmclock.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > > index fb8f52149be9..cba2e732e53f 100644 > > --- a/arch/x86/kernel/kvmclock.c > > +++ b/arch/x86/kernel/kvmclock.c > > @@ -22,7 +22,7 @@ > > #include <asm/x86_init.h> > > #include <asm/kvmclock.h> > > > > -static int kvmclock __initdata = 1; > > +static int kvmclock __ro_after_init = 1; > > static int kvmclock_vsyscall __initdata = 1; > > static int msr_kvm_system_time __ro_after_init = MSR_KVM_SYSTEM_TIME; > > static int msr_kvm_wall_clock __ro_after_init = MSR_KVM_WALL_CLOCK; > > @@ -195,7 +195,12 @@ static void kvm_setup_secondary_clock(void) > > > > void kvmclock_disable(void) > > { > > - native_write_msr(msr_kvm_system_time, 0, 0); > > + if (!kvm_para_available() || !kvmclock) > > + return; > > + > > + if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE) || > > + kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) > > + native_write_msr(msr_kvm_system_time, 0, 0); > > Rather than recheck everything and preserve kvmclock, what about leaving the MSR > indices '0' by default and then disable msr_kvm_system_time iff it's non-zero. > That way the disable path won't become stale if the conditions for enabling > kvmclock change. Okay, works for me too.
On 10/5/2023 6:13 AM, Kirill A. Shutemov wrote: > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > present in the VM. It leads to write to a MSR that doesn't exist on some > configurations, namely in TDX guest: > > unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000) > at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30) > > kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt > features. > > Do not disable kvmclock if it was not enumerated or disabled by user > from kernel command line. For the above warning, check for CLOCKSOURCE and CLOCKSOURCE2 feature is sufficient, right? Do we need to include user/command-line disable check here? > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown") > --- > arch/x86/kernel/kvmclock.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > index fb8f52149be9..cba2e732e53f 100644 > --- a/arch/x86/kernel/kvmclock.c > +++ b/arch/x86/kernel/kvmclock.c > @@ -22,7 +22,7 @@ > #include <asm/x86_init.h> > #include <asm/kvmclock.h> > > -static int kvmclock __initdata = 1; > +static int kvmclock __ro_after_init = 1; > static int kvmclock_vsyscall __initdata = 1; > static int msr_kvm_system_time __ro_after_init = MSR_KVM_SYSTEM_TIME; > static int msr_kvm_wall_clock __ro_after_init = MSR_KVM_WALL_CLOCK; > @@ -195,7 +195,12 @@ static void kvm_setup_secondary_clock(void) > > void kvmclock_disable(void) > { > - native_write_msr(msr_kvm_system_time, 0, 0); > + if (!kvm_para_available() || !kvmclock) > + return; > + > + if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE) || > + kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) > + native_write_msr(msr_kvm_system_time, 0, 0); > } > > static void __init kvmclock_init_mem(void)
On Tue, Oct 10, 2023 at 06:53:27AM -0700, Kuppuswamy Sathyanarayanan wrote: > > > On 10/5/2023 6:13 AM, Kirill A. Shutemov wrote: > > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > > present in the VM. It leads to write to a MSR that doesn't exist on some > > configurations, namely in TDX guest: > > > > unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000) > > at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30) > > > > kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt > > features. > > > > Do not disable kvmclock if it was not enumerated or disabled by user > > from kernel command line. > > For the above warning, check for CLOCKSOURCE and CLOCKSOURCE2 > feature is sufficient, right? Do we need to include user/command-line > disable check here? The command line disables kvmclock, even if it is enumerated, so disabling it is not needed. Anyway, I reworked the patch already based on Sean's feedback. No need in taking parameter into account directly now.
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index fb8f52149be9..cba2e732e53f 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -22,7 +22,7 @@ #include <asm/x86_init.h> #include <asm/kvmclock.h> -static int kvmclock __initdata = 1; +static int kvmclock __ro_after_init = 1; static int kvmclock_vsyscall __initdata = 1; static int msr_kvm_system_time __ro_after_init = MSR_KVM_SYSTEM_TIME; static int msr_kvm_wall_clock __ro_after_init = MSR_KVM_WALL_CLOCK; @@ -195,7 +195,12 @@ static void kvm_setup_secondary_clock(void) void kvmclock_disable(void) { - native_write_msr(msr_kvm_system_time, 0, 0); + if (!kvm_para_available() || !kvmclock) + return; + + if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE) || + kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) + native_write_msr(msr_kvm_system_time, 0, 0); } static void __init kvmclock_init_mem(void)