From patchwork Tue Oct 10 09:40:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Durrant X-Patchwork-Id: 150620 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp72624vqb; Tue, 10 Oct 2023 02:42:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEkH8eKLj/SduPWJwlY0SXyFfpiF8SmB4qdS2d091G/Xum9niju1lpEeJEhZX/LxBZGrxwX X-Received: by 2002:a17:90b:1a8c:b0:273:ed61:a682 with SMTP id ng12-20020a17090b1a8c00b00273ed61a682mr14655931pjb.1.1696930936801; Tue, 10 Oct 2023 02:42:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696930936; cv=none; d=google.com; s=arc-20160816; b=WAV9/k9cOv+wZmE0AK8hmbQ0eTQ/p6Ek53I5cuVdRHEFYmiqIzf7qrR8jlceZsnK3k tfnjQbQlTkq7i+QRBnJWvJkW6a0kqv9eC9kdPROLEJUxjK4wNsO9SpMQtG3D4hBC2BzH 66JVvKMU6EyGbh5TUuP9KoMuV7z/bSkuU5IapqCtG5j6HyAiEqm3MyEme6HJ+2eAYAZc tTf2QnyAfhyjNSlsTkgDsDTymVx0W7kdQLtUPDpl5Yk+KJ3+OFR95UFclf7dyPwTUojv 2oYazCeAieOj6yEJWOQ1NscJg39dtI7OIJrOlC3Yt+Yo1bndk7fMQCa1D0+1g3QU8puy 5wiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=KiG9gq+pmTjiCNH5AMoTyUrDkGdFm9tyOSauoUX5EWE=; fh=/NbIdVsjiVtuUXo6JKFkhHYQCAksfSdBPWHKTMf6bDs=; b=jxbYURepl2LRY7FNoNkx4+yEXzY40hUbXYOYbIV0Exvg5THISnACgY5eXvpxtE7qjE fEKaWmLA4hqi8cTC99FjPYAX8L8WiGqPt1st1LrlnNeSFubi4G2l9Tb9TMBKJUzmu7j5 1uj64Umg5IvWytE3JoT9JQUkJ6XQ70ealY0kG3AFuihO6xrzXSolaMv2EmzQsx83DMIW sshHVnqsD1WjJtkrp+xJAvywiv8IUyfgQRcb5EY64altCwkTRyy92p6EfjwQ9rZj//l3 sp3GjRJcXVd8LH9GAjw3ZmTeJhWIVvMuJo3dQdT8kFH1ceCWLHyU7MyZ50PUje3v0Wpx FARg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b=CSHN0Dsx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id q15-20020a17090a178f00b0027ce344c18asi1468143pja.38.2023.10.10.02.42.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 02:42:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b=CSHN0Dsx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 8BBC580A1848; Tue, 10 Oct 2023 02:41:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229975AbjJJJlm (ORCPT + 20 others); Tue, 10 Oct 2023 05:41:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229580AbjJJJlk (ORCPT ); Tue, 10 Oct 2023 05:41:40 -0400 Received: from mail.xenproject.org (mail.xenproject.org [104.130.215.37]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E8669F; Tue, 10 Oct 2023 02:41:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=xen.org; s=20200302mail; h=Content-Transfer-Encoding:MIME-Version:Message-Id:Date: Subject:To:From; bh=KiG9gq+pmTjiCNH5AMoTyUrDkGdFm9tyOSauoUX5EWE=; b=CSHN0Dsxd GyqSFPEirbfl9CZEngtVS2y9cciXtLcxSnvkEglg4KOvV/GRTscJ9zQwdJYWQNC0P8FupOvDC/uyA IxkMNMigg77/PQ1RcfAwx0JRypFaxMbqWDAaKaGzGPUgxOQXCYHcdXL7WDeLWXxGIa4FIUtcDuk4+ GIsYumYE=; Received: from xenbits.xenproject.org ([104.239.192.120]) by mail.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qq9El-0004Bm-KY; Tue, 10 Oct 2023 09:41:11 +0000 Received: from ec2-63-33-11-17.eu-west-1.compute.amazonaws.com ([63.33.11.17] helo=REM-PW02S00X.ant.amazon.com) by xenbits.xenproject.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qq9El-0004Ha-7T; Tue, 10 Oct 2023 09:41:11 +0000 From: Paul Durrant To: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , David Woodhouse , Paul Durrant , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] KVM x86/xen: add an override for PVCLOCK_TSC_STABLE_BIT Date: Tue, 10 Oct 2023 09:40:47 +0000 Message-Id: <20231010094047.3850928-1-paul@xen.org> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_PASS, SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 10 Oct 2023 02:41:56 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779361053801898083 X-GMAIL-MSGID: 1779361053801898083 From: Paul Durrant Unless explicitly told to do so (by passing 'clocksource=tsc' and 'tsc=stable:socket', and then jumping through some hoops concerning potential CPU hotplug) Xen will never use TSC as its clocksource. Hence, by default, a Xen guest will not see PVCLOCK_TSC_STABLE_BIT set in either the primary or secondary pvclock memory areas. This has led to bugs in some guest kernels which only become evident if PVCLOCK_TSC_STABLE_BIT *is* set in the pvclocks. Hence, to support such guests, give the VMM a new attribute to tell KVM to forcibly clear the bit in the Xen pvclocks. Signed-off-by: Paul Durrant --- Documentation/virt/kvm/api.rst | 9 +++++++++ arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 25 ++++++++++++++++++++----- arch/x86/kvm/xen.c | 13 +++++++++++++ include/uapi/linux/kvm.h | 5 +++++ 5 files changed, 48 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 21a7578142a1..d06f971a2ce0 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -5544,6 +5544,7 @@ attribute cannot be read. __u64 expires_ns; } timer; __u8 vector; + __u32 flags; } u; }; @@ -5610,6 +5611,14 @@ KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR vector configured with HVM_PARAM_CALLBACK_IRQ. It is disabled by setting the vector to zero. +KVM_XEN_VCPU_ATTR_TYPE_PVCLOCK + This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates + support for KVM_XEN_HVM_CONFIG_PVCLOCK feature. It modifies the + pvclock information available to the guest. Currently the only defined + flag is KVM_XEN_PVCLOCK_TSC_UNSTABLE. If this flag is set then the + PVCLOCK_TSC_STABLE_BIT flag will not be set in any of the Xen pvclock + sources. This aligns with Xen's behaviour when it is not using TSC + as its clock source, which is the default behaviour. 4.129 KVM_XEN_VCPU_GET_ATTR --------------------------- diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 17715cb8731d..2edc48e94d56 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -685,6 +685,7 @@ struct kvm_vcpu_xen { u64 hypercall_rip; u32 current_runstate; u8 upcall_vector; + bool tsc_is_unstable; struct gfn_to_pfn_cache vcpu_info_cache; struct gfn_to_pfn_cache vcpu_time_info_cache; struct gfn_to_pfn_cache runstate_cache; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9f18b06bbda6..1c6556e14d40 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3096,7 +3096,8 @@ u64 get_kvmclock_ns(struct kvm *kvm) static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, struct gfn_to_pfn_cache *gpc, - unsigned int offset) + unsigned int offset, + bool force_tsc_unstable) { struct kvm_vcpu_arch *vcpu = &v->arch; struct pvclock_vcpu_time_info *guest_hv_clock; @@ -3133,6 +3134,10 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, } memcpy(guest_hv_clock, &vcpu->hv_clock, sizeof(*guest_hv_clock)); + + if (force_tsc_unstable) + guest_hv_clock->flags &= ~PVCLOCK_TSC_STABLE_BIT; + smp_wmb(); guest_hv_clock->version = ++vcpu->hv_clock.version; @@ -3231,12 +3236,21 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) vcpu->hv_clock.flags = pvclock_flags; if (vcpu->pv_time.active) - kvm_setup_guest_pvclock(v, &vcpu->pv_time, 0); + kvm_setup_guest_pvclock(v, &vcpu->pv_time, 0, false); + + /* + * For Xen guests we may need to override PVCLOCK_TSC_STABLE_BIT as unless + * explicitly told to use TSC as its clocksource Xen will not set this bit. + * This default behaviour led to bugs in some guest kernels which cause + * problems if they observe PVCLOCK_TSC_STABLE_BIT in the pvclock flags. + */ if (vcpu->xen.vcpu_info_cache.active) kvm_setup_guest_pvclock(v, &vcpu->xen.vcpu_info_cache, - offsetof(struct compat_vcpu_info, time)); + offsetof(struct compat_vcpu_info, time), + vcpu->xen.tsc_is_unstable); if (vcpu->xen.vcpu_time_info_cache.active) - kvm_setup_guest_pvclock(v, &vcpu->xen.vcpu_time_info_cache, 0); + kvm_setup_guest_pvclock(v, &vcpu->xen.vcpu_time_info_cache, 0, + vcpu->xen.tsc_is_unstable); kvm_hv_setup_tsc_page(v->kvm, &vcpu->hv_clock); return 0; } @@ -4531,7 +4545,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL | KVM_XEN_HVM_CONFIG_SHARED_INFO | KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL | - KVM_XEN_HVM_CONFIG_EVTCHN_SEND; + KVM_XEN_HVM_CONFIG_EVTCHN_SEND | + KVM_XEN_HVM_CONFIG_PVCLOCK; if (sched_info_on()) r |= KVM_XEN_HVM_CONFIG_RUNSTATE | KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 40edf4d1974c..08e64df2e27d 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -938,6 +938,12 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) } break; + case KVM_XEN_VCPU_ATTR_TYPE_PVCLOCK: + vcpu->arch.xen.tsc_is_unstable = data->u.flags & KVM_XEN_PVCLOCK_TSC_UNSTABLE; + kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); + r = 0; + break; + default: break; } @@ -1030,6 +1036,13 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) r = 0; break; + case KVM_XEN_VCPU_ATTR_TYPE_PVCLOCK: + data->u.flags = 0; + if (vcpu->arch.xen.tsc_is_unstable) + data->u.flags |= KVM_XEN_PVCLOCK_TSC_UNSTABLE; + r = 0; + break; + default: break; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 13065dd96132..a101fe60f2e1 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1282,6 +1282,7 @@ struct kvm_x86_mce { #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 4) #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND (1 << 5) #define KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG (1 << 6) +#define KVM_XEN_HVM_CONFIG_PVCLOCK (1 << 7) struct kvm_xen_hvm_config { __u32 flags; @@ -1870,6 +1871,8 @@ struct kvm_xen_vcpu_attr { __u64 expires_ns; } timer; __u8 vector; + __u32 flags; +#define KVM_XEN_PVCLOCK_TSC_UNSTABLE (1 << 0) } u; }; @@ -1884,6 +1887,8 @@ struct kvm_xen_vcpu_attr { #define KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID 0x6 #define KVM_XEN_VCPU_ATTR_TYPE_TIMER 0x7 #define KVM_XEN_VCPU_ATTR_TYPE_UPCALL_VECTOR 0x8 +/* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_PVCLOCK */ +#define KVM_XEN_VCPU_ATTR_TYPE_PVCLOCK 0x9 /* Secure Encrypted Virtualization command */ enum sev_cmd_id {