From patchwork Mon Feb 26 08:25:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206378 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1942273dyb; Mon, 26 Feb 2024 00:42:46 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUTBq2gn1RkyOUSLQvsvx71vZHekiQUwZUr8UHtmLFQEiy7LrX9UCXwvZexE7b0kvXNn5YOusMJRX7Qr159EFyndYu+kQ== X-Google-Smtp-Source: AGHT+IEAXlEYHkFt4mq9p7W8nlx6LXOoAMIEfpcGPml84qQ2OcgYYF4ZSlO91eJW02sn0dufg95t X-Received: by 2002:a17:906:454a:b0:a43:1b37:521e with SMTP id s10-20020a170906454a00b00a431b37521emr2202734ejq.72.1708936966395; Mon, 26 Feb 2024 00:42:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708936966; cv=pass; d=google.com; s=arc-20160816; b=oSsZhAeRb3w6O+4F1jo19pTzcALUSroQlKMUWgbRnDGNbws2BuE4bOuyfxU8YUTpQp OXqdFU7eCCiJGZq9r49VaRYwm24LLEzD1TCvUT3o12Hq24WO5Qq2Obazjgz7TLHhcPG/ kPy2wrBHHGFr2kDyZed51iKz1FsS1cZlanSFuf8WlIxHqqUtQdTloa9rHnot3zuyaibq w4Oxyo74kDF9Gl896nHyl1fUbvcs3Ik5zIZuNgY3t+lhqMMY8b2E2R3KCNZuaH5H2y3J a6F1DFbA23ZkSz9ZWFYjQM2yTKsIEgWzkbXShXzPkmug1Hckl2cPAwkj3xIl/eTIoDbO Vdag== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=8fGTCkz2o4fDMAWmvQvf4EYtoSNOCX5i931LH4td4nw=; fh=GU4KR9NKf1qQHjmWlEi4zSAEAxyD6Pa0rUGDupPjr4Q=; b=UV81Jrb50S7LSFP1H+AUvccA+drr2XEeajVgOXLGlc42EqmvSytk2wiLgNF7mmdnXz iG46Ov2FxGNLMPgDkK3NLK4Zg+Fq2WqsPwvDZPFdohDaONxMRggDJG9Pfuk2NZwdfl9X PnA6ZqklB3AprBLHvS1W7GQGVXmjnnmav6F5zKRk+q/C72bYmfcwEX4I4q1Dgc55z+Fn LfJ3VqgzR90sGhyPf9rW1S2fimWbj8BwSsP+9iYkX5emOW2mGHT8PJee7AmiNTMgOvUt ue2c6WxaVcSNPTcxySEN3Y1DwGr22UIYlYmno0AXPz/7/sZbUVSMUD///xZkrL7dRYci kq+w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nTtIUIC7; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80803-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80803-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id jg6-20020a170907970600b00a43239c4427si1179658ejc.967.2024.02.26.00.42.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 00:42:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80803-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=nTtIUIC7; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80803-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80803-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id C94961F207C4 for ; Mon, 26 Feb 2024 08:42:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 818285F566; Mon, 26 Feb 2024 08:28:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nTtIUIC7" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C1BE5D8EF; Mon, 26 Feb 2024 08:28:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936090; cv=none; b=Zt6jYr/yNu+WdlVPhAbP1UCiWA37UMjxe5F05EhGmaxFrKoVMceygYIv6dUaAnIrHvX8tai0uY9b9EHeCEiCBYFczBiG5AcuQBGttmS7cLUmScHzZH4ZN0fSI6XAPf1ISK0Ep0nlN6HrQShEOHr3SSCy6mYa9PzOv7J6jLEovxo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936090; c=relaxed/simple; bh=B6Szq744QTs7ccXNfQ/JakGeoI4rVafNlokaNEo2vm0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZTasb/bcnA7XmPbP2e95DOk4Jq8os0kCNvi1etHVxA//UOrHgDxAjF+7oahVILfllnHVtsEg3OCdsdwVF6ifm/2AmpokpVDt/dnZl1ztuyARe+IY6pPSCaAEaFc+GrOG39Q7h4b8XuKKRa58rlDqNtxRBq3PDcYFoX8nNdl4aXk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nTtIUIC7; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936088; x=1740472088; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B6Szq744QTs7ccXNfQ/JakGeoI4rVafNlokaNEo2vm0=; b=nTtIUIC7WX/z9liZguyosSGGZNuMTgmNSkUFWcXuvnlQezUpdimbugf/ CZDbtdeTLdQD6AnWHd8Tw510G6lH0UPQgxhKmAQ/eEj83z+G0gq9e3i3K C4pWIuBFyT2pVSxQB/UoDUBjZjXutukg5Puge1y+7pX5geX0qnoHM+1bJ 62g7cL287xlED5HCahfwJ3s1G2JDFPpNSiflzbh5rJjGI5I+y1EsHbgOF XOJ/OpHFNLWtNLE9lYwwlgwqsiuc1peqXhKnz+jQS5+5zRAYMO/494Iab k9qlZsDGcD75edtfNYzzshe4xRl3R+Mi7mt5+v0m/s3SQ1VY0zeU8F0sc A==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="6155331" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6155331" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:28:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6615708" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:28:04 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Binbin Wu Subject: [PATCH v19 041/130] KVM: TDX: Refuse to unplug the last cpu on the package Date: Mon, 26 Feb 2024 00:25:43 -0800 Message-Id: <15d4d08cf1fe16777f8f8d84ee940e41afbad406.1708933498.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791950288410660761 X-GMAIL-MSGID: 1791950288410660761 From: Isaku Yamahata In order to reclaim TDX HKID, (i.e. when deleting guest TD), needs to call TDH.PHYMEM.PAGE.WBINVD on all packages. If we have active TDX HKID, refuse to offline the last online cpu to guarantee at least one CPU online per package. Add arch callback for cpu offline. Because TDX doesn't support suspend, this also refuses suspend if TDs are running. If no TD is running, suspend is allowed. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Binbin Wu --- v18: - Added reviewed-by BinBin --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 41 ++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 5 ++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 12 +++++++-- 8 files changed, 62 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 97f0a681e02c..f78200492a3d 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -18,6 +18,7 @@ KVM_X86_OP(check_processor_compatibility) KVM_X86_OP(hardware_enable) KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) +KVM_X86_OP_OPTIONAL_RET0(offline_cpu) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6dedb5cb71ef..0e2408a4707e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1600,6 +1600,7 @@ struct kvm_x86_ops { int (*hardware_enable)(void); void (*hardware_disable)(void); void (*hardware_unsetup)(void); + int (*offline_cpu)(void); bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 437c6d5e802e..d69dd474775b 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -110,6 +110,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .check_processor_compatibility = vmx_check_processor_compat, .hardware_unsetup = vt_hardware_unsetup, + .offline_cpu = tdx_offline_cpu, /* TDX cpu enablement is done by tdx_hardware_setup(). */ .hardware_enable = vmx_hardware_enable, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index b11f105db3cd..f2ee5abac14e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -97,6 +97,7 @@ int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) */ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; +static atomic_t nr_configured_hkid; static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { @@ -112,6 +113,7 @@ static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) { tdx_guest_keyid_free(kvm_tdx->hkid); kvm_tdx->hkid = -1; + atomic_dec(&nr_configured_hkid); } static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) @@ -586,6 +588,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, if (ret < 0) return ret; kvm_tdx->hkid = ret; + atomic_inc(&nr_configured_hkid); va = __get_free_page(GFP_KERNEL_ACCOUNT); if (!va) @@ -1071,3 +1074,41 @@ void tdx_hardware_unsetup(void) kfree(tdx_info); kfree(tdx_mng_key_config_lock); } + +int tdx_offline_cpu(void) +{ + int curr_cpu = smp_processor_id(); + cpumask_var_t packages; + int ret = 0; + int i; + + /* No TD is running. Allow any cpu to be offline. */ + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * In order to reclaim TDX HKID, (i.e. when deleting guest TD), need to + * call TDH.PHYMEM.PAGE.WBINVD on all packages to program all memory + * controller with pconfig. If we have active TDX HKID, refuse to + * offline the last online cpu. + */ + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + return -ENOMEM; + for_each_online_cpu(i) { + if (i != curr_cpu) + cpumask_set_cpu(topology_physical_package_id(i), packages); + } + /* Check if this cpu is the last online cpu of this package. */ + if (!cpumask_test_cpu(topology_physical_package_id(curr_cpu), packages)) + ret = -EBUSY; + free_cpumask_var(packages); + if (ret) + /* + * Because it's hard for human operator to understand the + * reason, warn it. + */ +#define MSG_ALLPKG_ONLINE \ + "TDX requires all packages to have an online CPU. Delete all TDs in order to offline all CPUs of a package.\n" + pr_warn_ratelimited(MSG_ALLPKG_ONLINE); + return ret; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 7f123ffe4d42..33ab6800eab8 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -138,6 +138,7 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); void tdx_hardware_unsetup(void); bool tdx_is_vm_type_supported(unsigned long type); +int tdx_offline_cpu(void); int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap); int tdx_vm_init(struct kvm *kvm); @@ -148,6 +149,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return -EOPNOTSUPP; } static inline void tdx_hardware_unsetup(void) {} static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } +static inline int tdx_offline_cpu(void) { return 0; } static inline int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1f01c7f91652..e27ea5ed2968 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12508,6 +12508,11 @@ void kvm_arch_hardware_disable(void) drop_user_return_notifiers(); } +int kvm_arch_offline_cpu(unsigned int cpu) +{ + return static_call(kvm_x86_offline_cpu)(); +} + bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu) { return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index eeaf4e73317c..813952eca1bf 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1502,6 +1502,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {} int kvm_arch_hardware_enable(void); void kvm_arch_hardware_disable(void); #endif +int kvm_arch_offline_cpu(unsigned int cpu); int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2f0a8e28795e..de38f308738e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5653,13 +5653,21 @@ static void hardware_disable_nolock(void *junk) __this_cpu_write(hardware_enabled, false); } +__weak int kvm_arch_offline_cpu(unsigned int cpu) +{ + return 0; +} + static int kvm_offline_cpu(unsigned int cpu) { + int r = 0; + mutex_lock(&kvm_lock); - if (kvm_usage_count) + r = kvm_arch_offline_cpu(cpu); + if (!r && kvm_usage_count) hardware_disable_nolock(NULL); mutex_unlock(&kvm_lock); - return 0; + return r; } static void hardware_disable_all_nolock(void)