From patchwork Mon Feb 26 08:29:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206480 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1956671dyb; Mon, 26 Feb 2024 01:21:44 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVfc0Ai86aqKf0ZZfizjJYoKUtC+7tkfTXSNyvN+HZZdFOrITCWNG1O07WsTsmjHx5yHF7p29U0JRBS0YoDEIoz+vTj4w== X-Google-Smtp-Source: AGHT+IHAsuiuRvxRV7Oja4doXK8erqx9HJYHQJrnMoYKmjFGG2fu8dOfasxvvriMzDtv4theCkiY X-Received: by 2002:a17:906:150b:b0:a3f:d742:f353 with SMTP id b11-20020a170906150b00b00a3fd742f353mr4173713ejd.57.1708939303848; Mon, 26 Feb 2024 01:21:43 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708939303; cv=pass; d=google.com; s=arc-20160816; b=lLHqH3n2Qa9FWvs/8nqu4fO4U3HwqdXOR4amVuOVnFmE4zLtiYw9snGYpDX1m6kr7R vJ9/+ocitvKDz6HBZuZfAqg3CfV60OSGo4LiXAa5MxB7cYouzNsOS6L56hzEceLCbLVX dz6nVOhhUqsC8MHyujgRTVidshXz4Z/QbYmsvLv5wbmWOpJZaOVRc+WMCA6/qpvLvsog r1+pRtWRkWx1TNX5k3k8UMANUL49i3U2ALMAWVI514VWqlY0oBTgSI3zYWX+Dec1ZuiP zc6FTNnSo/Hg0/TZ16uy3zvm1y0mgCHJSKxXkz5KFKzHlU0j/7irVbL2I43dzcCddu46 tCCg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=79R30YyaslGP7kQ3f1DzgT9i0VHB+/H+exnEQvKqudY=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=BxAw4KKhNcCb3QfxQ+FifenGLFmWUpCs3EDUgyOsBWw1wPfXnbj0YW0ToniUdd90/W 40y06dzREbzXjYQ5b+qhANBz2iMZZDZNOSP8g6FLo7vjB39f0yAjKZFBjNDT88DYxMwN 6RjkVvkpfTp2hUEg7ipYt7+6e5D++hEKAC+dLdya+v419N2rZgevMbNv4n5CkRWaH0Bk oyyTCUVWaU5yijLrlqeHnvY9JbZySSTVfDtBcaw1v1S2ga6h12UAgLnECJBZVryY42je T8ddtkOGA0YOFh37U32e/L7gSPdq48NJdBJMgS8UXGkC+fI5dn/00v5uIFLVkk3KmfCz IRJQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TjBSVyDC; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80897-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80897-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id j17-20020a170906095100b00a3dd7f75e15si2009164ejd.439.2024.02.26.01.21.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:21:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80897-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TjBSVyDC; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80897-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80897-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id DD1A51F265D9 for ; Mon, 26 Feb 2024 09:13:12 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2A99A13399D; Mon, 26 Feb 2024 08:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TjBSVyDC" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AD881CD37; Mon, 26 Feb 2024 08:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936176; cv=none; b=uinN4Ck6xM9JxDfAhLB5DxlNJNWcHMmWqhVLicafq0jVbl5yiKcRVJyfisXO5W7vAJkwJThgHsjlkdlcv8W0KVSOkorvlk496e/RMUQVmqsdZRbpsmgXKzsI/qm94Gil5UjL+wAtCnn1Y+wBa0g+i5thWG1YTyri0S51JKTW9QA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936176; c=relaxed/simple; bh=YvsddT3dqFtUhcHe4XrGJkymHl+vcOM1Ih5ZHHag8ws=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=q8aQ8WyTGrR+4kl+a1163S0jxJwXnEeMJ/nGBXFb8F3hX+P67NRDuWe89aUXk5Jp4yqFR4uW0gB0Jlex90CdxYQzKT5p4D+zpUpP/t7T0e6MIoH4O9/255Q5yLD+JVgwzAFKX839VNMN8a3eEYf0xQyjyfHiC63mQmAyuqL/CHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TjBSVyDC; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936173; x=1740472173; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YvsddT3dqFtUhcHe4XrGJkymHl+vcOM1Ih5ZHHag8ws=; b=TjBSVyDCesyySEwH/viEibNOLhRtv6VKGAup7E2NTwcoPmD/xne9JgWh 1hPErTih/pcgH384uInxFRsmiTEwlIhp3OuJ7hksT1E3sc12zR4aU2GuL 0+VXK9FjeIhi6ve8ikxnVmbnWAR+MiYsoGN63juYSq9HJvEgvYxlcBQ3R fYWYQO1gzx2KqR+zQLks+Nk9koNTt0iUbJc2i+Yb4rxKVqRIU7Z1yS0Le atL03xvmWKXEMuDfB7gImOCfFsbuyHNq90kwKc7dGOFnIp05nuVpT6fcU KodpWYkW3vxcQiZao0IzUsIUED6WjlGPVHYjKoAKpXd+xyfmcCMocrl9F w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="20751497" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="20751497" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6735284" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:31 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 01/14] KVM: Add transparent hugepage support for dedicated guest memory Date: Mon, 26 Feb 2024 00:29:15 -0800 Message-Id: <6fdc566ffb45eeaa653ec21c0a539723b8ee056d.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952739354591342 X-GMAIL-MSGID: 1791952739354591342 From: Sean Christopherson Extended guest_memfd to allow backing guest memory with transparent hugepages. Require userspace to opt-in via a flag even though there's no known/anticipated use case for forcing small pages as THP is optional, i.e. to avoid ending up in a situation where userspace is unaware that KVM can't provide hugepages. For simplicity, require the guest_memfd size to be a multiple of the hugepage size, e.g. so that KVM doesn't need to do bounds checking when deciding whether or not to allocate a huge folio. When reporting the max order when KVM gets a pfn from guest_memfd, force order-0 pages if the hugepage is not fully contained by the memslot binding, e.g. if userspace requested hugepages but punches a hole in the memslot bindings in order to emulate x86's VGA hole. Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20231027182217.3615211-18-seanjc@google.com Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/api.rst | 7 ++++ include/uapi/linux/kvm.h | 2 + virt/kvm/guest_memfd.c | 73 ++++++++++++++++++++++++++++++---- 3 files changed, 75 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 4b70d2b43532..213738a38b07 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6312,6 +6312,8 @@ and cannot be resized (guest_memfd files do however support PUNCH_HOLE). __u64 reserved[6]; }; + #define KVM_GUEST_MEMFD_ALLOW_HUGEPAGE (1ULL << 0) + Conceptually, the inode backing a guest_memfd file represents physical memory, i.e. is coupled to the virtual machine as a thing, not to a "struct kvm". The file itself, which is bound to a "struct kvm", is that instance's view of the @@ -6328,6 +6330,11 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If KVM_GUEST_MEMFD_ALLOW_HUGEPAGE is set in flags, KVM will attempt to allocate +and map hugepages for the guest_memfd file. This is currently best effort. If +KVM_GUEST_MEMFD_ALLOW_HUGEPAGE is set, the size must be aligned to the maximum +transparent hugepage size supported by the kernel + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_MEMORY_MAPPING diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index a7aa804ef021..47faaf71799f 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -2317,6 +2317,8 @@ struct kvm_create_guest_memfd { __u64 reserved[6]; }; +#define KVM_GUEST_MEMFD_ALLOW_HUGEPAGE (1ULL << 0) + #define KVM_MEMORY_MAPPING _IOWR(KVMIO, 0xd5, struct kvm_memory_mapping) struct kvm_memory_mapping { diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 3830d50b9b67..236443c3d8dc 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -13,14 +13,47 @@ struct kvm_gmem { struct list_head entry; }; -static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) +static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index) { +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + unsigned long huge_index = round_down(index, HPAGE_PMD_NR); + unsigned long flags = (unsigned long)inode->i_private; + struct address_space *mapping = inode->i_mapping; + gfp_t gfp = mapping_gfp_mask(mapping); struct folio *folio; - /* TODO: Support huge pages. */ - folio = filemap_grab_folio(inode->i_mapping, index); - if (IS_ERR_OR_NULL(folio)) + if (!(flags & KVM_GUEST_MEMFD_ALLOW_HUGEPAGE)) + return NULL; + + if (filemap_range_has_page(mapping, huge_index << PAGE_SHIFT, + (huge_index + HPAGE_PMD_NR - 1) << PAGE_SHIFT)) + return NULL; + + folio = filemap_alloc_folio(gfp, HPAGE_PMD_ORDER); + if (!folio) + return NULL; + + if (filemap_add_folio(mapping, folio, huge_index, gfp)) { + folio_put(folio); return NULL; + } + + return folio; +#else + return NULL; +#endif +} + +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) +{ + struct folio *folio; + + folio = kvm_gmem_get_huge_folio(inode, index); + if (!folio) { + folio = filemap_grab_folio(inode->i_mapping, index); + if (IS_ERR_OR_NULL(folio)) + return NULL; + } /* * Use the up-to-date flag to track whether or not the memory has been @@ -363,6 +396,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) inode->i_mode |= S_IFREG; inode->i_size = size; mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_large_folios(inode->i_mapping); mapping_set_unmovable(inode->i_mapping); /* Unmovable mappings are supposed to be marked unevictable as well. */ WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); @@ -388,12 +422,21 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + valid_flags |= KVM_GUEST_MEMFD_ALLOW_HUGEPAGE; + if (flags & ~valid_flags) return -EINVAL; if (size <= 0 || !PAGE_ALIGNED(size)) return -EINVAL; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + if ((flags & KVM_GUEST_MEMFD_ALLOW_HUGEPAGE) && + !IS_ALIGNED(size, HPAGE_PMD_SIZE)) + return -EINVAL; +#endif + return __kvm_gmem_create(kvm, size, flags); } @@ -488,7 +531,7 @@ void kvm_gmem_unbind(struct kvm_memory_slot *slot) int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, int *max_order) { - pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff; + pgoff_t index, huge_index; struct kvm_gmem *gmem; struct folio *folio; struct page *page; @@ -501,6 +544,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gmem = file->private_data; + index = gfn - slot->base_gfn + slot->gmem.pgoff; if (WARN_ON_ONCE(xa_load(&gmem->bindings, index) != slot)) { r = -EIO; goto out_fput; @@ -520,9 +564,24 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, page = folio_file_page(folio, index); *pfn = page_to_pfn(page); - if (max_order) - *max_order = 0; + if (!max_order) + goto success; + + *max_order = compound_order(compound_head(page)); + if (!*max_order) + goto success; + /* + * The folio can be mapped with a hugepage if and only if the folio is + * fully contained by the range the memslot is bound to. Note, the + * caller is responsible for handling gfn alignment, this only deals + * with the file binding. + */ + huge_index = ALIGN(index, 1ull << *max_order); + if (huge_index < ALIGN(slot->gmem.pgoff, 1ull << *max_order) || + huge_index + (1ull << *max_order) > slot->gmem.pgoff + slot->npages) + *max_order = 0; +success: r = 0; out_unlock: From patchwork Mon Feb 26 08:29:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206518 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1961475dyb; Mon, 26 Feb 2024 01:34:58 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUF11erBZeIG2Umjj5Oh0/Cr1/74e/sg/n1WGU4TJJcZDa6c3cjvkeqqtX7RtG2PPSSq0n/ro1qzSysTGDC9eHHLEAYtQ== X-Google-Smtp-Source: AGHT+IG4K2/Ua+iiwvm0Bo7f+aAixjiFqetKhWf/qZcBPuu7wQOtCOM8Fj5auq5hbFpCXXVeRNoV X-Received: by 2002:a05:6a20:d704:b0:1a0:eed4:dbf9 with SMTP id iz4-20020a056a20d70400b001a0eed4dbf9mr6441701pzb.24.1708940098788; Mon, 26 Feb 2024 01:34:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708940098; cv=pass; d=google.com; s=arc-20160816; b=Hlysg6lLyEmB8/DlF4GHqX9ojVVxJCGf17WWZAN1CbACLf3WajCgaDDIIFfh/VSV11 ry4FOjWfbLmWzRvVHcA/kYRYwUY1onXxd6PqxXT04/E1ihi5OZ4J9xGYsSyR/r6az6Wb oBoMah50oFMPATAti/YUjXL1qp2FukDmsJQOAY02kw/4O+J9aQNirbo2oRT5KBF7Z95u t50p6p1Wak3AVPQntEHWBb0HUYxmGTYrkp0OwhonCieDcav8XT8AsQojL1JsWC7YLdm5 PRZ8kGoUHYbzVf3itc7MHh+J0nD1pHcAXTSXZBUlXoe2TMkfSx3T7LnHOxX2h2Bia8J6 9/JQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=JwzkVKUaI01DY0Gd6C4GWR15W7nDHCDSavO/NLho9yc=; fh=vCV9dildqzi2zsj3V/yzWE0QtJBuVBxcnIaIom8NZMw=; b=TtRTZHzd/UuUYwRSdxeDqfeJQSokKgRPNWsdtxYEMFMjBmwKaLtB3IZIINoViXqO2M BYL0mMYJj9V/eBEEhDubQp/SwqY3/fTLfWi/+ZCphrYHZ6THeAnnr1FkCOAbcz8WzbmY xbGo7IqTYEy5LWVn6nFxMGcyjOdWAtpR2zC1Dl3h18h/JeWeQDjhmWqIHl/wu7OTsEV7 rdzV/n18l0iLPQjx1L2QlS4S6RaTxYHE9efysHYT/KWlqP69yVkt6TiZ6++Xb4jezmSn hulGJsu3/CKoysYyuOUsfIrEwKSgvaFZ+LoqnflCxfbrKO9wnpUfR8IZkuCOFYdgzIiK m4Wg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WajuvFCv; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80896-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80896-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id 66-20020a630045000000b005dc846e12adsi3426323pga.655.2024.02.26.01.34.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:34:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80896-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WajuvFCv; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80896-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80896-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id E7F6BB26234 for ; Mon, 26 Feb 2024 09:12:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C325C13340B; Mon, 26 Feb 2024 08:29:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="WajuvFCv" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC445131E51; Mon, 26 Feb 2024 08:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936175; cv=none; b=akmfCz3wh3Ej8NN3FSVWZobWTh01UYiIt4r2oIhNZu72qdXdhjVLIM4SnqwZyE4pU0AVtUiI+AGsn2abbQA/zfNaR8fJWcL10L9lscV+vrfwnIFM0u4sd52OPSYA/SfMcubCpcEHeLH5C/idIIGUM93GKKj5SLnxxhQ2hoAMeCo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936175; c=relaxed/simple; bh=387eYe1pemVIbH7FAN1xXCSFRyhOU3BS2G0RgG8bpFg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CXVCwzmpwyzSuV30d4PBSez/kuC+GX6c9HGjWS1GiWUt0s/slffC5Lz3t3wFgl2EcWLBkFOHODqMP/5QLJNDnslJvZY+2s/XhUHkKtXUS8beWJg5id4gPBjl9peOTkd60+yt+9DxaQXDMUpwqkaf39XPbFpurUudO4sitiLTuIk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=WajuvFCv; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936174; x=1740472174; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=387eYe1pemVIbH7FAN1xXCSFRyhOU3BS2G0RgG8bpFg=; b=WajuvFCv5NyIEkw5RJmCQCg1UhNQWNv27+MePc79bJ+yOfDOaDSYaUaY Y/bYOo6fA0JnQO3hzUQyudFuheCDYLgeWw6btjrvJhbP3pftGfAPCMQnJ Ym/xJZF1Pl4XalJZpn5lh3O8h1glyo7QJNX6yJ/TPYQIXXgFLgfsmXF6i MDJ4RX5rirpOl/9QQiKn+VLvkyu1WOWUPY5BpVP/fOqLK1oWu6nh2jcB3 +mxdtGWP0YOQxyFJniH+MpDaw2Sf8JhJEmXC2HBGAb62P9FIEsbsyxHJN JmWaLwLQse6CMAVUPSllUr1iAI1BqeUE6ibd/ggbnd9uVb/zEPaAebwGO A==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="20751501" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="20751501" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6735287" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:31 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li , Binbin Wu Subject: [PATCH v8 02/14] KVM: TDX: Flush cache based on page size before TDX SEAMCALL Date: Mon, 26 Feb 2024 00:29:16 -0800 Message-Id: <544c18765a2778afc4d3964629f116985554d1ee.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791953573001410263 X-GMAIL-MSGID: 1791953573001410263 From: Xiaoyao Li tdh_mem_page_aug() will support 2MB large page in the near future. Cache flush also needs to be 2MB instead of 4KB in such cases. Introduce a helper function to flush cache with page size info in preparation for large pages. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata Reviewed-by: Binbin Wu --- v6: - catch up tdx_seamcall() change --- arch/x86/kvm/vmx/tdx_ops.h | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index d27f281152cb..3af124711e98 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -6,6 +6,7 @@ #include +#include #include #include #include @@ -50,6 +51,11 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level) return level - 1; } +static inline void tdx_clflush_page(hpa_t addr, enum pg_level level) +{ + clflush_cache_range(__va(addr), KVM_HPAGE_SIZE(level)); +} + /* * TDX module acquires its internal lock for resources. It doesn't spin to get * locks because of its restrictions of allowed execution time. Instead, it @@ -87,7 +93,7 @@ static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) .rdx = tdr, }; - clflush_cache_range(__va(addr), PAGE_SIZE); + tdx_clflush_page(addr, PG_LEVEL_4K); return tdx_seamcall(TDH_MNG_ADDCX, &in, NULL); } @@ -101,7 +107,7 @@ static inline u64 tdh_mem_page_add(hpa_t tdr, gpa_t gpa, hpa_t hpa, hpa_t source .r9 = source, }; - clflush_cache_range(__va(hpa), PAGE_SIZE); + tdx_clflush_page(hpa, PG_LEVEL_4K); return tdx_seamcall_sept(TDH_MEM_PAGE_ADD, &in, out); } @@ -114,7 +120,7 @@ static inline u64 tdh_mem_sept_add(hpa_t tdr, gpa_t gpa, int level, hpa_t page, .r8 = page, }; - clflush_cache_range(__va(page), PAGE_SIZE); + tdx_clflush_page(page, PG_LEVEL_4K); return tdx_seamcall_sept(TDH_MEM_SEPT_ADD, &in, out); } @@ -147,7 +153,7 @@ static inline u64 tdh_vp_addcx(hpa_t tdvpr, hpa_t addr) .rdx = tdvpr, }; - clflush_cache_range(__va(addr), PAGE_SIZE); + tdx_clflush_page(addr, PG_LEVEL_4K); return tdx_seamcall(TDH_VP_ADDCX, &in, NULL); } @@ -160,7 +166,7 @@ static inline u64 tdh_mem_page_relocate(hpa_t tdr, gpa_t gpa, hpa_t hpa, .r8 = hpa, }; - clflush_cache_range(__va(hpa), PAGE_SIZE); + tdx_clflush_page(hpa, PG_LEVEL_4K); return tdx_seamcall_sept(TDH_MEM_PAGE_RELOCATE, &in, out); } @@ -173,7 +179,7 @@ static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, hpa_t hpa, .r8 = hpa, }; - clflush_cache_range(__va(hpa), PAGE_SIZE); + tdx_clflush_page(hpa, PG_LEVEL_4K); return tdx_seamcall_sept(TDH_MEM_PAGE_AUG, &in, out); } @@ -204,7 +210,7 @@ static inline u64 tdh_mng_create(hpa_t tdr, int hkid) .rdx = hkid, }; - clflush_cache_range(__va(tdr), PAGE_SIZE); + tdx_clflush_page(tdr, PG_LEVEL_4K); return tdx_seamcall(TDH_MNG_CREATE, &in, NULL); } @@ -215,7 +221,7 @@ static inline u64 tdh_vp_create(hpa_t tdr, hpa_t tdvpr) .rdx = tdr, }; - clflush_cache_range(__va(tdvpr), PAGE_SIZE); + tdx_clflush_page(tdvpr, PG_LEVEL_4K); return tdx_seamcall(TDH_VP_CREATE, &in, NULL); } From patchwork Mon Feb 26 08:29:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206485 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1957053dyb; Mon, 26 Feb 2024 01:22:44 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWb06mlmtAVAuz5PL9LhkkkTb3UluHRgV2hV8895fM6OiITBgVrsXVJEPEaI4rye37XbBzClC3i0LnVMXKBoxPes3DwXQ== X-Google-Smtp-Source: AGHT+IE6wtPexOIWGJDhecfpP2X5IR2J13QjobXGGKTzPnn6MeBQiliVjU4YHhbEQD7WEyn1kDem X-Received: by 2002:a17:906:1e93:b0:a3f:6942:a213 with SMTP id e19-20020a1709061e9300b00a3f6942a213mr4017903ejj.32.1708939364574; Mon, 26 Feb 2024 01:22:44 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708939364; cv=pass; d=google.com; s=arc-20160816; b=mu2aErYjgiZ0lPBAHUlSJ0AtZOkT6mVO7pyNopLE+JtDeKKSWYFoTzck247ES+64CM n4L7brnrzYnXvwOwHgIpUyHw6pzGDHi6t5ZJEXNC49H94dTDWu/39s+8fE36ukY3bpf2 LrUU/0QdWW/wCHJQQ+JcH2+TDdP+CmI2QUjM0wnd4pIjd5jAK2M/+25UYTfzmINkWLDY pUXbYZ9rHeGD6JdktKt+4c1Eg9xdFwk+xS8PKQuYK2M+WczPdh4E2Fs12FTyoYgxhWiv 95NBS9/Riws3uDT4Lu2Oo9GDjVvVOWdpPK0unNYtmkTDr5pAhAS2Qo1mf38sNmpLcpdA +c9A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=8N9nOucbqRuxh9xPPBgQhti6DMnryK6L7yN/1ibiVAc=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=N7ZDuRSWvpMYObM5sXAXNuyvcjABcxwMdLdKDwefVhJi0FLO3qd/H3xLND1cs73o+j nr044suSwfyEKNgrHfBUzYie2cp6Hcv1eZ9Sib56c8v1TsBrvzgtoINu+EQdgotC6YuQ 2EGXPIj60qdHEcKUhRRbpu5b/5LFKKYAXtepEVoso/EVafdK36qD3ahsEh1Cw8zDCL/7 Nrv2mhYZZwnFWWpzoOJB2YUHmYamrFsUmj5s/qLrqMTxNqua4IaXsm8ZWMD64vglNxNq KIU8ZZyO8pMOcGO1LOXyRmjALTZY4hkuWWHT73O5fpv++nTTiU8iz807G8NrsLIhdopy ssGA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CHw7tusO; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80898-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80898-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id ji10-20020a170907980a00b00a3e3fab918csi1931497ejc.428.2024.02.26.01.22.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:22:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80898-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CHw7tusO; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80898-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80898-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 73C1D1F265BD for ; Mon, 26 Feb 2024 09:13:12 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 394E113399F; Mon, 26 Feb 2024 08:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CHw7tusO" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DE32132C04; Mon, 26 Feb 2024 08:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936177; cv=none; b=eQxBxC+JuxIrMUfsEMtdyc6+oQ/nYIXIkhvxIv8EoP65OzhmrmppQK7jH0ciIRa2r3c+cJzd4AezGKddqvnWkesnLZoT/zuXfqTQjBety/uiY5XYGpiAvPo85MPEjH264KBOzj6RylDKZgIdLDy+r5c3atkbwQF+7H2MexmAoao= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936177; c=relaxed/simple; bh=9sC+tVDstlmuTrpJLRIwemasGuxQzkEeVSSJPRH35IY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iFiqQRoJyvnsqkc3S5UJ0gahvKswEBwiZ/QG2TqkEcewXYQAPH3LlM1Jw4T0XhznidAUSUwM9aMprQnIF0wfcl9gZ29V9btWXadSQ7SK9Q0cZX3KLKdPI3brNlObp/8IM26ChSEkKi3xfrJJQjp0wE/fI3HkNMYRaL9mM0EXNLg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CHw7tusO; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936175; x=1740472175; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9sC+tVDstlmuTrpJLRIwemasGuxQzkEeVSSJPRH35IY=; b=CHw7tusO+9QApu9CBLSO1jjmYdVkdRcJYNkHU4Xm/YgCzF8h66DDDW51 3oJs8ZQOZENdtdLA+N/iYff626xsvL9JC2clWRAN5OpKWCd1efpzryxc0 MaJklo0T+2Xz3N1FKadOyZQQSkPAM+zE/PmuQfMnEwI676b8x3W/kAAnw 7gQEACu6p54VVGRr1krZ8R+sl+/RPv0WMv5XFX9Oi3F9nH82o2geFF7zR kqzzUHq3wd4qb71TyjPj0i6KZUmSBPPNeMzU4oK5dUz5Nj+6urF2ciEFE t9EnuUv2xGBy5Jpij8LjCCx++/kDlzD6FH1yfYWb0DgJmpbr1E2hsWv4x w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="20751505" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="20751505" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6735293" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:32 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 03/14] KVM: TDX: Pass KVM page level to tdh_mem_page_aug() Date: Mon, 26 Feb 2024 00:29:17 -0800 Message-Id: <8f4125c90898652317ae6bec5d46fe45d3f11eef.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952803017271539 X-GMAIL-MSGID: 1791952803017271539 From: Xiaoyao Li Level info is needed in tdx_clflush_page() to generate the correct page size. Besides, explicitly pass level info to SEAMCALL instead of assuming it's zero. It works naturally when 2MB support lands. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- v7: - Don't pass level to tdh_mem_page_add() as it supports only 4K page. - catch up for change of tdx_seamcall() --- arch/x86/kvm/vmx/tdx.c | 2 +- arch/x86/kvm/vmx/tdx_ops.h | 12 +++++++++--- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index a71093f7c3e3..fd992966379c 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1470,7 +1470,7 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, union tdx_sept_entry entry; u64 err; - err = tdh_mem_page_aug(kvm_tdx->tdr_pa, gpa, hpa, &out); + err = tdh_mem_page_aug(kvm_tdx->tdr_pa, gpa, tdx_level, hpa, &out); if (unlikely(err == TDX_ERROR_SEPT_BUSY)) { tdx_unpin(kvm, pfn); return -EAGAIN; diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index 3af124711e98..ef4748943ac7 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -51,6 +51,11 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level) return level - 1; } +static inline enum pg_level tdx_sept_level_to_pg_level(int tdx_level) +{ + return tdx_level + 1; +} + static inline void tdx_clflush_page(hpa_t addr, enum pg_level level) { clflush_cache_range(__va(addr), KVM_HPAGE_SIZE(level)); @@ -100,6 +105,7 @@ static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) static inline u64 tdh_mem_page_add(hpa_t tdr, gpa_t gpa, hpa_t hpa, hpa_t source, struct tdx_module_args *out) { + /* TDH.MEM.PAGE.ADD() suports only 4K page. tdx 4K page level = 0 */ struct tdx_module_args in = { .rcx = gpa, .rdx = tdr, @@ -170,16 +176,16 @@ static inline u64 tdh_mem_page_relocate(hpa_t tdr, gpa_t gpa, hpa_t hpa, return tdx_seamcall_sept(TDH_MEM_PAGE_RELOCATE, &in, out); } -static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, hpa_t hpa, +static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, int level, hpa_t hpa, struct tdx_module_args *out) { struct tdx_module_args in = { - .rcx = gpa, + .rcx = gpa | level, .rdx = tdr, .r8 = hpa, }; - tdx_clflush_page(hpa, PG_LEVEL_4K); + tdx_clflush_page(hpa, tdx_sept_level_to_pg_level(level)); return tdx_seamcall_sept(TDH_MEM_PAGE_AUG, &in, out); } From patchwork Mon Feb 26 08:29:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206467 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1953450dyb; Mon, 26 Feb 2024 01:13:23 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUNxW8JTms9F+DQIqQymyfJ2cZAtbBgemTwY1F9ncKcYIu+KRd5VRHFJeW/V3oorxPcn57WedfWxY8jz35dzPitBUGwZg== X-Google-Smtp-Source: AGHT+IEI9AwmXdvD5oyhq3BNI/yBlF3dIui/9ImeMfwUTESwDCSuFe6dOuTCnO9w8wFybTNwfcAQ X-Received: by 2002:a05:6830:6b8e:b0:6e4:971a:9249 with SMTP id dd14-20020a0568306b8e00b006e4971a9249mr4730381otb.9.1708938803121; Mon, 26 Feb 2024 01:13:23 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938803; cv=pass; d=google.com; s=arc-20160816; b=HsMOGPJZugnYadNAULksHPgLeEw9isXTlM5RGltyTFRSejzbYlEcg8Gsj4XWIwtzHk Gl3DpO0upZwFGFgWPuTKUX4NEZJ7LHCNmXGYxnNMCwjWxzDP6YS6fje5EKOIfpbLvvgX J+5dsMqMTKnpdh6DUmWtkykclUIoRERsXM7FDVZyhrXWTtLRIRDul0bZi91fhJrE6bPB GxwqtFt6WMJfLvpodyZ7/0v3725wuE4HG1HYpY0ngN8daRkidZVOqa3mOgP3bmcFS7eJ 3MFtt2rkFxe9GJdBYWEXoaUsjQsSOIN7PduS/YYy8IEGokWFiRxS8nW0MTcPKc1+aPy1 /1LQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=7sz+KrDOCvwzubsze9VXr6Vn6LXs75Xad2B7U1b4ynQ=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=V7ZUVFvTjnfznIJHHRKdkpsZ7MDguzsH1cJDNDQfOdrac5BqzvhB7dFKYkA49hKBT2 I9/1WC2zAV+ucYq2FvpTqro8KlUGwzMZx401rNNhcoOWQSsM/x1dcJXPB9dmWQ4+nMzW R070PDX3puxOgP17H/hCy1Pv+bZYSk4hGsnErlHeuWMhqivlMMrXewyM0A0ZUanLQTHO y1qrymSKrc112Ej++vln00q180iiPiK2wczJ18dHOS/JkroCSLF5rhy7tp9ruGH4KSIU +UtFxYZaVaoy/VrsRgv19fvyl/rVktxD0CoZMHL+R8mi7tU1VqCNpc1KTOLXJNKjJZ88 vg4Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ArhkQ4kG; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80899-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80899-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id m11-20020a656a0b000000b005d9b919b94asi3472382pgu.311.2024.02.26.01.13.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:13:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80899-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ArhkQ4kG; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80899-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80899-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id DF4CD280D9B for ; Mon, 26 Feb 2024 09:13:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5CF0D13472B; Mon, 26 Feb 2024 08:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ArhkQ4kG" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22798132C10; Mon, 26 Feb 2024 08:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936178; cv=none; b=vENp1inbgoPzdwUvWgx9GY1IMR05WTI8yT0EA/trwkKEJrmZGkWedk5kwRMdpfq7kXhYjcA1nMU6Y/w/xX9oeYhoNbtodDMPC9Ii9KsSfP4PoQoDebTFXKiUyEp/nd6QxM6AfgAHUS8PuvMmMuByo02Lj+u01OCxHjINYTypvNA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936178; c=relaxed/simple; bh=8X8eohflvuTsoO2p7zC4NMuohj2rz7u5lPxyLlFmN3Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HkxxkpDtPaiaEEoQZoT/S6IspC9+fLPV1dfvXJ4H63Z0dNFKt1wDarNHKslVz4xdJZH8W9y8W88qXc6EbQg/88cqOppox4ztBLWv36F2KKDtam6m4K8HTQdMxOG2D9LRwl2rqq+0AZCiBCIZUz6JuZCrChscL0h0pxs09q1nb5A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ArhkQ4kG; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936176; x=1740472176; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8X8eohflvuTsoO2p7zC4NMuohj2rz7u5lPxyLlFmN3Q=; b=ArhkQ4kGQ2doDEJYN1Iqqj1ASNgJlvaDHrzFhdzNUEgO/aIuypen3Zxw ipbciZZIrbFkZpq7tfZsg0R+orJs9zetKK7keCPtpu6nc/fgz7kEb8AHT Bsbf128bhxQAExudbpvG5c3kDQv4dAL50vZKDeRtLDUgWgygohWC3XPhF 1yrImYOiytaiKGkuMUfd3am548hXxNALSwfv6+N9IqE/aTh7sEb1jB06v SOVUnvrZ42Yna4DrpaNl2YjA+SdzIA7JjJEHKTSaFteDCzaFM9Rj/Em1c /A/xDhnkl2LFmYhvGBRRAwboJ0SuCQKesoppdgrutxNBIWmqI113qImbL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="20751512" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="20751512" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6735299" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 04/14] KVM: TDX: Pass size to reclaim_page() Date: Mon, 26 Feb 2024 00:29:18 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952214315989361 X-GMAIL-MSGID: 1791952214315989361 From: Xiaoyao Li A 2MB large page can be tdh_mem_page_aug()'ed to TD directly. In this case, it needs to reclaim and clear the page as 2MB size. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- v5: - Change type of page size from int to unsigned long --- arch/x86/kvm/vmx/tdx.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index fd992966379c..8205d68ed477 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -233,12 +233,13 @@ static void tdx_disassociate_vp_on_cpu(struct kvm_vcpu *vcpu) smp_call_function_single(cpu, tdx_disassociate_vp_arg, vcpu, 1); } -static void tdx_clear_page(unsigned long page_pa) +static void tdx_clear_page(unsigned long page_pa, unsigned long size) { const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); void *page = __va(page_pa); unsigned long i; + WARN_ON_ONCE(size % PAGE_SIZE); /* * When re-assign one page from old keyid to a new keyid, MOVDIR64B is * required to clear/write the page with new keyid to prevent integrity @@ -247,7 +248,7 @@ static void tdx_clear_page(unsigned long page_pa) * clflush doesn't flush cache with HKID set. The cache line could be * poisoned (even without MKTME-i), clear the poison bit. */ - for (i = 0; i < PAGE_SIZE; i += 64) + for (i = 0; i < size; i += 64) movdir64b(page + i, zero_page); /* * MOVDIR64B store uses WC buffer. Prevent following memory reads @@ -256,7 +257,7 @@ static void tdx_clear_page(unsigned long page_pa) __mb(); } -static int __tdx_reclaim_page(hpa_t pa) +static int __tdx_reclaim_page(hpa_t pa, enum pg_level level) { struct tdx_module_args out; u64 err; @@ -275,17 +276,19 @@ static int __tdx_reclaim_page(hpa_t pa) pr_tdx_error(TDH_PHYMEM_PAGE_RECLAIM, err, &out); return -EIO; } + /* out.r8 == tdx sept page level */ + WARN_ON_ONCE(out.r8 != pg_level_to_tdx_sept_level(level)); return 0; } -static int tdx_reclaim_page(hpa_t pa) +static int tdx_reclaim_page(hpa_t pa, enum pg_level level) { int r; - r = __tdx_reclaim_page(pa); + r = __tdx_reclaim_page(pa, level); if (!r) - tdx_clear_page(pa); + tdx_clear_page(pa, KVM_HPAGE_SIZE(level)); return r; } @@ -299,7 +302,7 @@ static void tdx_reclaim_control_page(unsigned long td_page_pa) * was already flushed by TDH.PHYMEM.CACHE.WB before here, So * cache doesn't need to be flushed again. */ - if (tdx_reclaim_page(td_page_pa)) + if (tdx_reclaim_page(td_page_pa, PG_LEVEL_4K)) /* * Leak the page on failure: * tdx_reclaim_page() returns an error if and only if there's an @@ -530,7 +533,7 @@ void tdx_vm_free(struct kvm *kvm) if (!kvm_tdx->tdr_pa) return; - if (__tdx_reclaim_page(kvm_tdx->tdr_pa)) + if (__tdx_reclaim_page(kvm_tdx->tdr_pa, PG_LEVEL_4K)) return; /* * TDX module maps TDR with TDX global HKID. TDX module may access TDR @@ -543,7 +546,7 @@ void tdx_vm_free(struct kvm *kvm) pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); return; } - tdx_clear_page(kvm_tdx->tdr_pa); + tdx_clear_page(kvm_tdx->tdr_pa, PAGE_SIZE); free_page((unsigned long)__va(kvm_tdx->tdr_pa)); kvm_tdx->tdr_pa = 0; @@ -1586,7 +1589,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, * The HKID assigned to this TD was already freed and cache * was already flushed. We don't have to flush again. */ - err = tdx_reclaim_page(hpa); + err = tdx_reclaim_page(hpa, level); if (KVM_BUG_ON(err, kvm)) return -EIO; tdx_unpin(kvm, pfn); @@ -1619,7 +1622,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); return -EIO; } - tdx_clear_page(hpa); + tdx_clear_page(hpa, PAGE_SIZE); tdx_unpin(kvm, pfn); return 0; } @@ -1753,7 +1756,7 @@ static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, * already flushed. We don't have to flush again. */ if (!is_hkid_assigned(kvm_tdx)) - return tdx_reclaim_page(__pa(private_spt)); + return tdx_reclaim_page(__pa(private_spt), PG_LEVEL_4K); /* * free_private_spt() is (obviously) called when a shadow page is being From patchwork Mon Feb 26 08:29:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206468 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1953467dyb; Mon, 26 Feb 2024 01:13:25 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVBvJFSA+ljrWgGihfGZRnVQsCGGgPkpSRf70wF4d7T+UJ4Yepjx8H1+M9usdcDOZ73rtR38h/Os7Ni3RqOMuEUveK9OQ== X-Google-Smtp-Source: AGHT+IHrs31hB4r+qVpv8Ub0sqJ89SyRN+23oHcmPecYl/AUnL3drrBhK2oOxxGNgSCOBcQm/KfY X-Received: by 2002:a9d:7597:0:b0:6e4:9eaf:939a with SMTP id s23-20020a9d7597000000b006e49eaf939amr2600646otk.4.1708938805447; Mon, 26 Feb 2024 01:13:25 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938805; cv=pass; d=google.com; s=arc-20160816; b=UZafaGjjVNiShM2BrA5GaggqfofNqUYF9JW9QLP68p8ZVBZW1za7LK3jCUg2JQGYaR DHkPWkFouvUpAOXKResV8GzeiMueyRsTvQ9GyNKnp3B1KiVradWz6wkcMbckwinWTaGt j1HZSf7bDoMF9BaxP7aBICOwV3kl80vrJtF7IYCfBFLXEM/yyiYn0DjTIIM+99Tsh02M MHJ+OBnTseYKXQMIxTb/qJYGE/l3KilaEqzuoCJr4aN417J7s3iwSkdKKyDeiAPvCb7Z 9qRldc/7PxwhHNeLynnT6fre8gcTO94QjbuZcavVMt8Cq8XnXLANMqq8CH32RfruAhVS 6+Hg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=Qzy8Lh0Hh6fheG29ZKvqsM8MgGjGXpqnCt6SzG8ssaQ=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=I1ZUHsbZIGg5bgalU/3aecQ0dezQDPqF/fCiehlJsB0d5dEh7xmVanNPeS4s7/JqM9 Pdvxdgf3pC91Lrbl4A8kq7QUtmNDZoxxO38KUoJIuEx2OUUrHKMsSbcVvJYOcePpyxOY Nbmgj82nzZDka9XrKPtGxzarw7ngLMp+k8eX1RFjjkkHbGrTSOoB69zXdzzMp1T8aku0 ySV2XXP/zMnOPh5Px6IbtiNsBmwGGVTpMkXYLE13o/BSUee5PPnM8DIMOMAASRb/UHaq CzcuPMUYtlE5klcHWnBLNP0wjF4xCokYoDZMpyYEEq+8hOV07R92M/bZB5QT9bp/Q+nN N+Xg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HCxYqcri; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80900-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80900-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id j22-20020a635516000000b005dc48449b0bsi3390832pgb.690.2024.02.26.01.13.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:13:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80900-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HCxYqcri; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80900-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80900-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 283462828BF for ; Mon, 26 Feb 2024 09:13:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A80B8134733; Mon, 26 Feb 2024 08:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HCxYqcri" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33FD1132C13; Mon, 26 Feb 2024 08:29:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936178; cv=none; b=KTEBMZB0UIuDQ2T8s1BBP6Q6fXxrxBUrGDP1f2tIWIDuHqEsYo0rHoXj0O3FPQjTyaQlz61+cI9Qdt+P1had61vOh1rKMYIk9bv78qd3IXXtXBVFjTWkcVMk4ulZk7yUR4Kct9bCtuWY95kESYSasl1T5beDT1JqF45+jxWjpEw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936178; c=relaxed/simple; bh=KhO9+dBhvKMB6pZRM19WAk4VyKqQ7LgR/rHkkV+J3mM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b7SWsRnIP3fZ9pdDLmDo7U9jqxY3uyC8DWk8+wo85w7YuNaGhdv7gS2DKFwC+6hNITyx5mnRuhALZzF1qfOyxDeL+nPFpSKZeivG0gApCa2F+vlBr4+BAeWwP9DXQN3dRZ3LyTkSDusymkmzofgMgFdCDXTAZWwUavTeOSgh0Xs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HCxYqcri; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936176; x=1740472176; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KhO9+dBhvKMB6pZRM19WAk4VyKqQ7LgR/rHkkV+J3mM=; b=HCxYqcri6thDVQhpln36QrpOHtyw/2LdhSYVToKrGibTAl+QcbryjonY 72Za+j9It3lzwFaomifyidVdCHBZt8KMaumauvrll7oNm6f/CvSDDntS8 F+lQtogeYUdfSYBlnitE9Fd7zAmmm3p2cPwgemRbIJQOzBTfqtwj1AUR/ 8msRaQZJsv2xfPhE6gWeapqwMgP289TszFzwADjVcAfosh+6xCdNHW/o7 HQJ/oSFqWve/XSD5Um0qgyeU2aGOd84s9ZJEew+HFjUVaQazKcRJJxxD1 cL76m5DT+h2JXwWzd/v2X96RgPi7wE3YhXqBIt5woZ6PYuuAponcOqNyx g==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="20751517" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="20751517" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6735302" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 05/14] KVM: TDX: Update tdx_sept_{set,drop}_private_spte() to support large page Date: Mon, 26 Feb 2024 00:29:19 -0800 Message-Id: <69f9845176b8a4f59440ce1c2d2d7f10c5585ed7.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952216891977512 X-GMAIL-MSGID: 1791952216891977512 From: Xiaoyao Li Allow large page level AUG and REMOVE for TDX pages. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 68 ++++++++++++++++++++++-------------------- 1 file changed, 35 insertions(+), 33 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8205d68ed477..d73a32588ad8 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1454,11 +1454,12 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa); } -static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn) +static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn, enum pg_level level) { - struct page *page = pfn_to_page(pfn); + int i; - put_page(page); + for (i = 0; i < KVM_PAGES_PER_HPAGE(level); i++) + put_page(pfn_to_page(pfn + i)); } static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, @@ -1475,7 +1476,7 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, err = tdh_mem_page_aug(kvm_tdx->tdr_pa, gpa, tdx_level, hpa, &out); if (unlikely(err == TDX_ERROR_SEPT_BUSY)) { - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); return -EAGAIN; } if (unlikely(err == (TDX_EPT_ENTRY_STATE_INCORRECT | TDX_OPERAND_ID_RCX))) { @@ -1484,7 +1485,7 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, if (level_state.level == tdx_level && level_state.state == TDX_SEPT_PENDING && entry.leaf && entry.pfn == pfn && entry.sve) { - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); WARN_ON_ONCE(!(to_kvm_tdx(kvm)->attributes & TDX_TD_ATTR_SEPT_VE_DISABLE)); return -EAGAIN; @@ -1492,7 +1493,7 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, } if (KVM_BUG_ON(err, kvm)) { pr_tdx_error(TDH_MEM_PAGE_AUG, err, &out); - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); return -EIO; } @@ -1519,7 +1520,7 @@ static int tdx_mem_page_add(struct kvm *kvm, gfn_t gfn, return -EINVAL; if (KVM_BUG_ON(!kvm_tdx->source_page, kvm)) { - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); return -EINVAL; } @@ -1537,7 +1538,7 @@ static int tdx_mem_page_add(struct kvm *kvm, gfn_t gfn, * fail with parameters user provided. */ if (err) { - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); return -EIO; } @@ -1548,10 +1549,7 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); - - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) - return -EINVAL; + int i; /* * Because restricted mem doesn't support page migration with @@ -1561,7 +1559,8 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, * TODO: Once restricted mem introduces callback on page migration, * implement it and remove get_page/put_page(). */ - get_page(pfn_to_page(pfn)); + for (i = 0; i < KVM_PAGES_PER_HPAGE(level); i++) + get_page(pfn_to_page(pfn + i)); if (likely(is_td_finalized(kvm_tdx))) return tdx_mem_page_aug(kvm, gfn, level, pfn); @@ -1578,11 +1577,9 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, gpa_t gpa = gfn_to_gpa(gfn); hpa_t hpa = pfn_to_hpa(pfn); hpa_t hpa_with_hkid; + int r = 0; u64 err; - - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) - return -EINVAL; + int i; if (unlikely(!is_hkid_assigned(kvm_tdx))) { /* @@ -1592,7 +1589,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, err = tdx_reclaim_page(hpa, level); if (KVM_BUG_ON(err, kvm)) return -EIO; - tdx_unpin(kvm, pfn); + tdx_unpin(kvm, pfn, level); return 0; } @@ -1609,22 +1606,27 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, return -EIO; } - hpa_with_hkid = set_hkid_to_hpa(hpa, (u16)kvm_tdx->hkid); - do { - /* - * TDX_OPERAND_BUSY can happen on locking PAMT entry. Because - * this page was removed above, other thread shouldn't be - * repeatedly operating on this page. Just retry loop. - */ - err = tdh_phymem_page_wbinvd(hpa_with_hkid); - } while (unlikely(err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX))); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); - return -EIO; + for (i = 0; i < KVM_PAGES_PER_HPAGE(level); i++) { + hpa_with_hkid = set_hkid_to_hpa(hpa, (u16)kvm_tdx->hkid); + do { + /* + * TDX_OPERAND_BUSY can happen on locking PAMT entry. + * Because this page was removed above, other thread + * shouldn't be repeatedly operating on this page. + * Simple retry should work. + */ + err = tdh_phymem_page_wbinvd(hpa_with_hkid); + } while (unlikely(err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX))); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); + r = -EIO; + } else { + tdx_clear_page(hpa, PAGE_SIZE); + tdx_unpin(kvm, pfn + i, PG_LEVEL_4K); + } + hpa += PAGE_SIZE; } - tdx_clear_page(hpa, PAGE_SIZE); - tdx_unpin(kvm, pfn); - return 0; + return r; } static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, From patchwork Mon Feb 26 08:29:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206520 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1962044dyb; Mon, 26 Feb 2024 01:36:36 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUNBtWoiW1vAszc8BWkFtAY5yo5HntGZGdmVDSzBRy555vbSvREK5hPxwEuAC97ebAMB1w7SQqXhBzxXjyEOiz3ygGwyg== X-Google-Smtp-Source: AGHT+IF2OAwEEtW9RPjGRXULb1YeN8+FtHPvva90V0pwC5+RPmKhPzRUIck1c7AB07DWztNVRTuf X-Received: by 2002:a17:902:6b87:b0:1dc:63fd:391d with SMTP id p7-20020a1709026b8700b001dc63fd391dmr8214529plk.29.1708940196595; Mon, 26 Feb 2024 01:36:36 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708940196; cv=pass; d=google.com; s=arc-20160816; b=ety6w4CEwtmJrF60EHJjVjFtswwVAKjEEZfD1f4YjC/XDAKYJbprhOuESbLWNiN/HY zs/Wbbr+Xgn2f3ZzJ7MHpwsqcDAQHKZAnSmiYPlOwpEXcPqynYJNIVL+0wPKExylfKFT C/stzGSao5m3kleKTxr8Cd7/2m2YDs8f9n7rb+8knAtRC3Qx+vUTXtaeOLHnxXek/JkX +0Y1VzcFU2lYAw7pB2OXJ/wuEqZiVeYaWKSoU3/PbSHbmpgMjqPcFP7jtwDe1B5LVGfV 8je7LxMpN6Pzk+y0lMoCkqkUPNuxqN0cwRq03DbmLPhFVGNtoY2tuO75VIwKz3zPtVZW y/yA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=/NZSUUw4C59U4OAP3SX64g6rQwfcLN3rQ4mqXjnrZ6Q=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=P2zzfZTUqVoK6KjiC3fJ7naaBw+y0OXEJwSZnUZBsFw7wPBK7tpfxjwoU9lyhCs/t2 La46Dwar86KXAQt9Vy4WkKJ9r67Vs/bwMDq6KD/C4qtJuHsWInEJ8ufmk9mFrdwI8/SV OwVxcHAeoX2w2/j2ofVY7cgl+LnUyJ0CaIRe+oT++W6FykuTDJTEFVyyGOv0E8keGoSK 4BHy4cohvYjib5vYxSWOE6261peaI23XVDlp2h0pdKfrzKZgLsYpjgvnHu8Y+f07ibVB humQgIx0FraIest2cIon4JGz5nv4TnHFTJS9xUItGnHHczlHhMqgAg4fGrHjcQYGiVYP SkBA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OFmB7O7U; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80902-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80902-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id cp1-20020a170902e78100b001d6f89cfec9si3276926plb.348.2024.02.26.01.36.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:36:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80902-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OFmB7O7U; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80902-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80902-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 36D56B220CA for ; Mon, 26 Feb 2024 09:14:39 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 56A5813540A; Mon, 26 Feb 2024 08:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OFmB7O7U" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E82EE1EB27; Mon, 26 Feb 2024 08:29:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936192; cv=none; b=mhS0OlElxNiTKXW4AiZUVrnksb8rbPIzez1gIA8Pr4709EpsAXjDNsVDHdXgrp3nxQPYCUj298hJ0eyDyAbuGXShR5giH/fnArZkrW1o/20l1K8K2QHCssaasjJp8AbQKpGlT4CaV1vZrcJKXaWQO5lMSIMyXplj0HA+x9+s2Iw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936192; c=relaxed/simple; bh=1Ryj6u4U9BulVOUlWacagZD4U9GjBBr6paLN/Y3sP80=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tbWzAqxfVhLDKJNmgfgUfwLSvc/s9WYQPLQOBAo2u7akrvn5Ky0gbZoBq7kZH36PdKxEUeGwyVmv09wWhw/mXyGVuY3zYS5tSlsFWPOhiBG5YdpPsZm+CknqH/t6LBv8RTxRv6rordn8CNQZuGn7biveMcjbyL/vz50EI1Qe9G8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OFmB7O7U; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936191; x=1740472191; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1Ryj6u4U9BulVOUlWacagZD4U9GjBBr6paLN/Y3sP80=; b=OFmB7O7UcgzbkUTZnW9wU+JaA//593FWMu6WwCCmDU2MecNet3AGt1Hn giEGdKRhllfTSlLOwjJ1k09H6/uBJWBorjjfH4luyYZux/yaA9sd8ayEZ ETq6FKdovk4t/BpjYtB2QFHTU/Gu8d9munVuya956xUR6eWzQbIERVSsK J0Cw0vpW6btIEdWvGta4kvVypd5El/SPUsWn6uRUlz7ehvrFSGP8grkSY GdGpa7rfWF3EmcoWP6HTsQfL0Z6jH75ZsjTu/YuDciu+jtDTfWe3j3Z4X qGKLJ23Suebc4IacYY0nNJ9foRcZt97xHqlIy7dMLCkdzUHODdyEGIDQZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623303" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623303" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519404" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 06/14] KVM: MMU: Introduce level info in PFERR code Date: Mon, 26 Feb 2024 00:29:20 -0800 Message-Id: <4d61104bff388a081ff8f6ae4ac71e05a13e53c3.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791953675438914297 X-GMAIL-MSGID: 1791953675438914297 From: Xiaoyao Li For TDX, EPT violation can happen when TDG.MEM.PAGE.ACCEPT. And TDG.MEM.PAGE.ACCEPT contains the desired accept page level of TD guest. 1. KVM can map it with 4KB page while TD guest wants to accept 2MB page. TD guest will get TDX_PAGE_SIZE_MISMATCH and it should try to accept 4KB size. 2. KVM can map it with 2MB page while TD guest wants to accept 4KB page. KVM needs to honor it because a) there is no way to tell guest KVM maps it as 2MB size. And b) guest accepts it in 4KB size since guest knows some other 4KB page in the same 2MB range will be used as shared page. For case 2, it need to pass desired page level to KVM MMU page fault handler. Use bit 29:31 of kvm PF error code for this purpose. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 5 +++++ arch/x86/kvm/mmu/mmu.c | 5 +++++ 2 files changed, 10 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e4d40e31fc31..c864a1ff2eb1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -262,6 +262,8 @@ enum x86_intercept_stage; #define PFERR_FETCH_BIT 4 #define PFERR_PK_BIT 5 #define PFERR_SGX_BIT 15 +#define PFERR_LEVEL_START_BIT 29 +#define PFERR_LEVEL_END_BIT 31 #define PFERR_GUEST_FINAL_BIT 32 #define PFERR_GUEST_PAGE_BIT 33 #define PFERR_GUEST_ENC_BIT 34 @@ -274,6 +276,7 @@ enum x86_intercept_stage; #define PFERR_FETCH_MASK BIT(PFERR_FETCH_BIT) #define PFERR_PK_MASK BIT(PFERR_PK_BIT) #define PFERR_SGX_MASK BIT(PFERR_SGX_BIT) +#define PFERR_LEVEL_MASK GENMASK_ULL(PFERR_LEVEL_END_BIT, PFERR_LEVEL_START_BIT) #define PFERR_GUEST_FINAL_MASK BIT_ULL(PFERR_GUEST_FINAL_BIT) #define PFERR_GUEST_PAGE_MASK BIT_ULL(PFERR_GUEST_PAGE_BIT) #define PFERR_GUEST_ENC_MASK BIT_ULL(PFERR_GUEST_ENC_BIT) @@ -283,6 +286,8 @@ enum x86_intercept_stage; PFERR_WRITE_MASK | \ PFERR_PRESENT_MASK) +#define PFERR_LEVEL(err_code) (((err_code) & PFERR_LEVEL_MASK) >> PFERR_LEVEL_START_BIT) + /* apic attention bits */ #define KVM_APIC_CHECK_VAPIC 0 /* diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b8d6ce02e66d..081df7855065 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4625,6 +4625,11 @@ bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma) int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { + u8 err_level = PFERR_LEVEL(fault->error_code); + + if (err_level) + fault->max_level = min(fault->max_level, err_level); + /* * If the guest's MTRRs may be used to compute the "real" memtype, * restrict the mapping level to ensure KVM uses a consistent memtype From patchwork Mon Feb 26 08:29:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206519 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1961933dyb; Mon, 26 Feb 2024 01:36:17 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXS9jri1U37tDYIOeQfHxhORYyMESKaMOjRf7GbRv0tYZkkA6gbsiqlaW+CycRYBRf6LJjhAx+xuIoT9jHtTJ+GjqFFkA== X-Google-Smtp-Source: AGHT+IEJaO2pyptebYTifkIOxE3d+YwsvpIlkKTahOuHUU66bm9TVqRrahM0z6+8T2CREXFamyGI X-Received: by 2002:a9d:73cc:0:b0:6e4:934d:41fa with SMTP id m12-20020a9d73cc000000b006e4934d41famr5044923otk.22.1708940177005; Mon, 26 Feb 2024 01:36:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708940176; cv=pass; d=google.com; s=arc-20160816; b=0ZM4Efx5rEPQPOvbzv2xABZeiB7OpNG4YFqNQ84tPKJWfNNPvi2W3AKylVm4Ot2Jdi TUAYl9EfUpOxmy9skXaA7d4ZzIFXZYUkHzt69MT3JptCsQ2lsTJZ0PJX94Y+U2IVmIub OXMSnIEpwABgkHWLmBQAw7HKr/JS/kaoH+JG0yPOFqlBROQlhjFpqcGnWr0RQInq/Y9I AntUtXvktaIoSExGKsey/oEl5vHE96baIe96XWykNzE3WFQThIPMPz7IZ9tkwJH3/mh7 ojtjebYVJ10450Blu54Z29LxjSGuXmz4rRC5kELs+ztnqwGU3/pFDEtkkRI/sad7bjL+ ACSg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=ZJpPZ7oiE9TYhvxYpdgohiDNj0iZ+7xqW2typMs0ZYE=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=NTOtZBAkt6R5LIk21mGhP1yQ0n2F8qzSfyZq89d8iLgGFMtLENruK/jm+gUYOe6DqP xLCQ+K6IsMxBnei1hon4NXuiULK7R5xEVYeGtV5BIkyz5RZLlGNORLjxw2gM0yYoeC46 e907llJ1H8UWzuwb4PNHkY5/AVYsjumhbuR3uNbVV+/LgX49ZQYyb9u0Ne84E9KprT8m f+BidSeJaWk8xEas6KfHZ7XXaePuEtsaF5B4cXsnBeXQjCsH9DBTFPH80izdr2S74gD6 DvB6nvd4Pm1tj9kG7GkwvzDfh+o+vsRAes77E65pOjmcahkBOEXQiatnxxNvutGGjU60 9+JA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Je6Xy/OQ"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80901-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80901-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id 66-20020a630045000000b005dc846e12adsi3426323pga.655.2024.02.26.01.36.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:36:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80901-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Je6Xy/OQ"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80901-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80901-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 317E7B20587 for ; Mon, 26 Feb 2024 09:14:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 67C9F1350C4; Mon, 26 Feb 2024 08:29:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Je6Xy/OQ" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16B7E1CD37; Mon, 26 Feb 2024 08:29:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936190; cv=none; b=jxB0qX+tkd7nreW2kXK6MQ9cevEtNisp725WjvCvXiaq0LqhzedSheXk0m77b97nljdlZQft+xCsK0hwXnHRfYLpYx4ajtwHGDeHYvbCmdeMOuON/p0Ad2Rgdq0quoz2rkb8cExADhLFjusN6Fz9CvysuoILbiENHKZztEpy7mA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936190; c=relaxed/simple; bh=cp1eBjoAT8NOI0oEDQWVIksfHFeL9PGp78iWB16k6Lw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dLpri2oL0s4KF1ZU+qJZqS8LMPoocffkkgMdLwzEXftEF5CIUKRfXIPeKP9WfNznw+JlaO4MB6yAw+mI7SK0E+X3eh7rLSSMq3gpb94sW513l99Zf6umqRBi3gyivcn7et1cLEo8fmBqBTAsIk17GJ/tP4mYiQ1MJTuTH0jmY/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Je6Xy/OQ; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936189; x=1740472189; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cp1eBjoAT8NOI0oEDQWVIksfHFeL9PGp78iWB16k6Lw=; b=Je6Xy/OQ+wax9XBNaqHN4iFYUlMJB8Fwyc5ieqP0gTJ9WjzO34iMHX/M WfB6S9IOXOKfvJM8z6Nh8pJVfrJePZquL8aTJ3Em72h60E+rbSooCZBuY h2JyY+PLh90Q733jNATUIj9kEk0mzQKWac+D0TcYqLTgkQSW5A6wctzEs ymBmBitL6qwxQdCz3DJWNxQGZhcxDZyayFepeRUgCLeCJ00FDzbR0a38e SoMTJeqf3732nnmp9Jx9BFtNiEIYovOEVZk0PB5P981ML5EEcjG2tVRGw zsTJkikycQreZh1vRqnAbBWkDNcvArfj761PqZkm0GWyggXIU3MLalwfb A==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623307" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623307" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519407" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 07/14] KVM: TDX: Pass desired page level in err code for page fault handler Date: Mon, 26 Feb 2024 00:29:21 -0800 Message-Id: <3d2a6bfb033ee1b51f7b875360bd295376c32b54.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791953655165921332 X-GMAIL-MSGID: 1791953655165921332 From: Xiaoyao Li For TDX, EPT violation can happen when TDG.MEM.PAGE.ACCEPT. And TDG.MEM.PAGE.ACCEPT contains the desired accept page level of TD guest. 1. KVM can map it with 4KB page while TD guest wants to accept 2MB page. TD geust will get TDX_PAGE_SIZE_MISMATCH and it should try to accept 4KB size. 2. KVM can map it with 2MB page while TD guest wants to accept 4KB page. KVM needs to honor it because a) there is no way to tell guest KVM maps it as 2MB size. And b) guest accepts it in 4KB size since guest knows some other 4KB page in the same 2MB range will be used as shared page. For case 2, it need to pass desired page level to MMU's page_fault_handler. Use bit 29:31 of kvm PF error code for this purpose. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/common.h | 6 +++++- arch/x86/kvm/vmx/tdx.c | 18 ++++++++++++++++-- arch/x86/kvm/vmx/tdx_arch.h | 19 +++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 2 +- 4 files changed, 41 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h index 027aa4175d2c..787f59c44abc 100644 --- a/arch/x86/kvm/vmx/common.h +++ b/arch/x86/kvm/vmx/common.h @@ -67,7 +67,8 @@ static inline void vmx_handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, } static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, - unsigned long exit_qualification) + unsigned long exit_qualification, + int err_page_level) { u64 error_code; @@ -90,6 +91,9 @@ static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, if (kvm_is_private_gpa(vcpu->kvm, gpa)) error_code |= PFERR_GUEST_ENC_MASK; + if (err_page_level > PG_LEVEL_NONE) + error_code |= (err_page_level << PFERR_LEVEL_START_BIT) & PFERR_LEVEL_MASK; + return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); } diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d73a32588ad8..6941e9483e7e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1812,7 +1812,20 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) { + union tdx_ext_exit_qualification ext_exit_qual; unsigned long exit_qual; + int err_page_level = 0; + + ext_exit_qual.full = tdexit_ext_exit_qual(vcpu); + + if (ext_exit_qual.type >= NUM_EXT_EXIT_QUAL) { + pr_err("EPT violation at gpa 0x%lx, with invalid ext exit qualification type 0x%x\n", + tdexit_gpa(vcpu), ext_exit_qual.type); + kvm_vm_bugged(vcpu->kvm); + return 0; + } else if (ext_exit_qual.type == EXT_EXIT_QUAL_ACCEPT) { + err_page_level = tdx_sept_level_to_pg_level(ext_exit_qual.req_sept_level); + } if (kvm_is_private_gpa(vcpu->kvm, tdexit_gpa(vcpu))) { /* @@ -1839,7 +1852,7 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) } trace_kvm_page_fault(vcpu, tdexit_gpa(vcpu), exit_qual); - return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual); + return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual, err_page_level); } static int tdx_handle_ept_misconfig(struct kvm_vcpu *vcpu) @@ -3027,7 +3040,8 @@ int tdx_pre_memory_mapping(struct kvm_vcpu *vcpu, /* TDX supports only 4K to pre-populate. */ *max_level = PG_LEVEL_4K; - *error_code = TDX_SEPT_PFERR; + *error_code = TDX_SEPT_PFERR | + ((PG_LEVEL_4K << PFERR_LEVEL_START_BIT) & PFERR_LEVEL_MASK); r = get_user_pages_fast(mapping->source, 1, 0, &page); if (r < 0) diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index 87ef22e9cd49..19f2deafde5b 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -221,6 +221,25 @@ union tdx_sept_level_state { u64 raw; }; +union tdx_ext_exit_qualification { + struct { + u64 type : 4; + u64 reserved0 : 28; + u64 req_sept_level : 3; + u64 err_sept_level : 3; + u64 err_sept_state : 8; + u64 err_sept_is_leaf : 1; + u64 reserved1 : 17; + }; + u64 full; +}; + +enum tdx_ext_exit_qualification_type { + EXT_EXIT_QUAL_NONE = 0, + EXT_EXIT_QUAL_ACCEPT = 1, + NUM_EXT_EXIT_QUAL, +}; + /* * Global scope metadata field ID. * See Table "Global Scope Metadata", TDX module 1.5 ABI spec. diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f8a00a766c40..a2004a0feb1c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5752,7 +5752,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) if (unlikely(allow_smaller_maxphyaddr && !kvm_vcpu_is_legal_gpa(vcpu, gpa))) return kvm_emulate_instruction(vcpu, 0); - return __vmx_handle_ept_violation(vcpu, gpa, exit_qualification); + return __vmx_handle_ept_violation(vcpu, gpa, exit_qualification, PG_LEVEL_NONE); } static int handle_ept_misconfig(struct kvm_vcpu *vcpu) From patchwork Mon Feb 26 08:29:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206469 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1953920dyb; Mon, 26 Feb 2024 01:14:39 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVO05mDnqwxdgawH+e49bycIVkBbXdZztb46pVDUA7RLoluB5GOFv0lkkbanJ/MMgLUKvjiZKsdkPZl6m0umP1e2WFUmA== X-Google-Smtp-Source: AGHT+IHW0LxdnrbUPNbA0zVsez/5dorBGuN5lkrqyqMYnvEZrEOhOhlfkGzOtGwz6ZTg4QYSMcfJ X-Received: by 2002:a25:94c:0:b0:dcc:9d30:58a0 with SMTP id u12-20020a25094c000000b00dcc9d3058a0mr3431004ybm.64.1708938879627; Mon, 26 Feb 2024 01:14:39 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938879; cv=pass; d=google.com; s=arc-20160816; b=iZezCiDGMHKAktZ1q21j2+ZlDnV82601hwFW1SutOL2uGfFQG2X6hHTr+OD8LSJb5k Fn6kmbQLmrHXbv1x/HBuOg1ZvcqsE4lzaz48bnhUXbo/nQuwdsfG+SDP1y79fdVqmNL9 ZlVi1+tGVnsvj3aRLk3DEbhHZvjBs+vahYXrhQNHwZQB3AAayxuXvKLM7fGI/T7T1UN7 K8RyqjBs4OunTax5np4Gm/5nm+UBhXFpDw9etTkf+rwRWzCXJeR3ToudDhhQLoykDCzb kIoq8F5VQL7H60MNHrEYxg6EnKjTk6/hhO0f+vASJUBaRsQjNd/ks9ffscd8PIeYM5Fn ROMA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=TP3ITd0rLWpIi8dhZMFlcYfNKQehhQ2xUWuofknK8eY=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=rkZeUbGDPdQX3RhqimVHtfMyl/9q0Eqgoj5PfgMNTtH58hirrZfuWs+wTMgxExCNQ2 Njvl6SnakF1O4IQS+w9nUxgecGFClhP6KZ1GukWdrPCRIM678pvnUj7kxNyj1MjbibRV KcgCghcaLVCnLiDoqaiWtEtMNBVxvUdflMHcezIIP/LETLCAIkGP7idXXzBczP9dDRWY YPMI2GdS4SIcRRGvZmc0ev+rk5skdzlYP2uvDTniFviNcykF9U/uISuSlohTIOQyg4VG VWOTBc1ypilV6lP/eRj06PpPLL68ipFRT7ypbCHRVCEQurRC6VX79pZFcP+SmId8WmoN VpZg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OzFZCYjp; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80903-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80903-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id e13-20020ac8490d000000b0042e4c1029fesi4552619qtq.539.2024.02.26.01.14.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:14:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80903-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OzFZCYjp; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80903-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80903-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 6C7551C23B09 for ; Mon, 26 Feb 2024 09:14:39 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 703E213540E; Mon, 26 Feb 2024 08:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OzFZCYjp" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5966A1DDF2; Mon, 26 Feb 2024 08:29:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936192; cv=none; b=osA+uYsGnX6yOZ5hOhGr03LlXbmnIEEtHbbtI5Kn3nWT+hrmAsNxXZJlH89uKIxbakhl4UEAR9eFOCjiv0nIpRHSGsZMYekQvw30ETavHfpd+0v4GrNlGXxWngAt7a5lVhDtbs9YB193CnYyCCP+mMl/Z479SYQRYnY4xxhYtYg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936192; c=relaxed/simple; bh=AX82JcILUa46R1G0PfPXFjiVx0GFJf+AzOhgtIqj4R8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TLzuW94I5f+r7CjxiG6WdcFw+Ttz0ipUzMzpi2FAYmysFE39ePR3yeAt+KGcAcEgJIDnhqH4L3fcNr2JRhtUYoMMzcjWikEa1DX4JOEPN3KO1vu8tM6THUq5VCNjIW3zGhGG26XVU5aneDGGG8B+l6fF4l66DnwstBQ8Wq3BaQc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OzFZCYjp; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936190; x=1740472190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AX82JcILUa46R1G0PfPXFjiVx0GFJf+AzOhgtIqj4R8=; b=OzFZCYjp7MydseztlfNJWt+31fupV8S4VOh5t2Q+7eMogLtfz+kXIe1l XfwYugpG17bSyno6Ot0yG36xlaSWxv3OKShEJ1l4j8c0hf5Yef3a5FmbC nY2IGmNcgHho8b7Wu2CgJGwrq38S6e1Rg2RxkM3voT1PXzYX2/b/ptQJn Q7LbW8hA6pdL3xJL1bFeAYc7pXdU1/LJg0SS5PzVHmrSZDnQoDta51W1i /vKTdRdnn2ltowBpbuwVVwfP+prVIUjWviKGiaCChPkQLczEg35gsfLaL 5CFdQfMlv/DbOoXdhqgUfEyVf7QSSB0ZkliqdM3M4x/tetn6FS0inCuqE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623311" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623311" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519410" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 08/14] KVM: x86/tdp_mmu: Allocate private page table for large page split Date: Mon, 26 Feb 2024 00:29:22 -0800 Message-Id: <657c4a403e63f2c7e1742b4cdb09ca94c6d5d9b8.1708933624.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952294566638928 X-GMAIL-MSGID: 1791952294566638928 From: Isaku Yamahata Make tdp_mmu_alloc_sp_split() aware of private page table. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu_internal.h | 9 +++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 8 ++++++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 9e2c7c6d85bf..9aa4c6ffa207 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -201,6 +201,15 @@ static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, struct kvm_m } } +static inline int kvm_alloc_private_spt_for_split(struct kvm_mmu_page *sp, gfp_t gfp) +{ + gfp &= ~__GFP_ZERO; + sp->private_spt = (void *)__get_free_page(gfp); + if (!sp->private_spt) + return -ENOMEM; + return 0; +} + static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) { if (sp->private_spt) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1a0e4baa8311..66de875d3de1 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1593,8 +1593,12 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp, union kvm_mm sp->role = role; sp->spt = (void *)__get_free_page(gfp); - /* TODO: large page support for private GPA. */ - WARN_ON_ONCE(kvm_mmu_page_role_is_private(role)); + if (kvm_mmu_page_role_is_private(role)) { + if (kvm_alloc_private_spt_for_split(sp, gfp)) { + free_page((unsigned long)sp->spt); + sp->spt = NULL; + } + } if (!sp->spt) { kmem_cache_free(mmu_page_header_cache, sp); return NULL; From patchwork Mon Feb 26 08:29:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206470 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954053dyb; Mon, 26 Feb 2024 01:14:57 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVsTCqGgn7gw8KocdhDelqY61VdLEEdBR8+yUe2XQKOO+icpZrrd0IGFr9ZW8W3XE4UbiwJy6RoianqYPToKPRSOQTd5w== X-Google-Smtp-Source: AGHT+IHbSF9LQX3NdcGe/DBieLtVHQ+dFcGayWxlFakwQL6CVGMyY/USsrmG6ZBBx+dK2QpI8ac5 X-Received: by 2002:a17:903:2342:b0:1d9:9c96:673c with SMTP id c2-20020a170903234200b001d99c96673cmr6337487plh.46.1708938896863; Mon, 26 Feb 2024 01:14:56 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938896; cv=pass; d=google.com; s=arc-20160816; b=GF7Pw7WGJNnz+eunG7YehyyD+C4fCB5X8nn2SR0m6xioLxFoOxLyq6OszNGRjOoAyz UOUm1hq7uHDL7y9tHo2cYS4J/J/ZwjK2gdqiEi+3qNlCDq4F3GRtNxId26OWgDZ/OjY9 yXqpux5r+fpTob82hiPJSAV71it+ltlzsNKN2ZNZBzUN0ucTcxp335f/Cp8negnXa7eW zxCIEXGZ3W3Ii6yBo+SmHicfO52vhTq4pya0beYmSYghzvntlVWRVn6cq1YgA4jxEKoL fUC9MXV7Pbcr8AkmS10ApA39KfmS5J/4WlKvitCaFsEqaNNRrGewEeJHUZ7M+emKTpH7 AHcw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=1LLNCz6XZbi/c+YzYmTiEVqPPRHHnoIYpfYcRNUiGoo=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=iTPlHo/vhZ6jE3TpMTWbKkWCoShLRVSd+hJLPE0dG1FpFppvtv1WDjYQRosYUusUru sltcHevQwlkuzCboH39kK3LZJ7yS5SJHWafRqQmmovsq3QyV3ZLElXxVGEz98TqKitDB O8VcYHO2sje+YoLX35yA7j0Ysp1VgGILzINeK2pFnm6Mqe6XMaJHYuROlg2SC9knY5p6 cZxbBQ6cQS6U+Zit8I/pKA2EYpU/E5iAe6mimh0DEbQAjUIy/nKbdvkSet1rGfR44KNe juM8dnFdSGLm9T/fw9v6tnK3lLr2e1F8eWkpKMpEiFZytC+Dlx7KT7jyZBk4xLB3XVF4 hQFA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LvseQIsL; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80904-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80904-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id me8-20020a170902fc4800b001dca8416577si834450plb.461.2024.02.26.01.14.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:14:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80904-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LvseQIsL; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80904-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80904-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8E062284E77 for ; Mon, 26 Feb 2024 09:14:56 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 86031339BF; Mon, 26 Feb 2024 08:29:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LvseQIsL" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E6252C869; Mon, 26 Feb 2024 08:29:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936193; cv=none; b=HQcbV6G+XCUqXuBVf+nbcfaJZs5LSC3zg30aWyQYm6Z8DbL1nfqmmojfjOxRetZMbMRoTbl2FfsM9+3ySk/qFKxIPjsJYoF/O68Ha6S8XZS1Letmz73E1LrkFMpegUqFmktvOePLy/ocixaWrzq7cPUScOLVElAODE3visjl2LQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936193; c=relaxed/simple; bh=GFnsETiyt8aTwReB6nymWxtwd83xZREOfV3VdIkGX1o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ejCR9iL6JettLRk+34A+unPpn5/BoSwCwGH6BZdfg4hROAy9qmo5NtVmKZsiruo10Tz7wW1SQqPgrVIrhvmPPMsnLJvR5WXdDpFGK4Z6oI153J9FlZ0ICr9Q0oYQqKEw5VYfJo+kADgFA4tuAEiGv9tm4QDYK/7pg2URLjBdJxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LvseQIsL; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936191; x=1740472191; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GFnsETiyt8aTwReB6nymWxtwd83xZREOfV3VdIkGX1o=; b=LvseQIsLYZpxaD+f0I8wYIOZFWOShfE27jHlyEnm1mZpXsiNsZD+nLU1 GotXaCp6ZcrOYp6JEvKKEwgWiLcjj22Ts2ExQzceVHEggE0UR8dMDnAPg aSvDBjplpcReo9eATt7gswytpWBBCfXPt2rZ7LTbTakUn6BEzTRcoHYkj 80FSmGiw6oIwWzFzU9w+8o6YAQCnvlu+Z3ogNax8BO1Y0QpoiWLbxoGcW wMljCkb8Yf4o+eUV8OgZpC4xF/72gZ0yhk8zzt84vXFsDVUAmcDoXh6Yl HWDMPJh6nGpa4Mc02tEbIGAfBb2R3pBK0RZQDEJERWpUweHJVPZ5FQBNv Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623317" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623317" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519414" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:33 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 09/14] KVM: x86/tdp_mmu: Split the large page when zap leaf Date: Mon, 26 Feb 2024 00:29:23 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952312293730535 X-GMAIL-MSGID: 1791952312293730535 From: Xiaoyao Li When TDX enabled, a large page cannot be zapped if it contains mixed pages. In this case, it has to split the large page. Signed-off-by: Xiaoyao Li --- v7: - remote unnecessary tlb shoot down in tdp_mmu_zap_leafs() to free unused split_sp. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 6 ++-- arch/x86/kvm/mmu/mmu_internal.h | 9 +++++ arch/x86/kvm/mmu/tdp_mmu.c | 60 ++++++++++++++++++++++++++++++--- 3 files changed, 68 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 081df7855065..fa7fabc410c4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -7473,8 +7473,8 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm, return kvm_unmap_gfn_range(kvm, range); } -static bool hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, - int level) +bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, + int level) { return lpage_info_slot(gfn, slot, level)->disallow_lpage & KVM_LPAGE_MIXED_FLAG; } @@ -7501,7 +7501,7 @@ static bool hugepage_has_attrs(struct kvm *kvm, struct kvm_memory_slot *slot, return kvm_range_has_memory_attributes(kvm, start, end, attrs); for (gfn = start; gfn < end; gfn += KVM_PAGES_PER_HPAGE(level - 1)) { - if (hugepage_test_mixed(slot, gfn, level - 1) || + if (kvm_hugepage_test_mixed(slot, gfn, level - 1) || attrs != kvm_get_memory_attributes(kvm, gfn)) return false; } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 9aa4c6ffa207..315c123affaf 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -430,4 +430,13 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES +bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, int level); +#else +static inline bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t gfn, int level) +{ + return false; +} +#endif + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 66de875d3de1..e3682794adda 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -953,6 +953,14 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp) return true; } + +static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, + struct tdp_iter *iter, + bool shared); + +static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, + struct kvm_mmu_page *sp, bool shared); + /* * If can_yield is true, will release the MMU lock and reschedule if the * scheduler needs the CPU or there is contention on the MMU lock. If this @@ -964,14 +972,16 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t start, gfn_t end, bool can_yield, bool flush, bool zap_private) { + bool is_private = is_private_sp(root); + struct kvm_mmu_page *split_sp = NULL; struct tdp_iter iter; end = min(end, tdp_mmu_max_gfn_exclusive()); lockdep_assert_held_write(&kvm->mmu_lock); - WARN_ON_ONCE(zap_private && !is_private_sp(root)); - if (!zap_private && is_private_sp(root)) + WARN_ON_ONCE(zap_private && !is_private); + if (!zap_private && is_private) return false; /* @@ -995,12 +1005,56 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, !is_last_spte(iter.old_spte, iter.level)) continue; + if (is_private && kvm_gfn_shared_mask(kvm) && + is_large_pte(iter.old_spte)) { + gfn_t gfn = iter.gfn & ~kvm_gfn_shared_mask(kvm); + gfn_t mask = KVM_PAGES_PER_HPAGE(iter.level) - 1; + struct kvm_memory_slot *slot; + struct kvm_mmu_page *sp; + + slot = gfn_to_memslot(kvm, gfn); + if (kvm_hugepage_test_mixed(slot, gfn, iter.level) || + (gfn & mask) < start || + end < (gfn & mask) + KVM_PAGES_PER_HPAGE(iter.level)) { + WARN_ON_ONCE(!can_yield); + if (split_sp) { + sp = split_sp; + split_sp = NULL; + sp->role = tdp_iter_child_role(&iter); + } else { + WARN_ON(iter.yielded); + if (flush && can_yield) { + kvm_flush_remote_tlbs(kvm); + flush = false; + } + sp = tdp_mmu_alloc_sp_for_split(kvm, &iter, false); + if (iter.yielded) { + split_sp = sp; + continue; + } + } + KVM_BUG_ON(!sp, kvm); + + tdp_mmu_init_sp(sp, iter.sptep, iter.gfn); + if (tdp_mmu_split_huge_page(kvm, &iter, sp, false)) { + /* force retry on this gfn. */ + iter.yielded = true; + split_sp = sp; + } else + flush = true; + continue; + } + } + tdp_mmu_iter_set_spte(kvm, &iter, SHADOW_NONPRESENT_VALUE); flush = true; } rcu_read_unlock(); + if (split_sp) + tdp_mmu_free_sp(split_sp); + /* * Because this flow zaps _only_ leaf SPTEs, the caller doesn't need * to provide RCU protection as no 'struct kvm_mmu_page' will be freed. @@ -1617,8 +1671,6 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, kvm_lockdep_assert_mmu_lock_held(kvm, shared); KVM_BUG_ON(kvm_mmu_page_role_is_private(role) != is_private_sptep(iter->sptep), kvm); - /* TODO: Large page isn't supported for private SPTE yet. */ - KVM_BUG_ON(kvm_mmu_page_role_is_private(role), kvm); /* * Since we are allocating while under the MMU lock we have to be From patchwork Mon Feb 26 08:29:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206472 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954546dyb; Mon, 26 Feb 2024 01:15:54 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXprpxO+NbdRZ3/BQJa13Mfr9SAUrrmCi071OW42vteECBZzRf+AX0btG1u96o5zvSKc2gtvc0htfs0ojAhb14b4CSbuQ== X-Google-Smtp-Source: AGHT+IHJtvDmqLGMoZ0jPpwIxP0wzY1qSqifZABM++2pFGfYxqWbClTQ8gv24J7RTI8bI5G2WkTD X-Received: by 2002:a05:6870:1701:b0:21f:d9e9:d62e with SMTP id h1-20020a056870170100b0021fd9e9d62emr4416896oae.9.1708938954181; Mon, 26 Feb 2024 01:15:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938954; cv=pass; d=google.com; s=arc-20160816; b=MSliSbO9lhyx4VKqVAUggG3xHvnz4ntQlidypllMXaGRPXmOy83YGbYEx1/7avrd1r 0yZqt8VNypsuaCUw/zOCmcN0mZyfsMUjvBZy/dQXJovTTV+z2Gg7VJRpnbv4dRBZC3Z0 x7fkOp4tv+moyAhxynpMbeZE45sI2E4ypO3l9rY3fmwB0mnDr+FtPHBWSEvU2JVeLtdC A1gJ7xEgbizVm0fNM7ZgxzwChVicpzXxD0iMHUDD1RcJlLWWzujBNmOkjz2nNQn6Qhs0 0ds6iAeCsuxEaf3h/vK2yscbn9olskZX/rUWvfVTG/GUrcUforncVcKH8JydXvj1slcj e3Ow== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=Pw0yyGrn+7Eyp9TdwL/r48T0SD4ZnUapp3N5/sBHttg=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=fVl7uNJJw45lQ3bVfNfl0qMBbFyRDaN8CLivaWUWKODOFKsVZocd6e+I01WHkwS2Ft 5vfPrLqnaPcfQ6pxV2O/rP2tw3GJ7llgOg9+090XdWE/tGn0fP7TTsS+Gwi+ODWMM9rQ xm42f6dqOndhhWbUTeFKZlZ0nDybLNuHPGxvtjpZCviQMxGc/W+ptjHTT9Pvaoubc5g/ LxzZj++0vWr6xLKcVoq393CIVh2l7VOSZToojovZpGBMNnUyn+rp/MSo+GYdI7xYPqfj qmzejxe4ni7YAhOwlpzReYkezxgck5H1+cTwiXedNnacp9cJ8u7q8XAI0DIqPMCNhYmm POjw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=B3lUEs4L; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80905-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80905-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id y7-20020a656c07000000b005dca7d50880si3482061pgu.696.2024.02.26.01.15.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:15:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80905-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=B3lUEs4L; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80905-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80905-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 27612285D75 for ; Mon, 26 Feb 2024 09:15:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 487AF13667D; Mon, 26 Feb 2024 08:29:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="B3lUEs4L" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5C452E3E4; Mon, 26 Feb 2024 08:29:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936194; cv=none; b=t+VE/3SgSAjXUHoJ+iXL4T/JlG5XKET1r2HmK8vpWRd35eznlfCadOEnGBwVY8jd+bK3/P1BxJZEXhSi5fO/GK/5ov7DlMIpI/GOxUCaapleniYDBiCwRK3fHG30XQydHM4bpb72tDAWC1AKG2cf9Xl9glRIYPLki9acRme0enc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936194; c=relaxed/simple; bh=bl9k+dlnXJduQusfWfzV1yxMDCOOObFRijgwH0XuSM0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L02AD6XoMdSu/LqbFShwfKGyb1SvTKrnfPyGEV+XrvKJo5+bXBX+7cWlHBiSdfucYCOq9GvpODqkjzm1fyw1Vj8p0JCvgFWCWrK4cZqjpmzGx9Gt1pcvXtZGAWEZbCBEjsXbnawTcDUMuQ65+VLCKyRnW7uf84zTGuUsOnaeQ+g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=B3lUEs4L; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936193; x=1740472193; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bl9k+dlnXJduQusfWfzV1yxMDCOOObFRijgwH0XuSM0=; b=B3lUEs4LKGivOcp7I4hXL8xiz0Y0gnDYBgpc4EZPRvMok5HEHfqTuPGx bSLWRjlFfPR6Zn8oNhZW8u60INDSk1+Xg9gjCz50JhpX2Aiy5TPM8SSbb Og+i1mM9PmY8fn9R5R9VGDQ0Md09fFMJe8SGnDUmC3jmjtivn5Jesu+G7 vbvuueFSN18Y1iaxixOsrxdW8fIYxtBPK91aGzRDsVS11fs3LyaWpOzCo +piZrRLKEIPwnO9osSr/vxNoPK4UmbjEHVvfhm9wDXjVA16xPoM/Etcb+ XPEjgMzqumyWITHiNk8D4ADMzWYKay0nUeM/j1eeO6TCuqZaqif35pN/w Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623321" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623321" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519419" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:34 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 10/14] KVM: x86/tdp_mmu, TDX: Split a large page when 4KB page within it converted to shared Date: Mon, 26 Feb 2024 00:29:24 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952372669299089 X-GMAIL-MSGID: 1791952372669299089 From: Xiaoyao Li When mapping the shared page for TDX, it needs to zap private alias. In the case that private page is mapped as large page (2MB), it can be removed directly only when the whole 2MB is converted to shared. Otherwise, it has to split 2MB page into 512 4KB page, and only remove the pages that converted to shared. When a present large leaf spte switches to present non-leaf spte, TDX needs to split the corresponding SEPT page to reflect it. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- v7: - catch up for tdx_seamcall() change - typo in a comment of __set_private_spte_present() - improved a comment in tdx_sept_split_private_spt() v6: - repeat TDH.MEM.PAGE.DEMOTE on TDX_INTERRUPTED_RESTARTABLE Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu/tdp_mmu.c | 21 ++++++++++++++++----- arch/x86/kvm/vmx/tdx.c | 27 +++++++++++++++++++++++++-- arch/x86/kvm/vmx/tdx_arch.h | 1 + arch/x86/kvm/vmx/tdx_errno.h | 1 + arch/x86/kvm/vmx/tdx_ops.h | 13 +++++++++++++ 7 files changed, 59 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 3a7140129855..ada6865100ee 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -105,6 +105,7 @@ KVM_X86_OP_OPTIONAL_RET0(get_mt_mask) KVM_X86_OP(load_mmu_pgd) KVM_X86_OP_OPTIONAL(link_private_spt) KVM_X86_OP_OPTIONAL(free_private_spt) +KVM_X86_OP_OPTIONAL(split_private_spt) KVM_X86_OP_OPTIONAL(set_private_spte) KVM_X86_OP_OPTIONAL(remove_private_spte) KVM_X86_OP_OPTIONAL(zap_private_spte) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c864a1ff2eb1..9c9742cc469c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1766,6 +1766,8 @@ struct kvm_x86_ops { void *private_spt); int (*free_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt); + int (*split_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + void *private_spt); int (*set_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn); int (*remove_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e3682794adda..0ac2a4911fd1 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -588,23 +588,34 @@ static int __must_check __set_private_spte_present(struct kvm *kvm, tdp_ptep_t s { bool was_present = is_shadow_present_pte(old_spte); bool is_present = is_shadow_present_pte(new_spte); + bool was_leaf = was_present && is_last_spte(old_spte, level); bool is_leaf = is_present && is_last_spte(new_spte, level); kvm_pfn_t new_pfn = spte_to_pfn(new_spte); + void *private_spt; int ret = 0; lockdep_assert_held(&kvm->mmu_lock); - /* TDP MMU doesn't change present -> present */ - KVM_BUG_ON(was_present, kvm); /* * Use different call to either set up middle level * private page table, or leaf. */ - if (is_leaf) + if (level > PG_LEVEL_4K && was_leaf && !is_leaf) { + /* + * splitting large page into 4KB. + * tdp_mmu_split_huge_page() => tdp_mmu_link_sp() + */ + private_spt = get_private_spt(gfn, new_spte, level); + KVM_BUG_ON(!private_spt, kvm); + ret = static_call(kvm_x86_zap_private_spte)(kvm, gfn, level); + kvm_flush_remote_tlbs(kvm); + if (!ret) + ret = static_call(kvm_x86_split_private_spt)(kvm, gfn, + level, private_spt); + } else if (is_leaf) ret = static_call(kvm_x86_set_private_spte)(kvm, gfn, level, new_pfn); else { - void *private_spt = get_private_spt(gfn, new_spte, level); - + private_spt = get_private_spt(gfn, new_spte, level); KVM_BUG_ON(!private_spt, kvm); ret = static_call(kvm_x86_link_private_spt)(kvm, gfn, level, private_spt); } diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 6941e9483e7e..88af64658a9c 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1650,6 +1650,30 @@ static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, return 0; } +static int tdx_sept_split_private_spt(struct kvm *kvm, gfn_t gfn, + enum pg_level level, void *private_spt) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + gpa_t gpa = gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level); + hpa_t hpa = __pa(private_spt); + struct tdx_module_args out; + u64 err; + + /* See comment in tdx_sept_set_private_spte() to pin pages. */ + do { + err = tdh_mem_page_demote(kvm_tdx->tdr_pa, gpa, tdx_level, hpa, &out); + } while (err == TDX_INTERRUPTED_RESTARTABLE); + if (unlikely(err == TDX_ERROR_SEPT_BUSY)) + return -EAGAIN; + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_PAGE_DEMOTE, err, &out); + return -EIO; + } + + return 0; +} + static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level) { @@ -1663,8 +1687,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, if (unlikely(!is_hkid_assigned(kvm_tdx))) return 0; - /* For now large page isn't supported yet. */ - WARN_ON_ONCE(level != PG_LEVEL_4K); err = tdh_mem_range_block(kvm_tdx->tdr_pa, gpa, tdx_level, &out); if (unlikely(err == TDX_ERROR_SEPT_BUSY)) return -EAGAIN; @@ -3308,6 +3330,7 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) x86_ops->link_private_spt = tdx_sept_link_private_spt; x86_ops->free_private_spt = tdx_sept_free_private_spt; + x86_ops->split_private_spt = tdx_sept_split_private_spt; x86_ops->set_private_spte = tdx_sept_set_private_spte; x86_ops->remove_private_spte = tdx_sept_remove_private_spte; x86_ops->zap_private_spte = tdx_sept_zap_private_spte; diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index 19f2deafde5b..bb324f744bbf 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -21,6 +21,7 @@ #define TDH_MNG_CREATE 9 #define TDH_VP_CREATE 10 #define TDH_MNG_RD 11 +#define TDH_MEM_PAGE_DEMOTE 15 #define TDH_MR_EXTEND 16 #define TDH_MR_FINALIZE 17 #define TDH_VP_FLUSH 18 diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h index 5366bf476d2c..416708e6cbb7 100644 --- a/arch/x86/kvm/vmx/tdx_errno.h +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -11,6 +11,7 @@ */ #define TDX_NON_RECOVERABLE_VCPU 0x4000000100000000ULL #define TDX_INTERRUPTED_RESUMABLE 0x8000000300000000ULL +#define TDX_INTERRUPTED_RESTARTABLE 0x8000000400000000ULL #define TDX_OPERAND_INVALID 0xC000010000000000ULL #define TDX_OPERAND_BUSY 0x8000020000000000ULL #define TDX_PREVIOUS_TLB_EPOCH_BUSY 0x8000020100000000ULL diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index ef4748943ac7..d8f0d9aa7439 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -241,6 +241,19 @@ static inline u64 tdh_mng_rd(hpa_t tdr, u64 field, struct tdx_module_args *out) return tdx_seamcall(TDH_MNG_RD, &in, out); } +static inline u64 tdh_mem_page_demote(hpa_t tdr, gpa_t gpa, int level, hpa_t page, + struct tdx_module_args *out) +{ + struct tdx_module_args in = { + .rcx = gpa | level, + .rdx = tdr, + .r8 = page, + }; + + tdx_clflush_page(page, PG_LEVEL_4K); + return tdx_seamcall_sept(TDH_MEM_PAGE_DEMOTE, &in, out); +} + static inline u64 tdh_mr_extend(hpa_t tdr, gpa_t gpa, struct tdx_module_args *out) { From patchwork Mon Feb 26 08:29:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206476 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954612dyb; Mon, 26 Feb 2024 01:16:03 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWlS39Y0yBuGuxXnIzBoeJG2HGdLcec+tZG0xuZtp5/XsOllRKz4hhwRSNWZlBuUHC9+Eu7wPQVfn5gvEHzn7hbdAY46g== X-Google-Smtp-Source: AGHT+IGLBDOjjOh3LW+JaEblUCTy40G/5Pbbg1q/QbphnsZQJzK5q6dBNCc57GCzozCAeKTwad7u X-Received: by 2002:a05:620a:24d5:b0:787:bf5a:278c with SMTP id m21-20020a05620a24d500b00787bf5a278cmr6124385qkn.59.1708938963442; Mon, 26 Feb 2024 01:16:03 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938963; cv=pass; d=google.com; s=arc-20160816; b=hzazxwenyCZg00qSyJBYpcnz6bSnYWndERfBlNj9+UuKSd5G6NyY20FfQ5+PINF014 FQm4/VusPFSiI9UMH7XjS5l5+/yLOkTFqWhfRU/YUhnuVj6lasS6qj+m5RKb/pMYiNpF y+v/uL3i/9A+S99v8DcWFJfO3kAZTqDcsaDI8LEBqBbdPvfgSZDH9JYwxFzC/RZf+w86 BiT7FNqTz8s6T1K4cC+Y4G5oPW1XsVQBsirRyC5j1KuTylI7ycPb1fC3Duj7u8BF1YaO Xn5UwN/KEvmxQh4JI+qQ+oLPWH7q3nLCULegNO8IWKmNOEc4FbnJ78fCH3MalBXssvO+ Armw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=2UcR54inpkyELsSsMQTifwtg1DWjwvk1gWsQG6s6EK4=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=LBw/T69edpYfxsQCdXPUGTDl561RDOs7CimXbvfvs3TDtjWHgfwj8xmF3NFqGUBotj 8Mjz2y4v79q3Ztf9O2lx+f856ADEYNeeUX7arNOE95p44ANL9nOH5WcFcG8yyxqA+8J1 IvCx+uHZC/kgWUH0EFlVAlAHOgiR2BV67dlZO0p/EFgfJ7vt+w7PqsicA+ez60Kis6Kt tUjwNngTlyORbyrk+89j6lc+Aq3iz5tTYR3two0FqJKgsr4Icx52HjtSjgim2SjdhURM mFA+YMReHl1Q2JxVu+tBuW3iyTv/7anYaFvmpSEPhs/FKY9DuFAe192V4a72Ueda8zbi yWRg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h4ssgUZN; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80906-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80906-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id c2-20020a05620a11a200b007879b444651si4580581qkk.112.2024.02.26.01.16.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:16:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80906-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h4ssgUZN; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80906-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80906-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 287CE1C23D0E for ; Mon, 26 Feb 2024 09:15:55 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DF9A91369B3; Mon, 26 Feb 2024 08:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h4ssgUZN" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6A2413473E; Mon, 26 Feb 2024 08:29:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936195; cv=none; b=sqEInd7qTEAXhBKK/AxhsI95Fz7KtFjPrhc+IqMNKXKpGekBeSoMZ9RTDMkxRe+YM+jBSQnPKUIzKidu7pIXdCtA/A2MAh6uIJE6NUo4XjU8zZXV2NfbkradlHsvMFdGGS7sjvDS+4IcYOlOGCS793m+P739mv3GIPvVRsUHmzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936195; c=relaxed/simple; bh=oeAEv4Gf0Qi57KYsvGTkKhwlHrVtXMsAmWfuZLQ9wrM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rFfl8dlnreWLByN+3+zqqla+exmv/dzpZ/VNi6B5FJMOfu8iq/SfeCc4tWF03wKyrUzQFahnBoUT6nhdhEIdpzrNkJZs02i2W04a7RnAPEPaPQlpZHj9vcw8MSoay2TtQSCrQ9WUVPa6UZa+DvyGyqfa9xWt5pnrSycWDqYLyL8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h4ssgUZN; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936193; x=1740472193; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oeAEv4Gf0Qi57KYsvGTkKhwlHrVtXMsAmWfuZLQ9wrM=; b=h4ssgUZNo37U6sabKJKqxnF2azUOS0D10lmEtbZyoCR24bN2uZXUmTxl qX+uM2q1ol5kMsuB9ooZidpRNeI21LwiKptrAOQLINUSA048B9Gk75vR2 VVGcPI3TyM8bXvwHP86VVWeS1biRea+OQahf9/qs/pJSM4cXu8rcQjpwB TMIhvz9b/Uz0buZ2AgTcvFCB4yXQsKSqGEhm1RRCVLUO2jCPGlwrLCyWF 9aBUvrr14PaRCDNP7bhsbY08iH80iMXrS8JKQR3uoleo9yg6D+1V981l7 ZE8B50cPbL2N02bVoF9xpFRaRXYpmUGcVnKUhVH4J9jS3DCBuHQSpJYoU w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623328" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623328" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519429" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:36 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 11/14] KVM: x86/tdp_mmu: Try to merge pages into a large page Date: Mon, 26 Feb 2024 00:29:25 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952382432780744 X-GMAIL-MSGID: 1791952382432780744 From: Isaku Yamahata When a large page is passed to the KVM page fault handler and some of sub pages are already populated, try to merge sub pages into a large page. This situation can happen when the guest converts small pages into shared and convert it back into private. When a large page is passed to KVM mmu page fault handler and the spte corresponding to the page is non-leaf (one or more of sub pages are already populated at lower page level), the current kvm mmu zaps non-leaf spte at a large page level, and populate a leaf spte at that level. Thus small pages are converted into a large page. However, it doesn't work for TDX because zapping and re-populating results in zeroing page content. Instead, populate all small pages and merge them into a large page. Merging pages into a large page can fail when some sub pages are accepted and some are not. In such case, with the assumption that guest tries to accept at large page size for performance when possible, don't try to be smart to identify which page is still pending, map all pages at lower page level, and let vcpu re-execute. Signed-off-by: Isaku Yamahata --- v7: - typo freezed => frozen - return 0 when page is merged into 2M large page instead of -EAGAIN v5: - Fix memory leak Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 2 + arch/x86/include/asm/kvm_host.h | 4 + arch/x86/kvm/mmu/tdp_iter.c | 37 ++++-- arch/x86/kvm/mmu/tdp_iter.h | 2 + arch/x86/kvm/mmu/tdp_mmu.c | 176 ++++++++++++++++++++++++++++- 5 files changed, 211 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index ada6865100ee..6741cc518dae 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -106,9 +106,11 @@ KVM_X86_OP(load_mmu_pgd) KVM_X86_OP_OPTIONAL(link_private_spt) KVM_X86_OP_OPTIONAL(free_private_spt) KVM_X86_OP_OPTIONAL(split_private_spt) +KVM_X86_OP_OPTIONAL(merge_private_spt) KVM_X86_OP_OPTIONAL(set_private_spte) KVM_X86_OP_OPTIONAL(remove_private_spte) KVM_X86_OP_OPTIONAL(zap_private_spte) +KVM_X86_OP_OPTIONAL(unzap_private_spte) KVM_X86_OP(has_wbinvd_exit) KVM_X86_OP(get_l2_tsc_offset) KVM_X86_OP(get_l2_tsc_multiplier) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 9c9742cc469c..a02b14be186c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -147,6 +147,7 @@ #define KVM_MAX_HUGEPAGE_LEVEL PG_LEVEL_1G #define KVM_NR_PAGE_SIZES (KVM_MAX_HUGEPAGE_LEVEL - PG_LEVEL_4K + 1) #define KVM_HPAGE_GFN_SHIFT(x) (((x) - 1) * 9) +#define KVM_HPAGE_GFN_MASK(x) (~((1UL << KVM_HPAGE_GFN_SHIFT(x)) - 1)) #define KVM_HPAGE_SHIFT(x) (PAGE_SHIFT + KVM_HPAGE_GFN_SHIFT(x)) #define KVM_HPAGE_SIZE(x) (1UL << KVM_HPAGE_SHIFT(x)) #define KVM_HPAGE_MASK(x) (~(KVM_HPAGE_SIZE(x) - 1)) @@ -1768,11 +1769,14 @@ struct kvm_x86_ops { void *private_spt); int (*split_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt); + int (*merge_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + void *private_spt); int (*set_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn); int (*remove_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn); int (*zap_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level); + int (*unzap_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level); bool (*has_wbinvd_exit)(void); diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 04c247bfe318..c4a18703f88a 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -71,6 +71,14 @@ tdp_ptep_t spte_to_child_pt(u64 spte, int level) return (tdp_ptep_t)__va(spte_to_pfn(spte) << PAGE_SHIFT); } +static void step_down(struct tdp_iter *iter, tdp_ptep_t child_pt) +{ + iter->level--; + iter->pt_path[iter->level - 1] = child_pt; + iter->gfn = gfn_round_for_level(iter->next_last_level_gfn, iter->level); + tdp_iter_refresh_sptep(iter); +} + /* * Steps down one level in the paging structure towards the goal GFN. Returns * true if the iterator was able to step down a level, false otherwise. @@ -92,14 +100,28 @@ static bool try_step_down(struct tdp_iter *iter) if (!child_pt) return false; - iter->level--; - iter->pt_path[iter->level - 1] = child_pt; - iter->gfn = gfn_round_for_level(iter->next_last_level_gfn, iter->level); - tdp_iter_refresh_sptep(iter); - + step_down(iter, child_pt); return true; } +/* Steps down for frozen spte. Don't re-read sptep because it was frozen. */ +void tdp_iter_step_down(struct tdp_iter *iter, tdp_ptep_t child_pt) +{ + WARN_ON_ONCE(!child_pt); + WARN_ON_ONCE(iter->yielded); + WARN_ON_ONCE(iter->level == iter->min_level); + + step_down(iter, child_pt); +} + +void tdp_iter_step_side(struct tdp_iter *iter) +{ + iter->gfn += KVM_PAGES_PER_HPAGE(iter->level); + iter->next_last_level_gfn = iter->gfn; + iter->sptep++; + iter->old_spte = kvm_tdp_mmu_read_spte(iter->sptep); +} + /* * Steps to the next entry in the current page table, at the current page table * level. The next entry could point to a page backing guest memory or another @@ -117,10 +139,7 @@ static bool try_step_side(struct tdp_iter *iter) (SPTE_ENT_PER_PAGE - 1)) return false; - iter->gfn += KVM_PAGES_PER_HPAGE(iter->level); - iter->next_last_level_gfn = iter->gfn; - iter->sptep++; - iter->old_spte = kvm_tdp_mmu_read_spte(iter->sptep); + tdp_iter_step_side(iter); return true; } diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index a9c9cd0db20a..ca00db799a50 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -134,6 +134,8 @@ void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root, int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); void tdp_iter_restart(struct tdp_iter *iter); +void tdp_iter_step_side(struct tdp_iter *iter); +void tdp_iter_step_down(struct tdp_iter *iter, tdp_ptep_t child_pt); static inline union kvm_mmu_page_role tdp_iter_child_role(struct tdp_iter *iter) { diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0ac2a4911fd1..556974361d36 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1205,6 +1205,180 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm, bool skip_private) } } +static int tdp_mmu_iter_step_side(int i, struct tdp_iter *iter) +{ + i++; + + /* + * if i = SPTE_ENT_PER_PAGE, tdp_iter_step_side() results + * in reading the entry beyond the last entry. + */ + if (i < SPTE_ENT_PER_PAGE) + tdp_iter_step_side(iter); + + return i; +} + +static int tdp_mmu_merge_private_spt(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault, + struct tdp_iter *iter, u64 new_spte) +{ + u64 *sptep = rcu_dereference(iter->sptep); + u64 old_spte = iter->old_spte; + struct kvm_mmu_page *child_sp; + struct kvm *kvm = vcpu->kvm; + struct tdp_iter child_iter; + int level = iter->level; + gfn_t gfn = iter->gfn; + tdp_ptep_t child_pt; + u64 child_spte; + int ret = 0; + int i; + + /* + * TDX KVM supports only 2MB large page. It's not supported to merge + * 2MB pages into 1GB page at the moment. + */ + WARN_ON_ONCE(fault->goal_level != PG_LEVEL_2M); + WARN_ON_ONCE(iter->level != PG_LEVEL_2M); + WARN_ON_ONCE(!is_large_pte(new_spte)); + + /* Freeze the spte to prevent other threads from working spte. */ + if (!try_cmpxchg64(sptep, &iter->old_spte, REMOVED_SPTE)) + return -EBUSY; + + /* + * Step down to the child spte. Because tdp_iter_next() assumes the + * parent spte isn't frozen, do it manually. + */ + child_pt = spte_to_child_pt(iter->old_spte, iter->level); + child_sp = sptep_to_sp(child_pt); + WARN_ON_ONCE(child_sp->role.level != PG_LEVEL_4K); + WARN_ON_ONCE(!kvm_mmu_page_role_is_private(child_sp->role)); + + /* Don't modify iter as the caller will use iter after this function. */ + child_iter = *iter; + /* Adjust the target gfn to the head gfn of the large page. */ + child_iter.next_last_level_gfn &= -KVM_PAGES_PER_HPAGE(level); + tdp_iter_step_down(&child_iter, child_pt); + + /* + * All child pages are required to be populated for merging them into a + * large page. Populate all child spte. + */ + for (i = 0; i < SPTE_ENT_PER_PAGE; i = tdp_mmu_iter_step_side(i, &child_iter)) { + int tmp; + + WARN_ON_ONCE(child_iter.level != PG_LEVEL_4K); + + if (is_shadow_present_pte(child_iter.old_spte)) { + /* TODO: relocate page for huge page. */ + if (WARN_ON_ONCE(spte_to_pfn(child_iter.old_spte) != + spte_to_pfn(new_spte) + i)) { + if (!ret) + ret = -EAGAIN; + continue; + } + /* + * When SEPT_VE_DISABLE=true and the page state is + * pending, this case can happen. Just resume the vcpu + * again with the expectation for other vcpu to accept + * this page. + */ + if (child_iter.gfn == fault->gfn) { + if (!ret) + ret = -EAGAIN; + } + continue; + } + + child_spte = make_huge_page_split_spte(kvm, new_spte, child_sp->role, i); + /* + * Because other thread may have started to operate on this spte + * before freezing the parent spte, Use atomic version to + * prevent race. + */ + tmp = tdp_mmu_set_spte_atomic(vcpu->kvm, &child_iter, child_spte); + if (tmp == -EBUSY || tmp == -EAGAIN) { + /* + * There was a race condition. Populate remaining 4K + * spte to resolve fault->gfn to guarantee the forward + * progress. + */ + if (!ret) + ret = tmp; + } else if (tmp) { + ret = tmp; + goto out; + } + } + if (ret) + goto out; + + /* Prevent the Secure-EPT entry from being used. */ + ret = static_call(kvm_x86_zap_private_spte)(kvm, gfn, level); + if (ret) + goto out; + kvm_flush_remote_tlbs_range(kvm, gfn & KVM_HPAGE_GFN_MASK(level), + KVM_PAGES_PER_HPAGE(level)); + + /* Merge pages into a large page. */ + ret = static_call(kvm_x86_merge_private_spt)(kvm, gfn, level, + kvm_mmu_private_spt(child_sp)); + /* + * Failed to merge pages because some pages are accepted and some are + * pending. Since the child page was mapped above, let vcpu run. + */ + if (ret) { + if (static_call(kvm_x86_unzap_private_spte)(kvm, gfn, level)) + old_spte = SHADOW_NONPRESENT_VALUE | + (spte_to_pfn(old_spte) << PAGE_SHIFT) | + PT_PAGE_SIZE_MASK; + goto out; + } + + /* Update stats manually as we don't use tdp_mmu_set_spte{, _atomic}(). */ + kvm_update_page_stats(kvm, level - 1, -SPTE_ENT_PER_PAGE); + kvm_update_page_stats(kvm, level, 1); + + /* Unfreeze spte. */ + iter->old_spte = new_spte; + __kvm_tdp_mmu_write_spte(sptep, new_spte); + + /* + * Free unused child sp. Secure-EPT page was already freed at TDX level + * by kvm_x86_merge_private_spt(). + */ + tdp_unaccount_mmu_page(kvm, child_sp); + tdp_mmu_free_sp(child_sp); + return 0; + +out: + iter->old_spte = old_spte; + __kvm_tdp_mmu_write_spte(sptep, old_spte); + return ret; +} + +static int __tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault, + struct tdp_iter *iter, u64 new_spte) +{ + /* + * The private page has smaller-size pages. For example, the child + * pages was converted from shared to page, and now it can be mapped as + * a large page. Try to merge small pages into a large page. + */ + if (fault->slot && + kvm_gfn_shared_mask(vcpu->kvm) && + iter->level > PG_LEVEL_4K && + kvm_is_private_gpa(vcpu->kvm, fault->addr) && + is_shadow_present_pte(iter->old_spte) && + !is_large_pte(iter->old_spte)) + return tdp_mmu_merge_private_spt(vcpu, fault, iter, new_spte); + + return tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte); +} + /* * Installs a last-level SPTE to handle a TDP page fault. * (NPT/EPT violation/misconfiguration) @@ -1246,7 +1420,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; - else if (tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte)) + else if (__tdp_mmu_map_handle_target_level(vcpu, fault, iter, new_spte)) return RET_PF_RETRY; else if (is_shadow_present_pte(iter->old_spte) && !is_last_spte(iter->old_spte, iter->level)) From patchwork Mon Feb 26 08:29:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206474 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954619dyb; Mon, 26 Feb 2024 01:16:05 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCV8xLjM/+LVw0LQYQfnPS5RJrR5jxTMjI1a8MVWR9V4jO1goOJ+sAlCEs/wOvSlR02/1djp3FnyuYYnPxzP7yuvQ5zJlg== X-Google-Smtp-Source: AGHT+IHbLaTBeEq3Bycswp81htW72aIXz6U233xVCoVEXg2r04YzZOJRrqIu1SrcLFYUOIFkwV1E X-Received: by 2002:a1f:ccc6:0:b0:4c8:e834:6ce2 with SMTP id c189-20020a1fccc6000000b004c8e8346ce2mr2865376vkg.5.1708938965532; Mon, 26 Feb 2024 01:16:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938965; cv=pass; d=google.com; s=arc-20160816; b=RioKLrsGpOBMAJlMLOxXElv5NjngANVYkdKgTv6WvBOPfn/LP4Y/EL+dgiig/sHQbW aUADznrSjpIG4SG6mH1QDqzcm664Hkl9D3jR4rYrAVAR0tvxZanvF2DSowaqzKKZbtrI az1cIgnqOYYzf/L61fJz4hNoGGxivzbHBIYHESjwJmWZKWwZHcdKNPOZb0UCKhg2MD+t G6hEI25QNaEn77YlacwEyHZmWe+ODywIwRgyTYhxBLR3ZJfypZ8uYf0NLgDoxMD6JC5/ EHytH2L8+5isBojaP1Vn4yicbie8CKqaD+YT3lhbs7L+z2DW17b/vWk98cKBA1RIcSV2 ji7A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=pxyvOJjXCSvCHlCnTrJwYXeOXUKtG5NNwiVL8zKi07o=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=Rrk0MHuxjU6a5ooU9I9oprUED2Z3QwhcO9Sp6o1Tcaqv17npKB98/8HOGWoFu7Agem QU6sTTJ1sVSK30Bl2MuhREIBv2QyO4e5M/W71gYBYnDylK9ZTeCc8GveB4os96qJfAFE t9c3k5EffLqmrI1T0H3+CW/xuU7FpwHfy1i3CwQ1wlm08WqKZI4ENEyq6PEl+AxYX6Fl Tsf8ITRyL2hzSjRnjujsQPAvpZDurauLaqSlLzg4NLbr10r7U1DKSLNApNLtgBuEXRiA zEDwaFVaZw20y6R8IhyAYkmWy/vu6mhE18NRcjWUy2aYOzemAPaII9MTurTa1dWCFsKj 8j2Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="e/7M++5H"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80909-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80909-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id 4-20020a0562140cc400b0068f51dc06f1si4846408qvx.450.2024.02.26.01.16.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:16:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80909-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="e/7M++5H"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80909-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80909-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 2FDAD1C23381 for ; Mon, 26 Feb 2024 09:15:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 882231369BD; Mon, 26 Feb 2024 08:30:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="e/7M++5H" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54076134CF9; Mon, 26 Feb 2024 08:29:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; cv=none; b=Ixwkaq+qnRrd/bsvo70YDGoLsCWWotPTZW9DV6wWjMfYhbimycqoDtQ9rWr1xdNvDPsW9yYMACsgEjBBFahuWf3ulthT3khb0UQiIQPbm4+1eUUjq1y+DktADS6Gwr8U1VADPI45hKBSN4voTj4wcktY7txBbmeJBWbASWgCb1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; c=relaxed/simple; bh=vAER6tofN9QA4BcfDZw3ooI9Hrt9ifz5SaXVfMtvuno=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L+OazS2hU2IOTPnyePAp4b2DSO8j8+1r/eOJGyF4n/zoFATfiCGtO7KTCvwJMyWSzYOh29tp1z2gmIhza/l1TdX7qjOU4FcTwGrbeZmWvrlPaeyPwqv8elDXYhahMr2+vgRXfNWuoQUOmLZz2Q3HbyMxHbK0SzOCaDoX11vvnjg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=e/7M++5H; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936194; x=1740472194; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vAER6tofN9QA4BcfDZw3ooI9Hrt9ifz5SaXVfMtvuno=; b=e/7M++5HFJq76rr3qLy86biEXqxchkp7+m8gxhNMqXziwRmet0a+fsCy uwWhYYdjswOzd40Tp8VgmBw3E1r2bWYQisQ32uOumDnctDVaqtRoRV6ZO sW7P2lLcBM16Hrv/J4biValdE3BEBMUmI/fSZ10E+CLR8aiUDqCLitHFW 59bXFgf8c1qlIczL4mhDvUe3p2iOGZRDyNmHoTKgfwl8bpcDxEmUnSgrJ 0LTTn9CFo4oCUgEkJNzTS2ftoUHukJwA+Sy+KBbZhSOqBBDuCqC8SMkzx GMSboVilxO+NVtyv7bkNX9hos6xFcRU6xI2CqktZHYZt8kBy8uxSaOjO2 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623332" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623332" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519432" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:36 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 12/14] KVM: TDX: Implement merge pages into a large page Date: Mon, 26 Feb 2024 00:29:26 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952385159220232 X-GMAIL-MSGID: 1791952385159220232 From: Isaku Yamahata Implement merge_private_stp callback. Signed-off-by: Isaku Yamahata --- v7: - Fix subject, x86/tdp_mmu => TDX - comment: use ulink instead of free for clarity v6: - repeat TDH.MEM.PAGE.PROMOTE() on TDX_INTERRUPTED_RESTARTABLE Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 74 ++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx_arch.h | 1 + arch/x86/kvm/vmx/tdx_errno.h | 2 + arch/x86/kvm/vmx/tdx_ops.h | 11 ++++++ 4 files changed, 88 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 88af64658a9c..5b4d94a6c6e2 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1674,6 +1674,51 @@ static int tdx_sept_split_private_spt(struct kvm *kvm, gfn_t gfn, return 0; } +static int tdx_sept_merge_private_spt(struct kvm *kvm, gfn_t gfn, + enum pg_level level, void *private_spt) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct tdx_module_args out; + gpa_t gpa = gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level); + u64 err; + + /* See comment in tdx_sept_set_private_spte() */ + do { + err = tdh_mem_page_promote(kvm_tdx->tdr_pa, gpa, tdx_level, &out); + } while (err == TDX_INTERRUPTED_RESTARTABLE); + if (unlikely(err == TDX_ERROR_SEPT_BUSY)) + return -EAGAIN; + if (unlikely(err == (TDX_EPT_INVALID_PROMOTE_CONDITIONS | + TDX_OPERAND_ID_RCX))) + /* + * Some pages are accepted, some pending. Need to wait for TD + * to accept all pages. Tell it the caller. + */ + return -EAGAIN; + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_PAGE_PROMOTE, err, &out); + return -EIO; + } + WARN_ON_ONCE(out.rcx != __pa(private_spt)); + + /* + * TDH.MEM.PAGE.PROMOTE unlinks the Secure-EPT page for the lower level. + * Flush cache for reuse. + */ + do { + err = tdh_phymem_page_wbinvd(set_hkid_to_hpa(__pa(private_spt), + to_kvm_tdx(kvm)->hkid)); + } while (unlikely(err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX))); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); + return -EIO; + } + + tdx_clear_page(__pa(private_spt), PAGE_SIZE); + return 0; +} + static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level) { @@ -1770,6 +1815,33 @@ static void tdx_track(struct kvm *kvm) } +static int tdx_sept_unzap_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + gpa_t gpa = gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level); + struct tdx_module_args out; + u64 err; + + do { + err = tdh_mem_range_unblock(kvm_tdx->tdr_pa, gpa, tdx_level, &out); + + /* + * tdh_mem_range_block() is accompanied with tdx_track() via kvm + * remote tlb flush. Wait for the caller of + * tdh_mem_range_block() to complete TDX track. + */ + } while (err == (TDX_TLB_TRACKING_NOT_DONE | TDX_OPERAND_ID_SEPT)); + if (unlikely(err == TDX_ERROR_SEPT_BUSY)) + return -EAGAIN; + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_RANGE_UNBLOCK, err, &out); + return -EIO; + } + return 0; +} + static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt) { @@ -3331,9 +3403,11 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) x86_ops->link_private_spt = tdx_sept_link_private_spt; x86_ops->free_private_spt = tdx_sept_free_private_spt; x86_ops->split_private_spt = tdx_sept_split_private_spt; + x86_ops->merge_private_spt = tdx_sept_merge_private_spt; x86_ops->set_private_spte = tdx_sept_set_private_spte; x86_ops->remove_private_spte = tdx_sept_remove_private_spte; x86_ops->zap_private_spte = tdx_sept_zap_private_spte; + x86_ops->unzap_private_spte = tdx_sept_unzap_private_spte; return 0; diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index bb324f744bbf..a320f6d45731 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -29,6 +29,7 @@ #define TDH_MNG_KEY_FREEID 20 #define TDH_MNG_INIT 21 #define TDH_VP_INIT 22 +#define TDH_MEM_PAGE_PROMOTE 23 #define TDH_MEM_SEPT_RD 25 #define TDH_VP_RD 26 #define TDH_MNG_KEY_RECLAIMID 27 diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h index 416708e6cbb7..799d7166e69a 100644 --- a/arch/x86/kvm/vmx/tdx_errno.h +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -21,6 +21,8 @@ #define TDX_KEY_CONFIGURED 0x0000081500000000ULL #define TDX_NO_HKID_READY_TO_WBCACHE 0x0000082100000000ULL #define TDX_FLUSHVP_NOT_DONE 0x8000082400000000ULL +#define TDX_TLB_TRACKING_NOT_DONE 0xC0000B0800000000ULL +#define TDX_EPT_INVALID_PROMOTE_CONDITIONS 0xC0000B0900000000ULL #define TDX_EPT_ENTRY_STATE_INCORRECT 0xC0000B0D00000000ULL /* diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index d8f0d9aa7439..bf660eefa9e0 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -254,6 +254,17 @@ static inline u64 tdh_mem_page_demote(hpa_t tdr, gpa_t gpa, int level, hpa_t pag return tdx_seamcall_sept(TDH_MEM_PAGE_DEMOTE, &in, out); } +static inline u64 tdh_mem_page_promote(hpa_t tdr, gpa_t gpa, int level, + struct tdx_module_args *out) +{ + struct tdx_module_args in = { + .rcx = gpa | level, + .rdx = tdr, + }; + + return tdx_seamcall_sept(TDH_MEM_PAGE_PROMOTE, &in, out); +} + static inline u64 tdh_mr_extend(hpa_t tdr, gpa_t gpa, struct tdx_module_args *out) { From patchwork Mon Feb 26 08:29:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206473 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954602dyb; Mon, 26 Feb 2024 01:16:02 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUPLU+tqpmC7lzvZlY3zXF+CfKxBCpme9fmBzy5lPIZzttlmnr6Ficj9qOhcRF/kQFUVEBNrkiC4VDo+jR89iyP3DNn7Q== X-Google-Smtp-Source: AGHT+IHE8Qnkkv/kihvUfNR+NK2cdxnVS+HD1+R6ylgHy+wYOfcwMIJDX3vpkSBKLjv2cjc5S85e X-Received: by 2002:a05:6808:a1c:b0:3c1:4524:43d5 with SMTP id n28-20020a0568080a1c00b003c1452443d5mr6229285oij.46.1708938962406; Mon, 26 Feb 2024 01:16:02 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938962; cv=pass; d=google.com; s=arc-20160816; b=ZUmtXfVG+l9X/aB2YvKPEJ6lvaChDgAcJPi2euGh3gtGfVMWtpnM3zydIvSi2oRtWh ucOW86a0HKXR0QvTlCV967xV4gpxm9jjvna3Ulgyw+JuFvtOOwUpu41jDfSALtGkwHym OXzD3pYtUu/ustrsyMULxytexH6WiUrFs40InAifgZuDmTzPVRUHPPVQIrQEJhcq5b4g qoOTumtJBctuSCjCRjehbPNJsaR7LUjlf+YZzSRhLsXDDyTHYxoUjepUCJdT9l1UnkL6 2Eqdo5xFctqXp5wpVvtqaAiVvzGw0HRsfA/kq7+g9Vt1QxlOfGFbiApC3lp1COJZTD3z LdHA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=DCy5LfFz/G1qGf5QzFV/sx1sarN2ocJ1323ywMFhVCs=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=WHY9UobnnGZurKLmEFN7+dRLDDCa15gprMBrhWAeX+J/kWvGLiQnKUKFC9AwitEeVZ aXNBxHwEsQZarIc/wtnbUCZfcvOhxt+voIIkFiEoJRfCjY1WISYlttIwD7XY7w/KUx1T h1qZrycWd+5oogv74TlJJ1FA4X0bgQkKNbJczcgYpFXVWXuur3y77sClowNy3PEzPxsJ ttJXRVIWLU/xXVvXQDDaJS8U5X3XB4FrrypBhKgYndDcRNWbSB4rmCm98v8bLD8mWgZd +fMy+DEFRe9ldKRFl+iPSNi9oVLoFYcZE6v44qJvGu/zIYf92XSOHJ3kmcwERXuo3D+w m9zQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XGeweEI6; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id y10-20020ae9f40a000000b007879c17eb3asi4673582qkl.699.2024.02.26.01.16.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:16:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XGeweEI6; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id EF6931C23CB7 for ; Mon, 26 Feb 2024 09:15:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A8E601369AD; Mon, 26 Feb 2024 08:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XGeweEI6" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 284121350ED; Mon, 26 Feb 2024 08:29:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; cv=none; b=Wj7FQY4c/ODxpccklwIrrEOvC214pPaWJ3Ks2QQ7JGNSJeZNCFq04utD3Gwk5P5CCjmNM+iYC0NbTOS6zc9UMm5SaHuF5LD84iEIK3C6/vGb+ig/HgSN3MrUPH3uCCvA7jFRPD3VDcBCXuoQaG2k5z9NPnT1gJzhPW1SMyaj2Us= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; c=relaxed/simple; bh=iJP1GahjgTaqNwfQyxqZGNYBnDS7wdW3RkKQ5eYOAAA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BLuGjs23YB8T0P6i2ZwGdHAqJtmn1oVjWq46Bp7eFv1fps+nT3sA3xYsYXPFwKTb2T0UQTSsQWUXO1QvSTXvrzCM7Er9hC5qc/ua/gFVbfQ9W7HQwN6GhP6wz+JQ3ihGgVuVF7JNx1UBfN0RB5N2IomSwNzsadDEOTEKsStzh8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XGeweEI6; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936195; x=1740472195; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iJP1GahjgTaqNwfQyxqZGNYBnDS7wdW3RkKQ5eYOAAA=; b=XGeweEI65Omao6z0vGui0nTyuXa4C6Z5SYIXKNeuyEnBqmg3OSBJKbUK Tr6S9u2btfZII6QWktDVJZJO9tEsMdxQikp5pT6rF/085u38X+OepylXM uhKKRc+v+Thm/uM8oePAWpKdcoIyERoIJ09TTQfUwbBDVKe23n+2LxVgb nid0+GzAqn8A5n2ssx/VHTBPJrz5tFx66vPfFd/E4s/h9s9FqKvXruEB+ jkAbibsYDzssVqwSr62Xsj5AV0bw+k3rYNo1znYviatKePJpOMjiplwI4 bmJXpbFoFSoiFrXqoLFzCCmDtUybLcrq34Sp8pu3epJH+wU1PnwYsDDDc w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623336" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623336" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519436" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 13/14] KVM: x86/mmu: Make kvm fault handler aware of large page of private memslot Date: Mon, 26 Feb 2024 00:29:27 -0800 Message-Id: <30209eb4d65d1de3e09dc9fdb3fc0d3d3c96dc7e.1708933625.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952381017577716 X-GMAIL-MSGID: 1791952381017577716 From: Isaku Yamahata struct kvm_page_fault.req_level is the page level which takes care of the faulted-in page size. For now its calculation is only for the conventional kvm memslot by host_pfn_mapping_level() that traverses page table. However, host_pfn_mapping_level() cannot be used for private kvm memslot because private pages of private kvm memlost aren't mapped into user virtual address space. Instead, page order is given when getting pfn. Remember it in struct kvm_page_fault and use it. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 27 ++++++++++++++------------- arch/x86/kvm/mmu/mmu_internal.h | 12 +++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index fa7fabc410c4..3c41861b4b3d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3154,10 +3154,10 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, static int __kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, - gfn_t gfn, int max_level, bool is_private) + gfn_t gfn, int max_level, int host_level, + bool is_private) { struct kvm_lpage_info *linfo; - int host_level; max_level = min(max_level, max_huge_page_level); for ( ; max_level > PG_LEVEL_4K; max_level--) { @@ -3166,24 +3166,23 @@ static int __kvm_mmu_max_mapping_level(struct kvm *kvm, break; } - if (is_private) - return max_level; - if (max_level == PG_LEVEL_4K) return PG_LEVEL_4K; - host_level = host_pfn_mapping_level(kvm, gfn, slot); + if (!is_private) { + WARN_ON_ONCE(host_level != PG_LEVEL_NONE); + host_level = host_pfn_mapping_level(kvm, gfn, slot); + } + WARN_ON_ONCE(host_level == PG_LEVEL_NONE); return min(host_level, max_level); } int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level) + int max_level, bool faultin_private) { - bool is_private = kvm_slot_can_be_private(slot) && - kvm_mem_is_private(kvm, gfn); - - return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, is_private); + return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, + PG_LEVEL_NONE, faultin_private); } void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) @@ -3208,7 +3207,8 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault */ fault->req_level = __kvm_mmu_max_mapping_level(vcpu->kvm, slot, fault->gfn, fault->max_level, - fault->is_private); + fault->host_level, + kvm_is_faultin_private(fault)); if (fault->req_level == PG_LEVEL_4K || fault->huge_page_disallowed) return; @@ -4349,6 +4349,7 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, } max_level = kvm_max_level_for_order(max_order); + fault->host_level = max_level; r = static_call(kvm_x86_gmem_max_level)(vcpu->kvm, fault->pfn, fault->gfn, fault->is_private, &max_level); @@ -6818,7 +6819,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, */ if (sp->role.direct && sp->role.level < kvm_mmu_max_mapping_level(kvm, slot, sp->gfn, - PG_LEVEL_NUM)) { + PG_LEVEL_NUM, false)) { kvm_zap_one_rmap_spte(kvm, rmap_head, sptep); if (kvm_available_flush_remote_tlbs_range()) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 315c123affaf..9d56f9ab16f7 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -327,6 +327,9 @@ struct kvm_page_fault { * is changing its own translation in the guest page tables. */ bool write_fault_to_shadow_pgtable; + + /* valid only for private memslot && private gfn */ + enum pg_level host_level; }; int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); @@ -421,7 +424,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level); + int max_level, bool faultin_private); void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level); @@ -439,4 +442,11 @@ static inline bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t g } #endif +static inline bool kvm_is_faultin_private(const struct kvm_page_fault *fault) +{ + if (IS_ENABLED(CONFIG_KVM_GENERIC_PRIVATE_MEM)) + return fault->is_private && kvm_slot_can_be_private(fault->slot); + return false; +} + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 556974361d36..d6ce8496803f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -2183,7 +2183,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, continue; max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot, - iter.gfn, PG_LEVEL_NUM); + iter.gfn, PG_LEVEL_NUM, false); if (max_mapping_level < iter.level) continue; From patchwork Mon Feb 26 08:29:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206475 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954623dyb; Mon, 26 Feb 2024 01:16:06 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU0CH2BzlelAqQWoGmkh8M3CZr6/zUS2uSw240g8ym07KdgB0NKEMYM1DQ1YlMRoUy3dJ29usfyxiVq5aeEcC7y6nU5DA== X-Google-Smtp-Source: AGHT+IElvAJ+pXlwyfx2O3NZW5C1H87qm50yxTbuB04YB2rd2eGY5eWcIfweuzkieCemyNMswEuo X-Received: by 2002:aa7:85ce:0:b0:6e4:9ba7:a622 with SMTP id z14-20020aa785ce000000b006e49ba7a622mr5651333pfn.19.1708938965869; Mon, 26 Feb 2024 01:16:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938965; cv=pass; d=google.com; s=arc-20160816; b=HiFhS6bhZ3s/t7dE94tDWizefHVokLs7Ud2X2ffa/ww+DqY53bbh2fvHQ3UmJcymIV x8aVRDLqttirtAHfZUm2owdJyhexTtMAYYH2GKMf5riden769XrKAxK6bmTdbAIzhlQE 6lQWrVlOFKC+DShuKZDsTeiplPE36P2l3aA1HJGXwOQAceJQLcxsTGgXL2DCCqcF2h3O Sp47f4rh/zMuBZw2+rO+ZSJ0rqLMhOrDU77bACUjPwQMoMiogi+INUi9o1/SI/2bMosl scdfG9xwD9m2gZ98vUaiUQsicICLne5t9o9hA8Hh4iwoXsTXP27c2yeAQphvLJ0R8Zrk feIw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=U6bo0ajjw2ita2FTVfiiUeveUPhpHKCraOxUXa+xufQ=; fh=sMdP/xP2j0mAr7mmqXKi/DmIqVqVXObmf3aqGFg9BJU=; b=HyPIzZ4CTr2usSIITzRapnOqo0CfN7WVYEtRYz/rhjiyPQ1lCyFNmVZAjyx5FDPH5v W8WkA6mALeHjBIIhIP6GJjguuvliq7sg2jM3DAGQ+jUICLMLnzGeh0wUr5Bm2p/da+yL mLvqDtcSM+Y5tStutlaAqE8y2DKovzBQGSOujGpiiTB6tDZ9uHJsVvQWs+phOZjzBtGC LXvJOIJb3gv0c9CYiNNf6h/ms0R5qp5z1B7fikJEl9NZKWAlqIZH21Gcluq7cWu8mqhu nc1FkV61NfCqZKAZHg7baLFLIJESZsvTyesWMxsxtqIxgJ5Xv2gilWUSWaAHpmAV8OBu jf9g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m91SWsCt; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80908-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80908-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id e6-20020a056a001a8600b006e0f4907ebasi3386321pfv.113.2024.02.26.01.16.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:16:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80908-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m91SWsCt; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80908-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80908-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id A9214284CD0 for ; Mon, 26 Feb 2024 09:15:48 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5CA8B1369A4; Mon, 26 Feb 2024 08:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="m91SWsCt" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BAC9B1353EE; Mon, 26 Feb 2024 08:29:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; cv=none; b=nG6C/Pg9ZyF3+QF7ukdDWfEDyiOAi2T+VjGTAPbKMP5mgnk9oY4fF9pqnC64We1L6hk4eQywwRdxn1P4WrDVllitXjHQ5cijA0B9GJwoPmWEuJONDFZ4e0FIMemwkqHWQqAQoYbTQY+GUNvpVYBjcQns9hLmM/eVRkfXQSBYYXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; c=relaxed/simple; bh=dO1PBJiymwhAoqOQlpC9i+z46NWUa5VVHxrGTwyy1RQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fHhgBQ9ljtmcxfg9iC5SZTw7CBHbJ2GBkYKjK3onTyHJYJqDOLv9xavo7uHYEEPraD0p3ETV0XxC5DXvWjmp+vikHd8n1ri+WAMjw17gq3zGTAl4f6NPapGPmBC/qNxra9SxBXc5OwDFAqc9YuwpOIJuyOIiO4ZIFlvYb1IitA0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=m91SWsCt; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936196; x=1740472196; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dO1PBJiymwhAoqOQlpC9i+z46NWUa5VVHxrGTwyy1RQ=; b=m91SWsCtSnzLOtQWK2fkYgMAG6TOnLUc/4mvOZgtc/9I8gU+bEtPdm1m tFCDORSARgcdQ6ADMtcpzwRYBVtNyGfksoVKTul71VP9AkRXkVIGGGv5A DKRVmwkKujlzFbmpkrhAp8egZIxNFym5ZJ1z8rlScx0dAguvF9kqMSHtf +eS+M0fg7Cy9nHpJy9b19nkNDDyRgWZ6n3WCkBNTxhnm2bkWF8zkNxouV XqCBOoy54Q01omCvYJXn24dPVlSoFkhs4xqc6cG7gyyU/9NZJghvNbTeB BP5KOlrSFZD87f6JxxIIyuimEB3S0KDQPFwdiNeBpXzHp735MWk7HLbM0 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623341" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623341" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519439" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li Subject: [PATCH v8 14/14] KVM: TDX: Allow 2MB large page for TD GUEST Date: Mon, 26 Feb 2024 00:29:28 -0800 Message-Id: <2f5ea6d9ae5ce86c7eebd52de4a061ffb05eb420.1708933625.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952385046162272 X-GMAIL-MSGID: 1791952385046162272 From: Xiaoyao Li Now that everything is there to support 2MB page for TD guest. Because TDX module TDH.MEM.PAGE.AUG supports 4KB page and 2MB page, set struct kvm_arch.tdp_max_page_level to 2MB page level. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/tdp_mmu.c | 9 ++------- arch/x86/kvm/vmx/tdx.c | 7 +++++-- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index d6ce8496803f..8663e163427c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1544,14 +1544,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) sp->nx_huge_page_disallowed = fault->huge_page_disallowed; - if (is_shadow_present_pte(iter.old_spte)) { - /* - * TODO: large page support. - * Doesn't support large page for TDX now - */ - KVM_BUG_ON(is_private_sptep(iter.sptep), vcpu->kvm); + if (is_shadow_present_pte(iter.old_spte)) r = tdp_mmu_split_huge_page(kvm, &iter, sp, true); - } else + else r = tdp_mmu_link_sp(kvm, &iter, sp, true); /* diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 5b4d94a6c6e2..1a94b1072068 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3109,8 +3109,11 @@ int tdx_gmem_max_level(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, if (!is_private) return 0; - /* TODO: Enable 2mb and 1gb large page support. */ - *max_level = min(*max_level, PG_LEVEL_4K); + /* + * TDH.MEM.PAGE.AUG supports up to 2MB page. + * TODO: Enable 1gb large page support. + */ + *max_level = min(*max_level, PG_LEVEL_2M); return 0; }