From patchwork Thu Feb 29 02:57:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Stevens X-Patchwork-Id: 208185 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2097:b0:108:e6aa:91d0 with SMTP id gs23csp143953dyb; Wed, 28 Feb 2024 19:01:46 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWFhoMjd3SEIUJc/GnL4UVp2BqAC4X/906eyFQtcpA3Xyh9/tlF76cqe4O+HDiir+YUxOWHjtl2QRnypkj0ZZaJt3t7Qg== X-Google-Smtp-Source: AGHT+IG/tC9vTFx+z96FXI49O6mDITzbDFLz4p/EX1fxuVosN0W1d3KficJ9XiL0LUcgcm7l2zuh X-Received: by 2002:a05:6a00:b41:b0:6e4:c102:8065 with SMTP id p1-20020a056a000b4100b006e4c1028065mr998443pfo.5.1709175706200; Wed, 28 Feb 2024 19:01:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709175706; cv=pass; d=google.com; s=arc-20160816; b=Pfh1YVmfym3QSr9/zzVr4wNbURdavabN3nmcR5czmReolKB5qfkqfTkc5zZbGrm329 iqpmFpJ2ZFfkgTjmMApNCZLNDYbcEY0N4zFVn3njeqjC9MuUQ3QmaGv/D0vZdjMHpj2q BwjMa4xW6PYsJ5+1Y/VBuuXh7a1CYOY4+wOKpb2Y2B5GsVayCWyABioWehxS+avpie1O LQWs9jhZHKhDxX7oPFIkS+OjjlgV7ccrPDDJ2uvN0tVNkJphOB63tJG6qovHKatlDHQ3 TfjSuujCE4BlWsRHPh8AjDZFo4gK/7g8P1XaBsaSQoe0+9mJVgO0as27/LMJKoi1V3En OkNg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=jpTtMKeJSamVSvzUBrIb8hYRwduZxS9ewHlGqezhwx0=; fh=yZOFYdPSUNSx/hrV7iYfGrTGUJTClr0hIrDH0rF+5kw=; b=mgmI2fWsZoFFgHAA1zQqJz23xCINf8lnWGBuB+9uttb/CLrJ6Y5G+k1NZGKOeuiQu+ pRtxTpuSLukittBVtAxJE2ev8oU250S3Xe/MZzlDKyMIkrd/xgsbA9P/Y+EJTFnIb3Ys D22QS0peGpD10hj7POxbHrk1cEXLwAUNXxSEiGQtHrcW4rJ+gGXaThPvbBq0RzqWKjIM DGbcmsCwLAtTq2ltFvGQmolaR4NuFzzLCX+XVldF6Ay6L7iU06E1ShRUp14dDw3LHew1 0L8MdpJ5ZP7rfACm3yr2DoOmW2+hEzwJ/v+2lLG3EaO5wkJM6tLFgDLIac5dUI8AieqK 0Afw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mp1HfUoV; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-86072-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-86072-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id lp3-20020a056a003d4300b006e561349528si370307pfb.374.2024.02.28.19.01.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 19:01:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-86072-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mp1HfUoV; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-86072-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-86072-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 6D5412854F0 for ; Thu, 29 Feb 2024 03:01:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9960B46B98; Thu, 29 Feb 2024 02:58:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="mp1HfUoV" Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CD0D44C6E for ; Thu, 29 Feb 2024 02:58:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709175525; cv=none; b=GKPu4nLNMR7xETR3NPpAn9yrx5DM70BmfKcXxazF+cbtBihkn0WrwiJ633KQmpMOcBHnoPtYN/U+7aLzsERdYw4RRELQm2DA83mRXfFGXIMsVvn3SP0HsDaEvn/UCuHOd9t5XfEDIAr/L5J3c8mZF/BdeWREWMU5tJFHQV9PMN4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709175525; c=relaxed/simple; bh=CHA75S6snq8UA20Y4BdKK2SwtCegWvLmcyn13Xk/r2Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J3/3lLEMWxbgFttExIXNrNQ4i1aBBjsUR7Wmb9kptChvCYRAYtVqxOoeNnHTvYefk0/SfDlM8Cq6C9tJ5GLXAhWnx3ybZAmHdZqVpas69hGTyku1qpKEsQB+7wJ6UzxDwEn2ae8gr0QS+oJEQEsm8IHEpjE9eUie5QOpHmHLhCQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=mp1HfUoV; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1dba177c596so3133045ad.0 for ; Wed, 28 Feb 2024 18:58:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1709175523; x=1709780323; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jpTtMKeJSamVSvzUBrIb8hYRwduZxS9ewHlGqezhwx0=; b=mp1HfUoV8R1444WIRrFqUzYsztJIyvqm0ZOmqXBbcmLPwO5W+7I0kcV3s/V+A5voQl WGioLd8r/JBgnvkQFknWgPHmvi/NPAIZSdyK/E4gQvp2FjQJW+JlJBwbTBSi0DcuJuC2 MrwIJJcJHby6lJRxN52BzPSs2EGtqaSBp13X4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709175523; x=1709780323; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jpTtMKeJSamVSvzUBrIb8hYRwduZxS9ewHlGqezhwx0=; b=WLHw0ibYuoMlXUxTsXC+5P2eQtzVyD5GeRqJ/a2oMFwLVXF12iNEYFkM3Ow7EAWFMd 5M0GeJTygBrAr3G1/SbVW1q2qJ9eP3SrbZ5roWHgqhDLjG0UW9ri26zkAqXU+Iv57ASc dLd9DKglb4ueHR2DMZsDeJTYIAzzyYsFz2HXYREE/GKyAZvIvpJVVhHQKhGHIUoXgEGn 0tQzFHaUyN1hwgazLLuqK2Ebv8PlFu/tDVPvj76Sszoo6vC4qUfQXzi1wAn3PnYciR3K TXAy6elYqmhqmTCz+WZo/Ii9dgjhXU2QxY6g+YepfCvf8fxK6USOPsciY1gjmR4WOF/l 7HeA== X-Forwarded-Encrypted: i=1; AJvYcCXpX89b7t96wJLXbwVcRH2JeYDMlBNJZR12RJmlQfvvrRogvZo4KPPfUjw2iFaG+CeHxLbevB9dJzfVB+LC9uIxz5ONC/1Yrod7Xw0w X-Gm-Message-State: AOJu0YyaOm1G+Bxw/0rVtXrF85OfWpwHo/osZgJM1xQk7lU8BdQRTZSI t3cc8XcSrmK7k7GCqQuY0i8e72xgpU6q4vOV/fu8tDijuzLkRX5A0eUcHChBuuSLFU3NyUd+99A = X-Received: by 2002:a17:903:1c3:b0:1dc:b887:35bd with SMTP id e3-20020a17090301c300b001dcb88735bdmr1034209plh.5.1709175523466; Wed, 28 Feb 2024 18:58:43 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:f51:e79e:9056:77ea]) by smtp.gmail.com with UTF8SMTPSA id x10-20020a170902ec8a00b001d5f1005096sm181559plg.55.2024.02.28.18.58.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Feb 2024 18:58:43 -0800 (PST) From: David Stevens X-Google-Original-From: David Stevens To: Sean Christopherson , Paolo Bonzini Cc: Yu Zhang , Isaku Yamahata , Zhi Wang , Maxim Levitsky , kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, David Stevens Subject: [PATCH v11 8/8] KVM: x86/mmu: Handle non-refcounted pages Date: Thu, 29 Feb 2024 11:57:59 +0900 Message-ID: <20240229025759.1187910-9-stevensd@google.com> X-Mailer: git-send-email 2.44.0.rc1.240.g4c46232300-goog In-Reply-To: <20240229025759.1187910-1-stevensd@google.com> References: <20240229025759.1187910-1-stevensd@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792200625401743476 X-GMAIL-MSGID: 1792200625401743476 From: David Stevens Handle non-refcounted pages in __kvm_faultin_pfn. This allows the host to map memory into the guest that is backed by non-refcounted struct pages - for example, the tail pages of higher order non-compound pages allocated by the amdgpu driver via ttm_pool_alloc_page. Signed-off-by: David Stevens --- arch/x86/kvm/mmu/mmu.c | 24 +++++++++++++++++------- arch/x86/kvm/mmu/mmu_internal.h | 2 ++ arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 3 ++- include/linux/kvm_host.h | 6 ++++-- virt/kvm/guest_memfd.c | 8 ++++---- virt/kvm/kvm_main.c | 10 ++++++++-- 7 files changed, 38 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 4936a8c5829b..f9046912bb43 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2924,6 +2924,11 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, bool host_writable = !fault || fault->map_writable; bool prefetch = !fault || fault->prefetch; bool write_fault = fault && fault->write; + /* + * Prefetching uses gfn_to_page_many_atomic, which never gets + * non-refcounted pages. + */ + bool is_refcounted = !fault || !!fault->accessed_page; if (unlikely(is_noslot_pfn(pfn))) { vcpu->stat.pf_mmio_spte_created++; @@ -2951,7 +2956,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, } wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch, - true, host_writable, true, &spte); + true, host_writable, is_refcounted, &spte); if (*sptep == spte) { ret = RET_PF_SPURIOUS; @@ -4319,8 +4324,8 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, return -EFAULT; } - r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &fault->pfn, - &max_order); + r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, + &fault->pfn, &fault->accessed_page, &max_order); if (r) { kvm_mmu_prepare_memory_fault_exit(vcpu, fault); return r; @@ -4330,6 +4335,9 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, fault->max_level); fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); + /* kvm_gmem_get_pfn takes a refcount, but accessed_page doesn't need it. */ + put_page(fault->accessed_page); + return RET_PF_CONTINUE; } @@ -4339,10 +4347,10 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault struct kvm_follow_pfn kfp = { .slot = slot, .gfn = fault->gfn, - .flags = FOLL_GET | (fault->write ? FOLL_WRITE : 0), + .flags = fault->write ? FOLL_WRITE : 0, .try_map_writable = true, .guarded_by_mmu_notifier = true, - .allow_non_refcounted_struct_page = false, + .allow_non_refcounted_struct_page = shadow_refcounted_mask, }; /* @@ -4359,6 +4367,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault fault->slot = NULL; fault->pfn = KVM_PFN_NOSLOT; fault->map_writable = false; + fault->accessed_page = NULL; return RET_PF_CONTINUE; } /* @@ -4422,6 +4431,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault success: fault->hva = kfp.hva; fault->map_writable = kfp.writable; + fault->accessed_page = kfp.refcounted_page; return RET_PF_CONTINUE; } @@ -4510,8 +4520,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault r = direct_map(vcpu, fault); out_unlock: + kvm_set_page_accessed(fault->accessed_page); write_unlock(&vcpu->kvm->mmu_lock); - kvm_release_pfn_clean(fault->pfn); return r; } @@ -4586,8 +4596,8 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, r = kvm_tdp_mmu_map(vcpu, fault); out_unlock: + kvm_set_page_accessed(fault->accessed_page); read_unlock(&vcpu->kvm->mmu_lock); - kvm_release_pfn_clean(fault->pfn); return r; } #endif diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 0669a8a668ca..0b05183600af 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -240,6 +240,8 @@ struct kvm_page_fault { kvm_pfn_t pfn; hva_t hva; bool map_writable; + /* Does NOT have an elevated refcount */ + struct page *accessed_page; /* * Indicates the guest is trying to write a gfn that contains one or diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index c965f77ac4d5..b39dce802394 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -847,8 +847,8 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault r = FNAME(fetch)(vcpu, fault, &walker); out_unlock: + kvm_set_page_accessed(fault->accessed_page); write_unlock(&vcpu->kvm->mmu_lock); - kvm_release_pfn_clean(fault->pfn); return r; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index ee497fb78d90..0524be7c0796 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -958,7 +958,8 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, else wrprot = make_spte(vcpu, sp, fault->slot, ACC_ALL, iter->gfn, fault->pfn, iter->old_spte, fault->prefetch, true, - fault->map_writable, true, &new_spte); + fault->map_writable, !!fault->accessed_page, + &new_spte); if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d19a418df04b..ea34eae6cfa4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2426,11 +2426,13 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) #ifdef CONFIG_KVM_PRIVATE_MEM int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order); + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order); #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, - kvm_pfn_t *pfn, int *max_order) + kvm_pfn_t *pfn, struct page **page, + int *max_order) { KVM_BUG_ON(1, kvm); return -EIO; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 0f4e0cf4f158..dabcca2ecc37 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -483,12 +483,12 @@ void kvm_gmem_unbind(struct kvm_memory_slot *slot) } int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order) + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) { pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff; struct kvm_gmem *gmem; struct folio *folio; - struct page *page; struct file *file; int r; @@ -514,9 +514,9 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, goto out_unlock; } - page = folio_file_page(folio, index); + *page = folio_file_page(folio, index); - *pfn = page_to_pfn(page); + *pfn = page_to_pfn(*page); if (max_order) *max_order = 0; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 235c92830cdc..1f5d2a1e63a9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3284,11 +3284,17 @@ void kvm_set_page_dirty(struct page *page) } EXPORT_SYMBOL_GPL(kvm_set_page_dirty); -void kvm_set_page_accessed(struct page *page) +static void __kvm_set_page_accessed(struct page *page) { if (kvm_is_ad_tracked_page(page)) mark_page_accessed(page); } + +void kvm_set_page_accessed(struct page *page) +{ + if (page) + __kvm_set_page_accessed(page); +} EXPORT_SYMBOL_GPL(kvm_set_page_accessed); void kvm_release_page_clean(struct page *page) @@ -3298,7 +3304,7 @@ void kvm_release_page_clean(struct page *page) if (!page) return; - kvm_set_page_accessed(page); + __kvm_set_page_accessed(page); put_page(page); } EXPORT_SYMBOL_GPL(kvm_release_page_clean);