From patchwork Mon Feb 26 08:29:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 206473 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp1954602dyb; Mon, 26 Feb 2024 01:16:02 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUPLU+tqpmC7lzvZlY3zXF+CfKxBCpme9fmBzy5lPIZzttlmnr6Ficj9qOhcRF/kQFUVEBNrkiC4VDo+jR89iyP3DNn7Q== X-Google-Smtp-Source: AGHT+IHE8Qnkkv/kihvUfNR+NK2cdxnVS+HD1+R6ylgHy+wYOfcwMIJDX3vpkSBKLjv2cjc5S85e X-Received: by 2002:a05:6808:a1c:b0:3c1:4524:43d5 with SMTP id n28-20020a0568080a1c00b003c1452443d5mr6229285oij.46.1708938962406; Mon, 26 Feb 2024 01:16:02 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708938962; cv=pass; d=google.com; s=arc-20160816; b=ZUmtXfVG+l9X/aB2YvKPEJ6lvaChDgAcJPi2euGh3gtGfVMWtpnM3zydIvSi2oRtWh ucOW86a0HKXR0QvTlCV967xV4gpxm9jjvna3Ulgyw+JuFvtOOwUpu41jDfSALtGkwHym OXzD3pYtUu/ustrsyMULxytexH6WiUrFs40InAifgZuDmTzPVRUHPPVQIrQEJhcq5b4g qoOTumtJBctuSCjCRjehbPNJsaR7LUjlf+YZzSRhLsXDDyTHYxoUjepUCJdT9l1UnkL6 2Eqdo5xFctqXp5wpVvtqaAiVvzGw0HRsfA/kq7+g9Vt1QxlOfGFbiApC3lp1COJZTD3z LdHA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=DCy5LfFz/G1qGf5QzFV/sx1sarN2ocJ1323ywMFhVCs=; fh=Itbyk7CEvizIrzGEESCqq3I2tZgG1kc/GkVOa3S7Hsg=; b=WHY9UobnnGZurKLmEFN7+dRLDDCa15gprMBrhWAeX+J/kWvGLiQnKUKFC9AwitEeVZ aXNBxHwEsQZarIc/wtnbUCZfcvOhxt+voIIkFiEoJRfCjY1WISYlttIwD7XY7w/KUx1T h1qZrycWd+5oogv74TlJJ1FA4X0bgQkKNbJczcgYpFXVWXuur3y77sClowNy3PEzPxsJ ttJXRVIWLU/xXVvXQDDaJS8U5X3XB4FrrypBhKgYndDcRNWbSB4rmCm98v8bLD8mWgZd +fMy+DEFRe9ldKRFl+iPSNi9oVLoFYcZE6v44qJvGu/zIYf92XSOHJ3kmcwERXuo3D+w m9zQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XGeweEI6; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id y10-20020ae9f40a000000b007879c17eb3asi4673582qkl.699.2024.02.26.01.16.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 01:16:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XGeweEI6; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80907-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id EF6931C23CB7 for ; Mon, 26 Feb 2024 09:15:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A8E601369AD; Mon, 26 Feb 2024 08:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XGeweEI6" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 284121350ED; Mon, 26 Feb 2024 08:29:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; cv=none; b=Wj7FQY4c/ODxpccklwIrrEOvC214pPaWJ3Ks2QQ7JGNSJeZNCFq04utD3Gwk5P5CCjmNM+iYC0NbTOS6zc9UMm5SaHuF5LD84iEIK3C6/vGb+ig/HgSN3MrUPH3uCCvA7jFRPD3VDcBCXuoQaG2k5z9NPnT1gJzhPW1SMyaj2Us= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936197; c=relaxed/simple; bh=iJP1GahjgTaqNwfQyxqZGNYBnDS7wdW3RkKQ5eYOAAA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BLuGjs23YB8T0P6i2ZwGdHAqJtmn1oVjWq46Bp7eFv1fps+nT3sA3xYsYXPFwKTb2T0UQTSsQWUXO1QvSTXvrzCM7Er9hC5qc/ua/gFVbfQ9W7HQwN6GhP6wz+JQ3ihGgVuVF7JNx1UBfN0RB5N2IomSwNzsadDEOTEKsStzh8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XGeweEI6; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936195; x=1740472195; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iJP1GahjgTaqNwfQyxqZGNYBnDS7wdW3RkKQ5eYOAAA=; b=XGeweEI65Omao6z0vGui0nTyuXa4C6Z5SYIXKNeuyEnBqmg3OSBJKbUK Tr6S9u2btfZII6QWktDVJZJO9tEsMdxQikp5pT6rF/085u38X+OepylXM uhKKRc+v+Thm/uM8oePAWpKdcoIyERoIJ09TTQfUwbBDVKe23n+2LxVgb nid0+GzAqn8A5n2ssx/VHTBPJrz5tFx66vPfFd/E4s/h9s9FqKvXruEB+ jkAbibsYDzssVqwSr62Xsj5AV0bw+k3rYNo1znYviatKePJpOMjiplwI4 bmJXpbFoFSoiFrXqoLFzCCmDtUybLcrq34Sp8pu3epJH+wU1PnwYsDDDc w==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="14623336" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="14623336" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6519436" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:29:37 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v8 13/14] KVM: x86/mmu: Make kvm fault handler aware of large page of private memslot Date: Mon, 26 Feb 2024 00:29:27 -0800 Message-Id: <30209eb4d65d1de3e09dc9fdb3fc0d3d3c96dc7e.1708933625.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791952381017577716 X-GMAIL-MSGID: 1791952381017577716 From: Isaku Yamahata struct kvm_page_fault.req_level is the page level which takes care of the faulted-in page size. For now its calculation is only for the conventional kvm memslot by host_pfn_mapping_level() that traverses page table. However, host_pfn_mapping_level() cannot be used for private kvm memslot because private pages of private kvm memlost aren't mapped into user virtual address space. Instead, page order is given when getting pfn. Remember it in struct kvm_page_fault and use it. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 27 ++++++++++++++------------- arch/x86/kvm/mmu/mmu_internal.h | 12 +++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index fa7fabc410c4..3c41861b4b3d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3154,10 +3154,10 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, static int __kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, - gfn_t gfn, int max_level, bool is_private) + gfn_t gfn, int max_level, int host_level, + bool is_private) { struct kvm_lpage_info *linfo; - int host_level; max_level = min(max_level, max_huge_page_level); for ( ; max_level > PG_LEVEL_4K; max_level--) { @@ -3166,24 +3166,23 @@ static int __kvm_mmu_max_mapping_level(struct kvm *kvm, break; } - if (is_private) - return max_level; - if (max_level == PG_LEVEL_4K) return PG_LEVEL_4K; - host_level = host_pfn_mapping_level(kvm, gfn, slot); + if (!is_private) { + WARN_ON_ONCE(host_level != PG_LEVEL_NONE); + host_level = host_pfn_mapping_level(kvm, gfn, slot); + } + WARN_ON_ONCE(host_level == PG_LEVEL_NONE); return min(host_level, max_level); } int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level) + int max_level, bool faultin_private) { - bool is_private = kvm_slot_can_be_private(slot) && - kvm_mem_is_private(kvm, gfn); - - return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, is_private); + return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, + PG_LEVEL_NONE, faultin_private); } void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) @@ -3208,7 +3207,8 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault */ fault->req_level = __kvm_mmu_max_mapping_level(vcpu->kvm, slot, fault->gfn, fault->max_level, - fault->is_private); + fault->host_level, + kvm_is_faultin_private(fault)); if (fault->req_level == PG_LEVEL_4K || fault->huge_page_disallowed) return; @@ -4349,6 +4349,7 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, } max_level = kvm_max_level_for_order(max_order); + fault->host_level = max_level; r = static_call(kvm_x86_gmem_max_level)(vcpu->kvm, fault->pfn, fault->gfn, fault->is_private, &max_level); @@ -6818,7 +6819,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, */ if (sp->role.direct && sp->role.level < kvm_mmu_max_mapping_level(kvm, slot, sp->gfn, - PG_LEVEL_NUM)) { + PG_LEVEL_NUM, false)) { kvm_zap_one_rmap_spte(kvm, rmap_head, sptep); if (kvm_available_flush_remote_tlbs_range()) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 315c123affaf..9d56f9ab16f7 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -327,6 +327,9 @@ struct kvm_page_fault { * is changing its own translation in the guest page tables. */ bool write_fault_to_shadow_pgtable; + + /* valid only for private memslot && private gfn */ + enum pg_level host_level; }; int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); @@ -421,7 +424,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level); + int max_level, bool faultin_private); void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level); @@ -439,4 +442,11 @@ static inline bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t g } #endif +static inline bool kvm_is_faultin_private(const struct kvm_page_fault *fault) +{ + if (IS_ENABLED(CONFIG_KVM_GENERIC_PRIVATE_MEM)) + return fault->is_private && kvm_slot_can_be_private(fault->slot); + return false; +} + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 556974361d36..d6ce8496803f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -2183,7 +2183,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, continue; max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot, - iter.gfn, PG_LEVEL_NUM); + iter.gfn, PG_LEVEL_NUM, false); if (max_mapping_level < iter.level) continue;