From patchwork Sun Oct 30 06:22:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12886 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666374wru; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM685Q6CLLwR2soHHthwyoUDwQ6uQPHk4Kub2THmskJFFYIefrAP4vsB871rji1Kd/6uWgJs X-Received: by 2002:a17:902:a611:b0:186:9ba2:148b with SMTP id u17-20020a170902a61100b001869ba2148bmr7911831plq.164.1667111399307; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111399; cv=none; d=google.com; s=arc-20160816; b=NdZzk8ySXeccgqy8XadU7qGqOYTHLgoeGDRUKNGmr9XZKFAQypKubYFivrRz0PcxF/ o4bXNmKxfxGcgGfVSJNyx5/8n2NHaRd/1lESpFDFdcmf4gy19YE40d56P12aQhKahrOq JNjLflgAWhfE1rf/JGxTArlzymfEWnfhqhxETr7WgySMPIUoYCxho0dwZ8PUXYd204C+ 5INz3j2d/cuVrOD0Cdtcwu8j9+ggAv0WnqMMXXfsIdl+c+qthdP/k1tOJp2xg4gQzOmF 7gGjokAqh2+sB8pBqm569UPc1agxax7hFfO0IgwQwzHGJDJpytUSaTFeI0fh8TTJtVai Lnsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=A6N/4ZLQ0zOrZyJQWu34ICLgvKLzFF9/SaRxRR5gbLA=; b=0xrWysc42csTgZzFbTE3uQMIQp6lv63PrEiVETDRPO+cOq80JKrVPmc+sbHefsfOqd AqN/KUQJu84kxQLh3abJZ4PC9yhDlnvRybRbVdp77WS4IAMPFid18s12JxRYWXJfCYmY bsmfNIBvHVStyXHNyZGnbAePD5gbtuIbyicl5xu+ahUtAut7yR4r6rVLPnMhIu9H5iy+ hgzeIDJuv+6bz2v9tfKxIi88th7JPxa7H9f9P00Asr93a0SujaTK85zWZRCZfmV+b6Pz ObbIVg1lzplI9tyo0PENVGt3zN27wQYaQJoXeQPj3uCwIpir1/iYbraelVI8zESQ61F9 NyOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iKwYjh93; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b6-20020a056a00114600b00536bbfa4994si4675502pfm.345.2022.10.29.23.29.47; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iKwYjh93; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230303AbiJ3G1w (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229956AbiJ3GY1 (ORCPT ); Sun, 30 Oct 2022 02:24:27 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46328194; Sat, 29 Oct 2022 23:24:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111052; x=1698647052; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/w066rUQeUR1aKt7Qslv/eTMH7Oknm4OPUaonW9aJSU=; b=iKwYjh93t6Z+ldoj2nqWRjBp+9dlY2fyTTGszWfy0Kh50kH3q4oUZgdy H0pZZ4mPWpKX2PQxrjYhSlyGalPb2SknZRncx8zYUjgFTyDC64WB1dVT6 RTwsVdYWD9jv2BRjexsOKhjnOpW7B/WZFINMOHk8v/SFEppDA4RPzgPiQ 0faJZaLPjowD0sNFxk1rlLVXIHPIF6CQcg8SvrZmifhpv5uwhy/iDehGn 2KnyiwJcvJQTUa4BLx2yPWexHw7GlAZSylUmxd0X+D2R/ZVRIOoyDLFQj +e4Rw2aXhNuKnNC5GGH+BUSjwuWG2fChMv51KMZZ0NDfTs7eJ8LkxZRLi w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037165" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037165" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393014" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393014" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 052/108] KVM: x86/tdp_mmu: Ignore unsupported mmu operation on private GFNs Date: Sat, 29 Oct 2022 23:22:53 -0700 Message-Id: <32e2f5f567e1af3858e2896d705b66f90a908ff0.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093002462752662?= X-GMAIL-MSGID: =?utf-8?q?1748093002462752662?= From: Isaku Yamahata Some KVM MMU operations (dirty page logging, page migration, aging page) aren't supported for private GFNs (yet) with the first generation of TDX. Silently return on unsupported TDX KVM MMU operations. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 3 ++ arch/x86/kvm/mmu/tdp_mmu.c | 73 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 3 ++ 3 files changed, 74 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 02e7b5cf3231..efc3b3f2dd12 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6588,6 +6588,9 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, for_each_rmap_spte(rmap_head, &iter, sptep) { sp = sptep_to_sp(sptep); + /* Private page dirty logging is not supported yet. */ + KVM_BUG_ON(is_private_sptep(sptep), kvm); + /* * We cannot do huge page mapping for indirect shadow pages, * which are found on the last rmap (level = 1) when not using diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0e053b96444a..4b207ce83ffe 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1469,7 +1469,8 @@ typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter, static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, struct kvm_gfn_range *range, - tdp_handler_t handler) + tdp_handler_t handler, + bool only_shared) { struct kvm_mmu_page *root; struct tdp_iter iter; @@ -1480,9 +1481,23 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, * into this helper allow blocking; it'd be dead, wasteful code. */ for_each_tdp_mmu_root(kvm, root, range->slot->as_id) { + gfn_t start; + gfn_t end; + + if (only_shared && is_private_sp(root)) + continue; + rcu_read_lock(); - tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) + /* + * For TDX shared mapping, set GFN shared bit to the range, + * so the handler() doesn't need to set it, to avoid duplicated + * code in multiple handler()s. + */ + start = kvm_gfn_for_root(kvm, root, range->start); + end = kvm_gfn_for_root(kvm, root, range->end); + + tdp_root_for_each_leaf_pte(iter, root, start, end) ret |= handler(kvm, &iter, range); rcu_read_unlock(); @@ -1526,7 +1541,12 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range); + /* + * First TDX generation doesn't support clearing A bit for private + * mapping, since there's no secure EPT API to support it. However + * it's a legitimate request for TDX guest. + */ + return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range, true); } static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, @@ -1537,7 +1557,8 @@ static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); + /* The first TDX generation doesn't support A bit. */ + return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn, true); } static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, @@ -1582,8 +1603,11 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) * No need to handle the remote TLB flush under RCU protection, the * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a * shadow page. See the WARN on pfn_changed in __handle_changed_spte(). + * + * .change_pte() callback should not happen for private page, because + * for now TDX private pages are pinned during VM's life time. */ - return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn); + return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn, true); } /* @@ -1637,6 +1661,14 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * Because first TDX generation doesn't support write protecting private + * mappings and kvm_arch_dirty_log_supported(kvm) = false, it's a bug + * to reach here for guest TD. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return false; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) spte_set |= wrprot_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages, min_level); @@ -1902,6 +1934,14 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * First TDX generation doesn't support clearing dirty bit, + * since there's no secure EPT API to support it. It is a + * bug to reach here for TDX guest. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return false; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) spte_set |= clear_dirty_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages); @@ -1968,6 +2008,13 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root; lockdep_assert_held_write(&kvm->mmu_lock); + /* + * First TDX generation doesn't support clearing dirty bit, + * since there's no secure EPT API to support it. For now silently + * ignore KVM_CLEAR_DIRTY_LOG. + */ + if (!kvm_arch_dirty_log_supported(kvm)) + return; for_each_tdp_mmu_root(kvm, root, slot->as_id) clear_dirty_pt_masked(kvm, root, gfn, mask, wrprot); } @@ -2034,6 +2081,13 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * This should only be reachable when diryt-log is supported. It's a + * bug to reach here. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) zap_collapsible_spte_range(kvm, root, slot); } @@ -2087,6 +2141,15 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, bool spte_set = false; lockdep_assert_held_write(&kvm->mmu_lock); + + /* + * First TDX generation doesn't support write protecting private + * mappings, silently ignore the request. KVM_GET_DIRTY_LOG etc + * can reach here, no warning. + */ + if (!kvm_arch_dirty_log_supported(kvm)) + return false; + for_each_tdp_mmu_root(kvm, root, slot->as_id) spte_set |= write_protect_gfn(kvm, root, gfn, min_level); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 24d9bfd4c582..3868605462ed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12897,6 +12897,9 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm, u32 new_flags = new ? new->flags : 0; bool log_dirty_pages = new_flags & KVM_MEM_LOG_DIRTY_PAGES; + if (!kvm_arch_dirty_log_supported(kvm) && log_dirty_pages) + return; + /* * Update CPU dirty logging if dirty logging is being toggled. This * applies to all operations.