From patchwork Wed Oct 19 16:56:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5759 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp432004wrs; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6BL4fcIKxrvzlBEnUPeyZWwJxaXZD8oLh6XZPTt0/06CAtmOoXKDV1WxMhPdfkvo8a5zWA X-Received: by 2002:a05:6402:50cf:b0:45c:dfce:66ae with SMTP id h15-20020a05640250cf00b0045cdfce66aemr8079052edb.370.1666198712631; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198712; cv=none; d=google.com; s=arc-20160816; b=l7Dmq+tRk+k45x6AUgZhHPJX4s2JkBEzgr74h0zfMTE7Y5OhZnnuKXH8UFf9ndAWhi 1o5yW1731AH+QM7xirA8iMkJTmLUJ5eTbfXS8a9utNUzpxnLnq7D1kyKf71u9wJnEVjZ u+juCnqH4Q1k9PcG/SaD0YmwLgcuUHd+gKOgJdvZ1u4eAVp5E5sqmJwPAmM1mBFBvYEv lNKzwioa2VP1fEv9smm+rIQkIchy+xCM7nMzkQYSjnzGWMfIj6GM2265R2GZRD0iZeta mk1gZPIWb5p8DEdHSgC8+MD2BczFf4WIMSVD+CHn1QgO9DPPBzOk5GFzXzczUO622Ooa Ymzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=LS+pilKxawTbP6OhESulmIxtCCQL8Wr76F/xHli54h6HVIxUbOxv1PUUFnoIsfv3dW wobTG36xfMa5CpKEsykSkz3vyXytE+w1MPEb25QQhW3uBtvOTUN9FfBeW94AUMhYI84Z 2yHtOP3v81mK5oLFnTU8IMIGOcUQuM91be86x9GMX3nmNko3u2EsHzzL18y26ZOuct2L r6dpnoGVXpsh4BrNe1XTYjlHTqRGn1NMsznHOZJ8Mxs3l3OCtzCoTapaVjzIAJMzJXIL Z+PQlHjkiIXjEbgCMQ8mHp2TkowEquRBH5eu5txpfDffComjQ6zwlXuJg4nKKQZtiu4h WaoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=eHou7EYf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q5-20020a1709064c8500b0078200f886bbsi14875783eju.361.2022.10.19.09.58.07; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=eHou7EYf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231245AbiJSQ4l (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230405AbiJSQ4b (ORCPT ); Wed, 19 Oct 2022 12:56:31 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84E361CFC71 for ; Wed, 19 Oct 2022 09:56:29 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id p3-20020a170902e74300b0018546b77dccso9134586plf.17 for ; Wed, 19 Oct 2022 09:56:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=eHou7EYfmU5ewGmVE97/OB0Rz/MjGpSSoqcb9mhgSPyFDa1znMXYSOAZFoR/AzYMch 8Z5YHSJfAKzggMptljw1XDgeHIcNdiPaWA5Dpqml3cug+3JbyRsms/ZUzh0r6I1j1OJM FLI2ceaF/58DxaA3Swq56W/551GeOhJ4Uh5ntxg9V7Wup9HFB/4URbZBz/h7GGutDB/z vCHHtYA1P4TaDG1LxkA3c8y5jtyRwKBMw2zQi/1ixRORrmId5VeNVwyKdQnd3xapMolG vUwNdgtca9HuqSXdPZrkzPUQ6ypqmnoybKfIVYA0UK6rSeH/OoR9b+SdRQDEDPuXf3kt 1d3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=PRCWSephaTs0PWZd4vSkDsIFaeGQdbYfWdU16LodGG1BbT5gM+mzfv2tPfSc254W8f OlZ+GIQjEGSGk7Y9yk2Gr5Jtqp1xiHvT4i4Lr+ttIhwVoUIsUyqJY18yiNuZ+Jl6ZePR lI3kVnnuz1lffEmQ8KVG1OXlvPcrXz8fAXwHACTL2slaLoExm1bd2joX/1VYFjA1ZmBC FMaKK1AT2t+6BniumIV49AGVNPOqIrHjdmdcJ8CPB4I46GUlOomboYzyoztTm2wp72Jd e6LaqKk8i2dTAkDYEVlM+S99XwbBlr/A8ZGxcjrpN7WRdnaP3EPze12Lg1KyN88tRvJk 7M/w== X-Gm-Message-State: ACrzQf3E/AX6jsj5V3lMKE7ERUvCUSUVt4Gsp+6bOR4eMFYQgt+mVb1n WjoGgMwrlNTOQLR70DNirbdSCmzuiWQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:10cf:b0:528:48c3:79e0 with SMTP id d15-20020a056a0010cf00b0052848c379e0mr9535472pfu.18.1666198588718; Wed, 19 Oct 2022 09:56:28 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:14 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-5-seanjc@google.com> Subject: [PATCH v6 4/8] KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135981001981966?= X-GMAIL-MSGID: =?utf-8?q?1747135981001981966?= Set nx_huge_page_disallowed in TDP MMU shadow pages before making the SP visible to other readers, i.e. before setting its SPTE. This will allow KVM to query the flag when determining if a shadow page can be replaced by a NX huge page without violating the rules of the mitigation. Note, the shadow/legacy MMU holds mmu_lock for write, so it's impossible for another CPU to see a shadow page without an up-to-date nx_huge_page_disallowed, i.e. only the TDP MMU needs the complicated dance. Signed-off-by: Sean Christopherson Reviewed-by: David Matlack Reviewed-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 28 +++++++++++++++++++--------- arch/x86/kvm/mmu/mmu_internal.h | 5 ++--- arch/x86/kvm/mmu/tdp_mmu.c | 31 ++++++++++++++++++------------- 3 files changed, 39 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 99086a684dd2..57c7c52d137a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,11 +802,8 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible) +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - sp->nx_huge_page_disallowed = true; - /* * If it's possible to replace the shadow page with an NX huge page, * i.e. if the shadow page is the only thing currently preventing KVM @@ -815,8 +812,7 @@ void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, * on the list if KVM is reusing an existing shadow page, i.e. if KVM * links a shadow page at multiple points. */ - if (!nx_huge_page_possible || - !list_empty(&sp->possible_nx_huge_page_link)) + if (!list_empty(&sp->possible_nx_huge_page_link)) return; ++kvm->stat.nx_lpage_splits; @@ -824,6 +820,15 @@ void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, &kvm->arch.possible_nx_huge_pages); } +static void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) +{ + sp->nx_huge_page_disallowed = true; + + if (nx_huge_page_possible) + track_possible_nx_huge_page(kvm, sp); +} + static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) { struct kvm_memslots *slots; @@ -841,10 +846,8 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_mmu_gfn_allow_lpage(slot, gfn); } -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - sp->nx_huge_page_disallowed = false; - if (list_empty(&sp->possible_nx_huge_page_link)) return; @@ -852,6 +855,13 @@ void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) list_del_init(&sp->possible_nx_huge_page_link); } +static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + sp->nx_huge_page_disallowed = false; + + untrack_possible_nx_huge_page(kvm, sp); +} + static struct kvm_memory_slot * gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn, bool no_dirty_log) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 67879459a25c..22152241bd29 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -328,8 +328,7 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible); -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 73eb28ed1f03..059231c82345 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -403,8 +403,11 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, lockdep_assert_held_write(&kvm->mmu_lock); list_del(&sp->link); - if (sp->nx_huge_page_disallowed) - unaccount_nx_huge_page(kvm, sp); + + if (sp->nx_huge_page_disallowed) { + sp->nx_huge_page_disallowed = false; + untrack_possible_nx_huge_page(kvm, sp); + } if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1118,16 +1121,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, * @kvm: kvm instance * @iter: a tdp_iter instance currently on the SPTE that should be set * @sp: The new TDP page table to install. - * @account_nx: True if this page table is being installed to split a - * non-executable huge page. * @shared: This operation is running under the MMU lock in read mode. * * Returns: 0 if the new page table was installed. Non-0 if the page table * could not be installed (e.g. the atomic compare-exchange failed). */ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, - struct kvm_mmu_page *sp, bool account_nx, - bool shared) + struct kvm_mmu_page *sp, bool shared) { u64 spte = make_nonleaf_spte(sp->spt, !kvm_ad_enabled()); int ret = 0; @@ -1142,8 +1142,6 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); - if (account_nx) - account_nx_huge_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); tdp_account_mmu_page(kvm, sp); @@ -1157,6 +1155,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu *mmu = vcpu->arch.mmu; + struct kvm *kvm = vcpu->kvm; struct tdp_iter iter; struct kvm_mmu_page *sp; int ret; @@ -1193,9 +1192,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) } if (!is_shadow_present_pte(iter.old_spte)) { - bool account_nx = fault->huge_page_disallowed && - fault->req_level >= iter.level; - /* * If SPTE has been frozen by another thread, just * give up and retry, avoiding unnecessary page table @@ -1207,10 +1203,19 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) sp = tdp_mmu_alloc_sp(vcpu); tdp_mmu_init_child_sp(sp, &iter); - if (tdp_mmu_link_sp(vcpu->kvm, &iter, sp, account_nx, true)) { + sp->nx_huge_page_disallowed = fault->huge_page_disallowed; + + if (tdp_mmu_link_sp(kvm, &iter, sp, true)) { tdp_mmu_free_sp(sp); break; } + + if (fault->huge_page_disallowed && + fault->req_level >= iter.level) { + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + track_possible_nx_huge_page(kvm, sp); + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); + } } } @@ -1498,7 +1503,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, * correctness standpoint since the translation will be the same either * way. */ - ret = tdp_mmu_link_sp(kvm, iter, sp, false, shared); + ret = tdp_mmu_link_sp(kvm, iter, sp, shared); if (ret) goto out;