From patchwork Wed Oct 19 16:56:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5755 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp431661wrs; Wed, 19 Oct 2022 09:57:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4PXLfYJJ8f5zfe7fvyryo95zg2Ipit0RqABxP4A+VKQB7J06FSKe3nJDzcjhzQ9QOnrBp4 X-Received: by 2002:a17:906:9bc3:b0:78d:816f:3743 with SMTP id de3-20020a1709069bc300b0078d816f3743mr7642472ejc.380.1666198647657; Wed, 19 Oct 2022 09:57:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198647; cv=none; d=google.com; s=arc-20160816; b=Ja/IpHUoUSHgrwOL9E5bNWLZx8Chhqgm4stgGaFuleqv9Lhj6g0OfpLccS4ydc7uMN AeDtOWn88q80aekaGuSBIZGAJ4dFUu91n69tgYdBiBcLmHWzS7UfheMnrAxzt4Ddg3pV OrpK769u5IVlPnpgDitvziLcmilmRArsh3esapdKy4AGZmrWC44/om+KnYh5M7EMxViZ e8kwwoxNBl22Cmw4cWxI7s/ikfNXDk1VXqEQfR2x0ew2TJr+Cs5KG7/K7fbtQRR+nUSY Jr9kthCflvbzot4Yra/VF4Qw7krG6YCxMYDELoDHKCMLcOOvlNTFBPdYKvw05B9tU+mT RNvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=tUi0BoiMqtO/M/WpUaQHGu1rzQiQ7c9np2BzoBF+UZg=; b=QB5XOqsRlrDDqVY0SHYQLWnriI/QRufCukMPeGRDzU0xY6au/VpvHxlFvUxn9/qz+n KgeKo4ROrNYKoet1/wOgu8lj/O3FFvjWE5zR4bym3AqQxSfbcHaN2j9qG2pRQBVbb0bE ZgGmLNLjZPAWDgmPIF/8zFUeR5s6nLbvyKcZcMtwZb276hMbINypODVZHSHkwNcWIP4D XbBPM86Yd/p/UFSQ0ItLGgeNC+TCSWz/L2GOcrBIFiM42xfzpTxxDcvREnKyv3HWU0r8 h0jvKqSJ/8OppE2VJN+zAzCoMwI+22ejKeq4djyhyg+oPv3asIJW72qL0yweFnUyXQqu XhrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wx1HWb2A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b4-20020a0564021f0400b0045d8bff7afesi7320956edb.376.2022.10.19.09.56.59; Wed, 19 Oct 2022 09:57:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wx1HWb2A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230242AbiJSQ40 (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229714AbiJSQ4Z (ORCPT ); Wed, 19 Oct 2022 12:56:25 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AE5814FD01 for ; Wed, 19 Oct 2022 09:56:24 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id n1-20020a170902f60100b00179c0a5c51fso12073446plg.7 for ; Wed, 19 Oct 2022 09:56:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=tUi0BoiMqtO/M/WpUaQHGu1rzQiQ7c9np2BzoBF+UZg=; b=Wx1HWb2AcBMg7QFvDkkaYmzy/2Ol3jWSHV09YGjMZQhczMlcdYT0g571M8pTqkvscu 07QFzJG1Epij+6OOw4tOL24/F6ieoj3UE/elZbukLXfCV7pKFjvpUEvBbSlhZvtps7Fe Z2+YkVZ7UIp0KaHuK6kiS4OWDbCRcnX5xh5nWLZ0eUPqAa7gX8C/YVd3EgRxFVrFA3RM jP4jPQ0tOydhJknuwYbF5Bg04SnkJuBC6kxbsFNepTozpVEZFHBgteXp1cj1oU+M8wKU 1gKYnzFB3RQJ/tpJ0AArvkHq/wdxyJIUoeEQ2dHFSL6QtR9pqFG9JUD1yMYIqOFL0jCq k5dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tUi0BoiMqtO/M/WpUaQHGu1rzQiQ7c9np2BzoBF+UZg=; b=1/01EFBlMcgaNKVPpGkXTQP1IuCAoB5LJhZkTa1Xs02Nt8xBTOKp8HK9Kz6UOfcxwh PyBjrygtiRCrDXN5gVyOR2Ll9kcgagBU/qbZmr2nF2qmkiTyyI4cjnt3tbpOx1wnuYr2 gcMW72a26cL+PJRRWrsrFAtLf+hFyLvdcGZMYlrxJCDynw8737imwd0wZFk21uQjcaUS 1uSRBRIu78Nhx/SYw+K83zoAkKt1BjdCmJbePqdmDqB9yJPgTc1mG2UMU2ODkLRz5JmS dIWT19ttrspKVgMzQywD5htDRpBSZvVvsOGs3hUOWg2SROe1BgHJzSjud+crPRg4g6KP xqAw== X-Gm-Message-State: ACrzQf2n+EmIfYY7dDNIySLkIyaUd/oH5iwwkW5CBNKiGKj1Ftbh/L1/ VBWeLZei+flNLC4HQYJ3YICPmoqzBUU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:b794:b0:20a:eab5:cf39 with SMTP id m20-20020a17090ab79400b0020aeab5cf39mr2181968pjr.1.1666198583493; Wed, 19 Oct 2022 09:56:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:11 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-2-seanjc@google.com> Subject: [PATCH v6 1/8] KVM: x86/mmu: Tag disallowed NX huge pages even if they're not tracked From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135913231383413?= X-GMAIL-MSGID: =?utf-8?q?1747135913231383413?= Tag shadow pages that cannot be replaced with an NX huge page regardless of whether or not zapping the page would allow KVM to immediately create a huge page, e.g. because something else prevents creating a huge page. I.e. track pages that are disallowed from being NX huge pages regardless of whether or not the page could have been huge at the time of fault. KVM currently tracks pages that were disallowed from being huge due to the NX workaround if and only if the page could otherwise be huge. But that fails to handled the scenario where whatever restriction prevented KVM from installing a huge page goes away, e.g. if dirty logging is disabled, the host mapping level changes, etc... Failure to tag shadow pages appropriately could theoretically lead to false negatives, e.g. if a fetch fault requests a small page and thus isn't tracked, and a read/write fault later requests a huge page, KVM will not reject the huge page as it should. To avoid yet another flag, initialize the list_head and use list_empty() to determine whether or not a page is on the list of NX huge pages that should be recovered. Note, the TDP MMU accounting is still flawed as fixing the TDP MMU is more involved due to mmu_lock being held for read. This will be addressed in a future commit. Fixes: 5bcaf3e1715f ("KVM: x86/mmu: Account NX huge page disallowed iff huge page was requested") Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++++++-------- arch/x86/kvm/mmu/mmu_internal.h | 10 +++++++++- arch/x86/kvm/mmu/paging_tmpl.h | 6 +++--- arch/x86/kvm/mmu/tdp_mmu.c | 4 +++- 4 files changed, 39 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6f81539061d6..f1e089dfdd22 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,15 +802,25 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) { - if (sp->lpage_disallowed) + sp->lpage_disallowed = true; + + /* + * If it's possible to replace the shadow page with an NX huge page, + * i.e. if the shadow page is the only thing currently preventing KVM + * from using a huge page, add the shadow page to the list of "to be + * zapped for NX recovery" pages. Note, the shadow page can already be + * on the list if KVM is reusing an existing shadow page, i.e. if KVM + * links a shadow page at multiple points. + */ + if (!nx_huge_page_possible || !list_empty(&sp->lpage_disallowed_link)) return; ++kvm->stat.nx_lpage_splits; list_add_tail(&sp->lpage_disallowed_link, &kvm->arch.lpage_disallowed_mmu_pages); - sp->lpage_disallowed = true; } static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) @@ -832,9 +842,13 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - --kvm->stat.nx_lpage_splits; sp->lpage_disallowed = false; - list_del(&sp->lpage_disallowed_link); + + if (list_empty(&sp->lpage_disallowed_link)) + return; + + --kvm->stat.nx_lpage_splits; + list_del_init(&sp->lpage_disallowed_link); } static struct kvm_memory_slot * @@ -2129,6 +2143,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm, set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + INIT_LIST_HEAD(&sp->lpage_disallowed_link); + /* * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() * depends on valid pages being added to the head of the list. See @@ -3126,9 +3142,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed && - fault->req_level >= it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->is_tdp && fault->huge_page_disallowed) + account_huge_nx_page(vcpu->kvm, sp, + fault->req_level >= it.level); } if (WARN_ON_ONCE(it.level != fault->goal_level)) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 582def531d4d..cca1ad75d096 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -100,6 +100,13 @@ struct kvm_mmu_page { }; }; + /* + * Tracks shadow pages that, if zapped, would allow KVM to create an NX + * huge page. A shadow page will have lpage_disallowed set but not be + * on the list if a huge page is disallowed for other reasons, e.g. + * because KVM is shadowing a PTE at the same gfn, the memslot isn't + * properly aligned, etc... + */ struct list_head lpage_disallowed_link; #ifdef CONFIG_X86_32 /* @@ -315,7 +322,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible); void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 5ab5f94dcb6f..8fd0c4e1e575 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -713,9 +713,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->huge_page_disallowed && - fault->req_level >= it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->huge_page_disallowed) + account_huge_nx_page(vcpu->kvm, sp, + fault->req_level >= it.level); } if (WARN_ON_ONCE(it.level != fault->goal_level)) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 672f0432d777..80a4a1a09131 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -284,6 +284,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, gfn_t gfn, union kvm_mmu_page_role role) { + INIT_LIST_HEAD(&sp->lpage_disallowed_link); + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); sp->role = role; @@ -1141,7 +1143,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); if (account_nx) - account_huge_nx_page(kvm, sp); + account_huge_nx_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); tdp_account_mmu_page(kvm, sp); From patchwork Wed Oct 19 16:56:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5756 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp431666wrs; Wed, 19 Oct 2022 09:57:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6X5cNef47ubwD/Mpl/OOnB1ZONkn3xKOsDZr/GrYg68VhBdOJSTEOPOuKwFE02mquj09Mu X-Received: by 2002:a17:906:cc5a:b0:78d:c3cf:6ff7 with SMTP id mm26-20020a170906cc5a00b0078dc3cf6ff7mr7575471ejb.383.1666198649483; Wed, 19 Oct 2022 09:57:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198649; cv=none; d=google.com; s=arc-20160816; b=0aJ+e3ozv/Dc4YTpT39r4jS3+Uv410fS/A6ZgMCC7bd7pXVIwoqzFnFJOEEZBCHqUh TQvZzyQWcigi9Nnrn3lw6kuUV7XCt0W54MYJw2/GRLuSopsd4POCmTcNvZJqhNAeISrQ WRnlnUmJd1Ov2A3PGuNU6Wz/gLIbLhrJirgXE19Zqy/sKG1HCRSE+5Q4nFZ6A0AJSatA 1R+SGzDrntBaPcA31YAaDyrP5T30z0GafUJob7qqbYxpqie73G0B1xR7Hk/ZSqP1PvZY fFOFu/bnpWgVoS0pJ+EAKHemc7y20CxY+xyiMoobY0FJUmZQIQHAMwtSefdNVeghgurM 3s7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=TXR9HXFQyDJjtsVPfzET1xdYLWGKMnKZJlUWyX6iwHs=; b=nbOu/+zEUn5EO3hxubUn00zF1Sr+wdfI7QYhm+r1RuzY0fOpSsUAe923wC0JwEm/Wu D96qpbwkZWg+hzMM9Kwji6eiQFy+sGdpyjQ4WDpgQLj8VIccfEfgL16msHfVItmquc3i PgC/SLIiwotTr2785eAwBjbRxblcOzOwneCCdUTI+1aLNslx6OPnXDmvXeBbLTkdU3sx VWMvBCYbUSTYKtdB2LiRSjI63yfXK2bDjZRTfpF3ry2p2g801Feis3HYwdj/kdO/uG/n 8rq8LmTymD0fPXEPTUnGPWyg5YgRSl0uG1a8/MeBLlmx3ZYr7MucG1V4ALrWYaWRrpM2 mm9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=SJwwwXXB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ul5-20020a170907ca8500b0073cd848ae8asi12608125ejc.321.2022.10.19.09.57.03; Wed, 19 Oct 2022 09:57:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=SJwwwXXB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230348AbiJSQ43 (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229497AbiJSQ41 (ORCPT ); Wed, 19 Oct 2022 12:56:27 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A726D14FD01 for ; Wed, 19 Oct 2022 09:56:25 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id u8-20020a170902e5c800b00185483ee4f5so8501707plf.10 for ; Wed, 19 Oct 2022 09:56:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=TXR9HXFQyDJjtsVPfzET1xdYLWGKMnKZJlUWyX6iwHs=; b=SJwwwXXBIGuVGPV/jum6SyIPLywygwRt5FoXEptXLPAxZajf3rfEyCDVAE1Tgi9uzv Kf+fW4BlEfSFT2zR+9ctkcm7CAJP4wAMJWVGLgFrgZV1pygEps77msyKXSAKEKjLNjiJ McfMo962RDjXJNmTH85yYsM74Ths2LTvn3U4jOCI0l+RyvfrVwwTNDfR8magvEybQoqc ZLkAwF1iJh3BoH7BGff++NZi4ezb4/S+NfLL7h/0ZD9iU9SFWHY6CJJUzV7WM+1sxPiM 4Q7OA6xmcpgR4svF/NusZewd2Sx1Ylz29f6jhE3qYTJknQzmZRlO43BGS1VYho4nzCpB 6Rlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TXR9HXFQyDJjtsVPfzET1xdYLWGKMnKZJlUWyX6iwHs=; b=yQqCXTgjeYq2dPYlfzUFii7XSouU3+8km2+9xvPKIF0O8lIPRMRxCC2MtLe8VKqZrp DQYqytKQN+Unr6Y+yLb+ztTrW0LgDEhMtlf6Iy9TrfuKOIwNfwHl14Bedopfe7/TvSVK BruE+E68qn5HeRYVXWFF7GpeUkmSMMUI36T+VE9S63vk0qHwUbVfd4CNdLjsuHQGTB2Z Db4fDNZ/uXWYw40HUN3T4rDH5HudXi5wpV4VJ0Qpam0rj6qTgRR7qaCa7OzF+YOBniGQ qsAxX9JDwxREiflJqVAF5S/oujosFSXmb83T2YsP3k2vC4526cOiwFxV9IHlpaVPUH71 k6/A== X-Gm-Message-State: ACrzQf3aLHxAOdnUinF9ZGV//kioqWC6JqewHnGlOUeaXc4BsO5VFLYb RTgGVMNFGWgcyuTLMPYMhXIJje0ypPg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:124b:b0:179:da2f:2457 with SMTP id u11-20020a170903124b00b00179da2f2457mr9373837plh.156.1666198585392; Wed, 19 Oct 2022 09:56:25 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:12 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-3-seanjc@google.com> Subject: [PATCH v6 2/8] KVM: x86/mmu: Rename NX huge pages fields/functions for consistency From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135915424686951?= X-GMAIL-MSGID: =?utf-8?q?1747135915424686951?= Rename most of the variables/functions involved in the NX huge page mitigation to provide consistency, e.g. lpage vs huge page, and NX huge vs huge NX, and also to provide clarity, e.g. to make it obvious the flag applies only to the NX huge page mitigation, not to any condition that prevents creating a huge page. Add a comment explaining what the newly named "possible_nx_huge_pages" tracks. Leave the nx_lpage_splits stat alone as the name is ABI and thus set in stone. Signed-off-by: Sean Christopherson Reviewed-by: Mingwei Zhang --- arch/x86/include/asm/kvm_host.h | 19 +++++++-- arch/x86/kvm/mmu/mmu.c | 71 +++++++++++++++++---------------- arch/x86/kvm/mmu/mmu_internal.h | 22 ++++++---- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++-- 5 files changed, 71 insertions(+), 51 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 7551b6f9c31c..0333dbb8ec85 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1151,7 +1151,18 @@ struct kvm_arch { struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES]; struct list_head active_mmu_pages; struct list_head zapped_obsolete_pages; - struct list_head lpage_disallowed_mmu_pages; + /* + * A list of kvm_mmu_page structs that, if zapped, could possibly be + * replaced by an NX huge page. A shadow page is on this list if its + * existence disallows an NX huge page (nx_huge_page_disallowed is set) + * and there are no other conditions that prevent a huge page, e.g. + * the backing host page is huge, dirtly logging is not enabled for its + * memslot, etc... Note, zapping shadow pages on this list doesn't + * guarantee an NX huge page will be created in its stead, e.g. if the + * guest attempts to execute from the region then KVM obviously can't + * create an NX huge page (without hanging the guest). + */ + struct list_head possible_nx_huge_pages; struct kvm_page_track_notifier_node mmu_sp_tracker; struct kvm_page_track_notifier_head track_notifier_head; /* @@ -1267,7 +1278,7 @@ struct kvm_arch { bool sgx_provisioning_allowed; struct kvm_pmu_event_filter __rcu *pmu_event_filter; - struct task_struct *nx_lpage_recovery_thread; + struct task_struct *nx_huge_page_recovery_thread; #ifdef CONFIG_X86_64 /* @@ -1312,8 +1323,8 @@ struct kvm_arch { * - tdp_mmu_roots (above) * - tdp_mmu_pages (above) * - the link field of kvm_mmu_page structs used by the TDP MMU - * - lpage_disallowed_mmu_pages - * - the lpage_disallowed_link field of kvm_mmu_page structs used + * - possible_nx_huge_pages; + * - the possible_nx_huge_page_link field of kvm_mmu_page structs used * by the TDP MMU * It is acceptable, but not necessary, to acquire this lock when * the thread holds the MMU lock in write mode. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f1e089dfdd22..5dd98cdc5283 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,10 +802,10 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, +void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, bool nx_huge_page_possible) { - sp->lpage_disallowed = true; + sp->nx_huge_page_disallowed = true; /* * If it's possible to replace the shadow page with an NX huge page, @@ -815,12 +815,13 @@ void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, * on the list if KVM is reusing an existing shadow page, i.e. if KVM * links a shadow page at multiple points. */ - if (!nx_huge_page_possible || !list_empty(&sp->lpage_disallowed_link)) + if (!nx_huge_page_possible || + !list_empty(&sp->possible_nx_huge_page_link)) return; ++kvm->stat.nx_lpage_splits; - list_add_tail(&sp->lpage_disallowed_link, - &kvm->arch.lpage_disallowed_mmu_pages); + list_add_tail(&sp->possible_nx_huge_page_link, + &kvm->arch.possible_nx_huge_pages); } static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) @@ -840,15 +841,15 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_mmu_gfn_allow_lpage(slot, gfn); } -void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) +void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - sp->lpage_disallowed = false; + sp->nx_huge_page_disallowed = false; - if (list_empty(&sp->lpage_disallowed_link)) + if (list_empty(&sp->possible_nx_huge_page_link)) return; --kvm->stat.nx_lpage_splits; - list_del_init(&sp->lpage_disallowed_link); + list_del_init(&sp->possible_nx_huge_page_link); } static struct kvm_memory_slot * @@ -2143,7 +2144,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm, set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - INIT_LIST_HEAD(&sp->lpage_disallowed_link); + INIT_LIST_HEAD(&sp->possible_nx_huge_page_link); /* * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() @@ -2502,8 +2503,8 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, zapped_root = !is_obsolete_sp(kvm, sp); } - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + if (sp->nx_huge_page_disallowed) + unaccount_nx_huge_page(kvm, sp); sp->role.invalid = 1; @@ -3143,7 +3144,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) link_shadow_page(vcpu, it.sptep, sp); if (fault->is_tdp && fault->huge_page_disallowed) - account_huge_nx_page(vcpu->kvm, sp, + account_nx_huge_page(vcpu->kvm, sp, fault->req_level >= it.level); } @@ -5987,7 +5988,7 @@ int kvm_mmu_init_vm(struct kvm *kvm) INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); - INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages); + INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages); spin_lock_init(&kvm->arch.mmu_unsync_pages_lock); r = kvm_mmu_init_tdp_mmu(kvm); @@ -6672,7 +6673,7 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp) kvm_mmu_zap_all_fast(kvm); mutex_unlock(&kvm->slots_lock); - wake_up_process(kvm->arch.nx_lpage_recovery_thread); + wake_up_process(kvm->arch.nx_huge_page_recovery_thread); } mutex_unlock(&kvm_lock); } @@ -6804,7 +6805,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel mutex_lock(&kvm_lock); list_for_each_entry(kvm, &vm_list, vm_list) - wake_up_process(kvm->arch.nx_lpage_recovery_thread); + wake_up_process(kvm->arch.nx_huge_page_recovery_thread); mutex_unlock(&kvm_lock); } @@ -6812,7 +6813,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel return err; } -static void kvm_recover_nx_lpages(struct kvm *kvm) +static void kvm_recover_nx_huge_pages(struct kvm *kvm) { unsigned long nx_lpage_splits = kvm->stat.nx_lpage_splits; int rcu_idx; @@ -6835,23 +6836,25 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) ratio = READ_ONCE(nx_huge_pages_recovery_ratio); to_zap = ratio ? DIV_ROUND_UP(nx_lpage_splits, ratio) : 0; for ( ; to_zap; --to_zap) { - if (list_empty(&kvm->arch.lpage_disallowed_mmu_pages)) + if (list_empty(&kvm->arch.possible_nx_huge_pages)) break; /* * We use a separate list instead of just using active_mmu_pages - * because the number of lpage_disallowed pages is expected to - * be relatively small compared to the total. + * because the number of shadow pages that be replaced with an + * NX huge page is expected to be relatively small compared to + * the total number of shadow pages. And because the TDP MMU + * doesn't use active_mmu_pages. */ - sp = list_first_entry(&kvm->arch.lpage_disallowed_mmu_pages, + sp = list_first_entry(&kvm->arch.possible_nx_huge_pages, struct kvm_mmu_page, - lpage_disallowed_link); - WARN_ON_ONCE(!sp->lpage_disallowed); + possible_nx_huge_page_link); + WARN_ON_ONCE(!sp->nx_huge_page_disallowed); if (is_tdp_mmu_page(sp)) { flush |= kvm_tdp_mmu_zap_sp(kvm, sp); } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); - WARN_ON_ONCE(sp->lpage_disallowed); + WARN_ON_ONCE(sp->nx_huge_page_disallowed); } if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { @@ -6872,7 +6875,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) srcu_read_unlock(&kvm->srcu, rcu_idx); } -static long get_nx_lpage_recovery_timeout(u64 start_time) +static long get_nx_huge_page_recovery_timeout(u64 start_time) { bool enabled; uint period; @@ -6883,19 +6886,19 @@ static long get_nx_lpage_recovery_timeout(u64 start_time) : MAX_SCHEDULE_TIMEOUT; } -static int kvm_nx_lpage_recovery_worker(struct kvm *kvm, uintptr_t data) +static int kvm_nx_huge_page_recovery_worker(struct kvm *kvm, uintptr_t data) { u64 start_time; long remaining_time; while (true) { start_time = get_jiffies_64(); - remaining_time = get_nx_lpage_recovery_timeout(start_time); + remaining_time = get_nx_huge_page_recovery_timeout(start_time); set_current_state(TASK_INTERRUPTIBLE); while (!kthread_should_stop() && remaining_time > 0) { schedule_timeout(remaining_time); - remaining_time = get_nx_lpage_recovery_timeout(start_time); + remaining_time = get_nx_huge_page_recovery_timeout(start_time); set_current_state(TASK_INTERRUPTIBLE); } @@ -6904,7 +6907,7 @@ static int kvm_nx_lpage_recovery_worker(struct kvm *kvm, uintptr_t data) if (kthread_should_stop()) return 0; - kvm_recover_nx_lpages(kvm); + kvm_recover_nx_huge_pages(kvm); } } @@ -6912,17 +6915,17 @@ int kvm_mmu_post_init_vm(struct kvm *kvm) { int err; - err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0, + err = kvm_vm_create_worker_thread(kvm, kvm_nx_huge_page_recovery_worker, 0, "kvm-nx-lpage-recovery", - &kvm->arch.nx_lpage_recovery_thread); + &kvm->arch.nx_huge_page_recovery_thread); if (!err) - kthread_unpark(kvm->arch.nx_lpage_recovery_thread); + kthread_unpark(kvm->arch.nx_huge_page_recovery_thread); return err; } void kvm_mmu_pre_destroy_vm(struct kvm *kvm) { - if (kvm->arch.nx_lpage_recovery_thread) - kthread_stop(kvm->arch.nx_lpage_recovery_thread); + if (kvm->arch.nx_huge_page_recovery_thread) + kthread_stop(kvm->arch.nx_huge_page_recovery_thread); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index cca1ad75d096..67879459a25c 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -57,7 +57,13 @@ struct kvm_mmu_page { bool tdp_mmu_page; bool unsync; u8 mmu_valid_gen; - bool lpage_disallowed; /* Can't be replaced by an equiv large page */ + + /* + * The shadow page can't be replaced by an equivalent huge page + * because it is being used to map an executable page in the guest + * and the NX huge page mitigation is enabled. + */ + bool nx_huge_page_disallowed; /* * The following two entries are used to key the shadow page in the @@ -102,12 +108,12 @@ struct kvm_mmu_page { /* * Tracks shadow pages that, if zapped, would allow KVM to create an NX - * huge page. A shadow page will have lpage_disallowed set but not be - * on the list if a huge page is disallowed for other reasons, e.g. - * because KVM is shadowing a PTE at the same gfn, the memslot isn't - * properly aligned, etc... + * huge page. A shadow page will have nx_huge_page_disallowed set but + * not be on the list if a huge page is disallowed for other reasons, + * e.g. because KVM is shadowing a PTE at the same gfn, the memslot + * isn't properly aligned, etc... */ - struct list_head lpage_disallowed_link; + struct list_head possible_nx_huge_page_link; #ifdef CONFIG_X86_32 /* * Used out of the mmu-lock to avoid reading spte values while an @@ -322,8 +328,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp, +void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, bool nx_huge_page_possible); -void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 8fd0c4e1e575..0f6455072055 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -714,7 +714,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, link_shadow_page(vcpu, it.sptep, sp); if (fault->huge_page_disallowed) - account_huge_nx_page(vcpu->kvm, sp, + account_nx_huge_page(vcpu->kvm, sp, fault->req_level >= it.level); } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 80a4a1a09131..73eb28ed1f03 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -284,7 +284,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, gfn_t gfn, union kvm_mmu_page_role role) { - INIT_LIST_HEAD(&sp->lpage_disallowed_link); + INIT_LIST_HEAD(&sp->possible_nx_huge_page_link); set_page_private(virt_to_page(sp->spt), (unsigned long)sp); @@ -403,8 +403,8 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, lockdep_assert_held_write(&kvm->mmu_lock); list_del(&sp->link); - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + if (sp->nx_huge_page_disallowed) + unaccount_nx_huge_page(kvm, sp); if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1143,7 +1143,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); if (account_nx) - account_huge_nx_page(kvm, sp, true); + account_nx_huge_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); tdp_account_mmu_page(kvm, sp); From patchwork Wed Oct 19 16:56:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5757 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp431691wrs; Wed, 19 Oct 2022 09:57:33 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7WmY2Izq3BcM99bPbt5h3AvE2/xiaFjZOpnK5sigwuF52P++c4ah3rxcB5Q+bFIQhoi9ha X-Received: by 2002:a05:6402:1a4d:b0:459:319f:ff80 with SMTP id bf13-20020a0564021a4d00b00459319fff80mr8517062edb.144.1666198652913; Wed, 19 Oct 2022 09:57:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198652; cv=none; d=google.com; s=arc-20160816; b=bvxCB4VbJW9QMQInu/jR9FSFrhJ8iIQLFPuvb1ySGvq/Lusm4W7to8RFdRpNVuidkV 4uKv4yqfVZaHLNnSl59KjqQwVivTA1gCRVur7/rsvRvr/Jc7U6nbD5h4eW6NvqEpq5nt 2dznFU1vYqMZu4jEHFrjxSrz5X7QR5yAKFDfzfv0C9OrKBm1qXNScxT+IDGMHo/dRLLC Z0I2eRH7KhDxSUakwdbiyVJvmX6wPFw3rFYf+TBI1lPJHwCYd97qKTOVGVqbq4N2b7U0 6txACrkgKarQdEHlihClSkjvDxwUJD83+E9GA3Ltut+d7bABbTKuJ8ClucXjV7hPOY4H 3nnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=gy24ztDzsu2pVLCZMStQCiAA6/olN6OB6IFeEvw2nwc=; b=OzXRCf68zKsLBkKK58msr6/btpjEOk2pKRVJsfwmhqfFknonYT6XDnurpD6CDQ37cC wCvYr6IFMdmNWy9nITU7ZwmMwpJ6HFMwWVAYjz7he7UuVF7YSkQr846oLBDfMXAQstZP dOZPMZ2MiBKQdJgBNFTCk5qvURCMGX6TVfEWzCQ3L/Gz/ViJCxRKkILOZBw8rbZ7m/l2 /hf9o6hcqF2hVRy3bo6liPcdP6eAkzhurDWmtMrs048Z7wGm62hZDpleXbNPgNZG264K 8/GIqMvwbuEXLqCLoeJV3RGSJ8h5zTjuRvzeRis9rol1W6rpEHVE/bOS12O4trLRUp5v Qpzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RkT5JmVH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d22-20020a50fe96000000b0045beaf03ddesi12678796edt.411.2022.10.19.09.57.08; Wed, 19 Oct 2022 09:57:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RkT5JmVH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230518AbiJSQ4f (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230226AbiJSQ43 (ORCPT ); Wed, 19 Oct 2022 12:56:29 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 915AC1C711F for ; Wed, 19 Oct 2022 09:56:27 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id f186-20020a636ac3000000b0044adaa7d347so9955121pgc.14 for ; Wed, 19 Oct 2022 09:56:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=gy24ztDzsu2pVLCZMStQCiAA6/olN6OB6IFeEvw2nwc=; b=RkT5JmVHOx1xLJuKyPUTQ/GHWwPLL/Ofaa9fMFcWafnho5RGXPgWMJuq6JtFQIVRU5 3MTRHaV+92xzf++NNAnl+DR9b5UgIZvfci41IYKi7CDUqVB2+NLKldXiZn18g62Ly3eN GP5xfcS2y6ft5gvqGdsKQufjnDmEap6e+Ggqnq1KWQ1ZMLIDN/eNajNku93QRdQyQSGg cXnlJOFm94o6frDwyz1i/mRrPIKOOgCJ3heiRrvlB8tXVHnRqxX8wwG5+Mf53qjVtc8D i/CqLQ/EdC68sMeiadZQajE7mvYEdOHQmt2H8C20u9lUD8opffszIpqLdrdMCs9Gvj+9 isNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gy24ztDzsu2pVLCZMStQCiAA6/olN6OB6IFeEvw2nwc=; b=eAfb4B3GN0Hke9gaRQ8rqb1/XR02p2zfyfkdTps1Oz91Pm7D+OpKQDHzd3DKs4g61t 9/QNTKQ064KajGfvRPAWznT/C2eREuE+UHFfMRvQE/A6zSmSmR9UikOVpfY5N2T7DLHl cIdzJ5KVZhMNRBW8B4X4SGFw8/ETq2kTmG8Jm640cGrNYihUODl2hbdLezTV6AbgmuUu Qecb3oN8lVjG9UAEwsS92mKbTEj6Ed+QE02vnbWqi4DsMSm88QfYVp5vio/eObIOGFfe CwUsJHVL5NqG2PL5DoqZCl0GpgbMD++P0aBvvC9V4f5P0ZvriwBP/JA2X5EFePGzASpK lH8g== X-Gm-Message-State: ACrzQf2DoKkIYc32nOdGMYNaQgHDSvW7EsBKbU5Xadt/o5W+CWuBlZOu slPMo1pl61B5GJu+Iz1J04V/AeMVU+0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:f78a:b0:184:f2e2:a5fa with SMTP id q10-20020a170902f78a00b00184f2e2a5famr9560092pln.161.1666198587158; Wed, 19 Oct 2022 09:56:27 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:13 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-4-seanjc@google.com> Subject: [PATCH v6 3/8] KVM: x86/mmu: Properly account NX huge page workaround for nonpaging MMUs From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135918407759755?= X-GMAIL-MSGID: =?utf-8?q?1747135918407759755?= Account and track NX huge pages for nonpaging MMUs so that a future enhancement to precisely check if a shadow page can't be replaced by a NX huge page doesn't get false positives. Without correct tracking, KVM can get stuck in a loop if an instruction is fetching and writing data on the same huge page, e.g. KVM installs a small executable page on the fetch fault, replaces it with an NX huge page on the write fault, and faults again on the fetch. Alternatively, and perhaps ideally, KVM would simply not enforce the workaround for nonpaging MMUs. The guest has no page tables to abuse and KVM is guaranteed to switch to a different MMU on CR0.PG being toggled so there's no security or performance concerns. However, getting make_spte() to play nice now and in the future is unnecessarily complex. In the current code base, make_spte() can enforce the mitigation if TDP is enabled or the MMU is indirect, but make_spte() may not always have a vCPU/MMU to work with, e.g. if KVM were to support in-line huge page promotion when disabling dirty logging. Without a vCPU/MMU, KVM could either pass in the correct information and/or derive it from the shadow page, but the former is ugly and the latter subtly non-trivial due to the possibility of direct shadow pages in indirect MMUs. Given that using shadow paging with an unpaged guest is far from top priority _and_ has been subjected to the workaround since its inception, keep it simple and just fix the accounting glitch. Signed-off-by: Sean Christopherson Reviewed-by: David Matlack Reviewed-by: Mingwei Zhang --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/spte.c | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 5dd98cdc5283..99086a684dd2 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3143,7 +3143,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed) + if (fault->huge_page_disallowed) account_nx_huge_page(vcpu->kvm, sp, fault->req_level >= it.level); } diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 2e08b2a45361..c0fd7e049b4e 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -161,6 +161,18 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, if (!prefetch) spte |= spte_shadow_accessed_mask(spte); + /* + * For simplicity, enforce the NX huge page mitigation even if not + * strictly necessary. KVM could ignore the mitigation if paging is + * disabled in the guest, as the guest doesn't have an page tables to + * abuse. But to safely ignore the mitigation, KVM would have to + * ensure a new MMU is loaded (or all shadow pages zapped) when CR0.PG + * is toggled on, and that's a net negative for performance when TDP is + * enabled. When TDP is disabled, KVM will always switch to a new MMU + * when CR0.PG is toggled, but leveraging that to ignore the mitigation + * would tie make_spte() further to vCPU/MMU state, and add complexity + * just to optimize a mode that is anything but performance critical. + */ if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) && is_nx_huge_page_enabled(vcpu->kvm)) { pte_access &= ~ACC_EXEC_MASK; From patchwork Wed Oct 19 16:56:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5759 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp432004wrs; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6BL4fcIKxrvzlBEnUPeyZWwJxaXZD8oLh6XZPTt0/06CAtmOoXKDV1WxMhPdfkvo8a5zWA X-Received: by 2002:a05:6402:50cf:b0:45c:dfce:66ae with SMTP id h15-20020a05640250cf00b0045cdfce66aemr8079052edb.370.1666198712631; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198712; cv=none; d=google.com; s=arc-20160816; b=l7Dmq+tRk+k45x6AUgZhHPJX4s2JkBEzgr74h0zfMTE7Y5OhZnnuKXH8UFf9ndAWhi 1o5yW1731AH+QM7xirA8iMkJTmLUJ5eTbfXS8a9utNUzpxnLnq7D1kyKf71u9wJnEVjZ u+juCnqH4Q1k9PcG/SaD0YmwLgcuUHd+gKOgJdvZ1u4eAVp5E5sqmJwPAmM1mBFBvYEv lNKzwioa2VP1fEv9smm+rIQkIchy+xCM7nMzkQYSjnzGWMfIj6GM2265R2GZRD0iZeta mk1gZPIWb5p8DEdHSgC8+MD2BczFf4WIMSVD+CHn1QgO9DPPBzOk5GFzXzczUO622Ooa Ymzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=LS+pilKxawTbP6OhESulmIxtCCQL8Wr76F/xHli54h6HVIxUbOxv1PUUFnoIsfv3dW wobTG36xfMa5CpKEsykSkz3vyXytE+w1MPEb25QQhW3uBtvOTUN9FfBeW94AUMhYI84Z 2yHtOP3v81mK5oLFnTU8IMIGOcUQuM91be86x9GMX3nmNko3u2EsHzzL18y26ZOuct2L r6dpnoGVXpsh4BrNe1XTYjlHTqRGn1NMsznHOZJ8Mxs3l3OCtzCoTapaVjzIAJMzJXIL Z+PQlHjkiIXjEbgCMQ8mHp2TkowEquRBH5eu5txpfDffComjQ6zwlXuJg4nKKQZtiu4h WaoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=eHou7EYf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q5-20020a1709064c8500b0078200f886bbsi14875783eju.361.2022.10.19.09.58.07; Wed, 19 Oct 2022 09:58:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=eHou7EYf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231245AbiJSQ4l (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230405AbiJSQ4b (ORCPT ); Wed, 19 Oct 2022 12:56:31 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84E361CFC71 for ; Wed, 19 Oct 2022 09:56:29 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id p3-20020a170902e74300b0018546b77dccso9134586plf.17 for ; Wed, 19 Oct 2022 09:56:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=eHou7EYfmU5ewGmVE97/OB0Rz/MjGpSSoqcb9mhgSPyFDa1znMXYSOAZFoR/AzYMch 8Z5YHSJfAKzggMptljw1XDgeHIcNdiPaWA5Dpqml3cug+3JbyRsms/ZUzh0r6I1j1OJM FLI2ceaF/58DxaA3Swq56W/551GeOhJ4Uh5ntxg9V7Wup9HFB/4URbZBz/h7GGutDB/z vCHHtYA1P4TaDG1LxkA3c8y5jtyRwKBMw2zQi/1ixRORrmId5VeNVwyKdQnd3xapMolG vUwNdgtca9HuqSXdPZrkzPUQ6ypqmnoybKfIVYA0UK6rSeH/OoR9b+SdRQDEDPuXf3kt 1d3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5rq0da+N//og/MoX0lmMjwzNMYNpS3ZLWb0+vQ8u+6Q=; b=PRCWSephaTs0PWZd4vSkDsIFaeGQdbYfWdU16LodGG1BbT5gM+mzfv2tPfSc254W8f OlZ+GIQjEGSGk7Y9yk2Gr5Jtqp1xiHvT4i4Lr+ttIhwVoUIsUyqJY18yiNuZ+Jl6ZePR lI3kVnnuz1lffEmQ8KVG1OXlvPcrXz8fAXwHACTL2slaLoExm1bd2joX/1VYFjA1ZmBC FMaKK1AT2t+6BniumIV49AGVNPOqIrHjdmdcJ8CPB4I46GUlOomboYzyoztTm2wp72Jd e6LaqKk8i2dTAkDYEVlM+S99XwbBlr/A8ZGxcjrpN7WRdnaP3EPze12Lg1KyN88tRvJk 7M/w== X-Gm-Message-State: ACrzQf3E/AX6jsj5V3lMKE7ERUvCUSUVt4Gsp+6bOR4eMFYQgt+mVb1n WjoGgMwrlNTOQLR70DNirbdSCmzuiWQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:10cf:b0:528:48c3:79e0 with SMTP id d15-20020a056a0010cf00b0052848c379e0mr9535472pfu.18.1666198588718; Wed, 19 Oct 2022 09:56:28 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:14 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-5-seanjc@google.com> Subject: [PATCH v6 4/8] KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135981001981966?= X-GMAIL-MSGID: =?utf-8?q?1747135981001981966?= Set nx_huge_page_disallowed in TDP MMU shadow pages before making the SP visible to other readers, i.e. before setting its SPTE. This will allow KVM to query the flag when determining if a shadow page can be replaced by a NX huge page without violating the rules of the mitigation. Note, the shadow/legacy MMU holds mmu_lock for write, so it's impossible for another CPU to see a shadow page without an up-to-date nx_huge_page_disallowed, i.e. only the TDP MMU needs the complicated dance. Signed-off-by: Sean Christopherson Reviewed-by: David Matlack Reviewed-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 28 +++++++++++++++++++--------- arch/x86/kvm/mmu/mmu_internal.h | 5 ++--- arch/x86/kvm/mmu/tdp_mmu.c | 31 ++++++++++++++++++------------- 3 files changed, 39 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 99086a684dd2..57c7c52d137a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,11 +802,8 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible) +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - sp->nx_huge_page_disallowed = true; - /* * If it's possible to replace the shadow page with an NX huge page, * i.e. if the shadow page is the only thing currently preventing KVM @@ -815,8 +812,7 @@ void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, * on the list if KVM is reusing an existing shadow page, i.e. if KVM * links a shadow page at multiple points. */ - if (!nx_huge_page_possible || - !list_empty(&sp->possible_nx_huge_page_link)) + if (!list_empty(&sp->possible_nx_huge_page_link)) return; ++kvm->stat.nx_lpage_splits; @@ -824,6 +820,15 @@ void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, &kvm->arch.possible_nx_huge_pages); } +static void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) +{ + sp->nx_huge_page_disallowed = true; + + if (nx_huge_page_possible) + track_possible_nx_huge_page(kvm, sp); +} + static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) { struct kvm_memslots *slots; @@ -841,10 +846,8 @@ static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) kvm_mmu_gfn_allow_lpage(slot, gfn); } -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - sp->nx_huge_page_disallowed = false; - if (list_empty(&sp->possible_nx_huge_page_link)) return; @@ -852,6 +855,13 @@ void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) list_del_init(&sp->possible_nx_huge_page_link); } +static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + sp->nx_huge_page_disallowed = false; + + untrack_possible_nx_huge_page(kvm, sp); +} + static struct kvm_memory_slot * gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn, bool no_dirty_log) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 67879459a25c..22152241bd29 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -328,8 +328,7 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible); -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 73eb28ed1f03..059231c82345 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -403,8 +403,11 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, lockdep_assert_held_write(&kvm->mmu_lock); list_del(&sp->link); - if (sp->nx_huge_page_disallowed) - unaccount_nx_huge_page(kvm, sp); + + if (sp->nx_huge_page_disallowed) { + sp->nx_huge_page_disallowed = false; + untrack_possible_nx_huge_page(kvm, sp); + } if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1118,16 +1121,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, * @kvm: kvm instance * @iter: a tdp_iter instance currently on the SPTE that should be set * @sp: The new TDP page table to install. - * @account_nx: True if this page table is being installed to split a - * non-executable huge page. * @shared: This operation is running under the MMU lock in read mode. * * Returns: 0 if the new page table was installed. Non-0 if the page table * could not be installed (e.g. the atomic compare-exchange failed). */ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, - struct kvm_mmu_page *sp, bool account_nx, - bool shared) + struct kvm_mmu_page *sp, bool shared) { u64 spte = make_nonleaf_spte(sp->spt, !kvm_ad_enabled()); int ret = 0; @@ -1142,8 +1142,6 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); - if (account_nx) - account_nx_huge_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); tdp_account_mmu_page(kvm, sp); @@ -1157,6 +1155,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu *mmu = vcpu->arch.mmu; + struct kvm *kvm = vcpu->kvm; struct tdp_iter iter; struct kvm_mmu_page *sp; int ret; @@ -1193,9 +1192,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) } if (!is_shadow_present_pte(iter.old_spte)) { - bool account_nx = fault->huge_page_disallowed && - fault->req_level >= iter.level; - /* * If SPTE has been frozen by another thread, just * give up and retry, avoiding unnecessary page table @@ -1207,10 +1203,19 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) sp = tdp_mmu_alloc_sp(vcpu); tdp_mmu_init_child_sp(sp, &iter); - if (tdp_mmu_link_sp(vcpu->kvm, &iter, sp, account_nx, true)) { + sp->nx_huge_page_disallowed = fault->huge_page_disallowed; + + if (tdp_mmu_link_sp(kvm, &iter, sp, true)) { tdp_mmu_free_sp(sp); break; } + + if (fault->huge_page_disallowed && + fault->req_level >= iter.level) { + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + track_possible_nx_huge_page(kvm, sp); + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); + } } } @@ -1498,7 +1503,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, * correctness standpoint since the translation will be the same either * way. */ - ret = tdp_mmu_link_sp(kvm, iter, sp, false, shared); + ret = tdp_mmu_link_sp(kvm, iter, sp, shared); if (ret) goto out; From patchwork Wed Oct 19 16:56:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5758 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp431803wrs; Wed, 19 Oct 2022 09:57:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63HslZg93eABbPvKgdrYNYjQn4K1L/r+r17LJZAPWHcGYVdIT5oWCY8znWw0Q13Fimd38w X-Received: by 2002:a17:907:70a:b0:741:78ab:dce5 with SMTP id xb10-20020a170907070a00b0074178abdce5mr7684164ejb.527.1666198674423; Wed, 19 Oct 2022 09:57:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198674; cv=none; d=google.com; s=arc-20160816; b=B8uwmjVEPn/ZgkCxgbJdsb+oqEF5omvMPPRuo324arFu/b8NvPjJEBy2ewqpMvfcR6 80QevFMs5+9EcEml7CZ2eefNNimXI/RWZb+wbnk5btQuEfqGvQ9m1UF1s+OFv14CfroT pz+R4miA95XmuR4gfjHW0tbygwGsRI+LkgaHBHeeHw5gyVjLKK7m0de+kLxGsQvadP1S 2BZu/1gCYguOQz4S1q5UTPVsnNm6UiNNGyMUT6XSuP780e2PXtEcVO61xlt+CRhv7VD3 kidOHWVtILzrFkrbefbfN7ifMMaPpCcT8JilF3MNtGNUwv7cXcqbd7+dCpNJxYNzvsEK sOyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=X3MIYjg97/CMOF++gdgmdL3lkYNVJQ+hwakprJNa1vw=; b=OrZKBK3DVwzSUpI/QWjRfBIyGOL+yqQBKyf6RfsPOtZd69fVceguIjDQXlqKVhOZKx ZczY5HoTFiIxE1JycpHkfqTHqWE8I3jszwIWKOeVKxYpADXE8CDcIF3N9ery7pbfcxuF JC+EMRocFlkOr2oBCrdv8ERTAAlhcvYvaW92SPZfLuN82Suc7bQ9VDtczrKPrp3Gk0nP BuJH7Z+Z/wus70hoW2bk+xwk8LeKI32z515h98YrNGpd21NKCrvKoFeJ7wsVb8yVSlFY ATGIMxw4UnJ1qpsTaTKbmDO1LnBVxcL2NMJLJr8rHru/NMO1mDY7+ll80W6EOjkdnv31 jvNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=El92XSKr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xg11-20020a170907320b00b00741a0c28f07si11374106ejb.943.2022.10.19.09.57.30; Wed, 19 Oct 2022 09:57:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=El92XSKr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231286AbiJSQ4p (ORCPT + 99 others); Wed, 19 Oct 2022 12:56:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230420AbiJSQ4c (ORCPT ); Wed, 19 Oct 2022 12:56:32 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 594531CFF34 for ; Wed, 19 Oct 2022 09:56:31 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id h2-20020a170902f54200b0018553a8b797so5713916plf.9 for ; Wed, 19 Oct 2022 09:56:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=X3MIYjg97/CMOF++gdgmdL3lkYNVJQ+hwakprJNa1vw=; b=El92XSKrFJmmBh4KfZzKzepItG1uCX4HbMX09QRxMZr1OWDatcXilIKUz56utPIpDH 1OxZNpUR++MRrh0EyHFZLWpCqhbyhL0eCulj4zUbw2Qe49Ye/yusT/x52WKkMy4warlf SH+JnEvkqTCXBrMdUg4s4BJF/osU6WmXcuTAEA164VP4NPZESPf74orLZ9SxmZnTJ85V ek8g/yjRi5XzVFgCkydkvOQbo+w2pK9kwl80Rm4WZ1lwFZG60i8gNNXkw7dVF1UO+e7O lKcl0X+QatGHPr7WVCn7oiK1D7n8TPwEOUCmAENidE0ZvrYjqLDxQPwtfAKAHppONXmT llAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=X3MIYjg97/CMOF++gdgmdL3lkYNVJQ+hwakprJNa1vw=; b=VbK5Q9JCeu79hl8eKPjh8nxLKYCRO3OF8XziHMvdqaya6q5/TcwEzOOAhV8Lc+fcpn ORteCJAuqMqNSUL3Tc/lbrnQwSK7fDrAe+sayPZ31H/+0UVJ1ZvFxQqP07SzevzjDSVN pXIfmX9Sj+qnDJ7w3u2rGYp83p2RbhyfbwOmK9PdacLnXF8TPKT1QCcN0otfc66bplyg GH84gLNq3vIp/rlUa38a5B5iWOyhzxav+jfTAULteowy8ugA34UCr0ccQnipXUXCwSQW UuaM6qyJfL/1bDPoRC2OchCjHWVfg/6/Pn/W/R6kgP7OAg58J6CUPQ0M+yTYHiek7llo hbxA== X-Gm-Message-State: ACrzQf2g4evJllMrGXF/k5pMcPmBh+X5xUoUzPiyeRB2lAPqpVyJO6Ul Cu3V50vTtDiLs7h/6Cw0p+ciMzwvqR0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:6907:b0:179:c9bc:dd73 with SMTP id j7-20020a170902690700b00179c9bcdd73mr9515508plk.159.1666198590558; Wed, 19 Oct 2022 09:56:30 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:15 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-6-seanjc@google.com> Subject: [PATCH v6 5/8] KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual pages From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747135941015562529?= X-GMAIL-MSGID: =?utf-8?q?1747135941015562529?= Track the number of TDP MMU "shadow" pages instead of tracking the pages themselves. With the NX huge page list manipulation moved out of the common linking flow, elminating the list-based tracking means the happy path of adding a shadow page doesn't need to acquire a spinlock and can instead inc/dec an atomic. Keep the tracking as the WARN during TDP MMU teardown on leaked shadow pages is very, very useful for detecting KVM bugs. Tracking the number of pages will also make it trivial to expose the counter to userspace as a stat in the future, which may or may not be desirable. Note, the TDP MMU needs to use a separate counter (and stat if that ever comes to be) from the existing n_used_mmu_pages. The TDP MMU doesn't bother supporting the shrinker nor does it honor KVM_SET_NR_MMU_PAGES (because the TDP MMU consumes so few pages relative to shadow paging), and including TDP MMU pages in that counter would break both the shrinker and shadow MMUs, e.g. if a VM is using nested TDP. Cc: Yan Zhao Reviewed-by: Mingwei Zhang Reviewed-by: David Matlack Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao --- arch/x86/include/asm/kvm_host.h | 11 +++-------- arch/x86/kvm/mmu/tdp_mmu.c | 20 +++++++++----------- 2 files changed, 12 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0333dbb8ec85..bbd2cecd34cb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1290,6 +1290,9 @@ struct kvm_arch { */ bool tdp_mmu_enabled; + /* The number of TDP MMU pages across all roots. */ + atomic64_t tdp_mmu_pages; + /* * List of kvm_mmu_page structs being used as roots. * All kvm_mmu_page structs in the list should have @@ -1310,18 +1313,10 @@ struct kvm_arch { */ struct list_head tdp_mmu_roots; - /* - * List of kvm_mmu_page structs not being used as roots. - * All kvm_mmu_page structs in the list should have - * tdp_mmu_page set and a tdp_mmu_root_count of 0. - */ - struct list_head tdp_mmu_pages; - /* * Protects accesses to the following fields when the MMU lock * is held in read mode: * - tdp_mmu_roots (above) - * - tdp_mmu_pages (above) * - the link field of kvm_mmu_page structs used by the TDP MMU * - possible_nx_huge_pages; * - the possible_nx_huge_page_link field of kvm_mmu_page structs used diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 059231c82345..4e5b3ae824c1 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -29,7 +29,6 @@ int kvm_mmu_init_tdp_mmu(struct kvm *kvm) kvm->arch.tdp_mmu_enabled = true; INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots); spin_lock_init(&kvm->arch.tdp_mmu_pages_lock); - INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages); kvm->arch.tdp_mmu_zap_wq = wq; return 1; } @@ -54,7 +53,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) /* Also waits for any queued work items. */ destroy_workqueue(kvm->arch.tdp_mmu_zap_wq); - WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages)); + WARN_ON(atomic64_read(&kvm->arch.tdp_mmu_pages)); WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); /* @@ -377,11 +376,13 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) { kvm_account_pgtable_pages((void *)sp->spt, +1); + atomic64_inc(&kvm->arch.tdp_mmu_pages); } static void tdp_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) { kvm_account_pgtable_pages((void *)sp->spt, -1); + atomic64_dec(&kvm->arch.tdp_mmu_pages); } /** @@ -397,17 +398,17 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, bool shared) { tdp_unaccount_mmu_page(kvm, sp); + + if (!sp->nx_huge_page_disallowed) + return; + if (shared) spin_lock(&kvm->arch.tdp_mmu_pages_lock); else lockdep_assert_held_write(&kvm->mmu_lock); - list_del(&sp->link); - - if (sp->nx_huge_page_disallowed) { - sp->nx_huge_page_disallowed = false; - untrack_possible_nx_huge_page(kvm, sp); - } + sp->nx_huge_page_disallowed = false; + untrack_possible_nx_huge_page(kvm, sp); if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1140,9 +1141,6 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, tdp_mmu_set_spte(kvm, iter, spte); } - spin_lock(&kvm->arch.tdp_mmu_pages_lock); - list_add(&sp->link, &kvm->arch.tdp_mmu_pages); - spin_unlock(&kvm->arch.tdp_mmu_pages_lock); tdp_account_mmu_page(kvm, sp); return 0; From patchwork Wed Oct 19 16:56:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5761 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp432940wrs; Wed, 19 Oct 2022 10:00:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7xzOf/hl74Lv6qjpv0jestIm7sTrsJ92h2nc/DKGXU8yS0LHML5CpxZTXGsOfdbJJjGxyb X-Received: by 2002:aa7:cd74:0:b0:45c:78fb:1d89 with SMTP id ca20-20020aa7cd74000000b0045c78fb1d89mr8291578edb.87.1666198836200; Wed, 19 Oct 2022 10:00:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198836; cv=none; d=google.com; s=arc-20160816; b=06qdEwv1qHdlD5TpDGY46SikBQPvafkSK4W1uV8mGTHN8+LdqXWuUg0fCEx0q6PQ1C AK8KHg4sYBqvS13i7DS/jjXDFOdCfyIrgVzQgNqAVdgGyajV+f3zEfkIBEBpP5NFgi66 BviVqoyN+dcduLBvAMe07mBq+V2488u4ThWkHcKHjOmQmscTP8oHyA1+pIefj17RlelT CY3rfpu+21eGQfsszlbpn0zr74SW3XEg5fqI7V+QiY5VBbhy41bzvwPyZxqERjocKCMr o5EOKUL2XPrZ/+CGsNsiRFIowagktVEF6MBHOIRKI/KRWRLQQdD9ag067WLEG+mMkHzu fgVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=waroNZFBJLQwvUgcfZIcey5TzJEOeoUZhGZO7shkmD4=; b=y6xFg7oVK8/ymiI5cpBAsOP1yD7h5PcjDzIuL7atby5wtVFLBiMhE0k/rXhPmEOkKw cHzUJGhVLdXCCJO3sS/QnVsIAKNVHIMHOA3GAwCKk67EDMZ8huMkzCR05nWkfTXXi+t0 oVhZWL4Ce5hfnEbFAaoBXyVzzR4lXAihzXJdqpN0Ikaq4uXEqjOR7saYGCaPGDQ9mK5B tT1NZSXSf81ilJ+NJYpAtTK1d+OOg8mQfD57Wul9F0jsfORiHLYOBb857OVxdUsrgoF3 c5MjUtk0ul9uN6PQYRXQVFevVMAKW0zzN5l9ENCmgvuc1+M/BBjyUwIc0F/hbEGGjxPk jOMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=e3P08m4q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e27-20020a50d4db000000b004594ac619b4si12836939edj.245.2022.10.19.10.00.09; Wed, 19 Oct 2022 10:00:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=e3P08m4q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231397AbiJSQ5K (ORCPT + 99 others); Wed, 19 Oct 2022 12:57:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231193AbiJSQ4i (ORCPT ); Wed, 19 Oct 2022 12:56:38 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0954A1D064C for ; Wed, 19 Oct 2022 09:56:33 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id b14-20020a170902d50e00b001854f631c4eso6828821plg.8 for ; Wed, 19 Oct 2022 09:56:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=waroNZFBJLQwvUgcfZIcey5TzJEOeoUZhGZO7shkmD4=; b=e3P08m4q0X038sMHBa7OpQBM91n7Ix99ZqQp7w8a4ngY84stpxKOpAS0LH0DxkudXa SxRdayNSOHpNA1a6MRDLHSkqmPNogQcLiNzgVsg6mOkE/3gKwKLWSmfQJtFEgFvtxswj QI49YQrjbyRhGIbq1OBl7DsSgaKWLLNUDYRpL3FFV6KpsUMoOTuVUyTHFSCylEFqyOY4 OPNVJXAB/Jda/Jki2rwzILN54htLopyJjnhCT5gxOeCD4EVWmHB0Y3E32Xoq7AXsww3E Em1WA8pdw+VaaObOsGKceZcu+YLwjkM+VT31zVYPy/KQ+CaBQ1SHrNZo9t1Gok+7mTjo dkww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=waroNZFBJLQwvUgcfZIcey5TzJEOeoUZhGZO7shkmD4=; b=1HsDIYOoZZzR2e84APh1Zl5TNu15Tvc4YfsBXcnTFnvXWETu3l8zJmcsxy9BFnDKR4 7V/VAdeHrFhsWZh6o19o9ZS01tCnt0YY//nSaRi+kr+dEp+C8WPyiNO866mBr5/Mcppb oUMx0ucPZHecaXl0MuObVcbZfco5aJydWA6Fh55wV1i27sB5S/X14il7jQX82qoy2SRe b6z0VVZLimVROg8fQ3LOFALWXujNJR5SU0e3li6de1Med0JHIoCTpqzDcFlMRgu0AsLm UI7qLXebGaOGfgC8pRUTkKn5Ss2wlUKDUWs7qr2cbSeADpiIoJa4snU582aFIxwvaKzp 6Kvw== X-Gm-Message-State: ACrzQf0UL/FLnZ8Y5csYZQJX9b/A0XVREzd6m7bfvFyMU3AbEwWqEO3O SBl2y9GEpDlLCGhKaHumDR6qDUhMcnY= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:c986:b0:205:f08c:a82b with SMTP id w6-20020a17090ac98600b00205f08ca82bmr2823094pjt.1.1666198592333; Wed, 19 Oct 2022 09:56:32 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:16 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-7-seanjc@google.com> Subject: [PATCH v6 6/8] KVM: x86/mmu: Add helper to convert SPTE value to its shadow page From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747136110380758286?= X-GMAIL-MSGID: =?utf-8?q?1747136110380758286?= Add a helper to convert a SPTE to its shadow page to deduplicate a variety of flows and hopefully avoid future bugs, e.g. if KVM attempts to get the shadow page for a SPTE without dropping high bits. Opportunistically add a comment in mmu_free_root_page() documenting why it treats the root HPA as a SPTE. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 17 ++++++++++------- arch/x86/kvm/mmu/mmu_internal.h | 12 ------------ arch/x86/kvm/mmu/spte.h | 17 +++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.h | 2 ++ 4 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 57c7c52d137a..f4f1b1591a02 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1818,7 +1818,7 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp, continue; } - child = to_shadow_page(ent & SPTE_BASE_ADDR_MASK); + child = spte_to_child_sp(ent); if (child->unsync_children) { if (mmu_pages_add(pvec, child, i)) @@ -2377,7 +2377,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, * so we should update the spte at this point to get * a new sp with the correct access. */ - child = to_shadow_page(*sptep & SPTE_BASE_ADDR_MASK); + child = spte_to_child_sp(*sptep); if (child->role.access == direct_access) return; @@ -2398,7 +2398,7 @@ static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, if (is_last_spte(pte, sp->role.level)) { drop_spte(kvm, spte); } else { - child = to_shadow_page(pte & SPTE_BASE_ADDR_MASK); + child = spte_to_child_sp(pte); drop_parent_pte(child, spte); /* @@ -2837,7 +2837,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, struct kvm_mmu_page *child; u64 pte = *sptep; - child = to_shadow_page(pte & SPTE_BASE_ADDR_MASK); + child = spte_to_child_sp(pte); drop_parent_pte(child, sptep); flush = true; } else if (pfn != spte_to_pfn(*sptep)) { @@ -3449,7 +3449,11 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, if (!VALID_PAGE(*root_hpa)) return; - sp = to_shadow_page(*root_hpa & SPTE_BASE_ADDR_MASK); + /* + * The "root" may be a special root, e.g. a PAE entry, treat it as a + * SPTE to ensure any non-PA bits are dropped. + */ + sp = spte_to_child_sp(*root_hpa); if (WARN_ON(!sp)) return; @@ -3934,8 +3938,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) hpa_t root = vcpu->arch.mmu->pae_root[i]; if (IS_VALID_PAE_ROOT(root)) { - root &= SPTE_BASE_ADDR_MASK; - sp = to_shadow_page(root); + sp = spte_to_child_sp(root); mmu_sync_children(vcpu, sp, true); } } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 22152241bd29..dbaf6755c5a7 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -133,18 +133,6 @@ struct kvm_mmu_page { extern struct kmem_cache *mmu_page_header_cache; -static inline struct kvm_mmu_page *to_shadow_page(hpa_t shadow_page) -{ - struct page *page = pfn_to_page(shadow_page >> PAGE_SHIFT); - - return (struct kvm_mmu_page *)page_private(page); -} - -static inline struct kvm_mmu_page *sptep_to_sp(u64 *sptep) -{ - return to_shadow_page(__pa(sptep)); -} - static inline int kvm_mmu_role_as_id(union kvm_mmu_page_role role) { return role.smm ? 1 : 0; diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 7670c13ce251..7e5343339b90 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -219,6 +219,23 @@ static inline int spte_index(u64 *sptep) */ extern u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask; +static inline struct kvm_mmu_page *to_shadow_page(hpa_t shadow_page) +{ + struct page *page = pfn_to_page((shadow_page) >> PAGE_SHIFT); + + return (struct kvm_mmu_page *)page_private(page); +} + +static inline struct kvm_mmu_page *spte_to_child_sp(u64 spte) +{ + return to_shadow_page(spte & SPTE_BASE_ADDR_MASK); +} + +static inline struct kvm_mmu_page *sptep_to_sp(u64 *sptep) +{ + return to_shadow_page(__pa(sptep)); +} + static inline bool is_mmio_spte(u64 spte) { return (spte & shadow_mmio_mask) == shadow_mmio_value && diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index c163f7cc23ca..d3714200b932 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -5,6 +5,8 @@ #include +#include "spte.h" + hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root) From patchwork Wed Oct 19 16:56:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5760 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp432499wrs; Wed, 19 Oct 2022 09:59:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6AQ1nel+3BQU/pQQn5NGfo6ZYJKBBl1nt8smX/oI72Xe4U22v/LNktZlDYoPd50clMenBP X-Received: by 2002:a05:6402:1d55:b0:45f:c87f:c753 with SMTP id dz21-20020a0564021d5500b0045fc87fc753mr155324edb.32.1666198776834; Wed, 19 Oct 2022 09:59:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198776; cv=none; d=google.com; s=arc-20160816; b=U+h2413YYrVyNSIh5gSTiXJ+CrZP56oCIq/LiWKTucdbCgTVST4ZETeq304Ye/P76b eZdmvbaBxI0lKpQ9kVpmRzU1gc93O2mVBP+JswmZLGO61wdGYpvLoy0oPrCK7xriWvuT JFd34wFQXYBztxkYpxXnq0/DTJRNrrJor1h9DzcBtYt8tAx7Vy/nXLAcFt/xwALfedJr yeCLfrrWrGO28rLcT244XLWjVsFsBNZtcgFul5YeU6XKQuNMCek7itOLW7l1q9Slp7Do yWNwXQOkQtsj7H3qH5l5gtrddvyBSeFBMFFxhrotVAWD4sfaqcz+7ta/vJbZpruLjlZH 2yew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=GscSUgsZGmj923/aia3vL9pbn6x7hvYEHqUbfgDwtc8=; b=ZTgdx6XKBp5oN/6LGOlTQqthtJQraFt2IqUe33J54kPV/Ju+eSvO9sa4XPgWg8brRh gcsePnASh+uqWxWxCxrQgnGYE+yNIulfaM/o66048oc6hDcjLn8+VwuBKHYMiKOch7es y23S1hyJV5PHFx1IqzUcuoSTenDiVENtdoYnXXu6bpBNhJClgw9eMi6imUtR8dVb+fUu krNdTzzOy+RI5kvpuGiUKFA5jwWdJpHI/ewY6pFMRRuNXovAvyQWCMdAuLPnaWDklb2h kJeAaHiF0LmbDO+JlMIZ0fTG2BoffkwdkIjlvhqgDICoHmpmleRiXZ5oGY3IVPxPXk/0 QUuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="EEYb/VsM"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sb10-20020a1709076d8a00b00780076c3322si15641072ejc.432.2022.10.19.09.59.11; Wed, 19 Oct 2022 09:59:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="EEYb/VsM"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231408AbiJSQ5F (ORCPT + 99 others); Wed, 19 Oct 2022 12:57:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231331AbiJSQ4r (ORCPT ); Wed, 19 Oct 2022 12:56:47 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A91B11D066E for ; Wed, 19 Oct 2022 09:56:35 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id o17-20020a170902d4d100b0018552c4f4bcso6011420plg.13 for ; Wed, 19 Oct 2022 09:56:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=GscSUgsZGmj923/aia3vL9pbn6x7hvYEHqUbfgDwtc8=; b=EEYb/VsMizKiUoFVd4rA6w3z6WaZeZLTGOTALE6KAhnCcBnYhEXMFPjDuNc4rzrZHO nUFmb16EjwmVJfqiyDksrVEzabLxWMpScBPXbdJQUDqsH3BYALeiEkgnazdHZ08SDTWm pMYzsafuQf6bB0R8JZxi6+3SdTIbQYywOUJJoSeUwfT4C/tI/gURJ7ZZqEljbSEl0pYF ofyj11F5quyUEKESmgKhZz+Qz5tsUxfJJ2toI+fr6KON8xxn9hYtHKa3auN/hKaVVo05 FlFcYetlto1j4ks++Bnrx2snmEGCvmqDASi99sKXlRwH/EXuK7ugvY4BoYkx/gQYd3hz eczw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GscSUgsZGmj923/aia3vL9pbn6x7hvYEHqUbfgDwtc8=; b=4w91cTS5tL1veIwa9KVoViFCVF2WV6NnH9AYnoYXtoIO57eEIrrbH5FR525uRAlJ2I Oev5lTTUAw2UpphHygMzDTxS+CXszQeQpaNVq+D7hNDbBcLzXTYZCgfUXLbDd8KedAt6 gSHYXvPqcI4BlMh/LdM1e191hHCMwfqeEZZjqvyVVp1i4BDf5A9xnd1qMwQ/ZctSJeNB H30MANClchp/yLG5d8S2FdpZXiJU3WNMEaJp/uQFug5wMp6gDIvc0B0uDwQEGfSYfgTy haWth2NmumpyRtqC8IDZdPiEhNn9zeBxsjYM36i+cYErPcSgb95EiHoKpnCxzQmlWIwJ qBbA== X-Gm-Message-State: ACrzQf1j3OwiqeTR7XP1GwwVoDB/oFVyGp/qdrElk1Twp+cy1qW5aiRG dDMf7NhkkUu2FiekCkyMiWq/LS5q4MA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:c986:b0:205:f08c:a82b with SMTP id w6-20020a17090ac98600b00205f08ca82bmr2823103pjt.1.1666198594436; Wed, 19 Oct 2022 09:56:34 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:17 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-8-seanjc@google.com> Subject: [PATCH v6 7/8] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust() From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747136048361784718?= X-GMAIL-MSGID: =?utf-8?q?1747136048361784718?= From: Mingwei Zhang Explicitly check if a NX huge page is disallowed when determining if a page fault needs to be forced to use a smaller sized page. KVM currently assumes that the NX huge page mitigation is the only scenario where KVM will force a shadow page instead of a huge page, and so unnecessarily keeps an existing shadow page instead of replacing it with a huge page. Any scenario that causes KVM to zap leaf SPTEs may result in having a SP that can be made huge without violating the NX huge page mitigation. E.g. prior to commit 5ba7c4c6d1c7 ("KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging"), KVM would keep shadow pages after disabling dirty logging due to a live migration being canceled, resulting in degraded performance due to running with 4kb pages instead of huge pages. Although the dirty logging case is "fixed", that fix is coincidental, i.e. is an implementation detail, and there are other scenarios where KVM will zap leaf SPTEs. E.g. zapping leaf SPTEs in response to a host page migration (mmu_notifier invalidation) to create a huge page would yield a similar result; KVM would see the shadow-present non-leaf SPTE and assume a huge page is disallowed. Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Reviewed-by: Ben Gardon Reviewed-by: David Matlack Signed-off-by: Mingwei Zhang [sean: use spte_to_child_sp(), massage changelog, fold into if-statement] Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f4f1b1591a02..14674c9e10f7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3111,7 +3111,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_ if (cur_level > PG_LEVEL_4K && cur_level == fault->goal_level && is_shadow_present_pte(spte) && - !is_large_pte(spte)) { + !is_large_pte(spte) && + spte_to_child_sp(spte)->nx_huge_page_disallowed) { /* * A small SPTE exists for this pfn, but FNAME(fetch) * and __direct_map would like to create a large PTE From patchwork Wed Oct 19 16:56:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 5762 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp433012wrs; Wed, 19 Oct 2022 10:00:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5AiDgb6WdzAebi1C1plA05E0VJQpXkcuVhVoBlbIBAHzz2vXAlka+v84MLsVl2BtKk1yz8 X-Received: by 2002:a17:907:6e11:b0:78e:3057:f631 with SMTP id sd17-20020a1709076e1100b0078e3057f631mr7450854ejc.333.1666198842272; Wed, 19 Oct 2022 10:00:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666198842; cv=none; d=google.com; s=arc-20160816; b=nXWPz/WvaVDZK4EbqaJfz8vq7Ug2uBR9Ksboc4eciVc8EBfMWu+29TlQfojU9zvF+x zjUqll+ZAUGy0NijOs7Yzy2tEzwTqPEtwQVDtuxatk5fnr8nFmueEG7J6u3sOjB81QQc CdZatFcb0vJezeEjROBIJy37ZsOr89YVWdWFTs9N00TBBd2wa2BsUrkXfaqZYi4YSKpq jSLjR6zCEoLFwvbct7Oej4sRqwfBVQZJoqO00ORk0UHcOK2LZkmdsil9l0LocauG5fuh WFDeRW+iHl20d6YZtoNQ73cQrOPPog4XAQLlcu3p48w5E9dh3AlfPnIbX8g8xTD8umDw DYGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=zolBZdZc8YdUGOdwe6gjYM21Hp1u8s0GmGxgf32zoYY=; b=MzEaYA2pHBhsLqSRe+w89eR6QEofFP8CTceLzT81sWT/SDJlDZ1Sb9wtaLzf0L8EH3 t8xRFfLXztpd0SKeuFc5A/s2os7pjWaM7FqkG+35y3AnXNZvTQ7SAmslwp0rPDlWDfPs tXpZuTawNdk2/GxlI2zOKzFa15j9mqQMhR4YJzwA5ZjX01cX9isKYTw1+gBn+lNJbmd9 dLB31v4ZQ5zxhAjl1eysN6l4Rny3ORbCt04u1i3nlHbPxQHd/im5GBBdyiVij64zXlJQ aslCiJU+FchpZSge0gvCH5Olhg8OlD5Qi4JExx0rDkatYMYa33KUTaCbvffAjPwNaUN/ 8IOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=aLnzkHpD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id js7-20020a17090797c700b00790058488d3si11910915ejc.89.2022.10.19.10.00.11; Wed, 19 Oct 2022 10:00:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=aLnzkHpD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231591AbiJSQ5M (ORCPT + 99 others); Wed, 19 Oct 2022 12:57:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230517AbiJSQ46 (ORCPT ); Wed, 19 Oct 2022 12:56:58 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C97FA1D0D4F for ; Wed, 19 Oct 2022 09:56:37 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id y65-20020a25c844000000b006bb773548d5so16649901ybf.5 for ; Wed, 19 Oct 2022 09:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=zolBZdZc8YdUGOdwe6gjYM21Hp1u8s0GmGxgf32zoYY=; b=aLnzkHpDpkeBNhxgUkhiqgxKHHtY4ZbhvNY3bJhR0ovSRwtOSLtNKKCjvs1PDtXMUm eajcWDo93u7FNylqbddyXHZXnxbWOSBdTj4/5V9+IDyPKawo2MPzEFH/OpvcILanRjiD gbDCKj3AHS+JtcoCV39QmB0z7cGG7NkGppT/yxeMaseTYyjdsgZAA2RQq/tVesYvGjpA 5FtfPIyuDpTIF9SAvKoNjSXl7TLkVUpl9hHutCkSkuAGWnBF+aJDh3/BM01cJb17gk4y 3mBir7X7QBpyDRK+6zPStUPgfESc3jrBc+sa9GIY7ZNLR548FGyXpHKCSd0M3D4cXO4l 8A9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zolBZdZc8YdUGOdwe6gjYM21Hp1u8s0GmGxgf32zoYY=; b=TLQMzEfk+NBpjsiQcVoGe79LyIDM32PdTaKFfSno6qy3uSmVHEiPmZFn8MC/dVklH+ o5JzRHrVMeJkLj7Zw8c8TaPvRXAwO71VK3j8oMY9S83ZpDUG2tWaBXF9X2FAllFCOJ9q NJSVd+JTuskI86+Xm6k36RKxjX4E20bSH4vv6IoGDBrDHLiHK9H6EV1Imjn+rPTqS1ez h9WKKvcv02vbtDXsHiIQcIyRlenLZU1wXTDbhDTJTh1+SfAbiGW8fCvfmP1MLjniFyiL l0tFpCB6w6QDaCsl/+hVkIwlCR0j7Ub1DGrpFf0iVeJjX3tlLBYhWbMcmWjp1JzCS+kb BYXg== X-Gm-Message-State: ACrzQf0Z0E1hTGrN9gLLK4BgaPPKRMqoBthiJ2BCrlEr19O4njNBYIAV kwkN0jrUBs0DoQ5iyls9amRI1A69jQI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:bfc2:0:b0:6c0:fec7:ae64 with SMTP id q2-20020a25bfc2000000b006c0fec7ae64mr6946686ybm.366.1666198596284; Wed, 19 Oct 2022 09:56:36 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 19 Oct 2022 16:56:18 +0000 In-Reply-To: <20221019165618.927057-1-seanjc@google.com> Mime-Version: 1.0 References: <20221019165618.927057-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221019165618.927057-9-seanjc@google.com> Subject: [PATCH v6 8/8] KVM: x86/mmu: WARN if TDP MMU SP disallows hugepage after being zapped From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , David Matlack , Yan Zhao , Ben Gardon X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747136117044317905?= X-GMAIL-MSGID: =?utf-8?q?1747136117044317905?= Extend the accounting sanity check in kvm_recover_nx_huge_pages() to the TDP MMU, i.e. verify that zapping a shadow page unaccounts the disallowed NX huge page regardless of the MMU type. Recovery runs while holding mmu_lock for write and so it should be impossible to get false positives on the WARN. Suggested-by: Yan Zhao Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 14674c9e10f7..dfd1656232ad 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6864,12 +6864,11 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm) struct kvm_mmu_page, possible_nx_huge_page_link); WARN_ON_ONCE(!sp->nx_huge_page_disallowed); - if (is_tdp_mmu_page(sp)) { + if (is_tdp_mmu_page(sp)) flush |= kvm_tdp_mmu_zap_sp(kvm, sp); - } else { + else kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); - WARN_ON_ONCE(sp->nx_huge_page_disallowed); - } + WARN_ON_ONCE(sp->nx_huge_page_disallowed); if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);