From patchwork Tue Aug 8 08:53:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 132656 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp2296418vqr; Tue, 8 Aug 2023 10:59:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFhsRsp2+oDy10VwBM/2gMtP1ovdWUmApDcBB3XHMlkNJBwV2YsNuyn9W489AnCj9fEEYDX X-Received: by 2002:a05:6a20:914d:b0:132:f61e:7d41 with SMTP id x13-20020a056a20914d00b00132f61e7d41mr232830pzc.5.1691517572511; Tue, 08 Aug 2023 10:59:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691517572; cv=none; d=google.com; s=arc-20160816; b=qEw8MCxEVcOC7NkEAfh8C1VkA3v7BXxTPgZ7oxxi8qu7I8R2WDXoBE5gXQ9hVY1pkp iboLXWMzLu5ppLb951vCc3lj+bid2azFFVM5YRY3EWq896wXZHuWLi5m8POGIcOTDLEA gXvO/4tKEGCTK8Uki3P8jdD9aqgQYuBgtcPVvXBVRAHxgCCOO8CsOH6L53hjdejo8tQ2 mjvgIpJxmdBGKWv3tg5zUxlUcO/m08CbrpzplAFIqrBxNbW8Vh+iBgSa0p7e5TnN+HAd Nh+2WinaMfRKqh23usmxSpIOjalNP2Zu70Z76uMVmwOsfWR6XQsMfkWYmOelgljtC6PZ oqtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=S8326lwCgbXPSwHB9RtiKl6MK7wQik3yF1ggv87A9J0=; fh=QXHRr2he8xyChZIeqmRKFcQ/Xe6XefztEBp7udm/kbw=; b=vMv3aSOqQMTUMJneYgxg2JO4ug5etkay5dBMRceMI9EHEt4egTxE8+hKfLojQM+68Q +6ZU27rnDO3zqlDElJuDIID9yo33oeaAdF4nWtGIkyONLQEJxHDEsEjzKp3y58lDm0UB P6D0dWS3IegpY1ChPCgrooY0ih2Q2sEVEuteoKs6LWs/YeQhr/k9/sWkWnjjyy1ktU80 Yw+kn83FEppGupjzQ7++YtItEuNiibrjXr5KXidqN5ONQpAteLnLFiWF9srdy910lsvR Do39ew1rNoVCne/BM+OD1n21qs/7+q9o+Bn1TkDE/MlwELcsTHeHSGKHVdp59AXG8F2F jwkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VypfmUs9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l66-20020a633e45000000b0053fb354c191si4756811pga.861.2023.08.08.10.59.18; Tue, 08 Aug 2023 10:59:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VypfmUs9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231140AbjHHPz6 (ORCPT + 99 others); Tue, 8 Aug 2023 11:55:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbjHHPyR (ORCPT ); Tue, 8 Aug 2023 11:54:17 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 240D559D4; Tue, 8 Aug 2023 08:43:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691509411; x=1723045411; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=qfj3FVS8H3992HtbiZET/Z3mfEMOWbWqbIvMBxejMBs=; b=VypfmUs9y/Z/xA7v1oTWj0ZTFWH8A64nIe8q/P4tTjC/RWrlso0AO9dj dSfWNzdKXEKHxW/8N/HaF8jBNdwb0t8xNqYN1BJyLhjJ9ozglKlhnLFXF WMticBUGRgMSY88YQZmtMuZ0lZbyGXQg43/DcB8B1xgw0ytpX8xDQrJF+ bQuoZkGzHDmfkUVF/hiNUsH+jlIpUPY92ZfNPBW6HnFQdZky1+vjMz4bQ khBgGRj+OhFIDEu1y5boYi5vP3bdQOlJkRzDaR6oD4tR7CUgA1uKJb/85 gUa2IpYkgq4kEWjytGd6mIfSSP7/ba9wcP0IVUhnnVvxrsn26aJU8iRNd A==; X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="457152687" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="457152687" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 02:19:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="734448258" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="734448258" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 02:19:55 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, Yan Zhao Subject: [PATCH 1/2] KVM: x86/mmu: Remove dead code in .change_pte() handler in x86 TDP MMU Date: Tue, 8 Aug 2023 16:53:06 +0800 Message-Id: <20230808085306.14742-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230808085056.14644-1-yan.y.zhao@intel.com> References: <20230808085056.14644-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773684729707708300 X-GMAIL-MSGID: 1773684729707708300 Remove the dead code set_spte_gfn() in x86 TDP MMU's .change_pte() handler to save CPU cycles and to prepare for the optimization in next patch. As explained in commit c13fda237f08 ("KVM: Assert that notifier count is elevated in .change_pte()"), when .change_pte() was added by commit 828502d30073 ("ksm: add mmu_notifier set_pte_at_notify()"), .change_pte() was invoked without any surrounding notifications; However, since commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end"), all calls to .change_pte() are guaranteed to be surrounded by .invalidate_range_start() and .invalidate_range_end() pair. As .invalidate_range_start() will always cause KVM to zap related SPTE, and page fault path will not install new SPTEs successfully before .invalidate_range_end(), kvm_set_spte_gfn() should not be able to find any shadow present leaf entries to operate on and therefore set_spte_gfn() is never called any more. So, in TDP MMU, just drop the set_spte_gfn() and only keep warning of huge pages. Signed-off-by: Yan Zhao --- arch/x86/kvm/mmu/tdp_mmu.c | 40 ++++---------------------------------- 1 file changed, 4 insertions(+), 36 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 6250bd3d20c1..89a1f222e823 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1235,36 +1235,6 @@ bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); } -static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, - struct kvm_gfn_range *range) -{ - u64 new_spte; - - /* Huge pages aren't expected to be modified without first being zapped. */ - WARN_ON(pte_huge(range->arg.pte) || range->start + 1 != range->end); - - if (iter->level != PG_LEVEL_4K || - !is_shadow_present_pte(iter->old_spte)) - return false; - - /* - * Note, when changing a read-only SPTE, it's not strictly necessary to - * zero the SPTE before setting the new PFN, but doing so preserves the - * invariant that the PFN of a present * leaf SPTE can never change. - * See handle_changed_spte(). - */ - tdp_mmu_iter_set_spte(kvm, iter, 0); - - if (!pte_write(range->arg.pte)) { - new_spte = kvm_mmu_changed_pte_notifier_make_spte(iter->old_spte, - pte_pfn(range->arg.pte)); - - tdp_mmu_iter_set_spte(kvm, iter, new_spte); - } - - return true; -} - /* * Handle the changed_pte MMU notifier for the TDP MMU. * data is a pointer to the new pte_t mapping the HVA specified by the MMU @@ -1273,12 +1243,10 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, */ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - /* - * No need to handle the remote TLB flush under RCU protection, the - * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a - * shadow page. See the WARN on pfn_changed in handle_changed_spte(). - */ - return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn); + /* Huge pages aren't expected to be modified */ + WARN_ON(pte_huge(range->arg.pte) || range->start + 1 != range->end); + + return false; } /* From patchwork Tue Aug 8 08:54:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yan Zhao X-Patchwork-Id: 132718 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp2328264vqr; Tue, 8 Aug 2023 11:54:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEYV4xLNwRVYgIK7vr8qw5E2ZPCI7T/QgVMipBhojhuU97RTW1I/SdWRZuhvrX4PKksgNN1 X-Received: by 2002:a05:6512:3c8a:b0:4f3:b588:48d0 with SMTP id h10-20020a0565123c8a00b004f3b58848d0mr347356lfv.14.1691520870852; Tue, 08 Aug 2023 11:54:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691520870; cv=none; d=google.com; s=arc-20160816; b=M0SJNnCahv8B4MmHvafCifITSTgiDmfxd+XhbSGlfo+tozKXLk59unMUVgDG8tmac/ /LkaR+ga3uevpEG9W4mbHb+a1yq15SOw5klS4btdtEca2xGk5sGAshT2yyriLmLwlCFM 1GK0L4CGe3op3etJdCMmYXuVSFta6MNhD1c2CKza36uc05ybt9XDpZE6XF7lFdhuHlfx UzadWz+H7TRMCXVQX+H0xi2LUtyPJNODDf5G1WNUEUPhfzBYrNC1aheYrNqvD4Jdquqc QgrVOMZp9yKYSj6jNXBHLYdnWU/FLSs9kvEGRWh+qBIFC79nYWB7C7dbimyww9T9cIQB 1G8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=uBUKcLuVro+hxk7DkBuKdSHmRgPaYVkcz9jeSTI6A2k=; fh=QXHRr2he8xyChZIeqmRKFcQ/Xe6XefztEBp7udm/kbw=; b=cVOlc3kG0UFBNTVyUGJAEnKMNJrGqEQQijrf3ODfqGOs4IKAzci0R6Sih0D5IjH/dJ bNyc8yopo4Il7FdQ4bG6g2yc4msGEOkiFTudJIrOYhfJyJmnvZbt4w4YgK2X0Vbeyczi ipxceUtcWw33QblwlV6Q1PdRyi0MYf8IczbJCd+jPvJ+rHYNYMFaOOgaXx/I4Th+Pjfd 2yVItJVZERN/pQ4MfJ0BevsfirltQ5WNYXePAdX8hLCIk+WLHe/DuiFBvRcU1zKZmmHk rqrLayKwoTcEMcqHGLbU064fCBWo6WQIjnjuHjIdWLhlHWsxVoJrIoABqk1VHQaVWc4Q GIgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=AnKnMHJM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k18-20020aa7d8d2000000b005233849b384si1807563eds.462.2023.08.08.11.54.03; Tue, 08 Aug 2023 11:54:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=AnKnMHJM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230487AbjHHP5M (ORCPT + 99 others); Tue, 8 Aug 2023 11:57:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229719AbjHHPzc (ORCPT ); Tue, 8 Aug 2023 11:55:32 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98CAE59F7; Tue, 8 Aug 2023 08:43:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691509422; x=1723045422; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=pmZXGyJEB0xcWfjq3eEUAtgF2VdcZivjssmuDMcHZFA=; b=AnKnMHJM8up9w2qaU0wdkoMWgNg5egukGRP4RGmxZh2nYdoyURRmC6vt TW5KCfpIN7399FxjAAEyTvD7/9ZGWk+mzvUkaSoNz4ZQVmO61OuaKwdvL QUmph86uzEtyr4eFNCZ71LpK9FQoBRYiYq8x2MKQq7XrHaA0Y3qqD/KSg gMw/0oCGjZtgyaa8PC4gN5gtfbHR40aCXaS0GUQlhK74kxFFjQISSn6eN uRIOeO4ei/ZQkAQY0TnzvGDtaCOGzkh2vJESD4EInJ+wmygZ4f273WuGk 6bSsRcJvI5kBNuePAWLhQ5e5XyKknSnS7P+B5RZyNvZGS+rJsgkeZMF4s g==; X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="457152947" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="457152947" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 02:21:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="734448724" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="734448724" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 02:21:21 -0700 From: Yan Zhao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, Yan Zhao Subject: [PATCH 2/2] KVM: x86/mmu: prefetch SPTE directly in x86 TDP MMU's change_pte() handler Date: Tue, 8 Aug 2023 16:54:31 +0800 Message-Id: <20230808085431.14814-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230808085056.14644-1-yan.y.zhao@intel.com> References: <20230808085056.14644-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773688188698234333 X-GMAIL-MSGID: 1773688188698234333 Optimize TDP MMU's .change_pte() handler to prefetch SPTEs in the handler directly with PFN info contained in .change_pte() to avoid that each vCPU write that triggers .change_pte() must undergo twice VMExits and TDP page faults. When there's a running vCPU on current pCPU, .change_pte() is probably caused by a vCPU write to a guest page previously faulted in with a vCPU read. Detailed sequence as below: 1. vCPU reads to a guest page. Though the page is in RW memslot, both primary MMU and KVM's secondary MMU are mapped with read-only PTEs during page fault. 2. vCPU writes to this guest page. 3. VMExit and kvm_tdp_mmu_page_fault() calls GUP and COW are triggered, so .invalidate_range_start(), .change_pte() and .invalidate_range_end() are call successively. 4. kvm_tdp_mmu_page_fault() returns retry because it will always find current page fault is stale because of the increased mmu_invalidate_seq in .invalidate_range_end(). 5. VMExit and page fault again. 6. Writable SPTE is mapped successfully. That is, each guest write to a COW page must trigger VMExit and KVM TDP page fault twice though .change_pte() has notified KVM the new PTE to be mapped. Since .change_pte() is called in a point that's ensured to succeed in primary MMU, prefetch the new PFN directly in .change_pte() handler on secondary MMU (KVM MMU) can save KVM the second VMExit and TDP page fault. During tests on my environment with 8 vCPUs and 16G memory with no assigned devices, there're around 8000+ (with OVMF) and 17000+ (with Seabios) TDP page faults saved during each VM boot-up; around 44000+ TDP page faults saved during booting a L2 VM with 2G memory. Signed-off-by: Yan Zhao --- arch/x86/kvm/mmu/tdp_mmu.c | 69 +++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 89a1f222e823..672a1e333c92 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1243,10 +1243,77 @@ bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) */ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { + struct kvm_mmu_page *root; + struct kvm_mmu_page *sp; + bool wrprot, writable; + struct kvm_vcpu *vcpu; + struct tdp_iter iter; + bool flush = false; + kvm_pfn_t pfn; + u64 new_spte; + /* Huge pages aren't expected to be modified */ WARN_ON(pte_huge(range->arg.pte) || range->start + 1 != range->end); - return false; + /* + * Get current running vCPU to be used in below prefetch in make_spte(). + * If no running vCPU, .change_pte() is probably not triggered by vCPU + * writes, drop prefetching SPTEs in that case. + * Also only prefetch for L1 vCPUs. + * If later the vCPU is scheduled out, it's still all right to prefetch + * with the same vCPU except the prefetched SPTE may not be accessed + * immediately. + */ + vcpu = kvm_get_running_vcpu(); + if (!vcpu || vcpu->kvm != kvm || is_guest_mode(vcpu)) + return flush; + + writable = !(range->slot->flags & KVM_MEM_READONLY) && pte_write(range->arg.pte); + pfn = pte_pfn(range->arg.pte); + + /* Do not allow rescheduling just as kvm_tdp_mmu_handle_gfn() */ + for_each_tdp_mmu_root(kvm, root, range->slot->as_id) { + rcu_read_lock(); + + tdp_root_for_each_pte(iter, root, range->start, range->end) { + if (iter.level > PG_LEVEL_4K) + continue; + + sp = sptep_to_sp(rcu_dereference(iter.sptep)); + + /* make the SPTE as prefetch */ + wrprot = make_spte(vcpu, sp, range->slot, ACC_ALL, iter.gfn, + pfn, iter.old_spte, true, true, writable, + &new_spte); + /* + * Do not prefetch new PFN for page tracked GFN + * as we want page fault handler to be triggered later + */ + if (wrprot) + continue; + + /* + * Warn if an existing SPTE is found becasuse it must not happen: + * .change_pte() must be surrounded by .invalidate_range_{start,end}(), + * so (1) kvm_unmap_gfn_range() should have zapped the old SPTE, + * (2) page fault handler should not be able to install new SPTE until + * .invalidate_range_end() completes. + * + * Even if the warn is hit and flush is true, + * (which indicates bugs in mmu notifier handler), + * there's no need to handle the remote TLB flush under RCU protection, + * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a + * shadow page. + */ + flush = WARN_ON(is_shadow_present_pte(iter.old_spte)); + tdp_mmu_iter_set_spte(kvm, &iter, new_spte); + + } + + rcu_read_unlock(); + } + + return flush; } /*