From patchwork Sat Sep 16 00:39:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 140962 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1496952vqi; Fri, 15 Sep 2023 22:56:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IED4/LHM/Kk0GUpXADirLStUmH/ciJTZGjchRXszb/9QyIuen6YngQPJbD1aK4L3ECNZZA5 X-Received: by 2002:a17:902:ecd2:b0:1bc:8fca:9d59 with SMTP id a18-20020a170902ecd200b001bc8fca9d59mr4265471plh.29.1694843774002; Fri, 15 Sep 2023 22:56:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694843773; cv=none; d=google.com; s=arc-20160816; b=obzHtYSnomwTjVJQ3kW9z3Mz39UrRqMTg56euoioQ1yh52b2sf7hFWf+Zv+rF9nbXm jgo0/7BvliS+0J4d8jJqnsj5fVeCi/hAR5OjXdejBN00VAzem57n8RDRHHstydPUX6/l cYf/t9pvh78XnG+n/c6SHBnVTsoqf/ZTlzzXiV9wCJX99syxo56BU7kVSZmJbE3LonFB bhyN4nhtbqcDocp/5HzoS3Vt7nWuv14xKSlG+zvhwz3LSqwv5zAeA9d3m7x22jPkqGlI YNqW3MlDGp8HwVxg+8GNBGOfNAXt1TIprXSo5geO3XxywplzSwHPhwqc0Gfti4fTVm3Z I59w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=/G55D0Iozl5Q4L7L3XRdDQbonw1AsiEUXTGGxuj8wPc=; fh=Ule+EmZomMuZFAbJQ8lqmeAEXCatKRnnjo4FlrQw2P0=; b=ijVQIIcaME1PhhLSMQYG1PnGYCYdPV+YZjJwjhxdHIYh048zL4U7aOosX9tzrLSiz6 BHc5UBNXLTvuGFizgU0hpLtYC3+wcAwj4LR2lMaV6+tpWvTlapUy8eUUDb//XeFduXym drXVPRjAo6cPAQKdUOyAW2dSwkcEGbuMZUuZGepiWgfTm6O8ANXOmUfbzYoUOSea7C40 0tT5mzP4Ac/B3BSJbMhNvQi8KOmpn5rEWuo+4uxoYyE7th7FKMOPDJmtpUX5pX/CNfZF VzF3k8HiQpk5YmAhNesMQKggMXhjbWxvUzXlD6nmyMZCgefH8JQE7tyi+vXDmq2H2N6O sOCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=csQv9SDw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id t20-20020a170902d29400b001bb9e2c38ecsi4298166plc.264.2023.09.15.22.56.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 22:56:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=csQv9SDw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id BA79F83EA8F2; Fri, 15 Sep 2023 17:48:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232867AbjIPArr (ORCPT + 28 others); Fri, 15 Sep 2023 20:47:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238672AbjIPAr1 (ORCPT ); Fri, 15 Sep 2023 20:47:27 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E4EE30E7 for ; Fri, 15 Sep 2023 17:43:22 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59bf37b7734so29851407b3.0 for ; Fri, 15 Sep 2023 17:43:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694824760; x=1695429560; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=/G55D0Iozl5Q4L7L3XRdDQbonw1AsiEUXTGGxuj8wPc=; b=csQv9SDwrRcSQf39Sx6bm9DsRYlgWeAgWK2Y+6yLK6q4gEOYALMCJhhYwdhb4b5zxo e2GLjGn4ry6HySOO8/LOLPxE085thkoLjGjsSBS35CoR2zpL3IV5ikQDtlGzLof7t2Dm j83iqU6lCrCpdvQ61r2OlgKwAdcJUXd9Sg6WCDc5n0OHC/BDsJ7h//D9E1l0EW6huVwk p+KkjG/vKWXinqVMwfjCIq5xOZGiS7F3MnFtPMaHHlOiff7sSqgWpRSS7J6QbDuUG29m JRGpizbFHD/5iwNLaF4BU86kOUNmz1HhflZSRDaKxA6m6DH5B5eDzp499Bf2w5CvbIVc jefQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694824760; x=1695429560; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/G55D0Iozl5Q4L7L3XRdDQbonw1AsiEUXTGGxuj8wPc=; b=b0/OiJ0d3P9KK7bTF4BCXj3cevoMQeYUaSO9QJIM6wsQhi67kAfJLLtNOn+Uus49Ze ksuZfhYKEkW/VPTeohmq5RPkWOD7DVFus4tJ8Gf++O6Xvwqs7YewyF0G9gD9v8NZpgId 1aeDiIlhCNyWLFGUrvvH1otya1kCFOC133HtKqDXLS7yXIZOdqHRZ9UCaKNAzf2sTO5o rJhOi7zeqA7HwWXluA0Joor/LPLso6PIxYJZmou/DxO4y0ja/ot0bEBe+DVtuN6Spd3l kVr7i9w7JVX711b57qrCjWQL6lz7Mw7Fa9CRTFeddiNBlmLOQwARsJE0EvZ4ovCZ9LVz oXVA== X-Gm-Message-State: AOJu0YwYuwK/Yd6P1YvnAh8V6Ip35GbUFp3PFtxSmdyfLm22PM4X3sbY PruUJgD7F3SIV/Bll2zARVl6EOo9EL8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:b609:0:b0:592:7a39:e4b4 with SMTP id u9-20020a81b609000000b005927a39e4b4mr105053ywh.6.1694824760295; Fri, 15 Sep 2023 17:39:20 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 15 Sep 2023 17:39:14 -0700 In-Reply-To: <20230916003916.2545000-1-seanjc@google.com> Mime-Version: 1.0 References: <20230916003916.2545000-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230916003916.2545000-2-seanjc@google.com> Subject: [PATCH 1/3] KVM: x86/mmu: Open code walking TDP MMU roots for mmu_notifier's zap SPTEs From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pattara Teerapong , David Stevens , Yiwei Zhang , Paul Hsia X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Fri, 15 Sep 2023 17:48:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777172505524703869 X-GMAIL-MSGID: 1777172505524703869 Use the "inner" TDP MMU root walker when zapping SPTEs in response to an mmu_notifier invalidation instead of invoking kvm_tdp_mmu_zap_leafs(). This will allow reworking for_each_tdp_mmu_root_yield_safe() to do more work, and to also make it usable in more places, without increasing the number of params to the point where it adds no value. The mmu_notifier path is a bit of a special snowflake, e.g. it zaps only a single address space (because it's per-slot), and can't always yield. Drop the @can_yield param from tdp_mmu_zap_leafs() as its sole remaining caller unconditionally passes "true". Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 13 +++++++++---- arch/x86/kvm/mmu/tdp_mmu.h | 4 ++-- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e1d011c67cc6..59f5e40b8f55 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6260,7 +6260,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) if (tdp_mmu_enabled) { for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) flush = kvm_tdp_mmu_zap_leafs(kvm, i, gfn_start, - gfn_end, true, flush); + gfn_end, flush); } if (flush) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 6c63f2d1675f..89aaa2463373 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -878,12 +878,12 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, * more SPTEs were zapped since the MMU lock was last acquired. */ bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, - bool can_yield, bool flush) + bool flush) { struct kvm_mmu_page *root; for_each_tdp_mmu_root_yield_safe(kvm, root, as_id) - flush = tdp_mmu_zap_leafs(kvm, root, start, end, can_yield, flush); + flush = tdp_mmu_zap_leafs(kvm, root, start, end, true, flush); return flush; } @@ -1146,8 +1146,13 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, bool flush) { - return kvm_tdp_mmu_zap_leafs(kvm, range->slot->as_id, range->start, - range->end, range->may_block, flush); + struct kvm_mmu_page *root; + + __for_each_tdp_mmu_root_yield_safe(kvm, root, range->slot->as_id, false, false) + flush = tdp_mmu_zap_leafs(kvm, root, range->start, range->end, + range->may_block, flush); + + return flush; } typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter, diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 0a63b1afabd3..eb4fa345d3a4 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -20,8 +20,8 @@ __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root) void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root, bool shared); -bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, - gfn_t end, bool can_yield, bool flush); +bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, + bool flush); bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp); void kvm_tdp_mmu_zap_all(struct kvm *kvm); void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm); From patchwork Sat Sep 16 00:39:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 140940 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1478552vqi; Fri, 15 Sep 2023 21:52:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFQ1p1OyrorGn9afEyrfW3hriUalzxTdAZUXvBBSgOvPxsF0lZheaCGJCk92Jz7XseUO8Fd X-Received: by 2002:a05:6a00:1409:b0:68f:cd32:c52d with SMTP id l9-20020a056a00140900b0068fcd32c52dmr3779668pfu.14.1694839971923; Fri, 15 Sep 2023 21:52:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694839971; cv=none; d=google.com; s=arc-20160816; b=Aw5MfviDO48jyiQghhuqd3X+zXijTFkm9Lt8H2Jj6idflh+Fv0DQK7oMSEc1ZXX+Kn 25icv7PX6PVnyWquw9hujslR8vdbnLa6mBOt890LspuRKCrA6kblH5grW7Eg8ZD518PG oZ5RUxQIQOCDn8/jEfuiRHoiO5jb1Mpp9qBxzIxxP4yynns21L5IJd9UKtgiOpMT6k7k WoO8iCEBMs65rNMSsib2pNsPQhX0QbdUsNO1CIUw9kLofZs2F76bjUCZbSFe3b+7UPJb ZQFURaQlgGgkDp6HNcHQffVHNdYVfa5b5YxvfBfBM4jVGAcOn83w213A5rXwU5mRiNhs xzzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=wv4J3NjpzA01KgOV3LplTe0R6P5qicH6Eu51IuLkMzI=; fh=Ule+EmZomMuZFAbJQ8lqmeAEXCatKRnnjo4FlrQw2P0=; b=zmAdcVVWQvhPQVlx4shtJGFOiWDvlEJbZ7/T/hFfJtoQ+FXtA9ZJWNqsXfNpuf2UiP eCwBmd5+CO2zizno2mbclRWpOPNazG6rI87211XCldfxqJec0168yev6vqYBQVeVnGaw GUYwRsyh+3WFOOM3Ep5ywcYlYxyhQE1ZKzNwhR6/ml8aZS1eP9cXY31kIdENdzkJ8ihw k28qJfzO+OpwZnByIL/RTQELX23N/8jFPQLMo014CkOvr66StAaXNKCktGYsKWMWdrc6 mQLvIHNS26V35luKQlEgeH9PNhPbDWoJNxr/jiVjomp7YiabOyzSAgBfIOaJzCn0QEV1 IY2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oKbvC4wv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id k3-20020a6568c3000000b0054fe6b9502esi4388139pgt.687.2023.09.15.21.52.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 21:52:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oKbvC4wv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id D94F9808348C; Fri, 15 Sep 2023 17:46:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238457AbjIPApk (ORCPT + 28 others); Fri, 15 Sep 2023 20:45:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238577AbjIPApT (ORCPT ); Fri, 15 Sep 2023 20:45:19 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01E9E30EE for ; Fri, 15 Sep 2023 17:43:23 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id 98e67ed59e1d1-271bb60d8a8so2741507a91.3 for ; Fri, 15 Sep 2023 17:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694824762; x=1695429562; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=wv4J3NjpzA01KgOV3LplTe0R6P5qicH6Eu51IuLkMzI=; b=oKbvC4wv6JL7OXRLPqpVEciDDkj9kEqtFSERSsd3So8cKTT21k2VPJ5bvLboJm7PQt gTJhXN8edqZNhPuZ6qceg9MGcPcTgacvo34S5DL971kGPZTjzfE3S1wvMkXpYtwroJL5 ksZuNs0nF5xqUz8vueJ89ztbEY1MenS1JCKIrMJStZhORyWhZjM3y8ZoJQ6HzMllMmLV ET+GNKizocLAU0mzK0bUSHPozsw/cled2EKvIlAAzvAaD+B7HNgzOj9mjL1tUZZwTC35 z/IkpnxwGbHOwDDn3qHJ+mGCqFGTEmlqjHn2wlQyMsk1C0ghQ7D1b6uh/B5spTkT8s1T svoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694824762; x=1695429562; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wv4J3NjpzA01KgOV3LplTe0R6P5qicH6Eu51IuLkMzI=; b=nOj3BST7nN73b2wfdh5tGJ9zfd3ooW6c/iElkmp6jxMl/aJOMx8CLlbb6wTg1/O/2G zAHTn3WeEV8aR/tgJBDwE9qy5lvVHEHcNNMXjw5EasiDnb5X1o3nfip/4IZ6WHeaz5os mnpgHkqofpHhQxndN56FgVELNsOqenW2mHKdRbvMWMDk1y918gmY06NxZWqTx/MptqLX JUe8tdWwiEgyVU7UqZCPYP0iflgimgqroCPgsKEgP0fKgBpzY07eRW0jxxGcg3FPBP1q 8ja3xINJkt5jWjunmHH7pkR7LVWYVzdhtL0BRMjJSnokkqZLSHWy9SGiE6ta99iWSpxm XPCw== X-Gm-Message-State: AOJu0YzTqrllZrAokObvHKpgv7bzsafKlfwLRrvL8GoywRcpnAs1mu5U RKDZ24LNJ4ah4Mf7p+ATX16t85Y/5V4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:1044:b0:274:6af0:d75b with SMTP id gq4-20020a17090b104400b002746af0d75bmr77079pjb.7.1694824762252; Fri, 15 Sep 2023 17:39:22 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 15 Sep 2023 17:39:15 -0700 In-Reply-To: <20230916003916.2545000-1-seanjc@google.com> Mime-Version: 1.0 References: <20230916003916.2545000-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230916003916.2545000-3-seanjc@google.com> Subject: [PATCH 2/3] KVM: x86/mmu: Take "shared" instead of "as_id" TDP MMU's yield-safe iterator From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pattara Teerapong , David Stevens , Yiwei Zhang , Paul Hsia X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 17:46:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777168518754381708 X-GMAIL-MSGID: 1777168518754381708 Replace the address space ID in for_each_tdp_mmu_root_yield_safe() with a shared (vs. exclusive) param, and have the walker iterate over all address spaces as all callers want to process all address spaces. Drop the @as_id param as well as the manual address space iteration in callers. Add the @shared param even though the two current callers pass "false" unconditionally, as the main reason for refactoring the walker is to simplify using it to zap invalid TDP MMU roots, which is done with mmu_lock held for read. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 8 ++------ arch/x86/kvm/mmu/tdp_mmu.c | 20 ++++++++++---------- arch/x86/kvm/mmu/tdp_mmu.h | 3 +-- 3 files changed, 13 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 59f5e40b8f55..54f94f644b42 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6246,7 +6246,6 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) { bool flush; - int i; if (WARN_ON_ONCE(gfn_end <= gfn_start)) return; @@ -6257,11 +6256,8 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end); - if (tdp_mmu_enabled) { - for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) - flush = kvm_tdp_mmu_zap_leafs(kvm, i, gfn_start, - gfn_end, flush); - } + if (tdp_mmu_enabled) + flush = kvm_tdp_mmu_zap_leafs(kvm, gfn_start, gfn_end, flush); if (flush) kvm_flush_remote_tlbs_range(kvm, gfn_start, gfn_end - gfn_start); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 89aaa2463373..7cb1902ae032 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -211,8 +211,12 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm, #define for_each_valid_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared) \ __for_each_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, _shared, true) -#define for_each_tdp_mmu_root_yield_safe(_kvm, _root, _as_id) \ - __for_each_tdp_mmu_root_yield_safe(_kvm, _root, _as_id, false, false) +#define for_each_tdp_mmu_root_yield_safe(_kvm, _root, _shared) \ + for (_root = tdp_mmu_next_root(_kvm, NULL, _shared, false); \ + _root; \ + _root = tdp_mmu_next_root(_kvm, _root, _shared, false)) \ + if (!kvm_lockdep_assert_mmu_lock_held(_kvm, _shared)) { \ + } else /* * Iterate over all TDP MMU roots. Requires that mmu_lock be held for write, @@ -877,12 +881,11 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, * true if a TLB flush is needed before releasing the MMU lock, i.e. if one or * more SPTEs were zapped since the MMU lock was last acquired. */ -bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, - bool flush) +bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, gfn_t start, gfn_t end, bool flush) { struct kvm_mmu_page *root; - for_each_tdp_mmu_root_yield_safe(kvm, root, as_id) + for_each_tdp_mmu_root_yield_safe(kvm, root, false) flush = tdp_mmu_zap_leafs(kvm, root, start, end, true, flush); return flush; @@ -891,7 +894,6 @@ bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, void kvm_tdp_mmu_zap_all(struct kvm *kvm) { struct kvm_mmu_page *root; - int i; /* * Zap all roots, including invalid roots, as all SPTEs must be dropped @@ -905,10 +907,8 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm) * is being destroyed or the userspace VMM has exited. In both cases, * KVM_RUN is unreachable, i.e. no vCPUs will ever service the request. */ - for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { - for_each_tdp_mmu_root_yield_safe(kvm, root, i) - tdp_mmu_zap_root(kvm, root, false); - } + for_each_tdp_mmu_root_yield_safe(kvm, root, false) + tdp_mmu_zap_root(kvm, root, false); } /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index eb4fa345d3a4..bc088953f929 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -20,8 +20,7 @@ __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root) void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root, bool shared); -bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, - bool flush); +bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, gfn_t start, gfn_t end, bool flush); bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp); void kvm_tdp_mmu_zap_all(struct kvm *kvm); void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm); From patchwork Sat Sep 16 00:39:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 140994 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1536841vqi; Sat, 16 Sep 2023 00:55:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFzh2yiD8ynaE9ZiMGR97eFlx1ThuFYMNlj6ic4dMGmSphhH3PI15Q6QFOTt7e31+AZKjD8 X-Received: by 2002:a05:6358:291f:b0:142:fb84:92dc with SMTP id y31-20020a056358291f00b00142fb8492dcmr5131142rwb.2.1694850930462; Sat, 16 Sep 2023 00:55:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694850930; cv=none; d=google.com; s=arc-20160816; b=i7fqqj/12qsS8nnrcC9ESwjETqqF82tIgh6AqmC5u7rpdsz/c9Wl4lfd5J2UG0KA6f F+T/SeD5TKGNtFU264rXqE8fsGey3+HUY5LW5rJKPRyTk72jhSfz2rkGU0PGClpUfKQ1 bZkxbq1mEQm6bAKycIVC0tP0X13xFM4xjAB2Z4+Tpp3ZjGEPXvVVx+D2K3JMgd7kWol7 oanbKh2fuJBQZDotuJIHcfMGL57Iuja6tiKbuoUppXCh4ZcYfCdqdm03mT1tfJAJWXqZ eop00/B0dQz5huYEbZ9J9/5isHgnhdYlDP0keephzQCdbaJFJr/A94z+lR5mU2CLrZsX zu7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=Cj9kWw04pNJMtISmJeHkwKC2N/5/ZVS25i920GH/+3E=; fh=Ule+EmZomMuZFAbJQ8lqmeAEXCatKRnnjo4FlrQw2P0=; b=amoS/oWTDMiaxbpye5b+xHQGF1OhBOiUB9s/zpoXk6Mv/jBAwDk89djOImSIsqc291 4orFOo4sS2IC9b3cElscq7CX6weXrIiaoFbv7+frSlMi3HIf/KOyF/vR078FmkQ4cI6E PBZaDMIny7hlAeGOA7XSY4QV5z4VdFJ7Yp9FuvPi/jy3uqHMZ9tEfKCTsOpnqCpP2ARo gmiepEV/QttF9G25D4xL4gygB9VGj2TPoVaiVbcl+YqrpDRb9etCyOeaQbab2C7v7Bbc BGAe4/DH7KSNiZ5OainsziES3SKtcxX/EJgv+K9fzcmnVUw5sd41mysSVeEJPrpha2pt 0YYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=rVc8HV2Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id g193-20020a636bca000000b00573e90d8bf0si4437205pgc.184.2023.09.16.00.55.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Sep 2023 00:55:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=rVc8HV2Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 88C448050938; Fri, 15 Sep 2023 17:48:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238558AbjIPArs (ORCPT + 28 others); Fri, 15 Sep 2023 20:47:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238727AbjIPArf (ORCPT ); Fri, 15 Sep 2023 20:47:35 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51A531BF8 for ; Fri, 15 Sep 2023 17:43:24 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d814a1f7378so7107295276.1 for ; Fri, 15 Sep 2023 17:43:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694824764; x=1695429564; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Cj9kWw04pNJMtISmJeHkwKC2N/5/ZVS25i920GH/+3E=; b=rVc8HV2YAqHN13X6dpScqho2WoQPca2aFH6eLGGXY+Ae5rCRsV4kKxt+rAo82psqst NeYkQUQjPBvQDw2QnZf5YYwafQ0pYFr3GReIzlR4QADp7/9qBL58LuXt8jd0w6/NoMR0 jJfjqIkbI4yDlw6oQKlBsZ/ZZBvOmL4RyqJhkMfBuEyJGGbpT632Wb1qOaQPquZz0dqe lTzWg5AWnFMZjyJC/NqosIJGSOZsH2etnVynI/zSvSE1T7lenPMZdAwIG+LmPY5bb9u2 ZsAB9M2FXhGNZS71Qo91QiNbuicj9JioosKEdpBWF6c0fq5xgVC/f+2UBesobV9og5gU xkIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694824764; x=1695429564; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Cj9kWw04pNJMtISmJeHkwKC2N/5/ZVS25i920GH/+3E=; b=CW4xta/P33UF/qFzPjWbTIONpqLo0C8wybxjG5COCOgG4y+5Cz+DU2f3bWdYSxrw+X 1A99aj7BBWOxzZgfa2F3QgJtlromRWn4DFXyOgMM8e12LrrIZoAHyJpDRiX2quUlmXDv f3jizadFlE20mEXgfETToqFmoPhSsG07LOIjcthl5E8p2SOaOH2bwxy1jQYf5M4iaQk5 GstEGhdQ8SvLDk5Y81VZIKAJFX1TJ1KAuujiAcSLQNVrybyWCwWyZ6pN49qPQd7GlJ/1 w31L97yEXoBwmr3K0Ykqnt0hmgpTT2hchWttOxEiBLehB1MUz2q4g6+qi2IKc6DikMcV HhcQ== X-Gm-Message-State: AOJu0YzN21oeC2J8cib6hG8hcDq09EDp/OUg1PrUhbfgIaR0rSkQcyqP jXdt5ni1k32B2v6xZQy80B40a78AK3A= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:ab47:0:b0:d7e:78db:d264 with SMTP id u65-20020a25ab47000000b00d7e78dbd264mr185764ybi.5.1694824764466; Fri, 15 Sep 2023 17:39:24 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 15 Sep 2023 17:39:16 -0700 In-Reply-To: <20230916003916.2545000-1-seanjc@google.com> Mime-Version: 1.0 References: <20230916003916.2545000-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230916003916.2545000-4-seanjc@google.com> Subject: [PATCH 3/3] KVM: x86/mmu: Stop zapping invalidated TDP MMU roots asynchronously From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pattara Teerapong , David Stevens , Yiwei Zhang , Paul Hsia X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 15 Sep 2023 17:48:26 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777180009448843349 X-GMAIL-MSGID: 1777180009448843349 Stop zapping invalidate TDP MMU roots via work queue now that KVM preserves TDP MMU roots until they are explicitly invalidated. Zapping roots asynchronously was effectively a workaround to avoid stalling a vCPU for an extended during if a vCPU unloaded a root, which at the time happened whenever the guest toggled CR0.WP (a frequent operation for some guest kernels). While a clever hack, zapping roots via an unbound worker had subtle, unintended consequences on host scheduling, especially when zapping multiple roots, e.g. as part of a memslot. Because the work of zapping a root is no longer bound to the task that initiated the zap, things like the CPU affinity and priority of the original task get lost. Losing the affinity and priority can be especially problematic if unbound workqueues aren't affined to a small number of CPUs, as zapping multiple roots can cause KVM to heavily utilize the majority of CPUs in the system, *beyond* the CPUs KVM is already using to run vCPUs. When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root zap can result in KVM occupying all logical CPUs for ~8ms, and result in high priority tasks not being scheduled in in a timely manner. In v5.15, which doesn't preserve unloaded roots, the issues were even more noticeable as KVM would zap roots more frequently and could occupy all CPUs for 50ms+. Consuming all CPUs for an extended duration can lead to significant jitter throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots is a semi-frequent operation as memslots are deleted and recreated with different host virtual addresses to react to host GPU drivers allocating and freeing GPU blobs. On ChromeOS, the jitter manifests as audio blips during games due to the audio server's tasks not getting scheduled in promptly, despite the tasks having a high realtime priority. Deleting memslots isn't exactly a fast path and should be avoided when possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the memslot shenanigans, but KVM is squarely in the wrong. Not to mention that removing the async zapping eliminates a non-trivial amount of complexity. Note, one of the subtle behaviors hidden behind the async zapping is that KVM would zap invalidated roots only once (ignoring partial zaps from things like mmu_notifier events). Preserve this behavior by adding a flag to identify roots that are scheduled to be zapped versus roots that have already been zapped but not yet freed. Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can encounter invalid roots, as it's not at all obvious why zapping invalidated roots shouldn't simply zap all invalid roots. Reported-by: Pattara Teerapong Cc: David Stevens Cc: Yiwei Zhang Cc: Paul Hsia Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kvm/mmu/mmu.c | 13 +--- arch/x86/kvm/mmu/mmu_internal.h | 13 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 116 +++++++++++++------------------- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- arch/x86/kvm/x86.c | 5 +- 6 files changed, 59 insertions(+), 93 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 1a4def36d5bb..17715cb8731d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1419,7 +1419,6 @@ struct kvm_arch { * the thread holds the MMU lock in write mode. */ spinlock_t tdp_mmu_pages_lock; - struct workqueue_struct *tdp_mmu_zap_wq; #endif /* CONFIG_X86_64 */ /* @@ -1835,7 +1834,7 @@ void kvm_mmu_vendor_module_exit(void); void kvm_mmu_destroy(struct kvm_vcpu *vcpu); int kvm_mmu_create(struct kvm_vcpu *vcpu); -int kvm_mmu_init_vm(struct kvm *kvm); +void kvm_mmu_init_vm(struct kvm *kvm); void kvm_mmu_uninit_vm(struct kvm *kvm); void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 54f94f644b42..f7901cb4d2fa 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6167,20 +6167,15 @@ static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages)); } -int kvm_mmu_init_vm(struct kvm *kvm) +void kvm_mmu_init_vm(struct kvm *kvm) { - int r; - INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages); spin_lock_init(&kvm->arch.mmu_unsync_pages_lock); - if (tdp_mmu_enabled) { - r = kvm_mmu_init_tdp_mmu(kvm); - if (r < 0) - return r; - } + if (tdp_mmu_enabled) + kvm_mmu_init_tdp_mmu(kvm); kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache; kvm->arch.split_page_header_cache.gfp_zero = __GFP_ZERO; @@ -6189,8 +6184,6 @@ int kvm_mmu_init_vm(struct kvm *kvm) kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache; kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO; - - return 0; } static void mmu_free_vm_memory_caches(struct kvm *kvm) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index b102014e2c60..93b9d50c24ad 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -58,7 +58,10 @@ struct kvm_mmu_page { bool tdp_mmu_page; bool unsync; - u8 mmu_valid_gen; + union { + u8 mmu_valid_gen; + bool tdp_mmu_scheduled_root_to_zap; + }; /* * The shadow page can't be replaced by an equivalent huge page @@ -100,13 +103,7 @@ struct kvm_mmu_page { struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */ tdp_ptep_t ptep; }; - union { - DECLARE_BITMAP(unsync_child_bitmap, 512); - struct { - struct work_struct tdp_mmu_async_work; - void *tdp_mmu_async_data; - }; - }; + DECLARE_BITMAP(unsync_child_bitmap, 512); /* * Tracks shadow pages that, if zapped, would allow KVM to create an NX diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7cb1902ae032..ca3304c2c00c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -12,18 +12,10 @@ #include /* Initializes the TDP MMU for the VM, if enabled. */ -int kvm_mmu_init_tdp_mmu(struct kvm *kvm) +void kvm_mmu_init_tdp_mmu(struct kvm *kvm) { - struct workqueue_struct *wq; - - wq = alloc_workqueue("kvm", WQ_UNBOUND|WQ_MEM_RECLAIM|WQ_CPU_INTENSIVE, 0); - if (!wq) - return -ENOMEM; - INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots); spin_lock_init(&kvm->arch.tdp_mmu_pages_lock); - kvm->arch.tdp_mmu_zap_wq = wq; - return 1; } /* Arbitrarily returns true so that this may be used in if statements. */ @@ -46,20 +38,15 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) * ultimately frees all roots. */ kvm_tdp_mmu_invalidate_all_roots(kvm); - - /* - * Destroying a workqueue also first flushes the workqueue, i.e. no - * need to invoke kvm_tdp_mmu_zap_invalidated_roots(). - */ - destroy_workqueue(kvm->arch.tdp_mmu_zap_wq); + kvm_tdp_mmu_zap_invalidated_roots(kvm); WARN_ON(atomic64_read(&kvm->arch.tdp_mmu_pages)); WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); /* * Ensure that all the outstanding RCU callbacks to free shadow pages - * can run before the VM is torn down. Work items on tdp_mmu_zap_wq - * can call kvm_tdp_mmu_put_root and create new callbacks. + * can run before the VM is torn down. Putting the last reference to + * zapped roots will create new callbacks. */ rcu_barrier(); } @@ -89,43 +76,6 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head) static void tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root, bool shared); -static void tdp_mmu_zap_root_work(struct work_struct *work) -{ - struct kvm_mmu_page *root = container_of(work, struct kvm_mmu_page, - tdp_mmu_async_work); - struct kvm *kvm = root->tdp_mmu_async_data; - - read_lock(&kvm->mmu_lock); - - /* - * A TLB flush is not necessary as KVM performs a local TLB flush when - * allocating a new root (see kvm_mmu_load()), and when migrating vCPU - * to a different pCPU. Note, the local TLB flush on reuse also - * invalidates any paging-structure-cache entries, i.e. TLB entries for - * intermediate paging structures, that may be zapped, as such entries - * are associated with the ASID on both VMX and SVM. - */ - tdp_mmu_zap_root(kvm, root, true); - - /* - * Drop the refcount using kvm_tdp_mmu_put_root() to test its logic for - * avoiding an infinite loop. By design, the root is reachable while - * it's being asynchronously zapped, thus a different task can put its - * last reference, i.e. flowing through kvm_tdp_mmu_put_root() for an - * asynchronously zapped root is unavoidable. - */ - kvm_tdp_mmu_put_root(kvm, root, true); - - read_unlock(&kvm->mmu_lock); -} - -static void tdp_mmu_schedule_zap_root(struct kvm *kvm, struct kvm_mmu_page *root) -{ - root->tdp_mmu_async_data = kvm; - INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work); - queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work); -} - void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root, bool shared) { @@ -917,18 +867,47 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm) */ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm) { - flush_workqueue(kvm->arch.tdp_mmu_zap_wq); + struct kvm_mmu_page *root; + + read_lock(&kvm->mmu_lock); + + for_each_tdp_mmu_root_yield_safe(kvm, root, true) { + if (!root->tdp_mmu_scheduled_root_to_zap) + continue; + + root->tdp_mmu_scheduled_root_to_zap = false; + KVM_BUG_ON(!root->role.invalid, kvm); + + /* + * A TLB flush is not necessary as KVM performs a local TLB + * flush when allocating a new root (see kvm_mmu_load()), and + * when migrating a vCPU to a different pCPU. Note, the local + * TLB flush on reuse also invalidates paging-structure-cache + * entries, i.e. TLB entries for intermediate paging structures, + * that may be zapped, as such entries are associated with the + * ASID on both VMX and SVM. + */ + tdp_mmu_zap_root(kvm, root, true); + + /* + * The referenced needs to be put *after* zapping the root, as + * the root must be reachable by mmu_notifiers while it's being + * zapped + */ + kvm_tdp_mmu_put_root(kvm, root, true); + } + + read_unlock(&kvm->mmu_lock); } /* * Mark each TDP MMU root as invalid to prevent vCPUs from reusing a root that * is about to be zapped, e.g. in response to a memslots update. The actual - * zapping is performed asynchronously. Using a separate workqueue makes it - * easy to ensure that the destruction is performed before the "fast zap" - * completes, without keeping a separate list of invalidated roots; the list is - * effectively the list of work items in the workqueue. + * zapping is done separately so that it happens with mmu_lock with read, + * whereas invalidating roots must be done with mmu_lock held for write (unless + * the VM is being destroyed). * - * Note, the asynchronous worker is gifted the TDP MMU's reference. + * Note, kvm_tdp_mmu_zap_invalidated_roots() is gifted the TDP MMU's reference. * See kvm_tdp_mmu_get_vcpu_root_hpa(). */ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm) @@ -953,19 +932,20 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm) /* * As above, mmu_lock isn't held when destroying the VM! There can't * be other references to @kvm, i.e. nothing else can invalidate roots - * or be consuming roots, but walking the list of roots does need to be - * guarded against roots being deleted by the asynchronous zap worker. + * or get/put references to roots. */ - rcu_read_lock(); - - list_for_each_entry_rcu(root, &kvm->arch.tdp_mmu_roots, link) { + list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) { + /* + * Note, invalid roots can outlive a memslot update! Invalid + * roots must be *zapped* before the memslot update completes, + * but a different task can acquire a reference and keep the + * root alive after its been zapped. + */ if (!root->role.invalid) { + root->tdp_mmu_scheduled_root_to_zap = true; root->role.invalid = true; - tdp_mmu_schedule_zap_root(kvm, root); } } - - rcu_read_unlock(); } /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index bc088953f929..733a3aef3a96 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -7,7 +7,7 @@ #include "spte.h" -int kvm_mmu_init_tdp_mmu(struct kvm *kvm); +void kvm_mmu_init_tdp_mmu(struct kvm *kvm); void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm); hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6c9c81e82e65..9f18b06bbda6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12308,9 +12308,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) if (ret) goto out; - ret = kvm_mmu_init_vm(kvm); - if (ret) - goto out_page_track; + kvm_mmu_init_vm(kvm); ret = static_call(kvm_x86_vm_init)(kvm); if (ret) @@ -12355,7 +12353,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) out_uninit_mmu: kvm_mmu_uninit_vm(kvm); -out_page_track: kvm_page_track_cleanup(kvm); out: return ret;