From patchwork Tue Dec 13 03:30:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 32616 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2610371wrr; Mon, 12 Dec 2022 19:33:46 -0800 (PST) X-Google-Smtp-Source: AA0mqf7An6SYeVzc/InOMQOcRkFwL708wfkFAvUd7PWOLdIfVheUE1mN0VeTN7mc9oqZRcn63juU X-Received: by 2002:a17:906:26d6:b0:78d:f454:387a with SMTP id u22-20020a17090626d600b0078df454387amr15861289ejc.55.1670902426770; Mon, 12 Dec 2022 19:33:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670902426; cv=none; d=google.com; s=arc-20160816; b=nYcAAY+pbKnd89zT1silueohyjYkjR1VVp5s+jqSzKLXdXpZ0m6fH4YdibGbGVyrYm JV2n0GfW7m4dolE8K+aqyQXoT+RKJ2ZZpHNXc4HS5kEGJHS/zWkx52MtVMlL443wu/eE MJ07hiC1dPNC9E0z0MQRYCeu/0usYX+bmxdXC4GMzUnAEunqOUEUe8diA5UwnZWP5qPo MefZKoDVpqOfYJkBpVPmMOW1IBzQV/4p0iiHh/Q9lgCBPNvCFjblggu424bYv6oXlGUb DsNhVlVqMzCLzRYmjTnAIkeOZGOqy8fsxTwW7gnj8v7iyv2n5QodlmX4YJyIRMH2jjwj ZzVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=RxUces6vMG4+c5LveealZEzgVFW4k2gLn9w4VIDwQ6E=; b=NWWtL147U+OvIgKLxk483o/JqHk5i5C6UyULJJqOVZVZ2KHbHQO8MCx6kfHRipYdKs wJWYFeROdCSAjy+4mD7Vq5uUTl5ffwQBZebsOfwo62ceuDuzQj+yp8MpmKqYJd214Xh2 Xel+YJWGM+puUtq4h09Rwd1ldvgPIX6meaW+64laKaQRTZQte+fRv1cpySCJovZM4bmV wSoxFpgTLUw7qcfIT5gmZ10uu8LJ/U49eALC3heWc/+LsaDtrnc4/3Gr4+DIX07siM1m rEyQHSU7ltMEHrZzM6L3smoCWZ96RX6ku/3VLuo1kwHepmBIvUvj7pq6E54k4M/Ufb/A bIlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="lq/GEj1d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g22-20020a1709065d1600b0077e04f856a2si7964041ejt.541.2022.12.12.19.33.23; Mon, 12 Dec 2022 19:33:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="lq/GEj1d"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229441AbiLMDal (ORCPT + 99 others); Mon, 12 Dec 2022 22:30:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234128AbiLMDag (ORCPT ); Mon, 12 Dec 2022 22:30:36 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 037A11B9C0 for ; Mon, 12 Dec 2022 19:30:36 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id f14-20020a056a0022ce00b005782e3b4704so1207983pfj.4 for ; Mon, 12 Dec 2022 19:30:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=RxUces6vMG4+c5LveealZEzgVFW4k2gLn9w4VIDwQ6E=; b=lq/GEj1d2zHrbru8iZ1qjB/Gm763m2IT0FGRntOmlZZGmAogcCDxjdxsgG8QUkA4gs 3HmAKxedcc0SmnlM8mMF3b1XEjAJJvxaOoMkV2Rf73uEsd7wrce4ToRMMVotLxikcdaR 8tz060UpDPZVxXEif4gg8Fr8oHHUzwhTIh9wNHGH3mv4KqwmTj6hVqMGxUosvHTwK5OQ p+Sy6U3lcLspFg0JRm7jXJbswBNorqyfcqMfRL/jBamIWn6JrUCyOw0houliRDImP+Cy foC16AHRKQ5j66k+3J656QAirqxqhzt1d+Gvjvuhg9DgO6b0TlhOiBsNE+MMy5oNTg7j LF9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RxUces6vMG4+c5LveealZEzgVFW4k2gLn9w4VIDwQ6E=; b=71e07Q0ctGtVtO79CXtuIsxXPR/3iOAo8qDFVR1WG0i9bjK9IpemalnCD4Z90Oouwn brH0KDVL4tmNUZ1wUADF8YIXfIlzRL2eXhJq1S5FyY7BJlh4o/GTIt1fOziQ9yDeTh4f w4d95hz1k1w/gNNsNmuzDSx4dp+YbCczzEMHMBNhn4SmPLcMJrSQN4egrkA1v/AvKSTC koLGoBSxVrcph4nUhhAfhSEW2TxZEGdtAay0WnBBEx4ddUTbgcIIAJivHzQrnrqjDd0v lC/IYROvkLDdNXkVKNXxB5RCyEz7/yZWmE4LvgJkSC8mxuTmDc+ta5NxdnVDQUm8nOPe 7lfw== X-Gm-Message-State: ANoB5pma5pOKY7p19DiEvep6XQu+8dnvuFPeO7t4z3q2gTMWEctWgp5h ecEqKXC3attOTGrlJhI/YUx+UEibymE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:2ecb:b0:219:5b3b:2b9f with SMTP id h11-20020a17090a2ecb00b002195b3b2b9fmr14549pjs.2.1670902235305; Mon, 12 Dec 2022 19:30:35 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 03:30:26 +0000 In-Reply-To: <20221213033030.83345-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213033030.83345-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213033030.83345-2-seanjc@google.com> Subject: [PATCH 1/5] KVM: x86/mmu: Don't attempt to map leaf if target TDP MMU SPTE is frozen From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Robert Hoo , Greg Thelen , David Matlack , Ben Gardon , Mingwei Zhang X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752068183202583717?= X-GMAIL-MSGID: =?utf-8?q?1752068183202583717?= Hoist the is_removed_spte() check above the "level == goal_level" check when walking SPTEs during a TDP MMU page fault to avoid attempting to map a leaf entry if said entry is frozen by a different task/vCPU. ------------[ cut here ]------------ WARNING: CPU: 3 PID: 939 at arch/x86/kvm/mmu/tdp_mmu.c:653 kvm_tdp_mmu_map+0x269/0x4b0 Modules linked in: kvm_intel CPU: 3 PID: 939 Comm: nx_huge_pages_t Not tainted 6.1.0-rc4+ #67 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_tdp_mmu_map+0x269/0x4b0 RSP: 0018:ffffc9000068fba8 EFLAGS: 00010246 RAX: 00000000000005a0 RBX: ffffc9000068fcc0 RCX: 0000000000000005 RDX: ffff88810741f000 RSI: ffff888107f04600 RDI: ffffc900006a3000 RBP: 060000010b000bf3 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 000ffffffffff000 R12: 0000000000000005 R13: ffff888113670000 R14: ffff888107464958 R15: 0000000000000000 FS: 00007f01c942c740(0000) GS:ffff888277cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000117013006 CR4: 0000000000172ea0 Call Trace: kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 63d28a25e04c ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry") Cc: Robert Hoo Signed-off-by: Sean Christopherson Reviewed-by: Robert Hoo --- arch/x86/kvm/mmu/tdp_mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 764f7c87286f..b740f38fedcc 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1162,9 +1162,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) if (fault->nx_huge_page_workaround_enabled) disallowed_hugepage_adjust(fault, iter.old_spte, iter.level); - if (iter.level == fault->goal_level) - break; - /* * If SPTE has been frozen by another thread, just give up and * retry, avoiding unnecessary page table allocation and free. @@ -1172,6 +1169,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) if (is_removed_spte(iter.old_spte)) goto retry; + if (iter.level == fault->goal_level) + break; + /* Step down into the lower level page table if it exists. */ if (is_shadow_present_pte(iter.old_spte) && !is_large_pte(iter.old_spte)) From patchwork Tue Dec 13 03:30:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 32617 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2610446wrr; Mon, 12 Dec 2022 19:34:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf77Z8t8GzfEPmPjcWlMhwxrtY8aNY7DKMASwwDEX1qgoRRSdr4PQT5/JhQaDG45mCC7hFiB X-Received: by 2002:a17:906:ecb7:b0:7c1:f6c:dd4e with SMTP id qh23-20020a170906ecb700b007c10f6cdd4emr17854037ejb.40.1670902440148; Mon, 12 Dec 2022 19:34:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670902440; cv=none; d=google.com; s=arc-20160816; b=Fp8KPzrNaua7d8D601oCDyfSHdaS+qpXdVfjWV+b7XqCKsX/8nIlvtvEWO/lD3mh/c QPe63ep5PZc84Y6I/CgAAY4jqmwQR/JIhj4aJ62G+d+ni7s24du/jX6cJFYXEm+sECpR KRz4sELVL4zAipxnoyTTIGrQBkbW0WZvpCMTSAOUKP7+qvqQ+ln0FDba+T1CnmukQQ02 MMYz7v0LVr/2s4qs81IBjA0mN7fF53BYQesewcO5x9pLz3Ja+NdT6ZUEjjWDkne2xvVB 8tJHS6iDweUG4NX+RyOlzLxET9tm0KOgEYwQ5yju4efrW5W9YOFuYaVYfmRPVLsRukP9 tSHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=WY93OmoN1ux/tTDsy/tXvpjucXkE3Hq9WCIGoG2/2Ro=; b=SZFAH83ZHZDtIm4CA4JXjEH6JYfulWjjId5SNf+qAzIn7GCxDSJd3evcqtpZusOGT2 xrEXKMslDZqmM3HV81+oAtp1voUVejaUKEDQDanwpoFxXDJrZw03XiIFYdexlYJhOhbX YiwS4fTJEIMlXFvvN9GE5S8XQ2tvzqdJVuSsN9MRlN+HqJ1hgfxIYiJ9u+W+8Xqkz9ET lrbR9qvA2hmA7vNM+DYYL187v74FQhATodpCy6D3NxJt/h5oFvCUgB6hVZLLHrwNgrFH 0FaIwvBU8nBT+rQTvynkOBuIIfBrN8bPbOwLZX2RZzxgFafOpUGFVcJyORqV9LpHTuYp EbUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=c84xiWKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e12-20020a17090658cc00b0078df24be362si8567216ejs.496.2022.12.12.19.33.36; Mon, 12 Dec 2022 19:34:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=c84xiWKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234253AbiLMDao (ORCPT + 99 others); Mon, 12 Dec 2022 22:30:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234188AbiLMDai (ORCPT ); Mon, 12 Dec 2022 22:30:38 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98DF61B7BC for ; Mon, 12 Dec 2022 19:30:37 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id jc4-20020a17090325c400b00189ceee4049so12031735plb.3 for ; Mon, 12 Dec 2022 19:30:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=WY93OmoN1ux/tTDsy/tXvpjucXkE3Hq9WCIGoG2/2Ro=; b=c84xiWKEpndzuiMaNG1kDZ8jnOBwSJ+AFzzh0cHgXLjDodX/RCj9/pfIC1BM7I69vp 24oYlPG55M9I7fXzE/cMae1AkD2EfxvVlpjiGU83Rly4H4pNY9RqkumpEzUtpx5i0ZV2 tWY6E8tjPEc3vvTxX7yJS7sdcVtTJRtA4UWrLl2WnWxonVPIY8oFLEISYCCyFQI3rNRq 8cl/ZFve0dE3oTqie4ycJF0P/qsiQnNx/IOwfeptTUls2hh10WRroIEUL7Mqj07cenB5 ovmTWCvC68SMwZUcZ3l2YmNPvdIde0z+sCGHFwEnyrcPhNOCEf9yb8x5xoLCpi71FI5q n9Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WY93OmoN1ux/tTDsy/tXvpjucXkE3Hq9WCIGoG2/2Ro=; b=dgIkK4gl5NSN3m+Woca+8Dku2gpVrnmaPLjPObhyfDv1KkvX0v0mBJ8vIFIWFAFk47 lf3XRmH/+s57LQ8wrn7K9fy/Qx/xQWk12gS0bhDzNpAfb4gZVYzXletphPnUlepNjq8d HxoJWy5zDT1s1mie5t3LZYKHqZFUG7BPMBPVNcvP0yRaaxNJqkUIl6vo0QLGGMa7/HK+ kS8tGrqvXWvd1/g3t0WDPwzn3dwv8FwUnW0DLOhBgjGct2AxKyt/G8HFLM9r7pN4cnGk JeDPyB/oswevXkLPAo9ix2AGNNPF4BRicOV9Ewa0BiXTopUm2j+UBuqscghie0gY5DsS MCyw== X-Gm-Message-State: ANoB5pkzCNrUGNz8qlrKOcecRw2khAJRYrFaxuyL5asXcv29IDne+2f0 KJSNrvtKJAhIJgLAVxra9qcHRuX9qko= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:1647:b0:576:c5e1:9b13 with SMTP id m7-20020a056a00164700b00576c5e19b13mr22847830pfc.67.1670902237147; Mon, 12 Dec 2022 19:30:37 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 03:30:27 +0000 In-Reply-To: <20221213033030.83345-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213033030.83345-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213033030.83345-3-seanjc@google.com> Subject: [PATCH 2/5] KVM: x86/mmu: Map TDP MMU leaf SPTE iff target level is reached From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Robert Hoo , Greg Thelen , David Matlack , Ben Gardon , Mingwei Zhang X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752068197122341092?= X-GMAIL-MSGID: =?utf-8?q?1752068197122341092?= Map the leaf SPTE when handling a TDP MMU page fault if and only if the target level is reached. A recent commit reworked the retry logic and incorrectly assumed that walking SPTEs would never "fail", as the loop either bails (retries) or installs parent SPs. However, the iterator itself will bail early if it detects a frozen (REMOVED) SPTE when stepping down. The TDP iterator also rereads the current SPTE before stepping down specifically to avoid walking into a part of the tree that is being removed, which means it's possible to terminate the loop without the guts of the loop observing the frozen SPTE, e.g. if a different task zaps a parent SPTE between the initial read and try_step_down()'s refresh. Mapping a leaf SPTE at the wrong level results in all kinds of badness as page table walkers interpret the SPTE as a page table, not a leaf, and walk into the weeds. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1025 at arch/x86/kvm/mmu/tdp_mmu.c:1070 kvm_tdp_mmu_map+0x481/0x510 Modules linked in: kvm_intel CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_tdp_mmu_map+0x481/0x510 RSP: 0018:ffffc9000072fba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffc9000072fcc0 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8 RBP: ffff888107d45a10 R08: ffff888277c5b4c0 R09: ffffc9000072fa48 R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000073a0e0 R13: ffff88810fc54800 R14: ffff888107d1ae60 R15: ffff88810fc54f90 FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0 Call Trace: kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Invalid SPTE change: cannot replace a present leaf SPTE with another present leaf SPTE mapping a different PFN! as_id: 0 gfn: 100200 old_spte: 600000112400bf3 new_spte: 6000001126009f3 level: 2 ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:__handle_changed_spte.cold+0x95/0x9c RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246 RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027 RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8 RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0 R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3 R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01 FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0 Call Trace: kvm_tdp_mmu_map+0x3b0/0x510 kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Modules linked in: kvm_intel ---[ end trace 0000000000000000 ]--- Fixes: 63d28a25e04c ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry") Cc: Robert Hoo Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index b740f38fedcc..e2e197d41780 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1170,7 +1170,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) goto retry; if (iter.level == fault->goal_level) - break; + goto map_target_level; /* Step down into the lower level page table if it exists. */ if (is_shadow_present_pte(iter.old_spte) && @@ -1192,8 +1192,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) r = tdp_mmu_link_sp(kvm, &iter, sp, true); /* - * Also force the guest to retry the access if the upper level SPTEs - * aren't in place. + * Force the guest to retry if installing an upper level SPTE + * failed, e.g. because a different task modified the SPTE. */ if (r) { tdp_mmu_free_sp(sp); @@ -1208,6 +1208,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) } } + /* + * The walk aborted before reaching the target level, e.g. because the + * iterator detected an upper level SPTE was frozen during traversal. + */ + WARN_ON_ONCE(iter.level == fault->goal_level); + goto retry; + +map_target_level: ret = tdp_mmu_map_handle_target_level(vcpu, fault, &iter); retry: From patchwork Tue Dec 13 03:30:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 32618 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2610457wrr; Mon, 12 Dec 2022 19:34:02 -0800 (PST) X-Google-Smtp-Source: AA0mqf4mGBA/kWCblb4g5OH5ITU2t9LTX1/PNNCTbaOovAWtibx4/pgkdiT4xlNChXB/dZ4uKkf3 X-Received: by 2002:aa7:df91:0:b0:470:34aa:a66d with SMTP id b17-20020aa7df91000000b0047034aaa66dmr316091edy.35.1670902442699; Mon, 12 Dec 2022 19:34:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670902442; cv=none; d=google.com; s=arc-20160816; b=Bje/v5qS/fuBwrH/ZNsLgeB3MVGBGdNrRt97Hkc+jOiTYXeP9WhbR15moKe17KvMTM 7YSzO9MLR6XtIU6GayPr+dcI+Zcsckx956NUpA/VAaz9EtiMw/I0fHPypduYUZwxME2Y lsJcolc3wactegv5amsNBJciUlIppf+mWHijT//bICpJFt/reGYi/jo7aEnDtXIUoV2j antG+qjAj70kcFjtEfYOuTw7l3su56qai5HztlzIfeSaiKBCm//7ZMuUd7vXxBl6L/HA nDYArZRysdyeaBk1xryyRtXJrVqr0N6zC4ltPm13ZU0fmwtWjuw4F1NhtH4r02wgVCjG X+Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=bLDSe44cZJY4RzP8rOml9E7i99VNmaADCbamzPObvzE=; b=uVw9cZpb0Hk9oYwJTFArlt+bL/JMDVshgJd+TDsmw4lYWbtCjCRgh9ftrknns5BI5B UNlZZYYNU6nTFziz2cBGIWvFvCZLt9cmvFYQw30xMC/U6lQoA9a3kHBsYoN4qT5JxehP 0gLwmNahM2a4XwlDrep/N8K6VOV/KLusoy85ptTiJbFGwcShA/Gvdv35cd379oh2ks+W aWS7cnXDgRhNRMG0e57Zge8Ctn9WtvwqHydnCvRo4Z7YGFtSb6zz2XgxE6aS4xujuu4M qPye3wIp4uxEjcixMTJcEFkeuWHBo1F0BTsqMO1bh0IWm4RleG1GPHtAYMfBzzVXeKsH 6vyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=mStYLDUi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h18-20020a056402281200b0046b953601besi10220454ede.29.2022.12.12.19.33.39; Mon, 12 Dec 2022 19:34:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=mStYLDUi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234032AbiLMDat (ORCPT + 99 others); Mon, 12 Dec 2022 22:30:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233635AbiLMDak (ORCPT ); Mon, 12 Dec 2022 22:30:40 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47F361B9DA for ; Mon, 12 Dec 2022 19:30:39 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id y6-20020a17090322c600b00189892baa53so11955323plg.6 for ; Mon, 12 Dec 2022 19:30:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=bLDSe44cZJY4RzP8rOml9E7i99VNmaADCbamzPObvzE=; b=mStYLDUilolo88aKNZelPa7R3nhuaL6QqfjDhv4UqW2a7w8/JewZlZWuBbsotirhp8 gXJYjLrgktEjDDnVsfQCKA47IPpeF4J3geWrPF3yYu5TNgBKOnM2VWJGzxwMX68i/7L9 t6xkATqedo9eqh40NrBVM7IHNi5Y8qpauTrufSI2OWK9XX7RPzwShQS/WC011ZFug9m5 yk9liybJzj54qGiFEKmxVN5BB39J1WyDMmdHR+EUIiwhhLFU/zKVSz1MNixU4DI3iawL QmH9ro9g1VH7ViLX/eot/wXJAXJOADgvi5aNmQxud844kXBYojZNOlbUHpSdJbhQaECz 495g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bLDSe44cZJY4RzP8rOml9E7i99VNmaADCbamzPObvzE=; b=IWa2I2wW27zFaHmfqDU9vopPBVKAoZiBwBG9rdz9SzHngdC5ycD3EoAzmbyy/PYB7g PzEMimgRceO0BBjs2lYWy0ntD4wBregqu92Gfjqghwa72we5QDOsvDuBrrOc4Uv4eUQ5 LgMW5GHhJO3synA2wg9BXfbOC4/momY0Oj/pevZl2UqQdg1vpIpG0ZObfZJFOAeqmvYv Yfm1nJS5W1UG90f3tHCulEc7122yvvV8qF9lHv0TRHlZ5aP/svzmPBRttJcyHBokSoeW wPHYrpZTN5yLlzdXLgu0flj0fOG2AEehRXmnunk93GA2eKxvoG4fkXHIQy5aSP9Vy8y6 ZEtQ== X-Gm-Message-State: ANoB5pmkORV9oT5uAb71RCabPbekgdhQ46wjR/XdgaI1f6+vvLeWeBz5 71F1mfhFZBmBw3OI5uHqtwzzZ2s3Ez4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ea02:b0:189:7441:1ad7 with SMTP id s2-20020a170902ea0200b0018974411ad7mr59350990plg.1.1670902238808; Mon, 12 Dec 2022 19:30:38 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 03:30:28 +0000 In-Reply-To: <20221213033030.83345-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213033030.83345-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213033030.83345-4-seanjc@google.com> Subject: [PATCH 3/5] KVM: x86/mmu: Re-check under lock that TDP MMU SP hugepage is disallowed From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Robert Hoo , Greg Thelen , David Matlack , Ben Gardon , Mingwei Zhang X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752068199871837804?= X-GMAIL-MSGID: =?utf-8?q?1752068199871837804?= Re-check sp->nx_huge_page_disallowed under the tdp_mmu_pages_lock spinlock when adding a new shadow page in the TDP MMU. To ensure the NX reclaim kthread can't see a not-yet-linked shadow page, the page fault path links the new page table prior to adding the page to possible_nx_huge_pages. If the page is zapped by different task, e.g. because dirty logging is disabled, between linking the page and adding it to the list, KVM can end up triggering use-after-free by adding the zapped SP to the aforementioned list, as the zapped SP's memory is scheduled for removal via RCU callback. The bug is detected by the sanity checks guarded by CONFIG_DEBUG_LIST=y, i.e. the below splat is just one possible signature. ------------[ cut here ]------------ list_add corruption. prev->next should be next (ffffc9000071fa70), but was ffff88811125ee38. (prev=ffff88811125ee38). WARNING: CPU: 1 PID: 953 at lib/list_debug.c:30 __list_add_valid+0x79/0xa0 Modules linked in: kvm_intel CPU: 1 PID: 953 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #71 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:__list_add_valid+0x79/0xa0 RSP: 0018:ffffc900006efb68 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff888116cae8a0 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 0000000100001872 RDI: ffff888277c5b4c8 RBP: ffffc90000717000 R08: ffff888277c5b4c0 R09: ffffc900006efa08 R10: 0000000000199998 R11: 0000000000199a20 R12: ffff888116cae930 R13: ffff88811125ee38 R14: ffffc9000071fa70 R15: ffff88810b794f90 FS: 00007fc0415d2740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000115201006 CR4: 0000000000172ea0 Call Trace: track_possible_nx_huge_page+0x53/0x80 kvm_tdp_mmu_map+0x242/0x2c0 kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 61f94478547b ("KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE") Reported-by: Greg Thelen Analyzed-by: David Matlack Cc: David Matlack Cc: Ben Gardon Cc: Mingwei Zhang Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e2e197d41780..fd4ae99790d7 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1203,7 +1203,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) if (fault->huge_page_disallowed && fault->req_level >= iter.level) { spin_lock(&kvm->arch.tdp_mmu_pages_lock); - track_possible_nx_huge_page(kvm, sp); + if (sp->nx_huge_page_disallowed) + track_possible_nx_huge_page(kvm, sp); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); } } From patchwork Tue Dec 13 03:30:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 32619 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2610519wrr; Mon, 12 Dec 2022 19:34:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf5KUEX23aKA7StLE3s+71nkeUUCd7dCQVajngjCZroDu+pxBs13perq9mF0u/qtu3jEMf9M X-Received: by 2002:a05:6402:1f08:b0:462:330a:ce35 with SMTP id b8-20020a0564021f0800b00462330ace35mr20279686edb.11.1670902453806; Mon, 12 Dec 2022 19:34:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670902453; cv=none; d=google.com; s=arc-20160816; b=XxlOvHXWPpoFDP9lsyrwsgwv/rBmb+APZc4E7L9I//KMaeoma4tWWGdCmqbMntTvdu L1kF8tv4uGngJdIP0MNcZENWZrQIHlx8YdnzBiZ5qKN4CwuW5YtFea6T1ZPjv38hssYF X14J7ouDe/K9QC2VKC6PDBtmvtxsZhQC4oTcyUVNWkvpDR+lV1etEdVhj/vQJ6QPG2nE ok9jIMwO1KOqtwV1kvIP+ESu2rUIdt+NS6qg6BPddaDAPpWiy8ciuJ5RniHM9jbx+rkF AtMOB4XQN+U1yekRedMZ4YCuo9nWoX+3v2hB3hK5Dz9zhziZvh7nsN/fgjeBR7ZXoX+f Wyfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=IBv/w73Qp92ge1mM27QBzL+EeWNcmNZL8sCukn+Lbm4=; b=xoBU9b+CAgHXoM1isa8wQXFm0ILcXNSlRQf9ar+vi12RhOhgaC2XCHhT3UYJ87Ext0 6mEDku0ole4Uylma4umeYUvq+SNsLtGAgwD1SjOABljuSnBfBZFhypfdpab/uyG+3TY/ DDzOsMW1FgOFOEXTCrba/W6DrZnhmevGwcmZ1AYgyJzooZf5vb6/TiLS2I5KKakZzVjr p+JA3qLUm9U0m42vFy7ZvYJfDCkOjbuIu5tYp1UWEDZOQj0T5D3oKPfygpHX4sqwQZYa 0sTI2wWCMABN8Hn1gvLEbOM/XyoJ2tFitLdXqIVKjV6k6cDo0xSQWLo2/Ust3GanZ+jx 6s6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=j1dAHXAb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bf8-20020a0564021a4800b0046bf4a2db94si7762422edb.491.2022.12.12.19.33.50; Mon, 12 Dec 2022 19:34:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=j1dAHXAb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234230AbiLMDa4 (ORCPT + 99 others); Mon, 12 Dec 2022 22:30:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234220AbiLMDal (ORCPT ); Mon, 12 Dec 2022 22:30:41 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1214D1B9DA for ; Mon, 12 Dec 2022 19:30:41 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3b0af5bcbd3so153978907b3.0 for ; Mon, 12 Dec 2022 19:30:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=IBv/w73Qp92ge1mM27QBzL+EeWNcmNZL8sCukn+Lbm4=; b=j1dAHXAbK+4Msudj/HHnf6xN0DoV2u8oMGpZ7vBKKgYi2Blq6OUUjTqN9ZP0SwhFiP NQ5nh8QW3M8k0o+0tds9o810RMiKNAyZXYos+Zgt9GsKrx6AwXoFgKvt+uDGbyohj6aO auwrDD8vpnKfBwKdT/fcl7YtQOVSzTrniF9OWjE55wXZSDTezwRVQE3nBi4XxDtA4UpE H2UlZRwSsEN51GK39gm/fvQ6BX/fje/UalOmmkNzCZjMRDl3+PRCWS/nY7yrrm2Cg3CM yNdSmGflIo0wl6geay22FSuESwbPUhMAcSiKVh7iQ/HzVoC2AVPD1K25cUarEl2vKnGT jXJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IBv/w73Qp92ge1mM27QBzL+EeWNcmNZL8sCukn+Lbm4=; b=EuibIVIlk50MKspUnFD2qnPgLTTUkahxrlxnCEDjk9Rlsh6m0TmjnaqJF6yimpP+mF C5953iVbO3olrSzoIH1L9hp6jUjPz0hADn5HbNE9LEmkoUpEV/iSkI89G6LgQDsKKb3z Apssogy5wUXoDh0gg8NtD+MNLsSXK/732hEA2r1Kfc37/9USFIy0/7x2oJ4RUeHfptPj SO9npQQ6gn20ON1SQwXkgfONpo3P/KBCxMAN2Mp8tniH5NEMmfCYoV1NVy5zCy7l9pqV QdP/9NSQU+XQ2bqu8/3yFK8t9xSVYxeLp8x8aUIJuMuO+BgOKIZsD2cTcFvoXj1MVvH0 R5jA== X-Gm-Message-State: ANoB5plnWqLrvYmOZBaL/dQflOZQAcXoWJED9Z3N1biwFZkl5ZvD9E16 i5hvzZz92UAcTcWfTTLWeQKoYWbvrE0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:57c3:0:b0:3c4:bb7a:9443 with SMTP id l186-20020a8157c3000000b003c4bb7a9443mr723618ywb.138.1670902240356; Mon, 12 Dec 2022 19:30:40 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 03:30:29 +0000 In-Reply-To: <20221213033030.83345-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213033030.83345-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213033030.83345-5-seanjc@google.com> Subject: [PATCH 4/5] KVM: x86/mmu: Don't install TDP MMU SPTE if SP has unexpected level From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Robert Hoo , Greg Thelen , David Matlack , Ben Gardon , Mingwei Zhang X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752068211408682841?= X-GMAIL-MSGID: =?utf-8?q?1752068211408682841?= Don't install a leaf TDP MMU SPTE if the parent page's level doesn't match the target level of the fault, and instead have the vCPU retry the faulting instruction after warning. Continuing on is completely unnecessary as the absolute worst case scenario of retrying is DoSing the vCPU, whereas continuing on all but guarantees bigger explosions, e.g. ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:__handle_changed_spte.cold+0x95/0x9c RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246 RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027 RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8 RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0 R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3 R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01 FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0 Call Trace: kvm_tdp_mmu_map+0x3b0/0x510 kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Modules linked in: kvm_intel ---[ end trace 0000000000000000 ]--- Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index fd4ae99790d7..cc1fb9a65620 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1063,7 +1063,9 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int ret = RET_PF_FIXED; bool wrprot = false; - WARN_ON(sp->role.level != fault->goal_level); + if (WARN_ON_ONCE(sp->role.level != fault->goal_level)) + return RET_PF_RETRY; + if (unlikely(!fault->slot)) new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); else From patchwork Tue Dec 13 03:30:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 32620 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2610597wrr; Mon, 12 Dec 2022 19:34:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf6an2tPYud+YjHMbY0TIRCEXtNnIzU1r6Akn4z2dseXKs5/3Yay8DYBlRgGjOrh6lNBMhRO X-Received: by 2002:a17:906:e54:b0:7c1:23f7:623a with SMTP id q20-20020a1709060e5400b007c123f7623amr14984240eji.66.1670902475884; Mon, 12 Dec 2022 19:34:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670902475; cv=none; d=google.com; s=arc-20160816; b=pYlGpYSjvMErW3SUg1kwoGXBrlWI+35o28LwTFsSyX19f3bfbEhXFUCX6+/OMg5HxY SwqqIujDIIL9wxNRghXD8SCeif3WFjUd35nsvoGzgi/aBvcq3wYeMddVZUpz9Ev8wNFt p+pYpBqZFJl3zrQTpVO4ZmBhVHWo5KMsGqL5MVhFmgT70BiE05QvuGAqssf1ZyPraIhc rAiYEvrm3G/drweH2/SeYHVAD9dxLcLTm6hIu68WmKQIrhCBLMKjC2ROf3Hgv+hXnKnC BhEk6t/C6hVAIzRe8oQv3TgfIWa/eSk28OiDZW+K+Nw/kIX/bDzxjgAQwsy7POd4sBda Al2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=fGXZEHyqBBNocJxYYVo3uHcPnlIM/Akr56NebVZoj5U=; b=nsxpZGpIV4yPSPijPMxYKFxlaBSuhna9Ruqj8Pi9x3oNAOCixCTySvBJdPOFFCxHNV mtbngOxabPVuPzYL5S1weWnLLOmiVBtL7NNSXZ2gAIdC3C60fFj074B4r4typdz+ARNt QQZTlh+7YwZlq8ZE2BiCPB1hPuWH0LbNK5lCC6GJnheOnfJJTknugadPZsSHOZgeQYiT 4b/d9h2xCCypwFnV/voPz5rshACe9phD0B93feo8aacS7B0Dxjd25EhjU5u1JMAFry2M Y0294cHvxpBWJdXj2qtO4X35YuUpz3Z+l5HFfFHVSYWnN3gyc/xhzW6XyXqtowkSa7yT WPqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JfKNgB4T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i5-20020a1709064fc500b007c1727f7c57si3304838ejw.243.2022.12.12.19.34.11; Mon, 12 Dec 2022 19:34:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JfKNgB4T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234295AbiLMDbC (ORCPT + 99 others); Mon, 12 Dec 2022 22:31:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234130AbiLMDax (ORCPT ); Mon, 12 Dec 2022 22:30:53 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3886B1C418 for ; Mon, 12 Dec 2022 19:30:42 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id n16-20020a056a000d5000b005764608bb24so1193408pfv.12 for ; Mon, 12 Dec 2022 19:30:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=fGXZEHyqBBNocJxYYVo3uHcPnlIM/Akr56NebVZoj5U=; b=JfKNgB4TUBplhFHUAMfehRbe93HY5CVbJ6zjVogM3HDrERmm2XLfsW4IArihP7/yzm MPPL7MSClc1DfYygfZDbfH66WzrYwqvOvMS6fQ4Sc3JvdX+TDhRC6vsRjcUf6vS/fHqd On1T0IaHnMhYHhBer7cf8XEaeDK5w+1iZ+0+8TSPDwRohpjshU4tyeS2PmWvNAR6QRZl TIP+tN5G7RaIYP/9Omyet0xvZNz3HkOB/ncBNtIfXyPugSWUzFbE1pBAY64Xs1BksA9S bFfvU/VqjO3tN/pg3XjD+JIhxkM9aSasRY8lUKbPY+Ul0YE7nHlP08eX4dqe99z0m97U vU1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fGXZEHyqBBNocJxYYVo3uHcPnlIM/Akr56NebVZoj5U=; b=VBilLxb6nWlXtrlGvmKpcf2GOfUv4058xUGTIo/6dxDxyKVqLFAuB2c8aeoWJ4OWxb N+OQi2lvNg/lw4X0sRyqgP9ZxBnoOl+bHnq4tAXIKLwYXoq+vAQbZIkO5zS+prD8UNSf qNhjXfYbhXTU4JkMMRxSj6jJqGIWoNIlYg0Nd81uvgnXYlFA+0iCUYwyntjZb9xUYLz6 MeXPc0KZq8dnpqrg6wmaJ6EcVNJ8w0hzyBpwbHn48ShsyTYnJIcUAP9i1Lx75y9SxU+v PcGooY8svbvi3cA4i/MHTZDtW0z/55y4VDYmPjC5VmOMlJMGKXO5iDmtIMKVDML3orEP amBQ== X-Gm-Message-State: ANoB5pmGw+4tbFzFkhUWs0cfQmjTu7P9YQyAcHlc92g10ug63icFRJDe 88XI/5TYOUJVc0LCINglFiwpJSxVIZE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d711:b0:188:c7b2:2dd with SMTP id w17-20020a170902d71100b00188c7b202ddmr79943918ply.88.1670902241759; Mon, 12 Dec 2022 19:30:41 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 03:30:30 +0000 In-Reply-To: <20221213033030.83345-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213033030.83345-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213033030.83345-6-seanjc@google.com> Subject: [PATCH 5/5] KVM: x86/mmu: Move kvm_tdp_mmu_map()'s prolog and epilog to its caller From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Robert Hoo , Greg Thelen , David Matlack , Ben Gardon , Mingwei Zhang X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752068234779466780?= X-GMAIL-MSGID: =?utf-8?q?1752068234779466780?= Move the hugepage adjust, tracepoint, and RCU (un)lock logic out of kvm_tdp_mmu_map() and into its sole caller, kvm_tdp_mmu_page_fault(), to eliminate the gotos used to bounce through rcu_read_unlock() when bailing from the walk. Opportunistically mark kvm_mmu_hugepage_adjust() as static as kvm_tdp_mmu_map() was the only external user. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 9 ++++++++- arch/x86/kvm/mmu/mmu_internal.h | 1 - arch/x86/kvm/mmu/tdp_mmu.c | 22 ++++------------------ 3 files changed, 12 insertions(+), 20 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 254bc46234e0..99c40617d325 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3085,7 +3085,8 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, return min(host_level, max_level); } -void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) +static void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault) { struct kvm_memory_slot *slot = fault->slot; kvm_pfn_t mask; @@ -4405,7 +4406,13 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, if (is_page_fault_stale(vcpu, fault)) goto out_unlock; + kvm_mmu_hugepage_adjust(vcpu, fault); + + trace_kvm_mmu_spte_requested(fault); + + rcu_read_lock(); r = kvm_tdp_mmu_map(vcpu, fault); + rcu_read_unlock(); out_unlock: read_unlock(&vcpu->kvm->mmu_lock); diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index ac00bfbf32f6..66c294d67641 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -317,7 +317,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, int max_level); -void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level); void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index cc1fb9a65620..78f47eb74544 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1150,13 +1150,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) struct kvm *kvm = vcpu->kvm; struct tdp_iter iter; struct kvm_mmu_page *sp; - int ret = RET_PF_RETRY; - - kvm_mmu_hugepage_adjust(vcpu, fault); - - trace_kvm_mmu_spte_requested(fault); - - rcu_read_lock(); tdp_mmu_for_each_pte(iter, mmu, fault->gfn, fault->gfn + 1) { int r; @@ -1169,10 +1162,10 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) * retry, avoiding unnecessary page table allocation and free. */ if (is_removed_spte(iter.old_spte)) - goto retry; + return RET_PF_RETRY; if (iter.level == fault->goal_level) - goto map_target_level; + return tdp_mmu_map_handle_target_level(vcpu, fault, &iter); /* Step down into the lower level page table if it exists. */ if (is_shadow_present_pte(iter.old_spte) && @@ -1199,7 +1192,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) */ if (r) { tdp_mmu_free_sp(sp); - goto retry; + return RET_PF_RETRY; } if (fault->huge_page_disallowed && @@ -1216,14 +1209,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) * iterator detected an upper level SPTE was frozen during traversal. */ WARN_ON_ONCE(iter.level == fault->goal_level); - goto retry; - -map_target_level: - ret = tdp_mmu_map_handle_target_level(vcpu, fault, &iter); - -retry: - rcu_read_unlock(); - return ret; + return RET_PF_RETRY; } bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,