Message ID | 20221128002043.1555543-3-mizhang@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5360214wrr; Sun, 27 Nov 2022 16:46:43 -0800 (PST) X-Google-Smtp-Source: AA0mqf6fkfPHXIGIQJWpF8Q7rE2wPTmX91lsZXLc/MdGPUTKtpvQJOwKPQbE7SSXghmnaJlywYOt X-Received: by 2002:a17:907:2a85:b0:7bc:a3c3:1ee with SMTP id fl5-20020a1709072a8500b007bca3c301eemr11795457ejc.304.1669596403693; Sun, 27 Nov 2022 16:46:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669596403; cv=none; d=google.com; s=arc-20160816; b=NmqrJVZaTyhdE6pyrgo8W3YIoIWP7MAeJhKlqzl6VA+nYcF4AVlRe9LjMeXkdrqPPa q0SRKZTMrSZbMWH9J4aSlNSZZORWfBp93I5CEh3MqA2bIeOAevbLFICRlyz8tBEMEepZ RiyqJZE/AnabKeSaEfqLeufahyUzL8m1A88wlkt3XbJpT9POsKAPjDz2skURRzepctNl gjYKbqDRaX5/34dADqQz56ziE5ASyuEjc2aAGvWFy0QElUSztHbmQ5J+ayHGwrVSBTsI XZK8gF6LAFLZ43cEVsoCc9qud6C5q74LkC9Ut312lo1/h+WEARy7h1sYGg6LR5FWeTj5 mOjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=H/d/oQocVevk4kmXKYK3OFaeIddN2VYzbRh6TtN6NXs=; b=jO/+ltdnKrblxr6yjOJHcFti7rXbejEQNq7icgv5Y5e2adwcEpT467OWwmxhiSX/iT lsuL88+DCjuSpZcwDTQPfJjzj8oluFpGryqqE5WdWDLxlDUkY4nadPh5lgNw+pYxSafJ XqskOy5nmpTIJCVXfMrJnvgR2pmZc6rNtaoFqtJKLl/ESx1u8gqRyghX+iA5o6e9Q4DY VoCCTprWIK9Fqu284BYSVc8Wxiy1ZPLcEGVsW1/oU5G15lvBkJ1xcGeMBW5Jnas0Z0/7 ycHejzzz9bhY8jFTbIEkuHNKQ6Fg4C+CH7oJd/EwII+GbtKKsHM5Ztul4QaPGlRR7oVO xCyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RYSaeGnD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eq10-20020a056402298a00b00469e712cba8si8613553edb.558.2022.11.27.16.46.20; Sun, 27 Nov 2022 16:46:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RYSaeGnD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229695AbiK1AVF (ORCPT <rfc822;gah0developer@gmail.com> + 99 others); Sun, 27 Nov 2022 19:21:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229709AbiK1AUx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 27 Nov 2022 19:20:53 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F0210564 for <linux-kernel@vger.kernel.org>; Sun, 27 Nov 2022 16:20:51 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id b13-20020a056a000a8d00b0057348c50123so6960899pfl.18 for <linux-kernel@vger.kernel.org>; Sun, 27 Nov 2022 16:20:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=H/d/oQocVevk4kmXKYK3OFaeIddN2VYzbRh6TtN6NXs=; b=RYSaeGnD1CFO5MQ+J96I9RF7NlYonxwfHL6CIt2WqTYaK4cGzpAXzP4/sMZ5U/O4kr Cj+72NFKKCQzdphG0akXy2FQoKO3VPj0rpsVOo4J8vDEBTh/0uCI3F0/t7+aC1/AT2l4 vAi6tToE0VuQnxEpqDxHXgr+iP77lS5LjGWPzcAuh2zBAYlrOc50HFpjY8kZLzUBOhBM 2RzeerZl34oU5H9OAK1wVzKUlbLkLiN6/n/Rvw0lpkCIMRekPFn59clteRzACq5L+8a0 PomoHd6qmoAXvpSEPVv7lEgG7Hw3VuR1NofbSkd+1aiT9bwKvxFqamiWT0Bkn9cdUy3h HvQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=H/d/oQocVevk4kmXKYK3OFaeIddN2VYzbRh6TtN6NXs=; b=Ih7crnq72mPWHbcTa5kRHDvkPtLulx2s9oYVe2zkB41+HhNn+TYo4ONKLfenxGrxcb FA9MmJk+08qj/pt6S05i9oAMtuOuCIoFViODHgPcLHRdWdWnB334ceVnDSxt2HKRKqoD 19gS2RC1x8FnHlrbuh5gpObEtHGdSCBX6RXLy+flMvQkHDgty5qJVMWKyjDWHn4sVMWC buobNTaVfr3er2+JumrIqoh/zV/OuhBoVbuF4tuljHaqAnDHoGgJ45EcQwReDfps8HbI i5O2pwb76HSw/SphIRpUtneTqkDVBaX62bANWfN7TUkBFHeVc96cwukzenbOWhNM6+S4 ej1g== X-Gm-Message-State: ANoB5pmKzIoGS2ZMG/tv0LQlLPIqGpfcVSRl7JItS1/Jd0Y5sE2KF1/i DauOqlPrXX65QwjshBWY/RzQGgFKGW18 X-Received: from mizhang-super.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1071]) (user=mizhang job=sendgmr) by 2002:a62:ee0f:0:b0:56c:8dbc:f83e with SMTP id e15-20020a62ee0f000000b0056c8dbcf83emr28797392pfi.41.1669594851357; Sun, 27 Nov 2022 16:20:51 -0800 (PST) Reply-To: Mingwei Zhang <mizhang@google.com> Date: Mon, 28 Nov 2022 00:20:43 +0000 In-Reply-To: <20221128002043.1555543-1-mizhang@google.com> Mime-Version: 1.0 References: <20221128002043.1555543-1-mizhang@google.com> X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: <20221128002043.1555543-3-mizhang@google.com> Subject: [RFC PATCH v3 2/2] KVM: x86/mmu: replace BUG() with KVM_BUG() in shadow mmu From: Mingwei Zhang <mizhang@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com>, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang <mizhang@google.com>, Nagareddy Reddy <nspreddy@google.com>, Jim Mattson <jmattson@google.com>, David Matlack <dmatlack@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750698718518487375?= X-GMAIL-MSGID: =?utf-8?q?1750698718518487375?= |
Series |
Deprecate BUG() in pte_list_remove() in shadow mmu
|
|
Commit Message
Mingwei Zhang
Nov. 28, 2022, 12:20 a.m. UTC
Replace BUG() in pte_list_remove() with KVM_BUG() to avoid crashing the
host. MMU bug is difficult to discover due to various racing conditions and
corner cases and thus it extremely hard to debug. The situation gets much
worse when it triggers the shutdown of a host. Host machine crash
eliminates everything including the potential clues for debugging.
From cloud computing service perspective, BUG() or BUG_ON() is probably no
longer appropriate as the host reliability is top priority. Crashing the
physical machine is almost never a good option as it eliminates innocent
VMs and cause service outage in a larger scope. Even worse, if attacker can
reliably triggers this code by diverting the control flow or corrupting the
memory, then this becomes vm-of-death attack. This is a huge attack vector
to cloud providers, as the death of one single host machine is not the end
of the story. Without manual interferences, a failed cloud job may be
dispatched to other hosts and continue host crashes until all of them are
dead.
Because of the above reasons, shrink the scope of crash to the target VM
only. KVM_BUG() and KVM_BUG_ON() requires a valid struct kvm which requires
extra plumbing. Avoid it in this version by just using
kvm_get_running_vcpu()->kvm instead.
Cc: Nagareddy Reddy <nspreddy@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
---
arch/x86/kvm/mmu/mmu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Comments
On Mon, Nov 28, 2022, Mingwei Zhang wrote: > Replace BUG() in pte_list_remove() with KVM_BUG() to avoid crashing the > host. MMU bug is difficult to discover due to various racing conditions and > corner cases and thus it extremely hard to debug. The situation gets much > worse when it triggers the shutdown of a host. Host machine crash > eliminates everything including the potential clues for debugging. > > From cloud computing service perspective, BUG() or BUG_ON() is probably no > longer appropriate as the host reliability is top priority. I don't think we need to bring "cloud computing" into this. Linus has made it clear over and over and over that BUG() / BUG_ON() need to be avoided unless the alternative is worse. E.g. the BUG() in __handle_changed_spte() is warranted because the alternative is silent corruption of guest data. > Crashing the physical machine is almost never a good option as it eliminates > innocent VMs and cause service outage in a larger scope. Even worse, if > attacker can reliably triggers this code by diverting the control flow or > corrupting the memory, Or if there's a KVM bug, which is waaaaay more likely. > then this becomes vm-of-death attack. This is true of any BUG(), and really of any unexpected fault while holding a spinlock, e.g. NULL pointer derefs in the MMU are almost always fatal as well. > This is a huge attack vector to cloud providers, as the death of one single > host machine is not the end of the story. Without manual interferences, a > failed cloud job may be dispatched to other hosts and continue host crashes > until all of them are dead. > > Because of the above reasons, shrink the scope of crash to the target VM > only. KVM_BUG() and KVM_BUG_ON() requires a valid struct kvm which requires > extra plumbing. Avoid it in this version by just using > kvm_get_running_vcpu()->kvm instead. Stale comment. > Cc: Nagareddy Reddy <nspreddy@google.com> > Cc: Jim Mattson <jmattson@google.com> > Cc: David Matlack <dmatlack@google.com> > Signed-off-by: Mingwei Zhang <mizhang@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index b5a44b8f5f7b..e132d82ab4c0 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -956,12 +956,12 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, > > if (!rmap_head->val) { > pr_err("%s: %p 0->BUG\n", __func__, spte); These probably need to be ratelimited (or "once"). Bugging the VM will prevent doing anything useful with the VM, but KVM still needs to destroy the VM, which means zapping SPTEs and purging the rmaps. Theoretically, there could be thousands of broken rmaps. > - BUG(); > + KVM_BUG(true, kvm, ""); If you don't want to provide a message, use KVM_BUG_ON(), not an empty message. Though my vote would be to fold the existing pr_err() messages into KVM_BUG(), which would make the WARN much more helpful and would address the pr_err() issue above. The __func__ printing can also go away in that case because the stack track will provide all the necessary info. The only reason not to drop the pr_err() entirely is if a ratelimited message is helpful for debugging failures that occur in production, which I doubt it true. And rather than pass "true", wrap the actual check with the KVM_BUG(). > } else if (!(rmap_head->val & 1)) { > rmap_printk("%p 1->0\n", spte); > if ((u64 *)rmap_head->val != spte) { > pr_err("%s: %p 1->BUG\n", __func__, spte); > - BUG(); > + KVM_BUG(true, kvm, ""); KVM needs to return here, otherwise KVM is knowingly writing a garbage pointer, e.g. will corrupt memory or trigger a fault. > } > rmap_head->val = 0; > } else { Something like this? --- arch/x86/kvm/mmu/mmu.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b5a44b8f5f7b..12790ccb8731 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -954,15 +954,16 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, struct pte_list_desc *prev_desc; int i; - if (!rmap_head->val) { - pr_err("%s: %p 0->BUG\n", __func__, spte); - BUG(); - } else if (!(rmap_head->val & 1)) { + if (KVM_BUG(!rmap_head->val, kvm, "rmap for %p is empty", spte)) + return; + + if (!(rmap_head->val & 1)) { rmap_printk("%p 1->0\n", spte); - if ((u64 *)rmap_head->val != spte) { - pr_err("%s: %p 1->BUG\n", __func__, spte); - BUG(); - } + + if (KVM_BUG((u64 *)rmap_head->val != spte, kvm, + "single rmap for %p doesn't match", spte)) + return; + rmap_head->val = 0; } else { rmap_printk("%p many->many\n", spte); @@ -979,8 +980,7 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, prev_desc = desc; desc = desc->more; } - pr_err("%s: %p many->many\n", __func__, spte); - BUG(); + KVM_BUG(true, kvm, "no rmap for %p (many->many)", spte); } } base-commit: d74237e747db7f9f27e821e6683d58185e846378 --
> Something like this? > > --- > arch/x86/kvm/mmu/mmu.c | 20 ++++++++++---------- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index b5a44b8f5f7b..12790ccb8731 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -954,15 +954,16 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, > struct pte_list_desc *prev_desc; > int i; > > - if (!rmap_head->val) { > - pr_err("%s: %p 0->BUG\n", __func__, spte); > - BUG(); > - } else if (!(rmap_head->val & 1)) { > + if (KVM_BUG(!rmap_head->val, kvm, "rmap for %p is empty", spte)) > + return; > + > + if (!(rmap_head->val & 1)) { > rmap_printk("%p 1->0\n", spte); > - if ((u64 *)rmap_head->val != spte) { > - pr_err("%s: %p 1->BUG\n", __func__, spte); > - BUG(); > - } > + > + if (KVM_BUG((u64 *)rmap_head->val != spte, kvm, > + "single rmap for %p doesn't match", spte)) > + return; > + > rmap_head->val = 0; > } else { > rmap_printk("%p many->many\n", spte); > @@ -979,8 +980,7 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, > prev_desc = desc; > desc = desc->more; > } > - pr_err("%s: %p many->many\n", __func__, spte); > - BUG(); > + KVM_BUG(true, kvm, "no rmap for %p (many->many)", spte); > } > } > > > base-commit: d74237e747db7f9f27e821e6683d58185e846378 > -- > make sense, will update that in the next version.
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b5a44b8f5f7b..e132d82ab4c0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -956,12 +956,12 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, if (!rmap_head->val) { pr_err("%s: %p 0->BUG\n", __func__, spte); - BUG(); + KVM_BUG(true, kvm, ""); } else if (!(rmap_head->val & 1)) { rmap_printk("%p 1->0\n", spte); if ((u64 *)rmap_head->val != spte) { pr_err("%s: %p 1->BUG\n", __func__, spte); - BUG(); + KVM_BUG(true, kvm, ""); } rmap_head->val = 0; } else { @@ -980,7 +980,7 @@ static void pte_list_remove(struct kvm *kvm, u64 *spte, desc = desc->more; } pr_err("%s: %p many->many\n", __func__, spte); - BUG(); + KVM_BUG(true, kvm, ""); } }