Message ID | 20240228024147.41573-9-seanjc@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3114688dyb; Tue, 27 Feb 2024 19:31:35 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVkrW9GKy/vk0kAYT822zAjuomgVxPYoufwaTf7ahffJ2NMafh6RVhM1z2X2iRTT1CTb0czbGMVcdX8ZQFaL52tODGpEw== X-Google-Smtp-Source: AGHT+IE4utYJDTj3Ei8v7sFX/6knHGhiHiR8D5YGlC1Bo0vdTzQF5XmqcldeZ6mgo1uyPMytwkH0 X-Received: by 2002:a17:903:228a:b0:1dc:b691:85f4 with SMTP id b10-20020a170903228a00b001dcb69185f4mr4951785plh.16.1709091094940; Tue, 27 Feb 2024 19:31:34 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709091094; cv=pass; d=google.com; s=arc-20160816; b=bAiUC2C++fkZ5ZRqsE2UJzhPwMcJkqRn56GUOFsr/FVjz51eCN/R3PF+sXbJbI42Dp NrY70VDZ70X+KDS9orBNtZV3C31le3+eDv6LAQZPDvAvyyB35rWpFynszW3dTiowSTz6 TMlb8udRpgGG9VOYTphTqaHCM5Wx7gyDzYQ+9sbLNCftDbZ1xVy6eOflD3F1eor86Jzc EizSEIeg0BqDgsHkerqErTK61+IPTFKwps9B2QbOVGfLG32Pw/S2AXA0uroOvPkwRQgF 3Ae4ypWWK/H0YGMcVEmWM6MKZrLY9WbAyfZn9AADFguCyYUtJ+3HQqpFuAXpGTlI0Pql R7Fw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :reply-to:dkim-signature; bh=9j/IY1O+fiTDSKSqyI0HJx2iE7AMuX53NhuhIZhorfQ=; fh=3X8eGvCZTY3Z0ZWOZcHa4pqmt6Sk8TCkUhxBmHTuAio=; b=gppzsuKQQFH64nspuKheVxtwEtJVtUlmhZWgjwZoAe6tPhfGA7k5XjBDp017wZXsJ3 DZx+2KOFLMBqm2coLVWghUmkr9XduuzryCcRpKKkVOSR+mMXwWt2rEdHPdZQPf3oY11o XDOCrEv1K9OTp9lAWxexQF72XRSFb5nYxzC6L3F5ssFud1JTCZDfkzlETdYPnxRxlVm3 sRa2FZeRmaIwlY2y+/WDwBY/CpP8CzArNQLFaO+KyuN2nFBFajEIEhu4kIWFN1BseguM zJT3V4di5AewMeWzW25T0n2+KOR4t0ufTxaQyDeJJFpUxGmiLjq4B0RnCqYod8j+TvZg Wxwg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=C2imXfn4; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id im22-20020a170902bb1600b001dcca096bacsi396708plb.365.2024.02.27.19.31.34 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 19:31:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=C2imXfn4; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84434-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 4509AB251D3 for <ouuuleilei@gmail.com>; Wed, 28 Feb 2024 02:44:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 337DC28E1B; Wed, 28 Feb 2024 02:42:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="C2imXfn4" Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E746D24A0A for <linux-kernel@vger.kernel.org>; Wed, 28 Feb 2024 02:42:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709088127; cv=none; b=EGOO6AhNkeuW0Sa9jtnow+2nwIA5DvaqBw9l8SUWHVBwlhZkR3H7NYYHZE9WFLq0jOLJUaWz+uEGBXMQze1NR3cGwNxPkLzFarnD42EVFwxvpsPvtNjEWu1qpiM51sKTN4539z/Qjk1RIfkdVb4Rt/sPU3/taQ/kuVp+Q6EcDns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709088127; c=relaxed/simple; bh=10WzY2o+Flr57bQulGQh5SH//bSBY0IZ4G03BJuz+vI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ct3O/p2o2RaUvtw8yTgNVetJf8LGIx6hx0rnsCnmyXrk02OxG2h9BmsK2zg3kLOtwU2j/n15vyGQWyjwDYk2fGBKDH8fmcIXfQUVqXMWLL5IUdM74PnNmouLPZ0DL5I6q19YTPFvRxF6ME3vyxE+yaRW1PAxtrzFr1Ia8Ly25OA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=C2imXfn4; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-29ade776a78so1564146a91.1 for <linux-kernel@vger.kernel.org>; Tue, 27 Feb 2024 18:42:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709088125; x=1709692925; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=9j/IY1O+fiTDSKSqyI0HJx2iE7AMuX53NhuhIZhorfQ=; b=C2imXfn4SEpwwn8cx9cbELjtlxgLpNqvKJwrmQBiuqZKd/kOIAPHLnAuES//+XIDEV 23DauaSA9GocbUTI0NgSscwhCzPBg79p4/uUqp5LdYW90kKhyJ5VR+gwJDTdYE+UrTQX jiMPVaYNDMe56BbdYC1g8X9XMWlJdTe3coP/YLKW9rbp10BLTa2wbKalhSstO+VllRgd 9d5R/VY83bnyM3L+3QAQqSlda/zltRJScOlG1ReAlk7EMRRt4m32vXkON3fvjLZ6J4iB o7XVq39qGPC403Aj7a0H6ei3cR2Op92ewh/s5fTMqJDgd+syonFZa48MB2kTDhtdGYXc Fk2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709088125; x=1709692925; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9j/IY1O+fiTDSKSqyI0HJx2iE7AMuX53NhuhIZhorfQ=; b=GWrEo3W3hEcB+naD9kc8xh6tyDH5vN7vmSY++c+bHVoeXSvLQUM7iTDKCPLBvDjVKe pduV63bWdfoAyxhZjZVwzex/EOHHN1JugApy/MjBbu5DMcNavNDQiPd4KWl7kC1czwSF VDlawp3CReqG1CF292CAiXBnelWc2nzkwt3U/qdGOKxUROJmNOVpLdcr7Dri8wgujoMo 72DTrUG2eVJaz5t2nQOfbf+vFuyOUVdTomNJv2kwi8jmLMBJRBmBPlqq+5gEjZHQcAl7 yrtsZMd3xfXvpBSvN3RJ3F+eJD0smcA/IrCdmJNz2BVnLSRN3O66gxwpniftml4TZu7Z an/w== X-Forwarded-Encrypted: i=1; AJvYcCWzDeOmRkpUwxN//W/ao4YJWLA2N7XJWH0xl9sx2cAksYB/L8kkfwFqsKvkoDVp7/mxBovwe4lXcwPdGEhmKX04ixmVOcZSSCDY7co0 X-Gm-Message-State: AOJu0Yxi5P9PCYlw2apUpU9Z6ZvtsYYuXSmMK8BV4rPegdnen4vknU4o hCT985BZc/WkzN4vTBgSXRajGbN+YcvmOFwrVCj47kd7CHGczEXymt7hWSEGc9R42dtT2hGLexz elA== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:518f:b0:29a:b2e7:91d3 with SMTP id se15-20020a17090b518f00b0029ab2e791d3mr80809pjb.3.1709088125284; Tue, 27 Feb 2024 18:42:05 -0800 (PST) Reply-To: Sean Christopherson <seanjc@google.com> Date: Tue, 27 Feb 2024 18:41:39 -0800 In-Reply-To: <20240228024147.41573-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 References: <20240228024147.41573-1-seanjc@google.com> X-Mailer: git-send-email 2.44.0.278.ge034bb2e1d-goog Message-ID: <20240228024147.41573-9-seanjc@google.com> Subject: [PATCH 08/16] KVM: x86/mmu: WARN and skip MMIO cache on private, reserved page faults From: Sean Christopherson <seanjc@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yan Zhao <yan.y.zhao@intel.com>, Isaku Yamahata <isaku.yamahata@intel.com>, Michael Roth <michael.roth@amd.com>, Yu Zhang <yu.c.zhang@linux.intel.com>, Chao Peng <chao.p.peng@linux.intel.com>, Fuad Tabba <tabba@google.com>, David Matlack <dmatlack@google.com> Content-Type: text/plain; charset="UTF-8" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792111903631512747 X-GMAIL-MSGID: 1792111903631512747 |
Series |
KVM: x86/mmu: Page fault and MMIO cleanups
|
|
Commit Message
Sean Christopherson
Feb. 28, 2024, 2:41 a.m. UTC
WARN and skip the emulated MMIO fastpath if a private, reserved page fault
is encountered, as private+reserved should be an impossible combination
(KVM should never create an MMIO SPTE for a private access).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
On 28/02/2024 3:41 pm, Sean Christopherson wrote: > WARN and skip the emulated MMIO fastpath if a private, reserved page fault > is encountered, as private+reserved should be an impossible combination > (KVM should never create an MMIO SPTE for a private access). > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index bd342ebd0809..9206cfa58feb 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5866,7 +5866,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err > error_code |= PFERR_PRIVATE_ACCESS; > > r = RET_PF_INVALID; > - if (unlikely(error_code & PFERR_RSVD_MASK)) { > + if (unlikely((error_code & PFERR_RSVD_MASK) && > + !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { > r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); > if (r == RET_PF_EMULATE) > goto emulate; It seems this will make KVM continue to call kvm_mmu_do_page_fault() when such private+reserve error code actually happens (e.g., due to bug), because @r is still RET_PF_INVALID in such case. Is it better to just return error, e.g., -EINVAL, and give up?
On Fri, Mar 01, 2024, Kai Huang wrote: > > > On 28/02/2024 3:41 pm, Sean Christopherson wrote: > > WARN and skip the emulated MMIO fastpath if a private, reserved page fault > > is encountered, as private+reserved should be an impossible combination > > (KVM should never create an MMIO SPTE for a private access). > > > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > > arch/x86/kvm/mmu/mmu.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index bd342ebd0809..9206cfa58feb 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -5866,7 +5866,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err > > error_code |= PFERR_PRIVATE_ACCESS; > > r = RET_PF_INVALID; > > - if (unlikely(error_code & PFERR_RSVD_MASK)) { > > + if (unlikely((error_code & PFERR_RSVD_MASK) && > > + !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { > > r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); > > if (r == RET_PF_EMULATE) > > goto emulate; > > It seems this will make KVM continue to call kvm_mmu_do_page_fault() when > such private+reserve error code actually happens (e.g., due to bug), because > @r is still RET_PF_INVALID in such case. Yep. > Is it better to just return error, e.g., -EINVAL, and give up? As long as there is no obvious/immediate danger to the host, no obvious way for the "bad" behavior to cause data corruption for the guest, and continuing on has a plausible chance of working, then KVM should generally try to continue on and not terminate the VM. E.g. in this case, KVM will just skip various fast paths because of the RSVD flag, and treat the fault like a PRIVATE fault. Hmm, but page_fault_handle_page_track() would skip write tracking, which could theoretically cause data corruption, so I guess arguably it would be safer to bail? Anyone else have an opinion? This type of bug should never escape development, so I'm a-ok effectively killing the VM. Unless someone has a good argument for continuing on, I'll go with Kai's suggestion and squash this: diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index cedacb1b89c5..d796a162b2da 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5892,8 +5892,10 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err error_code |= PFERR_PRIVATE_ACCESS; r = RET_PF_INVALID; - if (unlikely((error_code & PFERR_RSVD_MASK) && - !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { + if (unlikely(error_code & PFERR_RSVD_MASK)) { + if (WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS)) + return -EFAULT; + r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); if (r == RET_PF_EMULATE) goto emulate;
On 1/03/2024 12:06 pm, Sean Christopherson wrote: > On Fri, Mar 01, 2024, Kai Huang wrote: >> >> >> On 28/02/2024 3:41 pm, Sean Christopherson wrote: >>> WARN and skip the emulated MMIO fastpath if a private, reserved page fault >>> is encountered, as private+reserved should be an impossible combination >>> (KVM should never create an MMIO SPTE for a private access). >>> >>> Signed-off-by: Sean Christopherson <seanjc@google.com> >>> --- >>> arch/x86/kvm/mmu/mmu.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c >>> index bd342ebd0809..9206cfa58feb 100644 >>> --- a/arch/x86/kvm/mmu/mmu.c >>> +++ b/arch/x86/kvm/mmu/mmu.c >>> @@ -5866,7 +5866,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err >>> error_code |= PFERR_PRIVATE_ACCESS; >>> r = RET_PF_INVALID; >>> - if (unlikely(error_code & PFERR_RSVD_MASK)) { >>> + if (unlikely((error_code & PFERR_RSVD_MASK) && >>> + !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { >>> r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); >>> if (r == RET_PF_EMULATE) >>> goto emulate; >> >> It seems this will make KVM continue to call kvm_mmu_do_page_fault() when >> such private+reserve error code actually happens (e.g., due to bug), because >> @r is still RET_PF_INVALID in such case. > > Yep. > >> Is it better to just return error, e.g., -EINVAL, and give up? > > As long as there is no obvious/immediate danger to the host, no obvious way for > the "bad" behavior to cause data corruption for the guest, and continuing on has > a plausible chance of working, then KVM should generally try to continue on and > not terminate the VM. Agreed. But I think sometimes it is hard to tell whether there's any dangerous things waiting to happen, because that means we have to sanity check a lot of code, and when new patches arrive we need to keep that in mind too, which could be a nightmare in terms of maintenance. > > E.g. in this case, KVM will just skip various fast paths because of the RSVD flag, > and treat the fault like a PRIVATE fault. Hmm, but page_fault_handle_page_track() > would skip write tracking, which could theoretically cause data corruption, so I > guess arguably it would be safer to bail? > > Anyone else have an opinion? This type of bug should never escape development, > so I'm a-ok effectively killing the VM. Unless someone has a good argument for > continuing on, I'll go with Kai's suggestion and squash this: > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index cedacb1b89c5..d796a162b2da 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5892,8 +5892,10 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err > error_code |= PFERR_PRIVATE_ACCESS; > > r = RET_PF_INVALID; > - if (unlikely((error_code & PFERR_RSVD_MASK) && > - !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { > + if (unlikely(error_code & PFERR_RSVD_MASK)) { > + if (WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS)) > + return -EFAULT; -EFAULT is part of guest_memfd() memory fault ABI. I didn't think over this thoroughly but do you want to return -EFAULT here?
On Fri, Mar 01, 2024, Kai Huang wrote: > On 1/03/2024 12:06 pm, Sean Christopherson wrote: > > E.g. in this case, KVM will just skip various fast paths because of the RSVD flag, > > and treat the fault like a PRIVATE fault. Hmm, but page_fault_handle_page_track() > > would skip write tracking, which could theoretically cause data corruption, so I > > guess arguably it would be safer to bail? > > > > Anyone else have an opinion? This type of bug should never escape development, > > so I'm a-ok effectively killing the VM. Unless someone has a good argument for > > continuing on, I'll go with Kai's suggestion and squash this: > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index cedacb1b89c5..d796a162b2da 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -5892,8 +5892,10 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err > > error_code |= PFERR_PRIVATE_ACCESS; > > r = RET_PF_INVALID; > > - if (unlikely((error_code & PFERR_RSVD_MASK) && > > - !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { > > + if (unlikely(error_code & PFERR_RSVD_MASK)) { > > + if (WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS)) > > + return -EFAULT; > > -EFAULT is part of guest_memfd() memory fault ABI. I didn't think over this > thoroughly but do you want to return -EFAULT here? Yes, I/we do. There are many existing paths that can return -EFAULT from KVM_RUN without setting run->exit_reason to KVM_EXIT_MEMORY_FAULT. Userspace is responsible for checking run->exit_reason on -EFAULT (and -EHWPOISON), i.e. must be prepared to handle a "bare" -EFAULT, where for all intents and purposes "handle" means "terminate the guest". That's actually one of the reasons why KVM_EXIT_MEMORY_FAULT exists, it'd require an absurd amount of work and churn in KVM to *safely* return useful information on *all* -EFAULTs. FWIW, I had hopes and dreams of actually doing exactly this, but have long since abandoned those dreams. In other words, KVM_EXIT_MEMORY_FAULT essentially communicates to userspace that (a) userspace can likely fix whatever badness triggered the -EFAULT, and (b) that KVM is in a state where fixing the underlying problem and resuming the guest is safe, e.g. won't corrupt the guest (because KVM is in a half-baked state).
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index bd342ebd0809..9206cfa58feb 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5866,7 +5866,8 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err error_code |= PFERR_PRIVATE_ACCESS; r = RET_PF_INVALID; - if (unlikely(error_code & PFERR_RSVD_MASK)) { + if (unlikely((error_code & PFERR_RSVD_MASK) && + !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) { r = handle_mmio_page_fault(vcpu, cr2_or_gpa, direct); if (r == RET_PF_EMULATE) goto emulate;