Message ID | 20221019150333.1047423-1-pgonda@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp380668wrs; Wed, 19 Oct 2022 08:13:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6RDC3Ru2utztBA0C1WjNMdBwHAv17XpKlf/+7AOy16wWOMga+Y83yb89K97HvnknZedG0l X-Received: by 2002:a17:906:ee8e:b0:730:4a24:f311 with SMTP id wt14-20020a170906ee8e00b007304a24f311mr7351332ejb.420.1666192381534; Wed, 19 Oct 2022 08:13:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666192381; cv=none; d=google.com; s=arc-20160816; b=cAwaYw2/m/UMjggaqeHHKXSDn4l/jlDHFgf591rBneBpR/EjH85jO7vTxLfkI+aO8d 4xxUHAu0MLWVs1B0vTRczq7Ym3AiWsBzBHowNrrRrRdIe9oG38XSiuCoaJjG0bCy2+hP EgAoSKoVRxdKzVE36v2SEY4fuCQgF1xrsat0uZGTXN89AtN1fjp0StIoRzRx4IaTssoG QuV0W6A4Brgok7gyjP+g4nlqHBqpDpzC3m3DU8DSpyoy/1HF7wULFbHqgKYN05DkWBNw U2F5GhWTW3XAGUgPPz8niYmPjMLmbCZAXG/xJb9W8YyBp5zuRSCOFQVVhCl65rddiMHA /+Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:mime-version:message-id:date :dkim-signature; bh=jjeUrnqbhNp6g16FZyggyoCcMU0gxOqp1sDPYM/lLjs=; b=kolW5eKx8GaXOiGz7OPe+8VqSunZsllof7J5n7Boc7m66njmy+SjRdOFoNYmw6RUGo jtTi28/ZNlxx9hVrXVgkvbHGhgA+Z/+pxZUPfZbwLF62PH/nanR6xiWJBqJz58hyzCaE pYOTNlWqClPfje0M9xcvYnQnhIMDCVOOFvtfrag+2U6J8ZB6dAeVaDsqai5CXhJhaaJo P0lxV+6y4n9Bd559EU2Vk7kU0HMjiORlumxs9JJZZWABfMUj3JZTRQal1J9fmo4sk+Tz Yss4uyuFKfwlipt2Zy0kTyX3EEYkAYZLxyUfM4Tmief3uBTmYqOLNAGw7wWVMzZPwlMj x1FA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=I466DihA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s11-20020a056402520b00b0045db2dcff0esi7040332edd.594.2022.10.19.08.12.32; Wed, 19 Oct 2022 08:13:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=I466DihA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231572AbiJSPLL (ORCPT <rfc822;samuel.l.nystrom@gmail.com> + 99 others); Wed, 19 Oct 2022 11:11:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231476AbiJSPKr (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 19 Oct 2022 11:10:47 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5ECB3140E67 for <linux-kernel@vger.kernel.org>; Wed, 19 Oct 2022 08:03:45 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d18-20020a170902ced200b00180680b8ed1so11858302plg.1 for <linux-kernel@vger.kernel.org>; Wed, 19 Oct 2022 08:03:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=jjeUrnqbhNp6g16FZyggyoCcMU0gxOqp1sDPYM/lLjs=; b=I466DihAisJ43CiyCl9Txr4J6tmUK+4Kffzv68FV2YFCjEqBNMPDcpcuSHi0whTLCf xhnuMeDaOdHqDF4cXcTspPKG0j4XD40sGfkcAkMNjcPBlKXRGdY5RQciC+mhd/2Iu2Yw yuc9nOjcPp8a+62C40n+i2vaAmQztCSPpUDWy+Fir3N3/kRcesPOQZPmwIbYsISmZ7nC Yu6ItOpa644BhXEZ0C7hYUQNcitpwkRSjV140VlddREJNqyX0o3widEKrcy3iFZZzxwv rTt9mx04aZgG4f8S9Hj9wUS2OGGRyyfxFUsa4PDRjaqk1/vme5VJGOhP1SVcBPyFaIad 615w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=jjeUrnqbhNp6g16FZyggyoCcMU0gxOqp1sDPYM/lLjs=; b=s/cAjU8pctvrcpSm6bDXlRPpLqDn6jjM6o12fMtbPfXURDKvpFF7OHD/bRhERGd1Gv b3TRKEDNBqpfr0fsqpcTtOWe06pPcTVxdNgaZJ91NcjKhwTlvdfHXlgf8/VF/RfwPXi9 IXjW1i+iXh6j4WyYx+zrD6M9LbggGU8C7QVXnEtN00fybDtlimhtymKXjhd63vw5OxG7 XVQXGp2BA3kHzKplXMSP56p6dOPuyEFlS5+O639MQgQFWsWQY34GkKSERiF7gASBeKtj GErtQ8t8JjqA/GdohUwWlp5BGUACdZUQ0ZAk+HeBNSTjHFmO6fvhYfrenjF+sQeZ/gUH TdOA== X-Gm-Message-State: ACrzQf29KFA2AYx/QyenrUfQxGSOX2tZtsnhIgLDx9SLXeRxMj8QI9GV M+hV53HXfGvm95yplyJLZFLzJHw0VMI= X-Received: from pgonda1.kir.corp.google.com ([2620:15c:29:203:a42:8a14:f405:2ee1]) (user=pgonda job=sendgmr) by 2002:a65:5a0b:0:b0:46b:158e:ad7c with SMTP id y11-20020a655a0b000000b0046b158ead7cmr7728945pgs.272.1666191824125; Wed, 19 Oct 2022 08:03:44 -0700 (PDT) Date: Wed, 19 Oct 2022 08:03:33 -0700 Message-Id: <20221019150333.1047423-1-pgonda@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Subject: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver From: Peter Gonda <pgonda@google.com> To: thomas.lendacky@amd.com Cc: Peter Gonda <pgonda@google.com>, Borislav Petkov <bp@suse.de>, Michael Roth <michael.roth@amd.com>, Haowen Bai <baihaowen@meizu.com>, Yang Yingliang <yangyingliang@huawei.com>, Marc Orr <marcorr@google.com>, David Rientjes <rientjes@google.com>, Ashish Kalra <Ashish.Kalra@amd.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747129342837362989?= X-GMAIL-MSGID: =?utf-8?q?1747129342837362989?= |
Series |
virt: Prevent AES-GCM IV reuse in SNP guest driver
|
|
Commit Message
Peter Gonda
Oct. 19, 2022, 3:03 p.m. UTC
The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
communicate securely with each other. The IV to this scheme is a
sequence number that both the ASP and the guest track. Currently this
sequence number in a guest request must exactly match the sequence
number tracked by the ASP. This means that if the guest sees an error
from the host during a request it can only retry that exact request or
disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
reuse see:
https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
Signed-off-by: Peter Gonda <pgonda@google.com>
Reported-by: Peter Gonda <pgonda@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Marc Orr <marcorr@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ashish Kalra <Ashish.Kalra@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
---
drivers/virt/coco/sev-guest/sev-guest.c | 45 ++++++++++++++++++-------
1 file changed, 32 insertions(+), 13 deletions(-)
Comments
On 10/19/22 10:03, Peter Gonda wrote: > The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > communicate securely with each other. The IV to this scheme is a > sequence number that both the ASP and the guest track. Currently this > sequence number in a guest request must exactly match the sequence > number tracked by the ASP. This means that if the guest sees an error > from the host during a request it can only retry that exact request or > disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > reuse see: > https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf I wonder if we can at least still support the extended report length query by having the kernel allocate the required pages when the error is SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are no errors on the second request, the sequence numbers can be safely updated, but the kernel returns the original error (which will provide the caller with the number of pages required). For the rate-limiting patch series [1], the rate-limiting will have to be performed within the kernel, while the mutex is held, and then retry the exact request again. Otherwise, that error will require disabling the VMPCK. Either that, or the hypervisor must provide the rate limiting. Thoughts? [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > > Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver") > Signed-off-by: Peter Gonda <pgonda@google.com> > Reported-by: Peter Gonda <pgonda@google.com> > Cc: Borislav Petkov <bp@suse.de> > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Cc: Michael Roth <michael.roth@amd.com> > Cc: Haowen Bai <baihaowen@meizu.com> > Cc: Yang Yingliang <yangyingliang@huawei.com> > Cc: Marc Orr <marcorr@google.com> > Cc: David Rientjes <rientjes@google.com> > Cc: Ashish Kalra <Ashish.Kalra@amd.com> > Cc: linux-kernel@vger.kernel.org > Cc: kvm@vger.kernel.org > > --- > drivers/virt/coco/sev-guest/sev-guest.c | 45 ++++++++++++++++++------- > 1 file changed, 32 insertions(+), 13 deletions(-) > > diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c > index f422f9c58ba7..227ae6a10ef2 100644 > --- a/drivers/virt/coco/sev-guest/sev-guest.c > +++ b/drivers/virt/coco/sev-guest/sev-guest.c > @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev) > return true; > } > > +/* > + * If we receive an error from the host or ASP we have two options. We can > + * either retry the exact same encrypted request or we can discontinue using the > + * VMPCK. > + * > + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to > + * encrypt the requests. The IV for this scheme is the sequence number. GCM > + * cannot tolerate IV reuse. > + * > + * The ASP FW v1.51 only increments the sequence numbers on a successful > + * guest<->ASP back and forth and only accepts messages at its exact sequence > + * number. > + * > + * So if we were to reuse the sequence number the encryption scheme is > + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will > + * reject our request. > + */ > static void snp_disable_vmpck(struct snp_guest_dev *snp_dev) > { > + dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n", > + vmpck_id); > memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN); > snp_dev->vmpck = NULL; > } > @@ -326,29 +345,29 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in > if (fw_err) > *fw_err = err; > > - if (rc) > - return rc; > + if (rc) { > + dev_alert(snp_dev->dev, > + "Detected error from ASP request. rc: %d, fw_err: %lu\n", > + rc, fw_err); > + goto disable_vmpck; > + } > > - /* > - * The verify_and_dec_payload() will fail only if the hypervisor is > - * actively modifying the message header or corrupting the encrypted payload. > - * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that > - * the key cannot be used for any communication. The key is disabled to ensure > - * that AES-GCM does not use the same IV while encrypting the request payload. > - */ > rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz); > if (rc) { > dev_alert(snp_dev->dev, > - "Detected unexpected decode failure, disabling the vmpck_id %d\n", > - vmpck_id); > - snp_disable_vmpck(snp_dev); > - return rc; > + "Detected unexpected decode failure from ASP. rc: %d\n", > + rc); > + goto disable_vmpck; > } > > /* Increment to new message sequence after payload decryption was successful. */ > snp_inc_msg_seqno(snp_dev); > > return 0; > + > +disable_vmpck: > + snp_disable_vmpck(snp_dev); > + return rc; > } > > static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 10:03, Peter Gonda wrote: > > The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > > communicate securely with each other. The IV to this scheme is a > > sequence number that both the ASP and the guest track. Currently this > > sequence number in a guest request must exactly match the sequence > > number tracked by the ASP. This means that if the guest sees an error > > from the host during a request it can only retry that exact request or > > disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > > reuse see: > > https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > > I wonder if we can at least still support the extended report length query > by having the kernel allocate the required pages when the error is > SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are > no errors on the second request, the sequence numbers can be safely > updated, but the kernel returns the original error (which will provide the > caller with the number of pages required). I think we can but I thought fixing the security bug could come first, then the usability fix after. Dionna was planning on working on that fix. In that flow how does userspace get the data? Its called the ioctl with not enough output buffer space. What if the userspace calls the ioctl with no buffers space allocated, so its trying to query the length. We just send the host the request without any encrypted data. > > For the rate-limiting patch series [1], the rate-limiting will have to be > performed within the kernel, while the mutex is held, and then retry the > exact request again. Otherwise, that error will require disabling the > VMPCK. Either that, or the hypervisor must provide the rate limiting. > > Thoughts? > > [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ Yes I think if the host rate limits the guest. The guest kernel should retry the exact message. Which mutex are you referring too?
On 10/19/22 12:40, Peter Gonda wrote: > On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> >> On 10/19/22 10:03, Peter Gonda wrote: >>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to >>> communicate securely with each other. The IV to this scheme is a >>> sequence number that both the ASP and the guest track. Currently this >>> sequence number in a guest request must exactly match the sequence >>> number tracked by the ASP. This means that if the guest sees an error >>> from the host during a request it can only retry that exact request or >>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV >>> reuse see: >>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf >> I think I've wrapped my head around this now. Any non-zero return code from the hypervisor for an SNP Guest Request is either a hypervisor error or an sev-guest driver error, and so the VMPCK should be disabled. The sev-guest driver is really doing everything (message headers, performing the encryption, etc.) and is only using userspace data that will be part of the response message and can't result in a non-zero hypervisor return code. For the SNP Extended Guest Request, we only need to special case a return code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. >> I wonder if we can at least still support the extended report length query >> by having the kernel allocate the required pages when the error is >> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are >> no errors on the second request, the sequence numbers can be safely >> updated, but the kernel returns the original error (which will provide the >> caller with the number of pages required). > > I think we can but I thought fixing the security bug could come first, > then the usability fix after. Dionna was planning on working on that > fix. > > In that flow how does userspace get the data? Its called the ioctl > with not enough output buffer space. What if the userspace calls the > ioctl with no buffers space allocated, so its trying to query the > length. We just send the host the request without any encrypted data. In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data if it hasn't supplied enough buffer space. But, the sev-guest driver can supply enough buffer space and invoke the SNP Extended Guest Request again in order to successfully complete the call and update the sequence numbers. The sev-guest driver would just discard the data in this case, but pass back the original "not enough buffer space" error to the caller, who could now allocate space and retry. This then allows the sequence numbers to be bumped properly. > >> >> For the rate-limiting patch series [1], the rate-limiting will have to be >> performed within the kernel, while the mutex is held, and then retry the >> exact request again. Otherwise, that error will require disabling the >> VMPCK. Either that, or the hypervisor must provide the rate limiting. >> >> Thoughts? >> >> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > > Yes I think if the host rate limits the guest. The guest kernel should > retry the exact message. Which mutex are you referring too? Or the host waits and then submits the request and the guest kernel doesn't have to do anything. The mutex I'm referring to is the snp_cmd_mutex that is taken in snp_guest_ioctl(). Thanks, Tom
On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 12:40, Peter Gonda wrote: > > On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> > >> On 10/19/22 10:03, Peter Gonda wrote: > >>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>> communicate securely with each other. The IV to this scheme is a > >>> sequence number that both the ASP and the guest track. Currently this > >>> sequence number in a guest request must exactly match the sequence > >>> number tracked by the ASP. This means that if the guest sees an error > >>> from the host during a request it can only retry that exact request or > >>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>> reuse see: > >>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > >> > > I think I've wrapped my head around this now. Any non-zero return code > from the hypervisor for an SNP Guest Request is either a hypervisor error > or an sev-guest driver error, and so the VMPCK should be disabled. The > sev-guest driver is really doing everything (message headers, performing > the encryption, etc.) and is only using userspace data that will be part > of the response message and can't result in a non-zero hypervisor return code. > > For the SNP Extended Guest Request, we only need to special case a return > code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. > > > >> I wonder if we can at least still support the extended report length query > >> by having the kernel allocate the required pages when the error is > >> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are > >> no errors on the second request, the sequence numbers can be safely > >> updated, but the kernel returns the original error (which will provide the > >> caller with the number of pages required). > > > > I think we can but I thought fixing the security bug could come first, > > then the usability fix after. Dionna was planning on working on that > > fix. > > > > In that flow how does userspace get the data? Its called the ioctl > > with not enough output buffer space. What if the userspace calls the > > ioctl with no buffers space allocated, so its trying to query the > > length. We just send the host the request without any encrypted data. > > In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data > if it hasn't supplied enough buffer space. But, the sev-guest driver can > supply enough buffer space and invoke the SNP Extended Guest Request again > in order to successfully complete the call and update the sequence > numbers. The sev-guest driver would just discard the data in this case, > but pass back the original "not enough buffer space" error to the caller, > who could now allocate space and retry. This then allows the sequence > numbers to be bumped properly. > The way I thought to solve this was to make certificate length querying a part of the specified protocol. The first ext_guest_request command /must/ query the certificate buffer length with req.certs_len == 0. By making this part of the protocol, the sev-guest driver can check if the certificate length has been requested before. If so, emulate the host's VMM error code for invalid length without sending an encrypted message. If not, then send an all zeroes request buffer with the req.certs_len = 0 values to the VMM. The VMM will respond with the size if indeed the expected_pages are > 0. In the case that the host has not set the certificate buffer yet, then the host will inspect the header of the request page for a zero sequence number. If so, then we know that we don't have a valid request. We treat this also as the INVALID_LEN case but still return the size of 0. The driver will have the expected pages value stored as 0 at this point, so subsequent calls will not have this behavior. The way /dev/sev-guest user code has been written, I don't think this will break any existing software package. > > > >> > >> For the rate-limiting patch series [1], the rate-limiting will have to be > >> performed within the kernel, while the mutex is held, and then retry the > >> exact request again. Otherwise, that error will require disabling the > >> VMPCK. Either that, or the hypervisor must provide the rate limiting. > >> > >> Thoughts? > >> > >> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > > > > Yes I think if the host rate limits the guest. The guest kernel should > > retry the exact message. Which mutex are you referring too? > > Or the host waits and then submits the request and the guest kernel > doesn't have to do anything. The mutex I'm referring to is the > snp_cmd_mutex that is taken in snp_guest_ioctl(). I think that either the host kernel or guest kernel waiting can lead to unacceptable delays. I would recommend that we add a zero argument ioctl to /dev/sev-guest specifically for retrying the last request. We can know what the last request is due to the sev_cmd_mutex serialization. The driver will just keep a scratch buffer for this. Any other request that comes in without resolving the retry will get an -EBUSY error code. Calling the retry ioctl without a pending command will result in -EINVAL. Let me know what you think. > > Thanks, > Tom
On 10/19/22 14:17, Dionna Amalie Glaze wrote: > On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> >> On 10/19/22 12:40, Peter Gonda wrote: >>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>> >>>> On 10/19/22 10:03, Peter Gonda wrote: >>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to >>>>> communicate securely with each other. The IV to this scheme is a >>>>> sequence number that both the ASP and the guest track. Currently this >>>>> sequence number in a guest request must exactly match the sequence >>>>> number tracked by the ASP. This means that if the guest sees an error >>>>> from the host during a request it can only retry that exact request or >>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV >>>>> reuse see: >>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf >>>> >> >> I think I've wrapped my head around this now. Any non-zero return code >> from the hypervisor for an SNP Guest Request is either a hypervisor error >> or an sev-guest driver error, and so the VMPCK should be disabled. The >> sev-guest driver is really doing everything (message headers, performing >> the encryption, etc.) and is only using userspace data that will be part >> of the response message and can't result in a non-zero hypervisor return code. >> >> For the SNP Extended Guest Request, we only need to special case a return >> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. >> >> >>>> I wonder if we can at least still support the extended report length query >>>> by having the kernel allocate the required pages when the error is >>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are >>>> no errors on the second request, the sequence numbers can be safely >>>> updated, but the kernel returns the original error (which will provide the >>>> caller with the number of pages required). >>> >>> I think we can but I thought fixing the security bug could come first, >>> then the usability fix after. Dionna was planning on working on that >>> fix. >>> >>> In that flow how does userspace get the data? Its called the ioctl >>> with not enough output buffer space. What if the userspace calls the >>> ioctl with no buffers space allocated, so its trying to query the >>> length. We just send the host the request without any encrypted data. >> >> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data >> if it hasn't supplied enough buffer space. But, the sev-guest driver can >> supply enough buffer space and invoke the SNP Extended Guest Request again >> in order to successfully complete the call and update the sequence >> numbers. The sev-guest driver would just discard the data in this case, >> but pass back the original "not enough buffer space" error to the caller, >> who could now allocate space and retry. This then allows the sequence >> numbers to be bumped properly. >> > > The way I thought to solve this was to make certificate length > querying a part of the specified protocol. > > The first ext_guest_request command /must/ query the certificate > buffer length with req.certs_len == 0. This becomes an incompatible change to the GHCB specification. > By making this part of the protocol, the sev-guest driver can check if > the certificate length has been requested before. > If so, emulate the host's VMM error code for invalid length without > sending an encrypted message. On the hypervisor side, the certificate blob can be replaced at any time with a new blob that is larger. So you may still have to handle the case where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. > If not, then send an all zeroes request buffer with the req.certs_len > = 0 values to the VMM. > > The VMM will respond with the size if indeed the expected_pages are > > 0. In the case that the host has not set the certificate buffer yet, > then the host will inspect the header of the request page for a zero > sequence number. If so, then we know that we don't have a valid > request. We treat this also as the INVALID_LEN case but still return > the size of 0. The driver will have the expected pages value stored as > 0 at this point, so subsequent calls will not have this behavior. > > The way /dev/sev-guest user code has been written, I don't think this > will break any existing software package. I think having the sev-guest driver re-issue the request with the internal buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. You could still cache the size request and always return that to user-space when a request is received with a 0 length. The user-space program must be able to handle receiving multiple SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that the hypervisor can be updating the certs asynchronously. And if you get a request that is not 0 length, then you issue it as such and re-use the logic of the first 0 length request that was received if you get an SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. Peter, is this something you could change the patch to do? > >>> >>>> >>>> For the rate-limiting patch series [1], the rate-limiting will have to be >>>> performed within the kernel, while the mutex is held, and then retry the >>>> exact request again. Otherwise, that error will require disabling the >>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. >>>> >>>> Thoughts? >>>> >>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ >>> >>> Yes I think if the host rate limits the guest. The guest kernel should >>> retry the exact message. Which mutex are you referring too? >> >> Or the host waits and then submits the request and the guest kernel >> doesn't have to do anything. The mutex I'm referring to is the >> snp_cmd_mutex that is taken in snp_guest_ioctl(). > > I think that either the host kernel or guest kernel waiting can lead > to unacceptable delays. > I would recommend that we add a zero argument ioctl to /dev/sev-guest > specifically for retrying the last request. > > We can know what the last request is due to the sev_cmd_mutex serialization. > The driver will just keep a scratch buffer for this. Any other request > that comes in without resolving the retry will get an -EBUSY error > code. And the first caller will have received an -EAGAIN in order to differentiate between the two situations? > > Calling the retry ioctl without a pending command will result in -EINVAL. > > Let me know what you think. I think that sounds reasonable, but there are some catches. You will need to ensure that the caller that is supposed to retry does actually retry and that a caller that does retry is the same caller that was told to retry. Thanks, Tom >> >> Thanks, >> Tom > > >
On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 14:17, Dionna Amalie Glaze wrote: > > On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> > >> On 10/19/22 12:40, Peter Gonda wrote: > >>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>> > >>>> On 10/19/22 10:03, Peter Gonda wrote: > >>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>>>> communicate securely with each other. The IV to this scheme is a > >>>>> sequence number that both the ASP and the guest track. Currently this > >>>>> sequence number in a guest request must exactly match the sequence > >>>>> number tracked by the ASP. This means that if the guest sees an error > >>>>> from the host during a request it can only retry that exact request or > >>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>>>> reuse see: > >>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > >>>> > >> > >> I think I've wrapped my head around this now. Any non-zero return code > >> from the hypervisor for an SNP Guest Request is either a hypervisor error > >> or an sev-guest driver error, and so the VMPCK should be disabled. The > >> sev-guest driver is really doing everything (message headers, performing > >> the encryption, etc.) and is only using userspace data that will be part > >> of the response message and can't result in a non-zero hypervisor return code. > >> > >> For the SNP Extended Guest Request, we only need to special case a return > >> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. > >> > >> > >>>> I wonder if we can at least still support the extended report length query > >>>> by having the kernel allocate the required pages when the error is > >>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are > >>>> no errors on the second request, the sequence numbers can be safely > >>>> updated, but the kernel returns the original error (which will provide the > >>>> caller with the number of pages required). > >>> > >>> I think we can but I thought fixing the security bug could come first, > >>> then the usability fix after. Dionna was planning on working on that > >>> fix. > >>> > >>> In that flow how does userspace get the data? Its called the ioctl > >>> with not enough output buffer space. What if the userspace calls the > >>> ioctl with no buffers space allocated, so its trying to query the > >>> length. We just send the host the request without any encrypted data. > >> > >> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data > >> if it hasn't supplied enough buffer space. But, the sev-guest driver can > >> supply enough buffer space and invoke the SNP Extended Guest Request again > >> in order to successfully complete the call and update the sequence > >> numbers. The sev-guest driver would just discard the data in this case, > >> but pass back the original "not enough buffer space" error to the caller, > >> who could now allocate space and retry. This then allows the sequence > >> numbers to be bumped properly. > >> > > > > The way I thought to solve this was to make certificate length > > querying a part of the specified protocol. > > > > The first ext_guest_request command /must/ query the certificate > > buffer length with req.certs_len == 0. > > This becomes an incompatible change to the GHCB specification. > > > By making this part of the protocol, the sev-guest driver can check if > > the certificate length has been requested before. > > If so, emulate the host's VMM error code for invalid length without > > sending an encrypted message. > > On the hypervisor side, the certificate blob can be replaced at any time > with a new blob that is larger. So you may still have to handle the case > where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. Ah, I forgot the host could keep changing the size of this data. > > > If not, then send an all zeroes request buffer with the req.certs_len > > = 0 values to the VMM. > > > > The VMM will respond with the size if indeed the expected_pages are > > > 0. In the case that the host has not set the certificate buffer yet, > > then the host will inspect the header of the request page for a zero > > sequence number. If so, then we know that we don't have a valid > > request. We treat this also as the INVALID_LEN case but still return > > the size of 0. The driver will have the expected pages value stored as > > 0 at this point, so subsequent calls will not have this behavior. > > > > The way /dev/sev-guest user code has been written, I don't think this > > will break any existing software package. > > I think having the sev-guest driver re-issue the request with the internal > buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. > You could still cache the size request and always return that to > user-space when a request is received with a 0 length. The user-space > program must be able to handle receiving multiple > SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that > the hypervisor can be updating the certs asynchronously. And if you get a > request that is not 0 length, then you issue it as such and re-use the > logic of the first 0 length request that was received if you get an > SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. > > Peter, is this something you could change the patch to do? OK so the guest retires with the same request when it gets an SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to hold the certificates. When it finally gets a successful request w/ certs. Do we want to return the attestation bits to userspace, but leave out the certificate data. Or just error out the ioctl completely? I can do that in this series. > > > > >>> > >>>> > >>>> For the rate-limiting patch series [1], the rate-limiting will have to be > >>>> performed within the kernel, while the mutex is held, and then retry the > >>>> exact request again. Otherwise, that error will require disabling the > >>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. > >>>> > >>>> Thoughts? > >>>> > >>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > >>> > >>> Yes I think if the host rate limits the guest. The guest kernel should > >>> retry the exact message. Which mutex are you referring too? > >> > >> Or the host waits and then submits the request and the guest kernel > >> doesn't have to do anything. The mutex I'm referring to is the > >> snp_cmd_mutex that is taken in snp_guest_ioctl(). > > > > I think that either the host kernel or guest kernel waiting can lead > > to unacceptable delays. > > I would recommend that we add a zero argument ioctl to /dev/sev-guest > > specifically for retrying the last request. > > > > We can know what the last request is due to the sev_cmd_mutex serialization. > > The driver will just keep a scratch buffer for this. Any other request > > that comes in without resolving the retry will get an -EBUSY error > > code. > > And the first caller will have received an -EAGAIN in order to > differentiate between the two situations? > > > > > Calling the retry ioctl without a pending command will result in -EINVAL. > > > > Let me know what you think. > > I think that sounds reasonable, but there are some catches. You will need > to ensure that the caller that is supposed to retry does actually retry > and that a caller that does retry is the same caller that was told to retry. Whats the issue with the guest driver taking some time? This sounds complex because there may be many users of the driver. How do multiple users coordinate when they need to use the retry ioctl? > > Thanks, > Tom > > >> > >> Thanks, > >> Tom > > > > > >
On Wed, Oct 19, 2022 at 12:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 14:17, Dionna Amalie Glaze wrote: > > On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> > >> On 10/19/22 12:40, Peter Gonda wrote: > >>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>> > >>>> On 10/19/22 10:03, Peter Gonda wrote: > >>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>>>> communicate securely with each other. The IV to this scheme is a > >>>>> sequence number that both the ASP and the guest track. Currently this > >>>>> sequence number in a guest request must exactly match the sequence > >>>>> number tracked by the ASP. This means that if the guest sees an error > >>>>> from the host during a request it can only retry that exact request or > >>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>>>> reuse see: > >>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > >>>> > >> > >> I think I've wrapped my head around this now. Any non-zero return code > >> from the hypervisor for an SNP Guest Request is either a hypervisor error > >> or an sev-guest driver error, and so the VMPCK should be disabled. The > >> sev-guest driver is really doing everything (message headers, performing > >> the encryption, etc.) and is only using userspace data that will be part > >> of the response message and can't result in a non-zero hypervisor return code. > >> > >> For the SNP Extended Guest Request, we only need to special case a return > >> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. > >> > >> > >>>> I wonder if we can at least still support the extended report length query > >>>> by having the kernel allocate the required pages when the error is > >>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are > >>>> no errors on the second request, the sequence numbers can be safely > >>>> updated, but the kernel returns the original error (which will provide the > >>>> caller with the number of pages required). > >>> > >>> I think we can but I thought fixing the security bug could come first, > >>> then the usability fix after. Dionna was planning on working on that > >>> fix. > >>> > >>> In that flow how does userspace get the data? Its called the ioctl > >>> with not enough output buffer space. What if the userspace calls the > >>> ioctl with no buffers space allocated, so its trying to query the > >>> length. We just send the host the request without any encrypted data. > >> > >> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data > >> if it hasn't supplied enough buffer space. But, the sev-guest driver can > >> supply enough buffer space and invoke the SNP Extended Guest Request again > >> in order to successfully complete the call and update the sequence > >> numbers. The sev-guest driver would just discard the data in this case, > >> but pass back the original "not enough buffer space" error to the caller, > >> who could now allocate space and retry. This then allows the sequence > >> numbers to be bumped properly. > >> > > > > The way I thought to solve this was to make certificate length > > querying a part of the specified protocol. > > > > The first ext_guest_request command /must/ query the certificate > > buffer length with req.certs_len == 0. > > This becomes an incompatible change to the GHCB specification. > > > By making this part of the protocol, the sev-guest driver can check if > > the certificate length has been requested before. > > If so, emulate the host's VMM error code for invalid length without > > sending an encrypted message. > > On the hypervisor side, the certificate blob can be replaced at any time > with a new blob that is larger. So you may still have to handle the case > where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. > > > If not, then send an all zeroes request buffer with the req.certs_len > > = 0 values to the VMM. > > > > The VMM will respond with the size if indeed the expected_pages are > > > 0. In the case that the host has not set the certificate buffer yet, > > then the host will inspect the header of the request page for a zero > > sequence number. If so, then we know that we don't have a valid > > request. We treat this also as the INVALID_LEN case but still return > > the size of 0. The driver will have the expected pages value stored as > > 0 at this point, so subsequent calls will not have this behavior. > > > > The way /dev/sev-guest user code has been written, I don't think this > > will break any existing software package. > > I think having the sev-guest driver re-issue the request with the internal > buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. I take it you mean in the case that the host's certs_len == 0? > You could still cache the size request and always return that to > user-space when a request is received with a 0 length. The user-space > program must be able to handle receiving multiple > SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that > the hypervisor can be updating the certs asynchronously. And if you get a > request that is not 0 length, then you issue it as such and re-use the > logic of the first 0 length request that was received if you get an > SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. > A request that gets SNP_GUEST_REQ_INVALID_LEN when the guest expects that it is providing a sufficiently sized certificate buffer means that the guest has encrypted its report request. We then have a harder problem than throttling because not only do we have to reissue the same request, it must be with different certificate arguments provided from user space. > Peter, is this something you could change the patch to do? > > > > >>> > >>>> > >>>> For the rate-limiting patch series [1], the rate-limiting will have to be > >>>> performed within the kernel, while the mutex is held, and then retry the > >>>> exact request again. Otherwise, that error will require disabling the > >>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. > >>>> > >>>> Thoughts? > >>>> > >>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > >>> > >>> Yes I think if the host rate limits the guest. The guest kernel should > >>> retry the exact message. Which mutex are you referring too? > >> > >> Or the host waits and then submits the request and the guest kernel > >> doesn't have to do anything. The mutex I'm referring to is the > >> snp_cmd_mutex that is taken in snp_guest_ioctl(). > > > > I think that either the host kernel or guest kernel waiting can lead > > to unacceptable delays. > > I would recommend that we add a zero argument ioctl to /dev/sev-guest > > specifically for retrying the last request. > > > > We can know what the last request is due to the sev_cmd_mutex serialization. > > The driver will just keep a scratch buffer for this. Any other request > > that comes in without resolving the retry will get an -EBUSY error > > code. > > And the first caller will have received an -EAGAIN in order to > differentiate between the two situations? Yes, the throttled caller gets -EAGAIN, and other ioctls other than retry after that get -EBUSY. > > I think that sounds reasonable, but there are some catches. You will need > to ensure that the caller that is supposed to retry does actually retry > and that a caller that does retry is the same caller that was told to retry. > I think that constitutes a change to task_struct, the way that there's a buffer for interrupted system calls. That seems a bit much. Do we have to model for protocol-breaking user tasks that have access to /dev/sev-guest? The caller that gets -EAGAIN knows to retry. There's no reason for other tasks to retry due to command serialization and the -EBUSY behavior.
On 10/19/22 15:39, Peter Gonda wrote: > On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> >> On 10/19/22 14:17, Dionna Amalie Glaze wrote: >>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>> >>>> On 10/19/22 12:40, Peter Gonda wrote: >>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>>>> >>>>>> On 10/19/22 10:03, Peter Gonda wrote: >>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to >>>>>>> communicate securely with each other. The IV to this scheme is a >>>>>>> sequence number that both the ASP and the guest track. Currently this >>>>>>> sequence number in a guest request must exactly match the sequence >>>>>>> number tracked by the ASP. This means that if the guest sees an error >>>>>>> from the host during a request it can only retry that exact request or >>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV >>>>>>> reuse see: >>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf >>>>>> >>>> >>>> I think I've wrapped my head around this now. Any non-zero return code >>>> from the hypervisor for an SNP Guest Request is either a hypervisor error >>>> or an sev-guest driver error, and so the VMPCK should be disabled. The >>>> sev-guest driver is really doing everything (message headers, performing >>>> the encryption, etc.) and is only using userspace data that will be part >>>> of the response message and can't result in a non-zero hypervisor return code. >>>> >>>> For the SNP Extended Guest Request, we only need to special case a return >>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. >>>> >>>> >>>>>> I wonder if we can at least still support the extended report length query >>>>>> by having the kernel allocate the required pages when the error is >>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are >>>>>> no errors on the second request, the sequence numbers can be safely >>>>>> updated, but the kernel returns the original error (which will provide the >>>>>> caller with the number of pages required). >>>>> >>>>> I think we can but I thought fixing the security bug could come first, >>>>> then the usability fix after. Dionna was planning on working on that >>>>> fix. >>>>> >>>>> In that flow how does userspace get the data? Its called the ioctl >>>>> with not enough output buffer space. What if the userspace calls the >>>>> ioctl with no buffers space allocated, so its trying to query the >>>>> length. We just send the host the request without any encrypted data. >>>> >>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data >>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can >>>> supply enough buffer space and invoke the SNP Extended Guest Request again >>>> in order to successfully complete the call and update the sequence >>>> numbers. The sev-guest driver would just discard the data in this case, >>>> but pass back the original "not enough buffer space" error to the caller, >>>> who could now allocate space and retry. This then allows the sequence >>>> numbers to be bumped properly. >>>> >>> >>> The way I thought to solve this was to make certificate length >>> querying a part of the specified protocol. >>> >>> The first ext_guest_request command /must/ query the certificate >>> buffer length with req.certs_len == 0. >> >> This becomes an incompatible change to the GHCB specification. >> >>> By making this part of the protocol, the sev-guest driver can check if >>> the certificate length has been requested before. >>> If so, emulate the host's VMM error code for invalid length without >>> sending an encrypted message. >> >> On the hypervisor side, the certificate blob can be replaced at any time >> with a new blob that is larger. So you may still have to handle the case >> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. > > Ah, I forgot the host could keep changing the size of this data. > >> >>> If not, then send an all zeroes request buffer with the req.certs_len >>> = 0 values to the VMM. >>> >>> The VMM will respond with the size if indeed the expected_pages are > >>> 0. In the case that the host has not set the certificate buffer yet, >>> then the host will inspect the header of the request page for a zero >>> sequence number. If so, then we know that we don't have a valid >>> request. We treat this also as the INVALID_LEN case but still return >>> the size of 0. The driver will have the expected pages value stored as >>> 0 at this point, so subsequent calls will not have this behavior. >>> >>> The way /dev/sev-guest user code has been written, I don't think this >>> will break any existing software package. >> >> I think having the sev-guest driver re-issue the request with the internal >> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. >> You could still cache the size request and always return that to >> user-space when a request is received with a 0 length. The user-space >> program must be able to handle receiving multiple >> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that >> the hypervisor can be updating the certs asynchronously. And if you get a >> request that is not 0 length, then you issue it as such and re-use the >> logic of the first 0 length request that was received if you get an >> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. >> >> Peter, is this something you could change the patch to do? > > OK so the guest retires with the same request when it gets an > SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to It would just use the pre-allocated snp_dev->certs_data buffer with npages set to the full size of that buffer. > hold the certificates. When it finally gets a successful request w/ > certs. Do we want to return the attestation bits to userspace, but > leave out the certificate data. Or just error out the ioctl > completely? We need to be able to return the attestation bits that came back with the extra certs. So just error out of the ioctl with the length error and let user-space retry with the recommended number of pages. > > I can do that in this series. Thanks! > >> >>> >>>>> >>>>>> >>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be >>>>>> performed within the kernel, while the mutex is held, and then retry the >>>>>> exact request again. Otherwise, that error will require disabling the >>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ >>>>> >>>>> Yes I think if the host rate limits the guest. The guest kernel should >>>>> retry the exact message. Which mutex are you referring too? >>>> >>>> Or the host waits and then submits the request and the guest kernel >>>> doesn't have to do anything. The mutex I'm referring to is the >>>> snp_cmd_mutex that is taken in snp_guest_ioctl(). >>> >>> I think that either the host kernel or guest kernel waiting can lead >>> to unacceptable delays. >>> I would recommend that we add a zero argument ioctl to /dev/sev-guest >>> specifically for retrying the last request. >>> >>> We can know what the last request is due to the sev_cmd_mutex serialization. >>> The driver will just keep a scratch buffer for this. Any other request >>> that comes in without resolving the retry will get an -EBUSY error >>> code. >> >> And the first caller will have received an -EAGAIN in order to >> differentiate between the two situations? >> >>> >>> Calling the retry ioctl without a pending command will result in -EINVAL. >>> >>> Let me know what you think. >> >> I think that sounds reasonable, but there are some catches. You will need >> to ensure that the caller that is supposed to retry does actually retry >> and that a caller that does retry is the same caller that was told to retry. > > Whats the issue with the guest driver taking some time? > > This sounds complex because there may be many users of the driver. How > do multiple users coordinate when they need to use the retry ioctl? > >> >> Thanks, >> Tom >> >>>> >>>> Thanks, >>>> Tom >>> >>> >>>
On 10/19/22 15:40, Dionna Amalie Glaze wrote: > On Wed, Oct 19, 2022 at 12:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> >> On 10/19/22 14:17, Dionna Amalie Glaze wrote: >>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>> >>>> On 10/19/22 12:40, Peter Gonda wrote: >>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>>>> >>>>>> On 10/19/22 10:03, Peter Gonda wrote: >>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to >>>>>>> communicate securely with each other. The IV to this scheme is a >>>>>>> sequence number that both the ASP and the guest track. Currently this >>>>>>> sequence number in a guest request must exactly match the sequence >>>>>>> number tracked by the ASP. This means that if the guest sees an error >>>>>>> from the host during a request it can only retry that exact request or >>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV >>>>>>> reuse see: >>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf >>>>>> >>>> >>>> I think I've wrapped my head around this now. Any non-zero return code >>>> from the hypervisor for an SNP Guest Request is either a hypervisor error >>>> or an sev-guest driver error, and so the VMPCK should be disabled. The >>>> sev-guest driver is really doing everything (message headers, performing >>>> the encryption, etc.) and is only using userspace data that will be part >>>> of the response message and can't result in a non-zero hypervisor return code. >>>> >>>> For the SNP Extended Guest Request, we only need to special case a return >>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. >>>> >>>> >>>>>> I wonder if we can at least still support the extended report length query >>>>>> by having the kernel allocate the required pages when the error is >>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are >>>>>> no errors on the second request, the sequence numbers can be safely >>>>>> updated, but the kernel returns the original error (which will provide the >>>>>> caller with the number of pages required). >>>>> >>>>> I think we can but I thought fixing the security bug could come first, >>>>> then the usability fix after. Dionna was planning on working on that >>>>> fix. >>>>> >>>>> In that flow how does userspace get the data? Its called the ioctl >>>>> with not enough output buffer space. What if the userspace calls the >>>>> ioctl with no buffers space allocated, so its trying to query the >>>>> length. We just send the host the request without any encrypted data. >>>> >>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data >>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can >>>> supply enough buffer space and invoke the SNP Extended Guest Request again >>>> in order to successfully complete the call and update the sequence >>>> numbers. The sev-guest driver would just discard the data in this case, >>>> but pass back the original "not enough buffer space" error to the caller, >>>> who could now allocate space and retry. This then allows the sequence >>>> numbers to be bumped properly. >>>> >>> >>> The way I thought to solve this was to make certificate length >>> querying a part of the specified protocol. >>> >>> The first ext_guest_request command /must/ query the certificate >>> buffer length with req.certs_len == 0. >> >> This becomes an incompatible change to the GHCB specification. >> >>> By making this part of the protocol, the sev-guest driver can check if >>> the certificate length has been requested before. >>> If so, emulate the host's VMM error code for invalid length without >>> sending an encrypted message. >> >> On the hypervisor side, the certificate blob can be replaced at any time >> with a new blob that is larger. So you may still have to handle the case >> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. >> >>> If not, then send an all zeroes request buffer with the req.certs_len >>> = 0 values to the VMM. >>> >>> The VMM will respond with the size if indeed the expected_pages are > >>> 0. In the case that the host has not set the certificate buffer yet, >>> then the host will inspect the header of the request page for a zero >>> sequence number. If so, then we know that we don't have a valid >>> request. We treat this also as the INVALID_LEN case but still return >>> the size of 0. The driver will have the expected pages value stored as >>> 0 at this point, so subsequent calls will not have this behavior. >>> >>> The way /dev/sev-guest user code has been written, I don't think this >>> will break any existing software package. > >> >> I think having the sev-guest driver re-issue the request with the internal >> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. > > I take it you mean in the case that the host's certs_len == 0? Not sure what you mean. The sev-guest driver has an internal buffer for receiving the certs, snp_dev->certs_data, and it would use that whenever it receives an SNP_GUEST_REQ_INVALID_LEN return code. > >> You could still cache the size request and always return that to >> user-space when a request is received with a 0 length. The user-space >> program must be able to handle receiving multiple >> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that >> the hypervisor can be updating the certs asynchronously. And if you get a >> request that is not 0 length, then you issue it as such and re-use the >> logic of the first 0 length request that was received if you get an >> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. >> > > A request that gets SNP_GUEST_REQ_INVALID_LEN when the guest expects > that it is providing a sufficiently sized certificate buffer means > that the guest has encrypted its report request. Correct. > We then have a harder problem than throttling because not only do we > have to reissue the same request, it must be with different > certificate arguments provided from user space. Correct. But before returning the error to userspace, the sev-guest driver will issue the request again with its internal buffer so that the sequence numbers are updated and a new request can be issued. > >> Peter, is this something you could change the patch to do? >> >>> >>>>> >>>>>> >>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be >>>>>> performed within the kernel, while the mutex is held, and then retry the >>>>>> exact request again. Otherwise, that error will require disabling the >>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ >>>>> >>>>> Yes I think if the host rate limits the guest. The guest kernel should >>>>> retry the exact message. Which mutex are you referring too? >>>> >>>> Or the host waits and then submits the request and the guest kernel >>>> doesn't have to do anything. The mutex I'm referring to is the >>>> snp_cmd_mutex that is taken in snp_guest_ioctl(). >>> >>> I think that either the host kernel or guest kernel waiting can lead >>> to unacceptable delays. >>> I would recommend that we add a zero argument ioctl to /dev/sev-guest >>> specifically for retrying the last request. >>> >>> We can know what the last request is due to the sev_cmd_mutex serialization. >>> The driver will just keep a scratch buffer for this. Any other request >>> that comes in without resolving the retry will get an -EBUSY error >>> code. >> >> And the first caller will have received an -EAGAIN in order to >> differentiate between the two situations? > > Yes, the throttled caller gets -EAGAIN, and other ioctls other than > retry after that get -EBUSY. > >> >> I think that sounds reasonable, but there are some catches. You will need >> to ensure that the caller that is supposed to retry does actually retry >> and that a caller that does retry is the same caller that was told to retry. >> > > I think that constitutes a change to task_struct, the way that there's > a buffer for interrupted system calls. > That seems a bit much. Do we have to model for protocol-breaking user > tasks that have access to /dev/sev-guest? > The caller that gets -EAGAIN knows to retry. There's no reason for > other tasks to retry due to command serialization and the -EBUSY > behavior. Maybe for well-behaving user-space applications, but that's not guaranteed. I agree with Peter and think waiting in the sev-guest driver is the simplest and fairest thing to do in the case of throttling. Thanks, Tom > >
On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 15:39, Peter Gonda wrote: > > On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> > >> On 10/19/22 14:17, Dionna Amalie Glaze wrote: > >>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>> > >>>> On 10/19/22 12:40, Peter Gonda wrote: > >>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>>>> > >>>>>> On 10/19/22 10:03, Peter Gonda wrote: > >>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>>>>>> communicate securely with each other. The IV to this scheme is a > >>>>>>> sequence number that both the ASP and the guest track. Currently this > >>>>>>> sequence number in a guest request must exactly match the sequence > >>>>>>> number tracked by the ASP. This means that if the guest sees an error > >>>>>>> from the host during a request it can only retry that exact request or > >>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>>>>>> reuse see: > >>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > >>>>>> > >>>> > >>>> I think I've wrapped my head around this now. Any non-zero return code > >>>> from the hypervisor for an SNP Guest Request is either a hypervisor error > >>>> or an sev-guest driver error, and so the VMPCK should be disabled. The > >>>> sev-guest driver is really doing everything (message headers, performing > >>>> the encryption, etc.) and is only using userspace data that will be part > >>>> of the response message and can't result in a non-zero hypervisor return code. > >>>> > >>>> For the SNP Extended Guest Request, we only need to special case a return > >>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that. > >>>> > >>>> > >>>>>> I wonder if we can at least still support the extended report length query > >>>>>> by having the kernel allocate the required pages when the error is > >>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are > >>>>>> no errors on the second request, the sequence numbers can be safely > >>>>>> updated, but the kernel returns the original error (which will provide the > >>>>>> caller with the number of pages required). > >>>>> > >>>>> I think we can but I thought fixing the security bug could come first, > >>>>> then the usability fix after. Dionna was planning on working on that > >>>>> fix. > >>>>> > >>>>> In that flow how does userspace get the data? Its called the ioctl > >>>>> with not enough output buffer space. What if the userspace calls the > >>>>> ioctl with no buffers space allocated, so its trying to query the > >>>>> length. We just send the host the request without any encrypted data. > >>>> > >>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data > >>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can > >>>> supply enough buffer space and invoke the SNP Extended Guest Request again > >>>> in order to successfully complete the call and update the sequence > >>>> numbers. The sev-guest driver would just discard the data in this case, > >>>> but pass back the original "not enough buffer space" error to the caller, > >>>> who could now allocate space and retry. This then allows the sequence > >>>> numbers to be bumped properly. > >>>> > >>> > >>> The way I thought to solve this was to make certificate length > >>> querying a part of the specified protocol. > >>> > >>> The first ext_guest_request command /must/ query the certificate > >>> buffer length with req.certs_len == 0. > >> > >> This becomes an incompatible change to the GHCB specification. > >> > >>> By making this part of the protocol, the sev-guest driver can check if > >>> the certificate length has been requested before. > >>> If so, emulate the host's VMM error code for invalid length without > >>> sending an encrypted message. > >> > >> On the hypervisor side, the certificate blob can be replaced at any time > >> with a new blob that is larger. So you may still have to handle the case > >> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before. > > > > Ah, I forgot the host could keep changing the size of this data. > > > >> > >>> If not, then send an all zeroes request buffer with the req.certs_len > >>> = 0 values to the VMM. > >>> > >>> The VMM will respond with the size if indeed the expected_pages are > > >>> 0. In the case that the host has not set the certificate buffer yet, > >>> then the host will inspect the header of the request page for a zero > >>> sequence number. If so, then we know that we don't have a valid > >>> request. We treat this also as the INVALID_LEN case but still return > >>> the size of 0. The driver will have the expected pages value stored as > >>> 0 at this point, so subsequent calls will not have this behavior. > >>> > >>> The way /dev/sev-guest user code has been written, I don't think this > >>> will break any existing software package. > >> > >> I think having the sev-guest driver re-issue the request with the internal > >> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. > >> You could still cache the size request and always return that to > >> user-space when a request is received with a 0 length. The user-space > >> program must be able to handle receiving multiple > >> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that > >> the hypervisor can be updating the certs asynchronously. And if you get a > >> request that is not 0 length, then you issue it as such and re-use the > >> logic of the first 0 length request that was received if you get an > >> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value. > >> > >> Peter, is this something you could change the patch to do? > > > > OK so the guest retires with the same request when it gets an > > SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to > > It would just use the pre-allocated snp_dev->certs_data buffer with npages > set to the full size of that buffer. Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe we want to just allocate this buffer which we think is sufficient and never increase the allocation? I see the size of https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is 3200 bytes. Assuming the VCEK cert is the same size (which it should be since this .cert is 2 certificates). 16K seems to leave enough room even for some vendor certificates? > > > hold the certificates. When it finally gets a successful request w/ > > certs. Do we want to return the attestation bits to userspace, but > > leave out the certificate data. Or just error out the ioctl > > completely? > > We need to be able to return the attestation bits that came back with the > extra certs. So just error out of the ioctl with the length error and let > user-space retry with the recommended number of pages. That sounded simpler to me. Will do. > > > > > I can do that in this series. > > Thanks! > > > > >> > >>> > >>>>> > >>>>>> > >>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be > >>>>>> performed within the kernel, while the mutex is held, and then retry the > >>>>>> exact request again. Otherwise, that error will require disabling the > >>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. > >>>>>> > >>>>>> Thoughts? > >>>>>> > >>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > >>>>> > >>>>> Yes I think if the host rate limits the guest. The guest kernel should > >>>>> retry the exact message. Which mutex are you referring too? > >>>> > >>>> Or the host waits and then submits the request and the guest kernel > >>>> doesn't have to do anything. The mutex I'm referring to is the > >>>> snp_cmd_mutex that is taken in snp_guest_ioctl(). > >>> > >>> I think that either the host kernel or guest kernel waiting can lead > >>> to unacceptable delays. > >>> I would recommend that we add a zero argument ioctl to /dev/sev-guest > >>> specifically for retrying the last request. > >>> > >>> We can know what the last request is due to the sev_cmd_mutex serialization. > >>> The driver will just keep a scratch buffer for this. Any other request > >>> that comes in without resolving the retry will get an -EBUSY error > >>> code. > >> > >> And the first caller will have received an -EAGAIN in order to > >> differentiate between the two situations? > >> > >>> > >>> Calling the retry ioctl without a pending command will result in -EINVAL. > >>> > >>> Let me know what you think. > >> > >> I think that sounds reasonable, but there are some catches. You will need > >> to ensure that the caller that is supposed to retry does actually retry > >> and that a caller that does retry is the same caller that was told to retry. > > > > Whats the issue with the guest driver taking some time? > > > > This sounds complex because there may be many users of the driver. How > > do multiple users coordinate when they need to use the retry ioctl? > > > >> > >> Thanks, > >> Tom > >> > >>>> > >>>> Thanks, > >>>> Tom > >>> > >>> > >>>
On 10/19/22 16:47, Peter Gonda wrote: > On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> On 10/19/22 15:39, Peter Gonda wrote: >>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote: >>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>>>> On 10/19/22 12:40, Peter Gonda wrote: >>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>>>>>> On 10/19/22 10:03, Peter Gonda wrote: >>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to >>>>>>>>> communicate securely with each other. The IV to this scheme is a >>>>>>>>> sequence number that both the ASP and the guest track. Currently this >>>>>>>>> sequence number in a guest request must exactly match the sequence >>>>>>>>> number tracked by the ASP. This means that if the guest sees an error >>>>>>>>> from the host during a request it can only retry that exact request or >>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV >>>>>>>>> reuse see: >>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf >>> OK so the guest retires with the same request when it gets an >>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to >> >> It would just use the pre-allocated snp_dev->certs_data buffer with npages >> set to the full size of that buffer. > > Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe > we want to just allocate this buffer which we think is sufficient and > never increase the allocation? > > I see the size of > https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is > 3200 bytes. Assuming the VCEK cert is the same size (which it should > be since this .cert is 2 certificates). 16K seems to leave enough room > even for some vendor certificates? I think just using the 16K buffer (4 pages) as it is allocated today is ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4 pages, then we won't ever be able to pull the certs given how the driver is coded today. In that case, disabling the VMPCK is in order. A separate patch could be submitted later to improve this overall aspect of the certs buffer if needed. Thanks, Tom > >> >>> hold the certificates. When it finally gets a successful request w/ >>> certs. Do we want to return the attestation bits to userspace, but >>> leave out the certificate data. Or just error out the ioctl >>> completely? >> >> We need to be able to return the attestation bits that came back with the >> extra certs. So just error out of the ioctl with the length error and let >> user-space retry with the recommended number of pages. > > That sounded simpler to me. Will do. > >> >>> >>> I can do that in this series. >> >> Thanks! >> >>> >>>> >>>>> >>>>>>> >>>>>>>> >>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be >>>>>>>> performed within the kernel, while the mutex is held, and then retry the >>>>>>>> exact request again. Otherwise, that error will require disabling the >>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ >>>>>>> >>>>>>> Yes I think if the host rate limits the guest. The guest kernel should >>>>>>> retry the exact message. Which mutex are you referring too? >>>>>> >>>>>> Or the host waits and then submits the request and the guest kernel >>>>>> doesn't have to do anything. The mutex I'm referring to is the >>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl(). >>>>> >>>>> I think that either the host kernel or guest kernel waiting can lead >>>>> to unacceptable delays. >>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest >>>>> specifically for retrying the last request. >>>>> >>>>> We can know what the last request is due to the sev_cmd_mutex serialization. >>>>> The driver will just keep a scratch buffer for this. Any other request >>>>> that comes in without resolving the retry will get an -EBUSY error >>>>> code. >>>> >>>> And the first caller will have received an -EAGAIN in order to >>>> differentiate between the two situations? >>>> >>>>> >>>>> Calling the retry ioctl without a pending command will result in -EINVAL. >>>>> >>>>> Let me know what you think. >>>> >>>> I think that sounds reasonable, but there are some catches. You will need >>>> to ensure that the caller that is supposed to retry does actually retry >>>> and that a caller that does retry is the same caller that was told to retry. >>> >>> Whats the issue with the guest driver taking some time? >>> >>> This sounds complex because there may be many users of the driver. How >>> do multiple users coordinate when they need to use the retry ioctl? >>> >>>> >>>> Thanks, >>>> Tom >>>> >>>>>> >>>>>> Thanks, >>>>>> Tom >>>>> >>>>> >>>>>
On Thu, Oct 20, 2022 at 8:02 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 10/19/22 16:47, Peter Gonda wrote: > > On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> On 10/19/22 15:39, Peter Gonda wrote: > >>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote: > >>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>>>> On 10/19/22 12:40, Peter Gonda wrote: > >>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > >>>>>>>> On 10/19/22 10:03, Peter Gonda wrote: > >>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>>>>>>>> communicate securely with each other. The IV to this scheme is a > >>>>>>>>> sequence number that both the ASP and the guest track. Currently this > >>>>>>>>> sequence number in a guest request must exactly match the sequence > >>>>>>>>> number tracked by the ASP. This means that if the guest sees an error > >>>>>>>>> from the host during a request it can only retry that exact request or > >>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>>>>>>>> reuse see: > >>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > > >>> OK so the guest retires with the same request when it gets an > >>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to > >> > >> It would just use the pre-allocated snp_dev->certs_data buffer with npages > >> set to the full size of that buffer. > > > > Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe > > we want to just allocate this buffer which we think is sufficient and > > never increase the allocation? > > > > I see the size of > > https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is > > 3200 bytes. Assuming the VCEK cert is the same size (which it should > > be since this .cert is 2 certificates). 16K seems to leave enough room > > even for some vendor certificates? > > I think just using the 16K buffer (4 pages) as it is allocated today is > ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4 > pages, then we won't ever be able to pull the certs given how the driver > is coded today. In that case, disabling the VMPCK is in order. > > A separate patch could be submitted later to improve this overall aspect > of the certs buffer if needed. If that sounds OK I'd prefer that. This keeps the drivers current limit: static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg) ... if (req.certs_len > SEV_FW_BLOB_MAX_SIZE || !IS_ALIGNED(req.certs_len, PAGE_SIZE)) return -EINVAL; I'd prefer not to add extra features during the bug fix. But happy to make this work with buffers greater than SEV_FW_BLOB_MAX_SIZE as follow up if you want. > > Thanks, > Tom > > > > >> > >>> hold the certificates. When it finally gets a successful request w/ > >>> certs. Do we want to return the attestation bits to userspace, but > >>> leave out the certificate data. Or just error out the ioctl > >>> completely? > >> > >> We need to be able to return the attestation bits that came back with the > >> extra certs. So just error out of the ioctl with the length error and let > >> user-space retry with the recommended number of pages. > > > > That sounded simpler to me. Will do. > > > >> > >>> > >>> I can do that in this series. > >> > >> Thanks! > >> > >>> > >>>> > >>>>> > >>>>>>> > >>>>>>>> > >>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be > >>>>>>>> performed within the kernel, while the mutex is held, and then retry the > >>>>>>>> exact request again. Otherwise, that error will require disabling the > >>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting. > >>>>>>>> > >>>>>>>> Thoughts? > >>>>>>>> > >>>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/ > >>>>>>> > >>>>>>> Yes I think if the host rate limits the guest. The guest kernel should > >>>>>>> retry the exact message. Which mutex are you referring too? > >>>>>> > >>>>>> Or the host waits and then submits the request and the guest kernel > >>>>>> doesn't have to do anything. The mutex I'm referring to is the > >>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl(). > >>>>> > >>>>> I think that either the host kernel or guest kernel waiting can lead > >>>>> to unacceptable delays. > >>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest > >>>>> specifically for retrying the last request. > >>>>> > >>>>> We can know what the last request is due to the sev_cmd_mutex serialization. > >>>>> The driver will just keep a scratch buffer for this. Any other request > >>>>> that comes in without resolving the retry will get an -EBUSY error > >>>>> code. > >>>> > >>>> And the first caller will have received an -EAGAIN in order to > >>>> differentiate between the two situations? > >>>> > >>>>> > >>>>> Calling the retry ioctl without a pending command will result in -EINVAL. > >>>>> > >>>>> Let me know what you think. > >>>> > >>>> I think that sounds reasonable, but there are some catches. You will need > >>>> to ensure that the caller that is supposed to retry does actually retry > >>>> and that a caller that does retry is the same caller that was told to retry. > >>> > >>> Whats the issue with the guest driver taking some time? > >>> > >>> This sounds complex because there may be many users of the driver. How > >>> do multiple users coordinate when they need to use the retry ioctl? > >>> > >>>> > >>>> Thanks, > >>>> Tom > >>>> > >>>>>> > >>>>>> Thanks, > >>>>>> Tom > >>>>> > >>>>> > >>>>>
Hi Peter, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v6.1-rc2 next-20221024] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950 patch link: https://lore.kernel.org/r/20221019150333.1047423-1-pgonda%40google.com patch subject: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver config: x86_64-buildonly-randconfig-r006-20221024 compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/3bb420ef25bf00e5fa26d6146446b11e6ebfd255 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950 git checkout 3bb420ef25bf00e5fa26d6146446b11e6ebfd255 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/virt/coco/sev-guest/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> drivers/virt/coco/sev-guest/sev-guest.c:351:10: warning: format specifies type 'unsigned long' but the argument has type '__u64 *' (aka 'unsigned long long *') [-Wformat] rc, fw_err); ^~~~~~ include/linux/dev_printk.h:142:69: note: expanded from macro 'dev_alert' dev_printk_index_wrap(_dev_alert, KERN_ALERT, dev, dev_fmt(fmt), ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap' _p_func(dev, fmt, ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ 1 warning generated. vim +351 drivers/virt/coco/sev-guest/sev-guest.c 322 323 static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver, 324 u8 type, void *req_buf, size_t req_sz, void *resp_buf, 325 u32 resp_sz, __u64 *fw_err) 326 { 327 unsigned long err; 328 u64 seqno; 329 int rc; 330 331 /* Get message sequence and verify that its a non-zero */ 332 seqno = snp_get_msg_seqno(snp_dev); 333 if (!seqno) 334 return -EIO; 335 336 memset(snp_dev->response, 0, sizeof(struct snp_guest_msg)); 337 338 /* Encrypt the userspace provided payload */ 339 rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz); 340 if (rc) 341 return rc; 342 343 /* Call firmware to process the request */ 344 rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err); 345 if (fw_err) 346 *fw_err = err; 347 348 if (rc) { 349 dev_alert(snp_dev->dev, 350 "Detected error from ASP request. rc: %d, fw_err: %lu\n", > 351 rc, fw_err); 352 goto disable_vmpck; 353 } 354 355 rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz); 356 if (rc) { 357 dev_alert(snp_dev->dev, 358 "Detected unexpected decode failure from ASP. rc: %d\n", 359 rc); 360 goto disable_vmpck; 361 } 362 363 /* Increment to new message sequence after payload decryption was successful. */ 364 snp_inc_msg_seqno(snp_dev); 365 366 return 0; 367 368 disable_vmpck: 369 snp_disable_vmpck(snp_dev); 370 return rc; 371 } 372
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c index f422f9c58ba7..227ae6a10ef2 100644 --- a/drivers/virt/coco/sev-guest/sev-guest.c +++ b/drivers/virt/coco/sev-guest/sev-guest.c @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev) return true; } +/* + * If we receive an error from the host or ASP we have two options. We can + * either retry the exact same encrypted request or we can discontinue using the + * VMPCK. + * + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to + * encrypt the requests. The IV for this scheme is the sequence number. GCM + * cannot tolerate IV reuse. + * + * The ASP FW v1.51 only increments the sequence numbers on a successful + * guest<->ASP back and forth and only accepts messages at its exact sequence + * number. + * + * So if we were to reuse the sequence number the encryption scheme is + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will + * reject our request. + */ static void snp_disable_vmpck(struct snp_guest_dev *snp_dev) { + dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n", + vmpck_id); memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN); snp_dev->vmpck = NULL; } @@ -326,29 +345,29 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in if (fw_err) *fw_err = err; - if (rc) - return rc; + if (rc) { + dev_alert(snp_dev->dev, + "Detected error from ASP request. rc: %d, fw_err: %lu\n", + rc, fw_err); + goto disable_vmpck; + } - /* - * The verify_and_dec_payload() will fail only if the hypervisor is - * actively modifying the message header or corrupting the encrypted payload. - * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that - * the key cannot be used for any communication. The key is disabled to ensure - * that AES-GCM does not use the same IV while encrypting the request payload. - */ rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz); if (rc) { dev_alert(snp_dev->dev, - "Detected unexpected decode failure, disabling the vmpck_id %d\n", - vmpck_id); - snp_disable_vmpck(snp_dev); - return rc; + "Detected unexpected decode failure from ASP. rc: %d\n", + rc); + goto disable_vmpck; } /* Increment to new message sequence after payload decryption was successful. */ snp_inc_msg_seqno(snp_dev); return 0; + +disable_vmpck: + snp_disable_vmpck(snp_dev); + return rc; } static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)