virt: Prevent AES-GCM IV reuse in SNP guest driver

Message ID 20221019150333.1047423-1-pgonda@google.com
State New
Headers
Series virt: Prevent AES-GCM IV reuse in SNP guest driver |

Commit Message

Peter Gonda Oct. 19, 2022, 3:03 p.m. UTC
  The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
communicate securely with each other. The IV to this scheme is a
sequence number that both the ASP and the guest track. Currently this
sequence number in a guest request must exactly match the sequence
number tracked by the ASP. This means that if the guest sees an error
from the host during a request it can only retry that exact request or
disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
reuse see:
https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
Signed-off-by: Peter Gonda <pgonda@google.com>
Reported-by: Peter Gonda <pgonda@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Marc Orr <marcorr@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ashish Kalra <Ashish.Kalra@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org

---
 drivers/virt/coco/sev-guest/sev-guest.c | 45 ++++++++++++++++++-------
 1 file changed, 32 insertions(+), 13 deletions(-)
  

Comments

Tom Lendacky Oct. 19, 2022, 5:03 p.m. UTC | #1
On 10/19/22 10:03, Peter Gonda wrote:
> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> communicate securely with each other. The IV to this scheme is a
> sequence number that both the ASP and the guest track. Currently this
> sequence number in a guest request must exactly match the sequence
> number tracked by the ASP. This means that if the guest sees an error
> from the host during a request it can only retry that exact request or
> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> reuse see:
> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

I wonder if we can at least still support the extended report length query
by having the kernel allocate the required pages when the error is
SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
no errors on the second request, the sequence numbers can be safely
updated, but the kernel returns the original error (which will provide the
caller with the number of pages required).

For the rate-limiting patch series [1], the rate-limiting will have to be
performed within the kernel, while the mutex is held, and then retry the
exact request again. Otherwise, that error will require disabling the
VMPCK. Either that, or the hypervisor must provide the rate limiting.

Thoughts?

[1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/

> 
> Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
> Signed-off-by: Peter Gonda <pgonda@google.com>
> Reported-by: Peter Gonda <pgonda@google.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Michael Roth <michael.roth@amd.com>
> Cc: Haowen Bai <baihaowen@meizu.com>
> Cc: Yang Yingliang <yangyingliang@huawei.com>
> Cc: Marc Orr <marcorr@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Ashish Kalra <Ashish.Kalra@amd.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org
> 
> ---
>   drivers/virt/coco/sev-guest/sev-guest.c | 45 ++++++++++++++++++-------
>   1 file changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index f422f9c58ba7..227ae6a10ef2 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
>   	return true;
>   }
>   
> +/*
> + * If we receive an error from the host or ASP we have two options. We can
> + * either retry the exact same encrypted request or we can discontinue using the
> + * VMPCK.
> + *
> + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
> + * encrypt the requests. The IV for this scheme is the sequence number. GCM
> + * cannot tolerate IV reuse.
> + *
> + * The ASP FW v1.51 only increments the sequence numbers on a successful
> + * guest<->ASP back and forth and only accepts messages at its exact sequence
> + * number.
> + *
> + * So if we were to reuse the sequence number the encryption scheme is
> + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will
> + * reject our request.
> + */
>   static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
>   {
> +	dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n",
> +		  vmpck_id);
>   	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
>   	snp_dev->vmpck = NULL;
>   }
> @@ -326,29 +345,29 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
>   	if (fw_err)
>   		*fw_err = err;
>   
> -	if (rc)
> -		return rc;
> +	if (rc) {
> +		dev_alert(snp_dev->dev,
> +			  "Detected error from ASP request. rc: %d, fw_err: %lu\n",
> +			  rc, fw_err);
> +		goto disable_vmpck;
> +	}
>   
> -	/*
> -	 * The verify_and_dec_payload() will fail only if the hypervisor is
> -	 * actively modifying the message header or corrupting the encrypted payload.
> -	 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> -	 * the key cannot be used for any communication. The key is disabled to ensure
> -	 * that AES-GCM does not use the same IV while encrypting the request payload.
> -	 */
>   	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
>   	if (rc) {
>   		dev_alert(snp_dev->dev,
> -			  "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> -			  vmpck_id);
> -		snp_disable_vmpck(snp_dev);
> -		return rc;
> +			  "Detected unexpected decode failure from ASP. rc: %d\n",
> +			  rc);
> +		goto disable_vmpck;
>   	}
>   
>   	/* Increment to new message sequence after payload decryption was successful. */
>   	snp_inc_msg_seqno(snp_dev);
>   
>   	return 0;
> +
> +disable_vmpck:
> +	snp_disable_vmpck(snp_dev);
> +	return rc;
>   }
>   
>   static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
  
Peter Gonda Oct. 19, 2022, 5:40 p.m. UTC | #2
On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 10:03, Peter Gonda wrote:
> > The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> > communicate securely with each other. The IV to this scheme is a
> > sequence number that both the ASP and the guest track. Currently this
> > sequence number in a guest request must exactly match the sequence
> > number tracked by the ASP. This means that if the guest sees an error
> > from the host during a request it can only retry that exact request or
> > disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> > reuse see:
> > https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>
> I wonder if we can at least still support the extended report length query
> by having the kernel allocate the required pages when the error is
> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
> no errors on the second request, the sequence numbers can be safely
> updated, but the kernel returns the original error (which will provide the
> caller with the number of pages required).

I think we can but I thought fixing the security bug could come first,
then the usability fix after. Dionna was planning on working on that
fix.

In that flow how does userspace get the data? Its called the ioctl
with not enough output buffer space. What if the userspace calls the
ioctl with no buffers space allocated, so its trying to query the
length. We just send the host the request without any encrypted data.

>
> For the rate-limiting patch series [1], the rate-limiting will have to be
> performed within the kernel, while the mutex is held, and then retry the
> exact request again. Otherwise, that error will require disabling the
> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>
> Thoughts?
>
> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/

Yes I think if the host rate limits the guest. The guest kernel should
retry the exact message. Which mutex are you referring too?
  
Tom Lendacky Oct. 19, 2022, 6:43 p.m. UTC | #3
On 10/19/22 12:40, Peter Gonda wrote:
> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 10/19/22 10:03, Peter Gonda wrote:
>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>> communicate securely with each other. The IV to this scheme is a
>>> sequence number that both the ASP and the guest track. Currently this
>>> sequence number in a guest request must exactly match the sequence
>>> number tracked by the ASP. This means that if the guest sees an error
>>> from the host during a request it can only retry that exact request or
>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>> reuse see:
>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>>

I think I've wrapped my head around this now. Any non-zero return code 
from the hypervisor for an SNP Guest Request is either a hypervisor error 
or an sev-guest driver error, and so the VMPCK should be disabled. The 
sev-guest driver is really doing everything (message headers, performing 
the encryption, etc.) and is only using userspace data that will be part 
of the response message and can't result in a non-zero hypervisor return code.

For the SNP Extended Guest Request, we only need to special case a return 
code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.


>> I wonder if we can at least still support the extended report length query
>> by having the kernel allocate the required pages when the error is
>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
>> no errors on the second request, the sequence numbers can be safely
>> updated, but the kernel returns the original error (which will provide the
>> caller with the number of pages required).
> 
> I think we can but I thought fixing the security bug could come first,
> then the usability fix after. Dionna was planning on working on that
> fix.
> 
> In that flow how does userspace get the data? Its called the ioctl
> with not enough output buffer space. What if the userspace calls the
> ioctl with no buffers space allocated, so its trying to query the
> length. We just send the host the request without any encrypted data.

In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data 
if it hasn't supplied enough buffer space. But, the sev-guest driver can 
supply enough buffer space and invoke the SNP Extended Guest Request again 
in order to successfully complete the call and update the sequence 
numbers. The sev-guest driver would just discard the data in this case, 
but pass back the original "not enough buffer space" error to the caller, 
who could now allocate space and retry. This then allows the sequence 
numbers to be bumped properly.

> 
>>
>> For the rate-limiting patch series [1], the rate-limiting will have to be
>> performed within the kernel, while the mutex is held, and then retry the
>> exact request again. Otherwise, that error will require disabling the
>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>
>> Thoughts?
>>
>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> 
> Yes I think if the host rate limits the guest. The guest kernel should
> retry the exact message. Which mutex are you referring too?

Or the host waits and then submits the request and the guest kernel 
doesn't have to do anything. The mutex I'm referring to is the 
snp_cmd_mutex that is taken in snp_guest_ioctl().

Thanks,
Tom
  
Dionna Amalie Glaze Oct. 19, 2022, 7:17 p.m. UTC | #4
On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 12:40, Peter Gonda wrote:
> > On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>
> >> On 10/19/22 10:03, Peter Gonda wrote:
> >>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>> communicate securely with each other. The IV to this scheme is a
> >>> sequence number that both the ASP and the guest track. Currently this
> >>> sequence number in a guest request must exactly match the sequence
> >>> number tracked by the ASP. This means that if the guest sees an error
> >>> from the host during a request it can only retry that exact request or
> >>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>> reuse see:
> >>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
> >>
>
> I think I've wrapped my head around this now. Any non-zero return code
> from the hypervisor for an SNP Guest Request is either a hypervisor error
> or an sev-guest driver error, and so the VMPCK should be disabled. The
> sev-guest driver is really doing everything (message headers, performing
> the encryption, etc.) and is only using userspace data that will be part
> of the response message and can't result in a non-zero hypervisor return code.
>
> For the SNP Extended Guest Request, we only need to special case a return
> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
>
>
> >> I wonder if we can at least still support the extended report length query
> >> by having the kernel allocate the required pages when the error is
> >> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
> >> no errors on the second request, the sequence numbers can be safely
> >> updated, but the kernel returns the original error (which will provide the
> >> caller with the number of pages required).
> >
> > I think we can but I thought fixing the security bug could come first,
> > then the usability fix after. Dionna was planning on working on that
> > fix.
> >
> > In that flow how does userspace get the data? Its called the ioctl
> > with not enough output buffer space. What if the userspace calls the
> > ioctl with no buffers space allocated, so its trying to query the
> > length. We just send the host the request without any encrypted data.
>
> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
> if it hasn't supplied enough buffer space. But, the sev-guest driver can
> supply enough buffer space and invoke the SNP Extended Guest Request again
> in order to successfully complete the call and update the sequence
> numbers. The sev-guest driver would just discard the data in this case,
> but pass back the original "not enough buffer space" error to the caller,
> who could now allocate space and retry. This then allows the sequence
> numbers to be bumped properly.
>

The way I thought to solve this was to make certificate length
querying a part of the specified protocol.

The first ext_guest_request command /must/ query the certificate
buffer length with req.certs_len == 0.
By making this part of the protocol, the sev-guest driver can check if
the certificate length has been requested before.
If so, emulate the host's VMM error code for invalid length without
sending an encrypted message.
If not, then send an all zeroes request buffer with the req.certs_len
= 0 values to the VMM.

The VMM will respond with the size if indeed the expected_pages are >
0. In the case that the host has not set the certificate buffer yet,
then the host will inspect the header of the request page for a zero
sequence number. If so, then we know that we don't have a valid
request. We treat this also as the INVALID_LEN case but still return
the size of 0. The driver will have the expected pages value stored as
0 at this point, so subsequent calls will not have this behavior.

The way /dev/sev-guest user code has been written, I don't think this
will break any existing software package.

> >
> >>
> >> For the rate-limiting patch series [1], the rate-limiting will have to be
> >> performed within the kernel, while the mutex is held, and then retry the
> >> exact request again. Otherwise, that error will require disabling the
> >> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>
> >> Thoughts?
> >>
> >> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> >
> > Yes I think if the host rate limits the guest. The guest kernel should
> > retry the exact message. Which mutex are you referring too?
>
> Or the host waits and then submits the request and the guest kernel
> doesn't have to do anything. The mutex I'm referring to is the
> snp_cmd_mutex that is taken in snp_guest_ioctl().

I think that either the host kernel or guest kernel waiting can lead
to unacceptable delays.
I would recommend that we add a zero argument ioctl to /dev/sev-guest
specifically for retrying the last request.

We can know what the last request is due to the sev_cmd_mutex serialization.
The driver will just keep a scratch buffer for this. Any other request
that comes in without resolving the retry will get an -EBUSY error
code.

Calling the retry ioctl without a pending command will result in -EINVAL.

Let me know what you think.
>
> Thanks,
> Tom
  
Tom Lendacky Oct. 19, 2022, 7:56 p.m. UTC | #5
On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 10/19/22 12:40, Peter Gonda wrote:
>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>
>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>> communicate securely with each other. The IV to this scheme is a
>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>> sequence number in a guest request must exactly match the sequence
>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>> from the host during a request it can only retry that exact request or
>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>> reuse see:
>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>>>>
>>
>> I think I've wrapped my head around this now. Any non-zero return code
>> from the hypervisor for an SNP Guest Request is either a hypervisor error
>> or an sev-guest driver error, and so the VMPCK should be disabled. The
>> sev-guest driver is really doing everything (message headers, performing
>> the encryption, etc.) and is only using userspace data that will be part
>> of the response message and can't result in a non-zero hypervisor return code.
>>
>> For the SNP Extended Guest Request, we only need to special case a return
>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
>>
>>
>>>> I wonder if we can at least still support the extended report length query
>>>> by having the kernel allocate the required pages when the error is
>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
>>>> no errors on the second request, the sequence numbers can be safely
>>>> updated, but the kernel returns the original error (which will provide the
>>>> caller with the number of pages required).
>>>
>>> I think we can but I thought fixing the security bug could come first,
>>> then the usability fix after. Dionna was planning on working on that
>>> fix.
>>>
>>> In that flow how does userspace get the data? Its called the ioctl
>>> with not enough output buffer space. What if the userspace calls the
>>> ioctl with no buffers space allocated, so its trying to query the
>>> length. We just send the host the request without any encrypted data.
>>
>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
>> if it hasn't supplied enough buffer space. But, the sev-guest driver can
>> supply enough buffer space and invoke the SNP Extended Guest Request again
>> in order to successfully complete the call and update the sequence
>> numbers. The sev-guest driver would just discard the data in this case,
>> but pass back the original "not enough buffer space" error to the caller,
>> who could now allocate space and retry. This then allows the sequence
>> numbers to be bumped properly.
>>
> 
> The way I thought to solve this was to make certificate length
> querying a part of the specified protocol.
> 
> The first ext_guest_request command /must/ query the certificate
> buffer length with req.certs_len == 0.

This becomes an incompatible change to the GHCB specification.

> By making this part of the protocol, the sev-guest driver can check if
> the certificate length has been requested before.
> If so, emulate the host's VMM error code for invalid length without
> sending an encrypted message.

On the hypervisor side, the certificate blob can be replaced at any time 
with a new blob that is larger. So you may still have to handle the case 
where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.

> If not, then send an all zeroes request buffer with the req.certs_len
> = 0 values to the VMM.
> 
> The VMM will respond with the size if indeed the expected_pages are >
> 0. In the case that the host has not set the certificate buffer yet,
> then the host will inspect the header of the request page for a zero
> sequence number. If so, then we know that we don't have a valid
> request. We treat this also as the INVALID_LEN case but still return
> the size of 0. The driver will have the expected pages value stored as
> 0 at this point, so subsequent calls will not have this behavior.
> 
> The way /dev/sev-guest user code has been written, I don't think this
> will break any existing software package.

I think having the sev-guest driver re-issue the request with the internal 
buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go. 
You could still cache the size request and always return that to 
user-space when a request is received with a 0 length. The user-space 
program must be able to handle receiving multiple 
SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that 
the hypervisor can be updating the certs asynchronously. And if you get a 
request that is not 0 length, then you issue it as such and re-use the 
logic of the first 0 length request that was received if you get an 
SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.

Peter, is this something you could change the patch to do?

> 
>>>
>>>>
>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>> performed within the kernel, while the mutex is held, and then retry the
>>>> exact request again. Otherwise, that error will require disabling the
>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>
>>>> Thoughts?
>>>>
>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>
>>> Yes I think if the host rate limits the guest. The guest kernel should
>>> retry the exact message. Which mutex are you referring too?
>>
>> Or the host waits and then submits the request and the guest kernel
>> doesn't have to do anything. The mutex I'm referring to is the
>> snp_cmd_mutex that is taken in snp_guest_ioctl().
> 
> I think that either the host kernel or guest kernel waiting can lead
> to unacceptable delays.
> I would recommend that we add a zero argument ioctl to /dev/sev-guest
> specifically for retrying the last request.
> 
> We can know what the last request is due to the sev_cmd_mutex serialization.
> The driver will just keep a scratch buffer for this. Any other request
> that comes in without resolving the retry will get an -EBUSY error
> code.

And the first caller will have received an -EAGAIN in order to 
differentiate between the two situations?

> 
> Calling the retry ioctl without a pending command will result in -EINVAL.
> 
> Let me know what you think.

I think that sounds reasonable, but there are some catches. You will need 
to ensure that the caller that is supposed to retry does actually retry 
and that a caller that does retry is the same caller that was told to retry.

Thanks,
Tom

>>
>> Thanks,
>> Tom
> 
> 
>
  
Peter Gonda Oct. 19, 2022, 8:39 p.m. UTC | #6
On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> > On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>
> >> On 10/19/22 12:40, Peter Gonda wrote:
> >>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>
> >>>> On 10/19/22 10:03, Peter Gonda wrote:
> >>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>>>> communicate securely with each other. The IV to this scheme is a
> >>>>> sequence number that both the ASP and the guest track. Currently this
> >>>>> sequence number in a guest request must exactly match the sequence
> >>>>> number tracked by the ASP. This means that if the guest sees an error
> >>>>> from the host during a request it can only retry that exact request or
> >>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>>>> reuse see:
> >>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
> >>>>
> >>
> >> I think I've wrapped my head around this now. Any non-zero return code
> >> from the hypervisor for an SNP Guest Request is either a hypervisor error
> >> or an sev-guest driver error, and so the VMPCK should be disabled. The
> >> sev-guest driver is really doing everything (message headers, performing
> >> the encryption, etc.) and is only using userspace data that will be part
> >> of the response message and can't result in a non-zero hypervisor return code.
> >>
> >> For the SNP Extended Guest Request, we only need to special case a return
> >> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
> >>
> >>
> >>>> I wonder if we can at least still support the extended report length query
> >>>> by having the kernel allocate the required pages when the error is
> >>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
> >>>> no errors on the second request, the sequence numbers can be safely
> >>>> updated, but the kernel returns the original error (which will provide the
> >>>> caller with the number of pages required).
> >>>
> >>> I think we can but I thought fixing the security bug could come first,
> >>> then the usability fix after. Dionna was planning on working on that
> >>> fix.
> >>>
> >>> In that flow how does userspace get the data? Its called the ioctl
> >>> with not enough output buffer space. What if the userspace calls the
> >>> ioctl with no buffers space allocated, so its trying to query the
> >>> length. We just send the host the request without any encrypted data.
> >>
> >> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
> >> if it hasn't supplied enough buffer space. But, the sev-guest driver can
> >> supply enough buffer space and invoke the SNP Extended Guest Request again
> >> in order to successfully complete the call and update the sequence
> >> numbers. The sev-guest driver would just discard the data in this case,
> >> but pass back the original "not enough buffer space" error to the caller,
> >> who could now allocate space and retry. This then allows the sequence
> >> numbers to be bumped properly.
> >>
> >
> > The way I thought to solve this was to make certificate length
> > querying a part of the specified protocol.
> >
> > The first ext_guest_request command /must/ query the certificate
> > buffer length with req.certs_len == 0.
>
> This becomes an incompatible change to the GHCB specification.
>
> > By making this part of the protocol, the sev-guest driver can check if
> > the certificate length has been requested before.
> > If so, emulate the host's VMM error code for invalid length without
> > sending an encrypted message.
>
> On the hypervisor side, the certificate blob can be replaced at any time
> with a new blob that is larger. So you may still have to handle the case
> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.

Ah, I forgot the host could keep changing the size of this data.

>
> > If not, then send an all zeroes request buffer with the req.certs_len
> > = 0 values to the VMM.
> >
> > The VMM will respond with the size if indeed the expected_pages are >
> > 0. In the case that the host has not set the certificate buffer yet,
> > then the host will inspect the header of the request page for a zero
> > sequence number. If so, then we know that we don't have a valid
> > request. We treat this also as the INVALID_LEN case but still return
> > the size of 0. The driver will have the expected pages value stored as
> > 0 at this point, so subsequent calls will not have this behavior.
> >
> > The way /dev/sev-guest user code has been written, I don't think this
> > will break any existing software package.
>
> I think having the sev-guest driver re-issue the request with the internal
> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.
> You could still cache the size request and always return that to
> user-space when a request is received with a 0 length. The user-space
> program must be able to handle receiving multiple
> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
> the hypervisor can be updating the certs asynchronously. And if you get a
> request that is not 0 length, then you issue it as such and re-use the
> logic of the first 0 length request that was received if you get an
> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
>
> Peter, is this something you could change the patch to do?

OK so the guest retires with the same request when it gets an
SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
hold the certificates. When it finally gets a successful request w/
certs. Do we want to return the attestation bits to userspace, but
leave out the certificate data. Or just error out the ioctl
completely?

I can do that in this series.

>
> >
> >>>
> >>>>
> >>>> For the rate-limiting patch series [1], the rate-limiting will have to be
> >>>> performed within the kernel, while the mutex is held, and then retry the
> >>>> exact request again. Otherwise, that error will require disabling the
> >>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> >>>
> >>> Yes I think if the host rate limits the guest. The guest kernel should
> >>> retry the exact message. Which mutex are you referring too?
> >>
> >> Or the host waits and then submits the request and the guest kernel
> >> doesn't have to do anything. The mutex I'm referring to is the
> >> snp_cmd_mutex that is taken in snp_guest_ioctl().
> >
> > I think that either the host kernel or guest kernel waiting can lead
> > to unacceptable delays.
> > I would recommend that we add a zero argument ioctl to /dev/sev-guest
> > specifically for retrying the last request.
> >
> > We can know what the last request is due to the sev_cmd_mutex serialization.
> > The driver will just keep a scratch buffer for this. Any other request
> > that comes in without resolving the retry will get an -EBUSY error
> > code.
>
> And the first caller will have received an -EAGAIN in order to
> differentiate between the two situations?
>
> >
> > Calling the retry ioctl without a pending command will result in -EINVAL.
> >
> > Let me know what you think.
>
> I think that sounds reasonable, but there are some catches. You will need
> to ensure that the caller that is supposed to retry does actually retry
> and that a caller that does retry is the same caller that was told to retry.

Whats the issue with the guest driver taking some time?

This sounds complex because there may be many users of the driver. How
do multiple users coordinate when they need to use the retry ioctl?

>
> Thanks,
> Tom
>
> >>
> >> Thanks,
> >> Tom
> >
> >
> >
  
Dionna Amalie Glaze Oct. 19, 2022, 8:40 p.m. UTC | #7
On Wed, Oct 19, 2022 at 12:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> > On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>
> >> On 10/19/22 12:40, Peter Gonda wrote:
> >>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>
> >>>> On 10/19/22 10:03, Peter Gonda wrote:
> >>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>>>> communicate securely with each other. The IV to this scheme is a
> >>>>> sequence number that both the ASP and the guest track. Currently this
> >>>>> sequence number in a guest request must exactly match the sequence
> >>>>> number tracked by the ASP. This means that if the guest sees an error
> >>>>> from the host during a request it can only retry that exact request or
> >>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>>>> reuse see:
> >>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
> >>>>
> >>
> >> I think I've wrapped my head around this now. Any non-zero return code
> >> from the hypervisor for an SNP Guest Request is either a hypervisor error
> >> or an sev-guest driver error, and so the VMPCK should be disabled. The
> >> sev-guest driver is really doing everything (message headers, performing
> >> the encryption, etc.) and is only using userspace data that will be part
> >> of the response message and can't result in a non-zero hypervisor return code.
> >>
> >> For the SNP Extended Guest Request, we only need to special case a return
> >> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
> >>
> >>
> >>>> I wonder if we can at least still support the extended report length query
> >>>> by having the kernel allocate the required pages when the error is
> >>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
> >>>> no errors on the second request, the sequence numbers can be safely
> >>>> updated, but the kernel returns the original error (which will provide the
> >>>> caller with the number of pages required).
> >>>
> >>> I think we can but I thought fixing the security bug could come first,
> >>> then the usability fix after. Dionna was planning on working on that
> >>> fix.
> >>>
> >>> In that flow how does userspace get the data? Its called the ioctl
> >>> with not enough output buffer space. What if the userspace calls the
> >>> ioctl with no buffers space allocated, so its trying to query the
> >>> length. We just send the host the request without any encrypted data.
> >>
> >> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
> >> if it hasn't supplied enough buffer space. But, the sev-guest driver can
> >> supply enough buffer space and invoke the SNP Extended Guest Request again
> >> in order to successfully complete the call and update the sequence
> >> numbers. The sev-guest driver would just discard the data in this case,
> >> but pass back the original "not enough buffer space" error to the caller,
> >> who could now allocate space and retry. This then allows the sequence
> >> numbers to be bumped properly.
> >>
> >
> > The way I thought to solve this was to make certificate length
> > querying a part of the specified protocol.
> >
> > The first ext_guest_request command /must/ query the certificate
> > buffer length with req.certs_len == 0.
>
> This becomes an incompatible change to the GHCB specification.
>
> > By making this part of the protocol, the sev-guest driver can check if
> > the certificate length has been requested before.
> > If so, emulate the host's VMM error code for invalid length without
> > sending an encrypted message.
>
> On the hypervisor side, the certificate blob can be replaced at any time
> with a new blob that is larger. So you may still have to handle the case
> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.
>
> > If not, then send an all zeroes request buffer with the req.certs_len
> > = 0 values to the VMM.
> >
> > The VMM will respond with the size if indeed the expected_pages are >
> > 0. In the case that the host has not set the certificate buffer yet,
> > then the host will inspect the header of the request page for a zero
> > sequence number. If so, then we know that we don't have a valid
> > request. We treat this also as the INVALID_LEN case but still return
> > the size of 0. The driver will have the expected pages value stored as
> > 0 at this point, so subsequent calls will not have this behavior.
> >
> > The way /dev/sev-guest user code has been written, I don't think this
> > will break any existing software package.

>
> I think having the sev-guest driver re-issue the request with the internal
> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.

I take it you mean in the case that the host's certs_len == 0?

> You could still cache the size request and always return that to
> user-space when a request is received with a 0 length. The user-space
> program must be able to handle receiving multiple
> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
> the hypervisor can be updating the certs asynchronously. And if you get a
> request that is not 0 length, then you issue it as such and re-use the
> logic of the first 0 length request that was received if you get an
> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
>

A request that gets SNP_GUEST_REQ_INVALID_LEN when the guest expects
that it is providing a sufficiently sized certificate buffer means
that the guest has encrypted its report request.
We then have a harder problem than throttling because not only do we
have to reissue the same request, it must be with different
certificate arguments provided from user space.

> Peter, is this something you could change the patch to do?
>
> >
> >>>
> >>>>
> >>>> For the rate-limiting patch series [1], the rate-limiting will have to be
> >>>> performed within the kernel, while the mutex is held, and then retry the
> >>>> exact request again. Otherwise, that error will require disabling the
> >>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> >>>
> >>> Yes I think if the host rate limits the guest. The guest kernel should
> >>> retry the exact message. Which mutex are you referring too?
> >>
> >> Or the host waits and then submits the request and the guest kernel
> >> doesn't have to do anything. The mutex I'm referring to is the
> >> snp_cmd_mutex that is taken in snp_guest_ioctl().
> >
> > I think that either the host kernel or guest kernel waiting can lead
> > to unacceptable delays.
> > I would recommend that we add a zero argument ioctl to /dev/sev-guest
> > specifically for retrying the last request.
> >
> > We can know what the last request is due to the sev_cmd_mutex serialization.
> > The driver will just keep a scratch buffer for this. Any other request
> > that comes in without resolving the retry will get an -EBUSY error
> > code.
>
> And the first caller will have received an -EAGAIN in order to
> differentiate between the two situations?

Yes, the throttled caller gets -EAGAIN, and other ioctls other than
retry after that get -EBUSY.

>
> I think that sounds reasonable, but there are some catches. You will need
> to ensure that the caller that is supposed to retry does actually retry
> and that a caller that does retry is the same caller that was told to retry.
>

I think that constitutes a change to task_struct, the way that there's
a buffer for interrupted system calls.
That seems a bit much. Do we have to model for protocol-breaking user
tasks that have access to /dev/sev-guest?
The caller that gets -EAGAIN knows to retry. There's no reason for
other tasks to retry due to command serialization and the -EBUSY
behavior.
  
Tom Lendacky Oct. 19, 2022, 8:58 p.m. UTC | #8
On 10/19/22 15:39, Peter Gonda wrote:
> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>
>>>> On 10/19/22 12:40, Peter Gonda wrote:
>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>>
>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>>>> communicate securely with each other. The IV to this scheme is a
>>>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>>>> sequence number in a guest request must exactly match the sequence
>>>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>>>> from the host during a request it can only retry that exact request or
>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>>>> reuse see:
>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>>>>>>
>>>>
>>>> I think I've wrapped my head around this now. Any non-zero return code
>>>> from the hypervisor for an SNP Guest Request is either a hypervisor error
>>>> or an sev-guest driver error, and so the VMPCK should be disabled. The
>>>> sev-guest driver is really doing everything (message headers, performing
>>>> the encryption, etc.) and is only using userspace data that will be part
>>>> of the response message and can't result in a non-zero hypervisor return code.
>>>>
>>>> For the SNP Extended Guest Request, we only need to special case a return
>>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
>>>>
>>>>
>>>>>> I wonder if we can at least still support the extended report length query
>>>>>> by having the kernel allocate the required pages when the error is
>>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
>>>>>> no errors on the second request, the sequence numbers can be safely
>>>>>> updated, but the kernel returns the original error (which will provide the
>>>>>> caller with the number of pages required).
>>>>>
>>>>> I think we can but I thought fixing the security bug could come first,
>>>>> then the usability fix after. Dionna was planning on working on that
>>>>> fix.
>>>>>
>>>>> In that flow how does userspace get the data? Its called the ioctl
>>>>> with not enough output buffer space. What if the userspace calls the
>>>>> ioctl with no buffers space allocated, so its trying to query the
>>>>> length. We just send the host the request without any encrypted data.
>>>>
>>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
>>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can
>>>> supply enough buffer space and invoke the SNP Extended Guest Request again
>>>> in order to successfully complete the call and update the sequence
>>>> numbers. The sev-guest driver would just discard the data in this case,
>>>> but pass back the original "not enough buffer space" error to the caller,
>>>> who could now allocate space and retry. This then allows the sequence
>>>> numbers to be bumped properly.
>>>>
>>>
>>> The way I thought to solve this was to make certificate length
>>> querying a part of the specified protocol.
>>>
>>> The first ext_guest_request command /must/ query the certificate
>>> buffer length with req.certs_len == 0.
>>
>> This becomes an incompatible change to the GHCB specification.
>>
>>> By making this part of the protocol, the sev-guest driver can check if
>>> the certificate length has been requested before.
>>> If so, emulate the host's VMM error code for invalid length without
>>> sending an encrypted message.
>>
>> On the hypervisor side, the certificate blob can be replaced at any time
>> with a new blob that is larger. So you may still have to handle the case
>> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.
> 
> Ah, I forgot the host could keep changing the size of this data.
> 
>>
>>> If not, then send an all zeroes request buffer with the req.certs_len
>>> = 0 values to the VMM.
>>>
>>> The VMM will respond with the size if indeed the expected_pages are >
>>> 0. In the case that the host has not set the certificate buffer yet,
>>> then the host will inspect the header of the request page for a zero
>>> sequence number. If so, then we know that we don't have a valid
>>> request. We treat this also as the INVALID_LEN case but still return
>>> the size of 0. The driver will have the expected pages value stored as
>>> 0 at this point, so subsequent calls will not have this behavior.
>>>
>>> The way /dev/sev-guest user code has been written, I don't think this
>>> will break any existing software package.
>>
>> I think having the sev-guest driver re-issue the request with the internal
>> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.
>> You could still cache the size request and always return that to
>> user-space when a request is received with a 0 length. The user-space
>> program must be able to handle receiving multiple
>> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
>> the hypervisor can be updating the certs asynchronously. And if you get a
>> request that is not 0 length, then you issue it as such and re-use the
>> logic of the first 0 length request that was received if you get an
>> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
>>
>> Peter, is this something you could change the patch to do?
> 
> OK so the guest retires with the same request when it gets an
> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to

It would just use the pre-allocated snp_dev->certs_data buffer with npages 
set to the full size of that buffer.

> hold the certificates. When it finally gets a successful request w/
> certs. Do we want to return the attestation bits to userspace, but
> leave out the certificate data. Or just error out the ioctl
> completely?

We need to be able to return the attestation bits that came back with the 
extra certs. So just error out of the ioctl with the length error and let 
user-space retry with the recommended number of pages.

> 
> I can do that in this series.

Thanks!

> 
>>
>>>
>>>>>
>>>>>>
>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>>>> performed within the kernel, while the mutex is held, and then retry the
>>>>>> exact request again. Otherwise, that error will require disabling the
>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>>>
>>>>> Yes I think if the host rate limits the guest. The guest kernel should
>>>>> retry the exact message. Which mutex are you referring too?
>>>>
>>>> Or the host waits and then submits the request and the guest kernel
>>>> doesn't have to do anything. The mutex I'm referring to is the
>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
>>>
>>> I think that either the host kernel or guest kernel waiting can lead
>>> to unacceptable delays.
>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
>>> specifically for retrying the last request.
>>>
>>> We can know what the last request is due to the sev_cmd_mutex serialization.
>>> The driver will just keep a scratch buffer for this. Any other request
>>> that comes in without resolving the retry will get an -EBUSY error
>>> code.
>>
>> And the first caller will have received an -EAGAIN in order to
>> differentiate between the two situations?
>>
>>>
>>> Calling the retry ioctl without a pending command will result in -EINVAL.
>>>
>>> Let me know what you think.
>>
>> I think that sounds reasonable, but there are some catches. You will need
>> to ensure that the caller that is supposed to retry does actually retry
>> and that a caller that does retry is the same caller that was told to retry.
> 
> Whats the issue with the guest driver taking some time?
> 
> This sounds complex because there may be many users of the driver. How
> do multiple users coordinate when they need to use the retry ioctl?
> 
>>
>> Thanks,
>> Tom
>>
>>>>
>>>> Thanks,
>>>> Tom
>>>
>>>
>>>
  
Tom Lendacky Oct. 19, 2022, 9:05 p.m. UTC | #9
On 10/19/22 15:40, Dionna Amalie Glaze wrote:
> On Wed, Oct 19, 2022 at 12:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>
>>>> On 10/19/22 12:40, Peter Gonda wrote:
>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>>
>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>>>> communicate securely with each other. The IV to this scheme is a
>>>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>>>> sequence number in a guest request must exactly match the sequence
>>>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>>>> from the host during a request it can only retry that exact request or
>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>>>> reuse see:
>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>>>>>>
>>>>
>>>> I think I've wrapped my head around this now. Any non-zero return code
>>>> from the hypervisor for an SNP Guest Request is either a hypervisor error
>>>> or an sev-guest driver error, and so the VMPCK should be disabled. The
>>>> sev-guest driver is really doing everything (message headers, performing
>>>> the encryption, etc.) and is only using userspace data that will be part
>>>> of the response message and can't result in a non-zero hypervisor return code.
>>>>
>>>> For the SNP Extended Guest Request, we only need to special case a return
>>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
>>>>
>>>>
>>>>>> I wonder if we can at least still support the extended report length query
>>>>>> by having the kernel allocate the required pages when the error is
>>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
>>>>>> no errors on the second request, the sequence numbers can be safely
>>>>>> updated, but the kernel returns the original error (which will provide the
>>>>>> caller with the number of pages required).
>>>>>
>>>>> I think we can but I thought fixing the security bug could come first,
>>>>> then the usability fix after. Dionna was planning on working on that
>>>>> fix.
>>>>>
>>>>> In that flow how does userspace get the data? Its called the ioctl
>>>>> with not enough output buffer space. What if the userspace calls the
>>>>> ioctl with no buffers space allocated, so its trying to query the
>>>>> length. We just send the host the request without any encrypted data.
>>>>
>>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
>>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can
>>>> supply enough buffer space and invoke the SNP Extended Guest Request again
>>>> in order to successfully complete the call and update the sequence
>>>> numbers. The sev-guest driver would just discard the data in this case,
>>>> but pass back the original "not enough buffer space" error to the caller,
>>>> who could now allocate space and retry. This then allows the sequence
>>>> numbers to be bumped properly.
>>>>
>>>
>>> The way I thought to solve this was to make certificate length
>>> querying a part of the specified protocol.
>>>
>>> The first ext_guest_request command /must/ query the certificate
>>> buffer length with req.certs_len == 0.
>>
>> This becomes an incompatible change to the GHCB specification.
>>
>>> By making this part of the protocol, the sev-guest driver can check if
>>> the certificate length has been requested before.
>>> If so, emulate the host's VMM error code for invalid length without
>>> sending an encrypted message.
>>
>> On the hypervisor side, the certificate blob can be replaced at any time
>> with a new blob that is larger. So you may still have to handle the case
>> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.
>>
>>> If not, then send an all zeroes request buffer with the req.certs_len
>>> = 0 values to the VMM.
>>>
>>> The VMM will respond with the size if indeed the expected_pages are >
>>> 0. In the case that the host has not set the certificate buffer yet,
>>> then the host will inspect the header of the request page for a zero
>>> sequence number. If so, then we know that we don't have a valid
>>> request. We treat this also as the INVALID_LEN case but still return
>>> the size of 0. The driver will have the expected pages value stored as
>>> 0 at this point, so subsequent calls will not have this behavior.
>>>
>>> The way /dev/sev-guest user code has been written, I don't think this
>>> will break any existing software package.
> 
>>
>> I think having the sev-guest driver re-issue the request with the internal
>> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.
> 
> I take it you mean in the case that the host's certs_len == 0?

Not sure what you mean. The sev-guest driver has an internal buffer for 
receiving the certs, snp_dev->certs_data, and it would use that whenever 
it receives an SNP_GUEST_REQ_INVALID_LEN return code.

> 
>> You could still cache the size request and always return that to
>> user-space when a request is received with a 0 length. The user-space
>> program must be able to handle receiving multiple
>> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
>> the hypervisor can be updating the certs asynchronously. And if you get a
>> request that is not 0 length, then you issue it as such and re-use the
>> logic of the first 0 length request that was received if you get an
>> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
>>
> 
> A request that gets SNP_GUEST_REQ_INVALID_LEN when the guest expects
> that it is providing a sufficiently sized certificate buffer means
> that the guest has encrypted its report request.

Correct.

> We then have a harder problem than throttling because not only do we
> have to reissue the same request, it must be with different
> certificate arguments provided from user space.

Correct. But before returning the error to userspace, the sev-guest driver 
will issue the request again with its internal buffer so that the sequence 
numbers are updated and a new request can be issued.

> 
>> Peter, is this something you could change the patch to do?
>>
>>>
>>>>>
>>>>>>
>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>>>> performed within the kernel, while the mutex is held, and then retry the
>>>>>> exact request again. Otherwise, that error will require disabling the
>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>>>
>>>>> Yes I think if the host rate limits the guest. The guest kernel should
>>>>> retry the exact message. Which mutex are you referring too?
>>>>
>>>> Or the host waits and then submits the request and the guest kernel
>>>> doesn't have to do anything. The mutex I'm referring to is the
>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
>>>
>>> I think that either the host kernel or guest kernel waiting can lead
>>> to unacceptable delays.
>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
>>> specifically for retrying the last request.
>>>
>>> We can know what the last request is due to the sev_cmd_mutex serialization.
>>> The driver will just keep a scratch buffer for this. Any other request
>>> that comes in without resolving the retry will get an -EBUSY error
>>> code.
>>
>> And the first caller will have received an -EAGAIN in order to
>> differentiate between the two situations?
> 
> Yes, the throttled caller gets -EAGAIN, and other ioctls other than
> retry after that get -EBUSY.
> 
>>
>> I think that sounds reasonable, but there are some catches. You will need
>> to ensure that the caller that is supposed to retry does actually retry
>> and that a caller that does retry is the same caller that was told to retry.
>>
> 
> I think that constitutes a change to task_struct, the way that there's
> a buffer for interrupted system calls.
> That seems a bit much. Do we have to model for protocol-breaking user
> tasks that have access to /dev/sev-guest?
> The caller that gets -EAGAIN knows to retry. There's no reason for
> other tasks to retry due to command serialization and the -EBUSY
> behavior.

Maybe for well-behaving user-space applications, but that's not 
guaranteed. I agree with Peter and think waiting in the sev-guest driver 
is the simplest and fairest thing to do in the case of throttling.

Thanks,
Tom

> 
>
  
Peter Gonda Oct. 19, 2022, 9:47 p.m. UTC | #10
On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 15:39, Peter Gonda wrote:
> > On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>
> >> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> >>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>
> >>>> On 10/19/22 12:40, Peter Gonda wrote:
> >>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>>>
> >>>>>> On 10/19/22 10:03, Peter Gonda wrote:
> >>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>>>>>> communicate securely with each other. The IV to this scheme is a
> >>>>>>> sequence number that both the ASP and the guest track. Currently this
> >>>>>>> sequence number in a guest request must exactly match the sequence
> >>>>>>> number tracked by the ASP. This means that if the guest sees an error
> >>>>>>> from the host during a request it can only retry that exact request or
> >>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>>>>>> reuse see:
> >>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
> >>>>>>
> >>>>
> >>>> I think I've wrapped my head around this now. Any non-zero return code
> >>>> from the hypervisor for an SNP Guest Request is either a hypervisor error
> >>>> or an sev-guest driver error, and so the VMPCK should be disabled. The
> >>>> sev-guest driver is really doing everything (message headers, performing
> >>>> the encryption, etc.) and is only using userspace data that will be part
> >>>> of the response message and can't result in a non-zero hypervisor return code.
> >>>>
> >>>> For the SNP Extended Guest Request, we only need to special case a return
> >>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
> >>>>
> >>>>
> >>>>>> I wonder if we can at least still support the extended report length query
> >>>>>> by having the kernel allocate the required pages when the error is
> >>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
> >>>>>> no errors on the second request, the sequence numbers can be safely
> >>>>>> updated, but the kernel returns the original error (which will provide the
> >>>>>> caller with the number of pages required).
> >>>>>
> >>>>> I think we can but I thought fixing the security bug could come first,
> >>>>> then the usability fix after. Dionna was planning on working on that
> >>>>> fix.
> >>>>>
> >>>>> In that flow how does userspace get the data? Its called the ioctl
> >>>>> with not enough output buffer space. What if the userspace calls the
> >>>>> ioctl with no buffers space allocated, so its trying to query the
> >>>>> length. We just send the host the request without any encrypted data.
> >>>>
> >>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
> >>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can
> >>>> supply enough buffer space and invoke the SNP Extended Guest Request again
> >>>> in order to successfully complete the call and update the sequence
> >>>> numbers. The sev-guest driver would just discard the data in this case,
> >>>> but pass back the original "not enough buffer space" error to the caller,
> >>>> who could now allocate space and retry. This then allows the sequence
> >>>> numbers to be bumped properly.
> >>>>
> >>>
> >>> The way I thought to solve this was to make certificate length
> >>> querying a part of the specified protocol.
> >>>
> >>> The first ext_guest_request command /must/ query the certificate
> >>> buffer length with req.certs_len == 0.
> >>
> >> This becomes an incompatible change to the GHCB specification.
> >>
> >>> By making this part of the protocol, the sev-guest driver can check if
> >>> the certificate length has been requested before.
> >>> If so, emulate the host's VMM error code for invalid length without
> >>> sending an encrypted message.
> >>
> >> On the hypervisor side, the certificate blob can be replaced at any time
> >> with a new blob that is larger. So you may still have to handle the case
> >> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.
> >
> > Ah, I forgot the host could keep changing the size of this data.
> >
> >>
> >>> If not, then send an all zeroes request buffer with the req.certs_len
> >>> = 0 values to the VMM.
> >>>
> >>> The VMM will respond with the size if indeed the expected_pages are >
> >>> 0. In the case that the host has not set the certificate buffer yet,
> >>> then the host will inspect the header of the request page for a zero
> >>> sequence number. If so, then we know that we don't have a valid
> >>> request. We treat this also as the INVALID_LEN case but still return
> >>> the size of 0. The driver will have the expected pages value stored as
> >>> 0 at this point, so subsequent calls will not have this behavior.
> >>>
> >>> The way /dev/sev-guest user code has been written, I don't think this
> >>> will break any existing software package.
> >>
> >> I think having the sev-guest driver re-issue the request with the internal
> >> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.
> >> You could still cache the size request and always return that to
> >> user-space when a request is received with a 0 length. The user-space
> >> program must be able to handle receiving multiple
> >> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
> >> the hypervisor can be updating the certs asynchronously. And if you get a
> >> request that is not 0 length, then you issue it as such and re-use the
> >> logic of the first 0 length request that was received if you get an
> >> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
> >>
> >> Peter, is this something you could change the patch to do?
> >
> > OK so the guest retires with the same request when it gets an
> > SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
>
> It would just use the pre-allocated snp_dev->certs_data buffer with npages
> set to the full size of that buffer.

Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe
we want to just allocate this buffer which we think is sufficient and
never increase the allocation?

I see the size of
https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is
3200 bytes. Assuming the VCEK cert is the same size (which it should
be since this .cert is 2 certificates). 16K seems to leave enough room
even for some vendor certificates?

>
> > hold the certificates. When it finally gets a successful request w/
> > certs. Do we want to return the attestation bits to userspace, but
> > leave out the certificate data. Or just error out the ioctl
> > completely?
>
> We need to be able to return the attestation bits that came back with the
> extra certs. So just error out of the ioctl with the length error and let
> user-space retry with the recommended number of pages.

That sounded simpler to me. Will do.

>
> >
> > I can do that in this series.
>
> Thanks!
>
> >
> >>
> >>>
> >>>>>
> >>>>>>
> >>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
> >>>>>> performed within the kernel, while the mutex is held, and then retry the
> >>>>>> exact request again. Otherwise, that error will require disabling the
> >>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> >>>>>
> >>>>> Yes I think if the host rate limits the guest. The guest kernel should
> >>>>> retry the exact message. Which mutex are you referring too?
> >>>>
> >>>> Or the host waits and then submits the request and the guest kernel
> >>>> doesn't have to do anything. The mutex I'm referring to is the
> >>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
> >>>
> >>> I think that either the host kernel or guest kernel waiting can lead
> >>> to unacceptable delays.
> >>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
> >>> specifically for retrying the last request.
> >>>
> >>> We can know what the last request is due to the sev_cmd_mutex serialization.
> >>> The driver will just keep a scratch buffer for this. Any other request
> >>> that comes in without resolving the retry will get an -EBUSY error
> >>> code.
> >>
> >> And the first caller will have received an -EAGAIN in order to
> >> differentiate between the two situations?
> >>
> >>>
> >>> Calling the retry ioctl without a pending command will result in -EINVAL.
> >>>
> >>> Let me know what you think.
> >>
> >> I think that sounds reasonable, but there are some catches. You will need
> >> to ensure that the caller that is supposed to retry does actually retry
> >> and that a caller that does retry is the same caller that was told to retry.
> >
> > Whats the issue with the guest driver taking some time?
> >
> > This sounds complex because there may be many users of the driver. How
> > do multiple users coordinate when they need to use the retry ioctl?
> >
> >>
> >> Thanks,
> >> Tom
> >>
> >>>>
> >>>> Thanks,
> >>>> Tom
> >>>
> >>>
> >>>
  
Tom Lendacky Oct. 20, 2022, 2:02 p.m. UTC | #11
On 10/19/22 16:47, Peter Gonda wrote:
> On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>> On 10/19/22 15:39, Peter Gonda wrote:
>>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
>>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>> On 10/19/22 12:40, Peter Gonda wrote:
>>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>>>>>> communicate securely with each other. The IV to this scheme is a
>>>>>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>>>>>> sequence number in a guest request must exactly match the sequence
>>>>>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>>>>>> from the host during a request it can only retry that exact request or
>>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>>>>>> reuse see:
>>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

>>> OK so the guest retires with the same request when it gets an
>>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
>>
>> It would just use the pre-allocated snp_dev->certs_data buffer with npages
>> set to the full size of that buffer.
> 
> Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe
> we want to just allocate this buffer which we think is sufficient and
> never increase the allocation?
> 
> I see the size of
> https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is
> 3200 bytes. Assuming the VCEK cert is the same size (which it should
> be since this .cert is 2 certificates). 16K seems to leave enough room
> even for some vendor certificates?

I think just using the 16K buffer (4 pages) as it is allocated today is 
ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4 
pages, then we won't ever be able to pull the certs given how the driver 
is coded today. In that case, disabling the VMPCK is in order.

A separate patch could be submitted later to improve this overall aspect 
of the certs buffer if needed.

Thanks,
Tom

> 
>>
>>> hold the certificates. When it finally gets a successful request w/
>>> certs. Do we want to return the attestation bits to userspace, but
>>> leave out the certificate data. Or just error out the ioctl
>>> completely?
>>
>> We need to be able to return the attestation bits that came back with the
>> extra certs. So just error out of the ioctl with the length error and let
>> user-space retry with the recommended number of pages.
> 
> That sounded simpler to me. Will do.
> 
>>
>>>
>>> I can do that in this series.
>>
>> Thanks!
>>
>>>
>>>>
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>>>>>> performed within the kernel, while the mutex is held, and then retry the
>>>>>>>> exact request again. Otherwise, that error will require disabling the
>>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>>>>>
>>>>>>> Yes I think if the host rate limits the guest. The guest kernel should
>>>>>>> retry the exact message. Which mutex are you referring too?
>>>>>>
>>>>>> Or the host waits and then submits the request and the guest kernel
>>>>>> doesn't have to do anything. The mutex I'm referring to is the
>>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
>>>>>
>>>>> I think that either the host kernel or guest kernel waiting can lead
>>>>> to unacceptable delays.
>>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
>>>>> specifically for retrying the last request.
>>>>>
>>>>> We can know what the last request is due to the sev_cmd_mutex serialization.
>>>>> The driver will just keep a scratch buffer for this. Any other request
>>>>> that comes in without resolving the retry will get an -EBUSY error
>>>>> code.
>>>>
>>>> And the first caller will have received an -EAGAIN in order to
>>>> differentiate between the two situations?
>>>>
>>>>>
>>>>> Calling the retry ioctl without a pending command will result in -EINVAL.
>>>>>
>>>>> Let me know what you think.
>>>>
>>>> I think that sounds reasonable, but there are some catches. You will need
>>>> to ensure that the caller that is supposed to retry does actually retry
>>>> and that a caller that does retry is the same caller that was told to retry.
>>>
>>> Whats the issue with the guest driver taking some time?
>>>
>>> This sounds complex because there may be many users of the driver. How
>>> do multiple users coordinate when they need to use the retry ioctl?
>>>
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Tom
>>>>>
>>>>>
>>>>>
  
Peter Gonda Oct. 20, 2022, 2:46 p.m. UTC | #12
On Thu, Oct 20, 2022 at 8:02 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 10/19/22 16:47, Peter Gonda wrote:
> > On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >> On 10/19/22 15:39, Peter Gonda wrote:
> >>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> >>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>>> On 10/19/22 12:40, Peter Gonda wrote:
> >>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
> >>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>>>>>>>> communicate securely with each other. The IV to this scheme is a
> >>>>>>>>> sequence number that both the ASP and the guest track. Currently this
> >>>>>>>>> sequence number in a guest request must exactly match the sequence
> >>>>>>>>> number tracked by the ASP. This means that if the guest sees an error
> >>>>>>>>> from the host during a request it can only retry that exact request or
> >>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>>>>>>>> reuse see:
> >>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>
> >>> OK so the guest retires with the same request when it gets an
> >>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
> >>
> >> It would just use the pre-allocated snp_dev->certs_data buffer with npages
> >> set to the full size of that buffer.
> >
> > Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe
> > we want to just allocate this buffer which we think is sufficient and
> > never increase the allocation?
> >
> > I see the size of
> > https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is
> > 3200 bytes. Assuming the VCEK cert is the same size (which it should
> > be since this .cert is 2 certificates). 16K seems to leave enough room
> > even for some vendor certificates?
>
> I think just using the 16K buffer (4 pages) as it is allocated today is
> ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4
> pages, then we won't ever be able to pull the certs given how the driver
> is coded today. In that case, disabling the VMPCK is in order.
>
> A separate patch could be submitted later to improve this overall aspect
> of the certs buffer if needed.

If that sounds OK I'd prefer that. This keeps the drivers current limit:

static int get_ext_report(struct snp_guest_dev *snp_dev, struct
snp_guest_request_ioctl *arg)
...
if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
    !IS_ALIGNED(req.certs_len, PAGE_SIZE))
return -EINVAL;

I'd prefer not to add extra features during the bug fix. But happy to
make this work with buffers greater than SEV_FW_BLOB_MAX_SIZE as
follow up if you want.

>
> Thanks,
> Tom
>
> >
> >>
> >>> hold the certificates. When it finally gets a successful request w/
> >>> certs. Do we want to return the attestation bits to userspace, but
> >>> leave out the certificate data. Or just error out the ioctl
> >>> completely?
> >>
> >> We need to be able to return the attestation bits that came back with the
> >> extra certs. So just error out of the ioctl with the length error and let
> >> user-space retry with the recommended number of pages.
> >
> > That sounded simpler to me. Will do.
> >
> >>
> >>>
> >>> I can do that in this series.
> >>
> >> Thanks!
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
> >>>>>>>> performed within the kernel, while the mutex is held, and then retry the
> >>>>>>>> exact request again. Otherwise, that error will require disabling the
> >>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>>
> >>>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
> >>>>>>>
> >>>>>>> Yes I think if the host rate limits the guest. The guest kernel should
> >>>>>>> retry the exact message. Which mutex are you referring too?
> >>>>>>
> >>>>>> Or the host waits and then submits the request and the guest kernel
> >>>>>> doesn't have to do anything. The mutex I'm referring to is the
> >>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
> >>>>>
> >>>>> I think that either the host kernel or guest kernel waiting can lead
> >>>>> to unacceptable delays.
> >>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
> >>>>> specifically for retrying the last request.
> >>>>>
> >>>>> We can know what the last request is due to the sev_cmd_mutex serialization.
> >>>>> The driver will just keep a scratch buffer for this. Any other request
> >>>>> that comes in without resolving the retry will get an -EBUSY error
> >>>>> code.
> >>>>
> >>>> And the first caller will have received an -EAGAIN in order to
> >>>> differentiate between the two situations?
> >>>>
> >>>>>
> >>>>> Calling the retry ioctl without a pending command will result in -EINVAL.
> >>>>>
> >>>>> Let me know what you think.
> >>>>
> >>>> I think that sounds reasonable, but there are some catches. You will need
> >>>> to ensure that the caller that is supposed to retry does actually retry
> >>>> and that a caller that does retry is the same caller that was told to retry.
> >>>
> >>> Whats the issue with the guest driver taking some time?
> >>>
> >>> This sounds complex because there may be many users of the driver. How
> >>> do multiple users coordinate when they need to use the retry ioctl?
> >>>
> >>>>
> >>>> Thanks,
> >>>> Tom
> >>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Tom
> >>>>>
> >>>>>
> >>>>>
  
kernel test robot Oct. 24, 2022, 5:32 p.m. UTC | #13
Hi Peter,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950
patch link:    https://lore.kernel.org/r/20221019150333.1047423-1-pgonda%40google.com
patch subject: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver
config: x86_64-buildonly-randconfig-r006-20221024
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/3bb420ef25bf00e5fa26d6146446b11e6ebfd255
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950
        git checkout 3bb420ef25bf00e5fa26d6146446b11e6ebfd255
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/virt/coco/sev-guest/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/virt/coco/sev-guest/sev-guest.c:351:10: warning: format specifies type 'unsigned long' but the argument has type '__u64 *' (aka 'unsigned long long *') [-Wformat]
                             rc, fw_err);
                                 ^~~~~~
   include/linux/dev_printk.h:142:69: note: expanded from macro 'dev_alert'
           dev_printk_index_wrap(_dev_alert, KERN_ALERT, dev, dev_fmt(fmt), ##__VA_ARGS__)
                                                                      ~~~     ^~~~~~~~~~~
   include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
                   _p_func(dev, fmt, ##__VA_ARGS__);                       \
                                ~~~    ^~~~~~~~~~~
   1 warning generated.


vim +351 drivers/virt/coco/sev-guest/sev-guest.c

   322	
   323	static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
   324					u8 type, void *req_buf, size_t req_sz, void *resp_buf,
   325					u32 resp_sz, __u64 *fw_err)
   326	{
   327		unsigned long err;
   328		u64 seqno;
   329		int rc;
   330	
   331		/* Get message sequence and verify that its a non-zero */
   332		seqno = snp_get_msg_seqno(snp_dev);
   333		if (!seqno)
   334			return -EIO;
   335	
   336		memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
   337	
   338		/* Encrypt the userspace provided payload */
   339		rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
   340		if (rc)
   341			return rc;
   342	
   343		/* Call firmware to process the request */
   344		rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
   345		if (fw_err)
   346			*fw_err = err;
   347	
   348		if (rc) {
   349			dev_alert(snp_dev->dev,
   350				  "Detected error from ASP request. rc: %d, fw_err: %lu\n",
 > 351				  rc, fw_err);
   352			goto disable_vmpck;
   353		}
   354	
   355		rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
   356		if (rc) {
   357			dev_alert(snp_dev->dev,
   358				  "Detected unexpected decode failure from ASP. rc: %d\n",
   359				  rc);
   360			goto disable_vmpck;
   361		}
   362	
   363		/* Increment to new message sequence after payload decryption was successful. */
   364		snp_inc_msg_seqno(snp_dev);
   365	
   366		return 0;
   367	
   368	disable_vmpck:
   369		snp_disable_vmpck(snp_dev);
   370		return rc;
   371	}
   372
  

Patch

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index f422f9c58ba7..227ae6a10ef2 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -67,8 +67,27 @@  static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
 	return true;
 }
 
+/*
+ * If we receive an error from the host or ASP we have two options. We can
+ * either retry the exact same encrypted request or we can discontinue using the
+ * VMPCK.
+ *
+ * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
+ * encrypt the requests. The IV for this scheme is the sequence number. GCM
+ * cannot tolerate IV reuse.
+ *
+ * The ASP FW v1.51 only increments the sequence numbers on a successful
+ * guest<->ASP back and forth and only accepts messages at its exact sequence
+ * number.
+ *
+ * So if we were to reuse the sequence number the encryption scheme is
+ * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will
+ * reject our request.
+ */
 static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
 {
+	dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n",
+		  vmpck_id);
 	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
 	snp_dev->vmpck = NULL;
 }
@@ -326,29 +345,29 @@  static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
 	if (fw_err)
 		*fw_err = err;
 
-	if (rc)
-		return rc;
+	if (rc) {
+		dev_alert(snp_dev->dev,
+			  "Detected error from ASP request. rc: %d, fw_err: %lu\n",
+			  rc, fw_err);
+		goto disable_vmpck;
+	}
 
-	/*
-	 * The verify_and_dec_payload() will fail only if the hypervisor is
-	 * actively modifying the message header or corrupting the encrypted payload.
-	 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
-	 * the key cannot be used for any communication. The key is disabled to ensure
-	 * that AES-GCM does not use the same IV while encrypting the request payload.
-	 */
 	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
 	if (rc) {
 		dev_alert(snp_dev->dev,
-			  "Detected unexpected decode failure, disabling the vmpck_id %d\n",
-			  vmpck_id);
-		snp_disable_vmpck(snp_dev);
-		return rc;
+			  "Detected unexpected decode failure from ASP. rc: %d\n",
+			  rc);
+		goto disable_vmpck;
 	}
 
 	/* Increment to new message sequence after payload decryption was successful. */
 	snp_inc_msg_seqno(snp_dev);
 
 	return 0;
+
+disable_vmpck:
+	snp_disable_vmpck(snp_dev);
+	return rc;
 }
 
 static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)