[v4,6/6] virt: tdx-guest: Add Quote generation support using TSM_REPORTS

Message ID 169570185411.596431.3153227201984845022.stgit@dwillia2-xfh.jf.intel.com
State New
Headers
Series configfs-tsm: Attestation Report ABI |

Commit Message

Dan Williams Sept. 26, 2023, 4:17 a.m. UTC
  From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest. The first step in the attestation process is TDREPORT
generation, which involves getting the guest measurement data in the
format of TDREPORT, which is further used to validate the authenticity
of the TDX guest. TDREPORT by design is integrity-protected and can
only be verified on the local machine.

To support remote verification of the TDREPORT in a SGX-based
attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
(QE) to convert it to a remotely verifiable Quote. SGX QE by design can
only run outside of the TDX guest (i.e. in a host process or in a
normal VM) and guest can use communication channels like vsock or
TCP/IP to send the TDREPORT to the QE. But for security concerns, the
TDX guest may not support these communication channels. To handle such
cases, TDX defines a GetQuote hypercall which can be used by the guest
to request the host VMM to communicate with the SGX QE. More details
about GetQuote hypercall can be found in TDX Guest-Host Communication
Interface (GHCI) for Intel TDX 1.0, section titled
"TDG.VP.VMCALL<GetQuote>".

Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
Computing Guest platforms to get the measurement data via ConfigFS.
Extend the TSM framework and add support to allow an attestation agent
to get the TDX Quote data (included usage example below).

  report=/sys/kernel/config/tsm/report/report0
  mkdir $report
  dd if=/dev/urandom bs=64 count=1 > $report/inblob
  hexdump -C $report/outblob
  rmdir $report

GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
with TDREPORT data as input, which is further used by the VMM to copy
the TD Quote result after successful Quote generation. To create the
shared buffer, allocate a large enough memory and mark it shared using
set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
for GetQuote requests in the TDX TSM handler.

Although this method reserves a fixed chunk of memory for GetQuote
requests, such one time allocation can help avoid memory fragmentation
related allocation failures later in the uptime of the guest.

Since the Quote generation process is not time-critical or frequently
used, the current version uses a polling model for Quote requests and
it also does not support parallel GetQuote requests.

Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Erdem Aktas <erdemaktas@google.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/coco/tdx/tdx.c                 |   21 +++
 arch/x86/include/asm/shared/tdx.h       |    1 
 arch/x86/include/asm/tdx.h              |    2 
 drivers/virt/coco/tdx-guest/Kconfig     |    1 
 drivers/virt/coco/tdx-guest/tdx-guest.c |  229 +++++++++++++++++++++++++++++++
 5 files changed, 253 insertions(+), 1 deletion(-)
  

Comments

Peter Gonda Sept. 27, 2023, 4:14 p.m. UTC | #1
On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest. The first step in the attestation process is TDREPORT
> generation, which involves getting the guest measurement data in the
> format of TDREPORT, which is further used to validate the authenticity
> of the TDX guest. TDREPORT by design is integrity-protected and can
> only be verified on the local machine.
>
> To support remote verification of the TDREPORT in a SGX-based
> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> only run outside of the TDX guest (i.e. in a host process or in a
> normal VM) and guest can use communication channels like vsock or
> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> TDX guest may not support these communication channels. To handle such
> cases, TDX defines a GetQuote hypercall which can be used by the guest
> to request the host VMM to communicate with the SGX QE. More details
> about GetQuote hypercall can be found in TDX Guest-Host Communication
> Interface (GHCI) for Intel TDX 1.0, section titled
> "TDG.VP.VMCALL<GetQuote>".
>
> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> Computing Guest platforms to get the measurement data via ConfigFS.
> Extend the TSM framework and add support to allow an attestation agent
> to get the TDX Quote data (included usage example below).
>
>   report=/sys/kernel/config/tsm/report/report0
>   mkdir $report
>   dd if=/dev/urandom bs=64 count=1 > $report/inblob
>   hexdump -C $report/outblob
>   rmdir $report
>
> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> with TDREPORT data as input, which is further used by the VMM to copy
> the TD Quote result after successful Quote generation. To create the
> shared buffer, allocate a large enough memory and mark it shared using
> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> for GetQuote requests in the TDX TSM handler.
>
> Although this method reserves a fixed chunk of memory for GetQuote
> requests, such one time allocation can help avoid memory fragmentation
> related allocation failures later in the uptime of the guest.
>
> Since the Quote generation process is not time-critical or frequently
> used, the current version uses a polling model for Quote requests and
> it also does not support parallel GetQuote requests.
>
> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Hey Dan,

I tried running your test commands on an SNP enabled guest. To build
the kernel I just checked out linus/master and applied your series. I
haven't done any debugging yet, so I will update with what I find.

root@Ubuntu2004:~#   hexdump -C $report/outblob
[  219.871875] ------------[ cut here ]------------
[  219.876642] kernel BUG at include/linux/scatterlist.h:187!
[  219.882280] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  219.887628] CPU: 0 PID: 1317 Comm: hexdump Not tainted 6.6.0-rc3-xfstests-000
44-gf636850ddfc7 #1
[  219.896530] Hardware name: Google Google Compute Engine/Google Compute Engine
, BIOS Google 09/26/2023
[  219.905859] RIP: 0010:enc_dec_message+0x4ed/0x570
[  219.910673] Code: c7 c0 00 00 00 80 48 2b 05 b8 4f 99 00 e9 dc fd ff ff 0f 0b
 bb f4 ff ff ff eb b5 0f 0b 0f 0b 0f 0b e8 e7 47 ce ff 89 c3 eb 94 <0f> 0b 0f 0b
 0f 0b 0f 0b 48 8d 7c 24 38 e8 11 6b 23 00 8b 5c 24 58
[  219.929547] RSP: 0018:ffffc90000e27a18 EFLAGS: 00010246
[  219.934893] RAX: 0000000000000000 RBX: ffffc90000e27bf8 RCX: 0000000000081000
[  219.942134] RDX: 0000000000000000 RSI: 0000000000080000 RDI: ffffc90080e27bf8
[  219.949378] RBP: ffff8881018980a0 R08: 0000000000000000 R09: ffffc90000e27a78
[  219.956621] R10: 0000000000026680 R11: 0000000000000008 R12: ffff888111a3c400
[  219.963864] R13: ffff8881018980d0 R14: ffff8881003e7000 R15: 0000000000000060
[  219.971106] FS:  00007fd7e75f5740(0000) GS:ffff888237c00000(0000) knlGS:00000
00000000000
[  219.979303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  219.985160] CR2: 00005563500e8808 CR3: 0008000111986001 CR4: 0000000000370ef0
[  219.992401] Call Trace:
[  219.994955]  <TASK>
[  219.997160]  ? die+0x36/0x80
[  220.000149]  ? do_trap+0xf4/0x100
[  220.003574]  ? enc_dec_message+0x4ed/0x570
[  220.007777]  ? do_error_trap+0x65/0x80
[  220.011632]  ? enc_dec_message+0x4ed/0x570
[  220.015842]  ? exc_invalid_op+0x50/0x70
[  220.019788]  ? enc_dec_message+0x4ed/0x570
[  220.023994]  ? asm_exc_invalid_op+0x1a/0x20
[  220.028288]  ? enc_dec_message+0x4ed/0x570
[  220.032505]  ? enc_dec_message+0x16f/0x570
[  220.036711]  ? srso_alias_return_thunk+0x5/0x7f
[  220.041352]  ? srso_alias_return_thunk+0x5/0x7f
[  220.045994]  handle_guest_request+0xc6/0x330
[  220.050375]  get_ext_report+0x1e0/0x3d0
[  220.054323]  sev_report_new+0x159/0x460
[  220.058267]  tsm_report_read.part.0+0x96/0x120
[  220.062818]  configfs_bin_read_iter+0xe1/0x1e0
[  220.067377]  vfs_read+0x1db/0x310
[  220.070813]  ksys_read+0x6f/0xf0
[  220.074152]  do_syscall_64+0x3f/0x90
[  220.077843]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  220.083007] RIP: 0033:0x7fd7e7705fd2
[  220.086695] Code: c0 e9 c2 fe ff ff 50 48 8d 3d aa cb 0a 00 e8 d5 1a 02 00 0f
 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0
 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[  220.105572] RSP: 002b:00007fff9bc8fc18 EFLAGS: 00000246 ORIG_RAX: 00000000000
00000
[  220.113252] RAX: ffffffffffffffda RBX: 00007fd7e77e4980 RCX: 00007fd7e7705fd2
[  220.120496] RDX: 0000000000001000 RSI: 00005563500e7800 RDI: 0000000000000000
[  220.127742] RBP: 00007fd7e77e14a0 R08: 0000000000000000 R09: 000000000000007c
[  220.134988] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[  220.142232] R13: 00007fd7e77e08a0 R14: 0000000000000d68 R15: 0000000000000d68
[  220.149479]  </TASK>
[  220.151865] ---[ end trace 0000000000000000 ]---
[  220.156629] RIP: 0010:enc_dec_message+0x4ed/0x570
[  220.161479] Code: c7 c0 00 00 00 80 48 2b 05 b8 4f 99 00 e9 dc fd ff ff 0f 0b
 bb f4 ff ff ff eb b5 0f 0b 0f 0b 0f 0b e8 e7 47 ce ff 89 c3 eb 94 <0f> 0b 0f 0b
 0f 0b 0f 0b 48 8d 7c 24 38 e8 11 6b 23 00 8b 5c 24 58
[  220.180379] RSP: 0018:ffffc90000e27a18 EFLAGS: 00010246
[  220.185743] RAX: 0000000000000000 RBX: ffffc90000e27bf8 RCX: 0000000000081000
[  220.193012] RDX: 0000000000000000 RSI: 0000000000080000 RDI: ffffc90080e27bf8
[  220.200280] RBP: ffff8881018980a0 R08: 0000000000000000 R09: ffffc90000e27a78
[  220.207551] R10: 0000000000026680 R11: 0000000000000008 R12: ffff888111a3c400
[  220.214822] R13: ffff8881018980d0 R14: ffff8881003e7000 R15: 0000000000000060
[  220.222094] FS:  00007fd7e75f5740(0000) GS:ffff888237c00000(0000) knlGS:00000
00000000000
[  220.230329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  220.236210] CR2: 00005563500e8808 CR3: 0008000111986001 CR4: 0000000000370ef0
  
Dan Williams Sept. 27, 2023, 4:53 p.m. UTC | #2
Peter Gonda wrote:
> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest. The first step in the attestation process is TDREPORT
> > generation, which involves getting the guest measurement data in the
> > format of TDREPORT, which is further used to validate the authenticity
> > of the TDX guest. TDREPORT by design is integrity-protected and can
> > only be verified on the local machine.
> >
> > To support remote verification of the TDREPORT in a SGX-based
> > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > only run outside of the TDX guest (i.e. in a host process or in a
> > normal VM) and guest can use communication channels like vsock or
> > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > TDX guest may not support these communication channels. To handle such
> > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > to request the host VMM to communicate with the SGX QE. More details
> > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > Interface (GHCI) for Intel TDX 1.0, section titled
> > "TDG.VP.VMCALL<GetQuote>".
> >
> > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > Computing Guest platforms to get the measurement data via ConfigFS.
> > Extend the TSM framework and add support to allow an attestation agent
> > to get the TDX Quote data (included usage example below).
> >
> >   report=/sys/kernel/config/tsm/report/report0
> >   mkdir $report
> >   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> >   hexdump -C $report/outblob
> >   rmdir $report
> >
> > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > with TDREPORT data as input, which is further used by the VMM to copy
> > the TD Quote result after successful Quote generation. To create the
> > shared buffer, allocate a large enough memory and mark it shared using
> > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > for GetQuote requests in the TDX TSM handler.
> >
> > Although this method reserves a fixed chunk of memory for GetQuote
> > requests, such one time allocation can help avoid memory fragmentation
> > related allocation failures later in the uptime of the guest.
> >
> > Since the Quote generation process is not time-critical or frequently
> > used, the current version uses a polling model for Quote requests and
> > it also does not support parallel GetQuote requests.
> >
> > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> Hey Dan,
> 
> I tried running your test commands on an SNP enabled guest. To build
> the kernel I just checked out linus/master and applied your series. I
> haven't done any debugging yet, so I will update with what I find.

Ok, not surprised just because I made major changes to the sev-guest interface
and did not have a way to test.

> root@Ubuntu2004:~#   hexdump -C $report/outblob
> [  219.871875] ------------[ cut here ]------------
> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> [  219.882280] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [  219.887628] CPU: 0 PID: 1317 Comm: hexdump Not tainted 6.6.0-rc3-xfstests-000
> 44-gf636850ddfc7 #1
> [  219.896530] Hardware name: Google Google Compute Engine/Google Compute Engine

Might you be able to send me info offlist about how I can spin this up so I can
test without waiting for the community to try it out?

In any event, thanks for testing! Appreciate it.
  
Dan Williams Sept. 28, 2023, 10:49 p.m. UTC | #3
Peter Gonda wrote:
> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest. The first step in the attestation process is TDREPORT
> > generation, which involves getting the guest measurement data in the
> > format of TDREPORT, which is further used to validate the authenticity
> > of the TDX guest. TDREPORT by design is integrity-protected and can
> > only be verified on the local machine.
> >
> > To support remote verification of the TDREPORT in a SGX-based
> > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > only run outside of the TDX guest (i.e. in a host process or in a
> > normal VM) and guest can use communication channels like vsock or
> > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > TDX guest may not support these communication channels. To handle such
> > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > to request the host VMM to communicate with the SGX QE. More details
> > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > Interface (GHCI) for Intel TDX 1.0, section titled
> > "TDG.VP.VMCALL<GetQuote>".
> >
> > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > Computing Guest platforms to get the measurement data via ConfigFS.
> > Extend the TSM framework and add support to allow an attestation agent
> > to get the TDX Quote data (included usage example below).
> >
> >   report=/sys/kernel/config/tsm/report/report0
> >   mkdir $report
> >   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> >   hexdump -C $report/outblob
> >   rmdir $report
> >
> > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > with TDREPORT data as input, which is further used by the VMM to copy
> > the TD Quote result after successful Quote generation. To create the
> > shared buffer, allocate a large enough memory and mark it shared using
> > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > for GetQuote requests in the TDX TSM handler.
> >
> > Although this method reserves a fixed chunk of memory for GetQuote
> > requests, such one time allocation can help avoid memory fragmentation
> > related allocation failures later in the uptime of the guest.
> >
> > Since the Quote generation process is not time-critical or frequently
> > used, the current version uses a polling model for Quote requests and
> > it also does not support parallel GetQuote requests.
> >
> > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> Hey Dan,
> 
> I tried running your test commands on an SNP enabled guest. To build
> the kernel I just checked out linus/master and applied your series. I
> haven't done any debugging yet, so I will update with what I find.
> 
> root@Ubuntu2004:~#   hexdump -C $report/outblob
> [  219.871875] ------------[ cut here ]------------
> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!

Ok, it does not like virtual address of one of the buffers, but my
changes "should" not have affected that as get_ext_report() internally
uses snp_dev->certs_data and snp_dev->response for bounce buffering the
actual request / response memory. First test I want to try once I can
get on an SNP system is compare this to the ioctl path just make sure
that succeeds.
  
Peter Gonda Sept. 29, 2023, 5:26 p.m. UTC | #4
On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Peter Gonda wrote:
> > On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > >
> > > In TDX guest, the attestation process is used to verify the TDX guest
> > > trustworthiness to other entities before provisioning secrets to the
> > > guest. The first step in the attestation process is TDREPORT
> > > generation, which involves getting the guest measurement data in the
> > > format of TDREPORT, which is further used to validate the authenticity
> > > of the TDX guest. TDREPORT by design is integrity-protected and can
> > > only be verified on the local machine.
> > >
> > > To support remote verification of the TDREPORT in a SGX-based
> > > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > > only run outside of the TDX guest (i.e. in a host process or in a
> > > normal VM) and guest can use communication channels like vsock or
> > > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > > TDX guest may not support these communication channels. To handle such
> > > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > > to request the host VMM to communicate with the SGX QE. More details
> > > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > > Interface (GHCI) for Intel TDX 1.0, section titled
> > > "TDG.VP.VMCALL<GetQuote>".
> > >
> > > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > > Computing Guest platforms to get the measurement data via ConfigFS.
> > > Extend the TSM framework and add support to allow an attestation agent
> > > to get the TDX Quote data (included usage example below).
> > >
> > >   report=/sys/kernel/config/tsm/report/report0
> > >   mkdir $report
> > >   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > >   hexdump -C $report/outblob
> > >   rmdir $report
> > >
> > > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > > with TDREPORT data as input, which is further used by the VMM to copy
> > > the TD Quote result after successful Quote generation. To create the
> > > shared buffer, allocate a large enough memory and mark it shared using
> > > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > > for GetQuote requests in the TDX TSM handler.
> > >
> > > Although this method reserves a fixed chunk of memory for GetQuote
> > > requests, such one time allocation can help avoid memory fragmentation
> > > related allocation failures later in the uptime of the guest.
> > >
> > > Since the Quote generation process is not time-critical or frequently
> > > used, the current version uses a polling model for Quote requests and
> > > it also does not support parallel GetQuote requests.
> > >
> > > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > Hey Dan,
> >
> > I tried running your test commands on an SNP enabled guest. To build
> > the kernel I just checked out linus/master and applied your series. I
> > haven't done any debugging yet, so I will update with what I find.
> >
> > root@Ubuntu2004:~#   hexdump -C $report/outblob
> > [  219.871875] ------------[ cut here ]------------
> > [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
>
> Ok, it does not like virtual address of one of the buffers, but my
> changes "should" not have affected that as get_ext_report() internally
> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> actual request / response memory. First test I want to try once I can
> get on an SNP system is compare this to the ioctl path just make sure
> that succeeds.

I tried the IOCTL with our attestation test and it seems to be working
fine. I was hoping to debug further next week.

I also mailed you offlist to discuss getting access to SNP for testing.
  
Peter Gonda Oct. 3, 2023, 6:37 p.m. UTC | #5
On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@google.com> wrote:
>
> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > Peter Gonda wrote:
> > > On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > > >
> > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > >
> > > > In TDX guest, the attestation process is used to verify the TDX guest
> > > > trustworthiness to other entities before provisioning secrets to the
> > > > guest. The first step in the attestation process is TDREPORT
> > > > generation, which involves getting the guest measurement data in the
> > > > format of TDREPORT, which is further used to validate the authenticity
> > > > of the TDX guest. TDREPORT by design is integrity-protected and can
> > > > only be verified on the local machine.
> > > >
> > > > To support remote verification of the TDREPORT in a SGX-based
> > > > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > > > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > > > only run outside of the TDX guest (i.e. in a host process or in a
> > > > normal VM) and guest can use communication channels like vsock or
> > > > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > > > TDX guest may not support these communication channels. To handle such
> > > > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > > > to request the host VMM to communicate with the SGX QE. More details
> > > > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > > > Interface (GHCI) for Intel TDX 1.0, section titled
> > > > "TDG.VP.VMCALL<GetQuote>".
> > > >
> > > > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > > > Computing Guest platforms to get the measurement data via ConfigFS.
> > > > Extend the TSM framework and add support to allow an attestation agent
> > > > to get the TDX Quote data (included usage example below).
> > > >
> > > >   report=/sys/kernel/config/tsm/report/report0
> > > >   mkdir $report
> > > >   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > > >   hexdump -C $report/outblob
> > > >   rmdir $report
> > > >
> > > > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > > > with TDREPORT data as input, which is further used by the VMM to copy
> > > > the TD Quote result after successful Quote generation. To create the
> > > > shared buffer, allocate a large enough memory and mark it shared using
> > > > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > > > for GetQuote requests in the TDX TSM handler.
> > > >
> > > > Although this method reserves a fixed chunk of memory for GetQuote
> > > > requests, such one time allocation can help avoid memory fragmentation
> > > > related allocation failures later in the uptime of the guest.
> > > >
> > > > Since the Quote generation process is not time-critical or frequently
> > > > used, the current version uses a polling model for Quote requests and
> > > > it also does not support parallel GetQuote requests.
> > > >
> > > > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > >
> > > Hey Dan,
> > >
> > > I tried running your test commands on an SNP enabled guest. To build
> > > the kernel I just checked out linus/master and applied your series. I
> > > haven't done any debugging yet, so I will update with what I find.
> > >
> > > root@Ubuntu2004:~#   hexdump -C $report/outblob
> > > [  219.871875] ------------[ cut here ]------------
> > > [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >
> > Ok, it does not like virtual address of one of the buffers, but my
> > changes "should" not have affected that as get_ext_report() internally
> > uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > actual request / response memory. First test I want to try once I can
> > get on an SNP system is compare this to the ioctl path just make sure
> > that succeeds.
>

I think there may be an issue with CONFIG_DEBUG_SG. That was the
warning we were getting in my above stack trace:

> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!

This was for this line in enc_dec_message():

        sg_set_buf(&src[1], src_buf, hdr->msg_sz);

I am not sure why in sg_set_buf() virt_addr_valid() returns false for
the address given in the sev_report_new() which is from the variable
'ext_req' which is stack allocated?

static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
     unsigned int buflen)
{
#ifdef CONFIG_DEBUG_SG
    BUG_ON(!virt_addr_valid(buf));
#endif
    sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
}

When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
well at least it doesn't crash the guest. I haven't checked if the
report is valid yet.
  
Kuppuswamy Sathyanarayanan Oct. 3, 2023, 7:29 p.m. UTC | #6
Hi,

On 10/3/2023 11:37 AM, Peter Gonda wrote:
> On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@google.com> wrote:
>>
>> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote:
>>>
>>> Peter Gonda wrote:
>>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
>>>>>
>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>
>>>>> In TDX guest, the attestation process is used to verify the TDX guest
>>>>> trustworthiness to other entities before provisioning secrets to the
>>>>> guest. The first step in the attestation process is TDREPORT
>>>>> generation, which involves getting the guest measurement data in the
>>>>> format of TDREPORT, which is further used to validate the authenticity
>>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
>>>>> only be verified on the local machine.
>>>>>
>>>>> To support remote verification of the TDREPORT in a SGX-based
>>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
>>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
>>>>> only run outside of the TDX guest (i.e. in a host process or in a
>>>>> normal VM) and guest can use communication channels like vsock or
>>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
>>>>> TDX guest may not support these communication channels. To handle such
>>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
>>>>> to request the host VMM to communicate with the SGX QE. More details
>>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
>>>>> Interface (GHCI) for Intel TDX 1.0, section titled
>>>>> "TDG.VP.VMCALL<GetQuote>".
>>>>>
>>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
>>>>> Computing Guest platforms to get the measurement data via ConfigFS.
>>>>> Extend the TSM framework and add support to allow an attestation agent
>>>>> to get the TDX Quote data (included usage example below).
>>>>>
>>>>>   report=/sys/kernel/config/tsm/report/report0
>>>>>   mkdir $report
>>>>>   dd if=/dev/urandom bs=64 count=1 > $report/inblob
>>>>>   hexdump -C $report/outblob
>>>>>   rmdir $report
>>>>>
>>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
>>>>> with TDREPORT data as input, which is further used by the VMM to copy
>>>>> the TD Quote result after successful Quote generation. To create the
>>>>> shared buffer, allocate a large enough memory and mark it shared using
>>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
>>>>> for GetQuote requests in the TDX TSM handler.
>>>>>
>>>>> Although this method reserves a fixed chunk of memory for GetQuote
>>>>> requests, such one time allocation can help avoid memory fragmentation
>>>>> related allocation failures later in the uptime of the guest.
>>>>>
>>>>> Since the Quote generation process is not time-critical or frequently
>>>>> used, the current version uses a polling model for Quote requests and
>>>>> it also does not support parallel GetQuote requests.
>>>>>
>>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
>>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
>>>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>>
>>>> Hey Dan,
>>>>
>>>> I tried running your test commands on an SNP enabled guest. To build
>>>> the kernel I just checked out linus/master and applied your series. I
>>>> haven't done any debugging yet, so I will update with what I find.
>>>>
>>>> root@Ubuntu2004:~#   hexdump -C $report/outblob
>>>> [  219.871875] ------------[ cut here ]------------
>>>> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
>>>
>>> Ok, it does not like virtual address of one of the buffers, but my
>>> changes "should" not have affected that as get_ext_report() internally
>>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
>>> actual request / response memory. First test I want to try once I can
>>> get on an SNP system is compare this to the ioctl path just make sure
>>> that succeeds.
>>
> 
> I think there may be an issue with CONFIG_DEBUG_SG. That was the
> warning we were getting in my above stack trace:
> 
>> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> 
> This was for this line in enc_dec_message():
> 
>         sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> 
> I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> the address given in the sev_report_new() which is from the variable
> 'ext_req' which is stack allocated?
> 
> static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
>      unsigned int buflen)
> {
> #ifdef CONFIG_DEBUG_SG
>     BUG_ON(!virt_addr_valid(buf));
> #endif
>     sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> }
> 
> When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> well at least it doesn't crash the guest. I haven't checked if the
> report is valid yet.
> 

Dan, do you think it is related to not allocating direct mapped memory (using
kvalloc)?
  
Peter Gonda Oct. 3, 2023, 8:06 p.m. UTC | #7
On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>
> Hi,
>
> On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@google.com> wrote:
> >>
> >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >>>
> >>> Peter Gonda wrote:
> >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >>>>>
> >>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>>>>
> >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> >>>>> trustworthiness to other entities before provisioning secrets to the
> >>>>> guest. The first step in the attestation process is TDREPORT
> >>>>> generation, which involves getting the guest measurement data in the
> >>>>> format of TDREPORT, which is further used to validate the authenticity
> >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> >>>>> only be verified on the local machine.
> >>>>>
> >>>>> To support remote verification of the TDREPORT in a SGX-based
> >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> >>>>> normal VM) and guest can use communication channels like vsock or
> >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> >>>>> TDX guest may not support these communication channels. To handle such
> >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> >>>>> to request the host VMM to communicate with the SGX QE. More details
> >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> >>>>> "TDG.VP.VMCALL<GetQuote>".
> >>>>>
> >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> >>>>> Extend the TSM framework and add support to allow an attestation agent
> >>>>> to get the TDX Quote data (included usage example below).
> >>>>>
> >>>>>   report=/sys/kernel/config/tsm/report/report0
> >>>>>   mkdir $report
> >>>>>   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> >>>>>   hexdump -C $report/outblob
> >>>>>   rmdir $report
> >>>>>
> >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> >>>>> the TD Quote result after successful Quote generation. To create the
> >>>>> shared buffer, allocate a large enough memory and mark it shared using
> >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> >>>>> for GetQuote requests in the TDX TSM handler.
> >>>>>
> >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> >>>>> requests, such one time allocation can help avoid memory fragmentation
> >>>>> related allocation failures later in the uptime of the guest.
> >>>>>
> >>>>> Since the Quote generation process is not time-critical or frequently
> >>>>> used, the current version uses a polling model for Quote requests and
> >>>>> it also does not support parallel GetQuote requests.
> >>>>>
> >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>>>> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> >>>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >>>>
> >>>> Hey Dan,
> >>>>
> >>>> I tried running your test commands on an SNP enabled guest. To build
> >>>> the kernel I just checked out linus/master and applied your series. I
> >>>> haven't done any debugging yet, so I will update with what I find.
> >>>>
> >>>> root@Ubuntu2004:~#   hexdump -C $report/outblob
> >>>> [  219.871875] ------------[ cut here ]------------
> >>>> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >>>
> >>> Ok, it does not like virtual address of one of the buffers, but my
> >>> changes "should" not have affected that as get_ext_report() internally
> >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> >>> actual request / response memory. First test I want to try once I can
> >>> get on an SNP system is compare this to the ioctl path just make sure
> >>> that succeeds.
> >>
> >
> > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > warning we were getting in my above stack trace:
> >
> >> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >
> > This was for this line in enc_dec_message():
> >
> >         sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> >
> > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > the address given in the sev_report_new() which is from the variable
> > 'ext_req' which is stack allocated?
> >
> > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> >      unsigned int buflen)
> > {
> > #ifdef CONFIG_DEBUG_SG
> >     BUG_ON(!virt_addr_valid(buf));
> > #endif
> >     sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > }
> >
> > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > well at least it doesn't crash the guest. I haven't checked if the
> > report is valid yet.
> >
>
> Dan, do you think it is related to not allocating direct mapped memory (using
> kvalloc)?

But I think the issue is the stack allocated variable 'ext_req' here:

sev_report_new()
+       void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
+       if (!buf)
+               return -ENOMEM;
+
+       guard(mutex)(&snp_cmd_mutex);
+       certs_address = buf + report_size;
+       struct snp_ext_report_req ext_req = {
+               .data = { .vmpl = desc->privlevel },
+               .certs_address = (__u64)certs_address,
+               .certs_len = ext_size,
+       };
+       memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);


>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
  
Dan Williams Oct. 4, 2023, 12:54 a.m. UTC | #8
Peter Gonda wrote:
> On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@google.com> wrote:
> > >>
> > >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > >>>
> > >>> Peter Gonda wrote:
> > >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > >>>>>
> > >>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > >>>>>
> > >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> > >>>>> trustworthiness to other entities before provisioning secrets to the
> > >>>>> guest. The first step in the attestation process is TDREPORT
> > >>>>> generation, which involves getting the guest measurement data in the
> > >>>>> format of TDREPORT, which is further used to validate the authenticity
> > >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> > >>>>> only be verified on the local machine.
> > >>>>>
> > >>>>> To support remote verification of the TDREPORT in a SGX-based
> > >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> > >>>>> normal VM) and guest can use communication channels like vsock or
> > >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > >>>>> TDX guest may not support these communication channels. To handle such
> > >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> > >>>>> to request the host VMM to communicate with the SGX QE. More details
> > >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> > >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> > >>>>> "TDG.VP.VMCALL<GetQuote>".
> > >>>>>
> > >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> > >>>>> Extend the TSM framework and add support to allow an attestation agent
> > >>>>> to get the TDX Quote data (included usage example below).
> > >>>>>
> > >>>>>   report=/sys/kernel/config/tsm/report/report0
> > >>>>>   mkdir $report
> > >>>>>   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > >>>>>   hexdump -C $report/outblob
> > >>>>>   rmdir $report
> > >>>>>
> > >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> > >>>>> the TD Quote result after successful Quote generation. To create the
> > >>>>> shared buffer, allocate a large enough memory and mark it shared using
> > >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > >>>>> for GetQuote requests in the TDX TSM handler.
> > >>>>>
> > >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> > >>>>> requests, such one time allocation can help avoid memory fragmentation
> > >>>>> related allocation failures later in the uptime of the guest.
> > >>>>>
> > >>>>> Since the Quote generation process is not time-critical or frequently
> > >>>>> used, the current version uses a polling model for Quote requests and
> > >>>>> it also does not support parallel GetQuote requests.
> > >>>>>
> > >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > >>>>> Reviewed-by: Erdem Aktas <erdemaktas@google.com>
> > >>>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > >>>>
> > >>>> Hey Dan,
> > >>>>
> > >>>> I tried running your test commands on an SNP enabled guest. To build
> > >>>> the kernel I just checked out linus/master and applied your series. I
> > >>>> haven't done any debugging yet, so I will update with what I find.
> > >>>>
> > >>>> root@Ubuntu2004:~#   hexdump -C $report/outblob
> > >>>> [  219.871875] ------------[ cut here ]------------
> > >>>> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >>>
> > >>> Ok, it does not like virtual address of one of the buffers, but my
> > >>> changes "should" not have affected that as get_ext_report() internally
> > >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > >>> actual request / response memory. First test I want to try once I can
> > >>> get on an SNP system is compare this to the ioctl path just make sure
> > >>> that succeeds.
> > >>
> > >
> > > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > > warning we were getting in my above stack trace:
> > >
> > >> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >
> > > This was for this line in enc_dec_message():
> > >
> > >         sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> > >
> > > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > > the address given in the sev_report_new() which is from the variable
> > > 'ext_req' which is stack allocated?
> > >
> > > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> > >      unsigned int buflen)
> > > {
> > > #ifdef CONFIG_DEBUG_SG
> > >     BUG_ON(!virt_addr_valid(buf));
> > > #endif
> > >     sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > > }
> > >
> > > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > > well at least it doesn't crash the guest. I haven't checked if the
> > > report is valid yet.
> > >
> >
> > Dan, do you think it is related to not allocating direct mapped memory (using
> > kvalloc)?
> 
> But I think the issue is the stack allocated variable 'ext_req' here:
> 
> sev_report_new()
> +       void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       guard(mutex)(&snp_cmd_mutex);
> +       certs_address = buf + report_size;
> +       struct snp_ext_report_req ext_req = {
> +               .data = { .vmpl = desc->privlevel },
> +               .certs_address = (__u64)certs_address,
> +               .certs_len = ext_size,
> +       };
> +       memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);

If the failure is coming from:

	sg_set_buf(&src[1], src_buf, hdr->msg_sz);

...then that is always coming from the stack as get_ext_report()
internally copies either from the user ioctl() address or the kernel
stack into the local stack copy in both cases:

get_ext_report(...)
	...
	struct snp_ext_report_req req;
	...
        if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
                return -EFAULT;
	...
        ret = handle_guest_request(..., &req.data, ...);

...where that "&req.data" always becomes the @src_buf argument to
enc_dec_message(). So while I do understand why sg_set_buf() is
complaining, I don't understand why it is not *always* complaining,
regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.

I will be able to dig deeper once I can test on hardware, but I am
thinking that the entire scheme to pass the source buffer on the kernel
stack is broken and is only happening to work because there are no
crypto-accelerators attached that require that the virtual addresses be
virt_addr_valid() for a later dma_map_sg() event.

...or my eyes are overlooking how the ioctl() path is succeeding.
  
Dan Williams Oct. 10, 2023, 7:36 p.m. UTC | #9
Dan Williams wrote:
> Peter Gonda wrote:
> > On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> > <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
[..]
> > > Dan, do you think it is related to not allocating direct mapped memory (using
> > > kvalloc)?
> > 
> > But I think the issue is the stack allocated variable 'ext_req' here:
> > 
> > sev_report_new()
> > +       void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> > +       if (!buf)
> > +               return -ENOMEM;
> > +
> > +       guard(mutex)(&snp_cmd_mutex);
> > +       certs_address = buf + report_size;
> > +       struct snp_ext_report_req ext_req = {
> > +               .data = { .vmpl = desc->privlevel },
> > +               .certs_address = (__u64)certs_address,
> > +               .certs_len = ext_size,
> > +       };
> > +       memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);
> 
> If the failure is coming from:
> 
> 	sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> 
> ...then that is always coming from the stack as get_ext_report()
> internally copies either from the user ioctl() address or the kernel
> stack into the local stack copy in both cases:
> 
> get_ext_report(...)
> 	...
> 	struct snp_ext_report_req req;
> 	...
>         if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
>                 return -EFAULT;
> 	...
>         ret = handle_guest_request(..., &req.data, ...);
> 
> ...where that "&req.data" always becomes the @src_buf argument to
> enc_dec_message(). So while I do understand why sg_set_buf() is
> complaining, I don't understand why it is not *always* complaining,
> regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.
> 
> I will be able to dig deeper once I can test on hardware, but I am
> thinking that the entire scheme to pass the source buffer on the kernel
> stack is broken and is only happening to work because there are no
> crypto-accelerators attached that require that the virtual addresses be
> virt_addr_valid() for a later dma_map_sg() event.
> 
> ...or my eyes are overlooking how the ioctl() path is succeeding.

Confirmed, I can make DEBUG_SG equally unhappy via the ioctl() path.
Note I changed the BUG_ON() to WARN_ON() in this kernel to keep the
system alive over reproductions:

 WARNING: CPU: 0 PID: 1175 at include/linux/scatterlist.h:187 enc_dec_message+0x518/0x5b0 [sev_guest]
[..]
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
 RIP: 0010:enc_dec_message+0x518/0x5b0 [sev_guest]
 Call Trace:
  <TASK>
[..]
  handle_guest_request+0x135/0x520 [sev_guest]
  get_ext_report+0x1ec/0x3e0 [sev_guest]
  snp_guest_ioctl+0x157/0x200 [sev_guest]

So the required fix here will address both cases.

I will note that on this instance no certificate data is being returned,
so I can't test that path, but at least the report retrieval will be
tested for the next posting of this series.
  

Patch

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..752867b1d11b 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -104,6 +104,27 @@  int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
 }
 EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);
 
+/**
+ * tdx_hcall_get_quote() - Wrapper to request TD Quote using GetQuote
+ *                         hypercall.
+ * @buf: Address of the directly mapped shared kernel buffer which
+ *       contains TDREPORT. The same buffer will be used by VMM to
+ *       store the generated TD Quote output.
+ * @size: size of the tdquote buffer (4KB-aligned).
+ *
+ * Refer to section titled "TDG.VP.VMCALL<GetQuote>" in the TDX GHCI
+ * v1.0 specification for more information on GetQuote hypercall.
+ * It is used in the TDX guest driver module to get the TD Quote.
+ *
+ * Return 0 on success or error code on failure.
+ */
+u64 tdx_hcall_get_quote(u8 *buf, size_t size)
+{
+	/* Since buf is a shared memory, set the shared (decrypted) bits */
+	return _tdx_hypercall(TDVMCALL_GET_QUOTE, cc_mkdec(virt_to_phys(buf)), size, 0, 0);
+}
+EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
+
 static void __noreturn tdx_panic(const char *msg)
 {
 	struct tdx_hypercall_args args = {
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 7513b3bb69b7..9eab19950f39 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -22,6 +22,7 @@ 
 
 /* TDX hypercall Leaf IDs */
 #define TDVMCALL_MAP_GPA		0x10001
+#define TDVMCALL_GET_QUOTE		0x10002
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
 
 #ifndef __ASSEMBLY__
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 603e6d1e9d4a..ebd1cda4875f 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -52,6 +52,8 @@  bool tdx_early_handle_ve(struct pt_regs *regs);
 
 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport);
 
+u64 tdx_hcall_get_quote(u8 *buf, size_t size);
+
 #else
 
 static inline void tdx_early_init(void) { };
diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
index 14246fc2fb02..22dd59e19431 100644
--- a/drivers/virt/coco/tdx-guest/Kconfig
+++ b/drivers/virt/coco/tdx-guest/Kconfig
@@ -1,6 +1,7 @@ 
 config TDX_GUEST_DRIVER
 	tristate "TDX Guest driver"
 	depends on INTEL_TDX_GUEST
+	select TSM_REPORTS
 	help
 	  The driver provides userspace interface to communicate with
 	  the TDX module to request the TDX guest details like attestation
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
index 5e44a0fa69bd..1253bf76b570 100644
--- a/drivers/virt/coco/tdx-guest/tdx-guest.c
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -12,12 +12,60 @@ 
 #include <linux/mod_devicetable.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
+#include <linux/set_memory.h>
+#include <linux/io.h>
+#include <linux/delay.h>
+#include <linux/tsm.h>
+#include <linux/sizes.h>
 
 #include <uapi/linux/tdx-guest.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/tdx.h>
 
+/*
+ * Intel's SGX QE implementation generally uses Quote size less
+ * than 8K (2K Quote data + ~5K of certificate blob).
+ */
+#define GET_QUOTE_BUF_SIZE		SZ_8K
+
+#define GET_QUOTE_CMD_VER		1
+
+/* TDX GetQuote status codes */
+#define GET_QUOTE_SUCCESS		0
+#define GET_QUOTE_IN_FLIGHT		0xffffffffffffffff
+
+/* struct tdx_quote_buf: Format of Quote request buffer.
+ * @version: Quote format version, filled by TD.
+ * @status: Status code of Quote request, filled by VMM.
+ * @in_len: Length of TDREPORT, filled by TD.
+ * @out_len: Length of Quote data, filled by VMM.
+ * @data: Quote data on output or TDREPORT on input.
+ *
+ * More details of Quote request buffer can be found in TDX
+ * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
+ * section titled "TDG.VP.VMCALL<GetQuote>"
+ */
+struct tdx_quote_buf {
+	u64 version;
+	u64 status;
+	u32 in_len;
+	u32 out_len;
+	u8 data[];
+};
+
+/* Quote data buffer */
+static void *quote_data;
+
+/* Lock to streamline quote requests */
+static DEFINE_MUTEX(quote_lock);
+
+/*
+ * GetQuote request timeout in seconds. Expect that 30 seconds
+ * is enough time for QE to respond to any Quote requests.
+ */
+static u32 getquote_timeout = 30;
+
 static long tdx_get_report0(struct tdx_report_req __user *req)
 {
 	u8 *reportdata, *tdreport;
@@ -53,6 +101,154 @@  static long tdx_get_report0(struct tdx_report_req __user *req)
 	return ret;
 }
 
+static void free_quote_buf(void *buf)
+{
+	size_t len = PAGE_ALIGN(GET_QUOTE_BUF_SIZE);
+	unsigned int count = len >> PAGE_SHIFT;
+
+	if (set_memory_encrypted((unsigned long)buf, count)) {
+		pr_err("Failed to restore encryption mask for Quote buffer, leak it\n");
+		return;
+	}
+
+	free_pages_exact(buf, len);
+}
+
+static void *alloc_quote_buf(void)
+{
+	size_t len = PAGE_ALIGN(GET_QUOTE_BUF_SIZE);
+	unsigned int count = len >> PAGE_SHIFT;
+	void *addr;
+
+	addr = alloc_pages_exact(len, GFP_KERNEL | __GFP_ZERO);
+	if (!addr)
+		return NULL;
+
+	if (set_memory_decrypted((unsigned long)addr, count)) {
+		free_pages_exact(addr, len);
+		return NULL;
+	}
+
+	return addr;
+}
+
+/*
+ * wait_for_quote_completion() - Wait for Quote request completion
+ * @quote_buf: Address of Quote buffer.
+ * @timeout: Timeout in seconds to wait for the Quote generation.
+ *
+ * As per TDX GHCI v1.0 specification, sec titled "TDG.VP.VMCALL<GetQuote>",
+ * the status field in the Quote buffer will be set to GET_QUOTE_IN_FLIGHT
+ * while VMM processes the GetQuote request, and will change it to success
+ * or error code after processing is complete. So wait till the status
+ * changes from GET_QUOTE_IN_FLIGHT or the request being timed out.
+ */
+static int wait_for_quote_completion(struct tdx_quote_buf *quote_buf, u32 timeout)
+{
+	int i = 0;
+
+	/*
+	 * Quote requests usually take a few seconds to complete, so waking up
+	 * once per second to recheck the status is fine for this use case.
+	 */
+	while (quote_buf->status == GET_QUOTE_IN_FLIGHT && i++ < timeout) {
+		if (msleep_interruptible(MSEC_PER_SEC))
+			return -EINTR;
+	}
+
+	return (i == timeout) ? -ETIMEDOUT : 0;
+}
+
+static int tdx_report_new(struct tsm_report *report, void *data)
+{
+	u8 *buf, *reportdata = NULL, *tdreport = NULL;
+	struct tdx_quote_buf *quote_buf = quote_data;
+	struct tsm_desc *desc = &report->desc;
+	int ret;
+	u64 err;
+
+	/* TODO: switch to guard(mutex_intr) */
+	if (mutex_lock_interruptible(&quote_lock))
+		return -EINTR;
+
+	/*
+	 * If the previous request is timedout or interrupted, and the
+	 * Quote buf status is still in GET_QUOTE_IN_FLIGHT (owned by
+	 * VMM), don't permit any new request.
+	 */
+	if (quote_buf->status == GET_QUOTE_IN_FLIGHT) {
+		ret = -EBUSY;
+		goto done;
+	}
+
+	if (desc->inblob_len != TDX_REPORTDATA_LEN) {
+		ret = -EINVAL;
+		goto done;
+	}
+
+	reportdata = kmalloc(TDX_REPORTDATA_LEN, GFP_KERNEL);
+	if (!reportdata) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
+	if (!tdreport) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	memcpy(reportdata, desc->inblob, desc->inblob_len);
+
+	/* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
+	ret = tdx_mcall_get_report0(reportdata, tdreport);
+	if (ret) {
+		pr_err("GetReport call failed\n");
+		goto done;
+	}
+
+	memset(quote_data, 0, GET_QUOTE_BUF_SIZE);
+
+	/* Update Quote buffer header */
+	quote_buf->version = GET_QUOTE_CMD_VER;
+	quote_buf->in_len = TDX_REPORT_LEN;
+
+	memcpy(quote_buf->data, tdreport, TDX_REPORT_LEN);
+
+	err = tdx_hcall_get_quote(quote_data, GET_QUOTE_BUF_SIZE);
+	if (err) {
+		pr_err("GetQuote hypercall failed, status:%llx\n", err);
+		ret = -EIO;
+		goto done;
+	}
+
+	ret = wait_for_quote_completion(quote_buf, getquote_timeout);
+	if (ret) {
+		pr_err("GetQuote request timedout\n");
+		goto done;
+	}
+
+	buf = kvmemdup(quote_buf->data, quote_buf->out_len, GFP_KERNEL);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	report->outblob = buf;
+	report->outblob_len = quote_buf->out_len;
+
+	/*
+	 * TODO: parse the PEM-formatted cert chain out of the quote buffer when
+	 * provided
+	 */
+done:
+	mutex_unlock(&quote_lock);
+	kfree(reportdata);
+	kfree(tdreport);
+
+	return ret;
+}
+
 static long tdx_guest_ioctl(struct file *file, unsigned int cmd,
 			    unsigned long arg)
 {
@@ -82,17 +278,48 @@  static const struct x86_cpu_id tdx_guest_ids[] = {
 };
 MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
 
+static const struct tsm_ops tdx_tsm_ops = {
+	.name = KBUILD_MODNAME,
+	.report_new = tdx_report_new,
+};
+
 static int __init tdx_guest_init(void)
 {
+	int ret;
+
 	if (!x86_match_cpu(tdx_guest_ids))
 		return -ENODEV;
 
-	return misc_register(&tdx_misc_dev);
+	ret = misc_register(&tdx_misc_dev);
+	if (ret)
+		return ret;
+
+	quote_data = alloc_quote_buf();
+	if (!quote_data) {
+		pr_err("Failed to allocate Quote buffer\n");
+		ret = -ENOMEM;
+		goto free_misc;
+	}
+
+	ret = tsm_register(&tdx_tsm_ops, NULL, NULL);
+	if (ret)
+		goto free_quote;
+
+	return 0;
+
+free_quote:
+	free_quote_buf(quote_data);
+free_misc:
+	misc_deregister(&tdx_misc_dev);
+
+	return ret;
 }
 module_init(tdx_guest_init);
 
 static void __exit tdx_guest_exit(void)
 {
+	tsm_unregister(&tdx_tsm_ops);
+	free_quote_buf(quote_data);
 	misc_deregister(&tdx_misc_dev);
 }
 module_exit(tdx_guest_exit);