[RFC,00/48] RISC-V CoVE support

Message ID 20230419221716.3603068-1-atishp@rivosinc.com
Headers
Series RISC-V CoVE support |

Message

Atish Patra April 19, 2023, 10:16 p.m. UTC
  This patch series adds the RISC-V Confidential VM Extension (CoVE) support to
Linux kernel. The RISC-V CoVE specification introduces non-ISA, SBI APIs. These
APIs enable a confidential environment in which a guest VM's data can be isolated
from the host while the host retains control of guest VM management and platform
resources(memory, CPU, I/O).

This is a very early WIP work. We want to share this with the community to get any
feedback on overall architecture and direction. Any other feedback is welcome too.

The detailed CoVE architecture document can be found here [0]. It used to be
called AP-TEE and renamed to CoVE recently to avoid overloading term of TEE in
general. The specification is in the draft stages and is subjected to change based
on the feedback from the community.

The CoVE specification introduces 3 new SBI extensions.
COVH - CoVE Host side interface
COVG - CoVE Guest side interface
COVI - CoVE Secure Interrupt management extension

Some key acronyms introduced:

TSM - TEE Security Manager
TVM - TEE VM (aka Confidential VM)

CoVE Architecture:
====================
The CoVE APIs are designed to be implementation and architecture agnostic,
allowing for different deployment models while retaining common host and guest
kernel code. Two examples are shown in Figure 1 and Figure 2.
As shown in both figures, the architecture introduces a new software component
called the "TEE Security Manager" (TSM) that runs in HS mode. The TSM has minimal
hw attested footprint on TCB as it is a passive component that doesn't support
scheduling or timer interrupts. Both example deployment models provide memory 
isolation between the host and the TEE VM (TVM).

        
	Non secure world       |         Secure world         |
                               |                              |
        Non                    |                              |
    Virtualized |  Virtualized |   Virtualized  Virtualized   |            
        Env     |      Env     |       Env          Env       |                
   +----------+ | +----------+ |  +----------+ +----------+   |  --------------        
   |          | | |          | |  |          | |          |   |  
   | Host Apps| | |   Apps   | |  |   Apps   | |   Apps   |   |        VU-Mode
   |  (VMM)   | | |          | |  |          | |          |   |         
   +----------+ | +----------+ |  +----------+ +----------+   |  --------------
        |       | +----------+ |  +----------+ +----------+   |                
        |       | |          | |  |          | |          |   |      
        |       | |          | |  |    TVM   | |    TVM   |   |      
        |       | |   Guest  | |  |   Guest  | |   Guest  |   |       VS-Mode
     Syscalls   | +----------+ |  +----------+ +----------+   |      
        |              |       |        |                     |
        |             SBI      |   SBI(COVG + COVI)           |   
        |              |       |        |                     |
  +--------------------------+ |  +---------------------------+  --------------
  |     Host (Linux)         | |  |       TSM (Salus)         |        
  +--------------------------+ |  +---------------------------+
             |                 |            |                       HS-Mode
     SBI (COVH + COVI)         |     SBI (COVH + COVI)            
             |                 |            |
  +-----------------------------------------------------------+  --------------
  |                    Firmware(OpenSBI) + TSM Driver         |        M-Mode
  +-----------------------------------------------------------+  --------------
 +-----------------------------------------------------------------------------
  |                    Hardware (RISC-V CPU + RoT + IOMMU)
  +---------------------------------------------------------------------------- 
 		Figure 1: Host in HS model


The deployment model shown in Figure 1 runs the host in HS mode where it is peer
to the TSM which also runs in HS mode. It requires another component known as TSM
Driver running in higher privilege mode than host/TSM. It is responsible for switching
the context between the host and the TSM. TSM driver also manages the platform
specific hardware solution via confidential domain bit as described in the specification[0]
to provide the required memory isolation.

 
	     Non secure world  |         Secure world
                               |
         Virtualized Env       |   Virtualized   Virtualized  |                  
             		              Env           Env       |                       
   +-------------------------+ |  +----------+  +----------+  |    ------------ 
   |          | | |          | |  |          |  |          |  |                           
   | Host Apps| | |   Apps   | |  |   Apps   |  |   Apps   |  |        VU-Mode              
   +----------+ | +----------+ |  +----------+  +----------+  |    ------------ 
        |                      |        |             |       |                          
    Syscalls             SBI   |      	|             |       |                           
        |                      |        |             |       |                           
  +--------------------------+ |  +-----------+ +-----------+ |                          
  |     Host (Linux)         | |  |  TVM Guest| |  TVM Guest| |       VS-Mode                
  +--------------------------+ |  +-----------+ +-----------+ |               
             |                 |        |             |       |               
     SBI (COVH + COVI)         |       SBI           SBI      |              
             |                 |   (COVG + COVI) (COVG + COVI)|              
	     |                 |        |             |       |              
  +-----------------------------------------------------------+    --------------
  |                    TSM(Salus)	                      |        HS-Mode
  +-----------------------------------------------------------+    --------------
 			      | 
  			     SBI
			      |
  +---------------------------------------------------------+    --------------
  |                    Firmware(OpenSBI)                  |        M-Mode
  +---------------------------------------------------------+    --------------
 +-----------------------------------------------------------------------------
  |                    Hardware (RISC-V CPU + RoT + IOMMU)
  +---------------------------------------------------------------------------- 
 			Figure 2: Host in VS model


The deployment model shown in Figure 2 simplifies the context switch and memory isolation
by running the host in VS mode as a guest of TSM. Thus, the memory isolation is
achieved by gstage mapping by the TSM. We don't need any additional hardware confidential
domain bit to provide memory isolation. The downside of this model the host has to run the
non-confidential VMs in nested environment which may have lower performance (yet to be measured).
The current implementation Salus(TSM) doesn't support full nested virtualization yet.

The platform must have a RoT to provide attestation in either model.
This patch series implements the APIs defined by CoVE. The current TSM implementation
allows the host to run TVMs as shown in figure 2. We are working on deployment
model 1 in parallel. We do not expect any significant changes in either host/guest side
ABI due to that.

Shared memory between the host & TSM:
=====================================
To accelerate the H-mode CSR/GPR access, CoVE also reuses the Nested Acceleration (NACL)
SBI extension[1]. NACL defines a per physical cpu shared memory area that is allocated
at the boot. It allows the host running in VS mode to access H-mode CSR/GPR easily
without trapping into the TSM. The CoVE specification clearly defines the exact
state of the shared memory with r/w permissions at every call. 

Secure Interrupt management:
===========================
The CoVE specification relies on the MSI based interrupt scheme defined in Advanced Interrupt
Architecture specification[2]. The COVI SBI extension adds functions to bind
a guest interrupt file to a TVMs. After that, only TCB components (TSM, TVM, TSM driver)
can modify that. The host can inject an interrupt via TSM only. 
The TVMs are also in complete control of which interrupts it can receive. By default,
all interrupts are denied. In this proof-of-concept implementation, all the interrupts
are allowed by the guest at boot time to keep it simple.

Device I/O: 
===========
In order to support paravirt I/O devices, SWIOTLB bounce buffer must be used by the
guest. As the host can not access confidential memory, this buffer memory
must be shared with the host via share/unshare functions defined in COVG SBI extension.
RISC-V implementation achieves this generalizing mem_encrypt_init() similar to TDX/SEV/CCA.
That's why, the CoVE Guest is only allowed to use virtio devices with VIRTIO_F_ACCESS_PLATFORM
and VIRTIO_F_VERSION_1 as they force virtio drivers to use the DMA API.

MMIO emulation:
======================
TVM can register regions of address space as MMIO regions to be emulated by
the host. TSM provides explicit SBI functions i.e. SBI_EXT_COVG_[ADD/REMOVE]_MMIO_REGION
to request/remove MMIO regions. Any reads or writes to those MMIO region after
SBI_EXT_COVG_ADD_MMIO_REGION call are forwarded to the host for emulation. 

This series allows any ioremapped memory to be emulated as MMIO region with
above APIs via arch hookups inspired from pKVM work. We are aware that this model
doesn't address all the threat vectors. We have also implemented the device 
filtering/authorization approach adopted by TDX[4]. However, those patches are not
part of this series as the base TDX patches are still under active development.
RISC-V CoVE will also adapt the revamped device filtering work once it is accepted
by the Linux community in the future.

The direct assignment of devices are a work in progress and will be added in the future[4].

VMM support:
============
This series is only tested with kvmtool support. Other VMM support (qemu-kvm, crossvm/rust-vmm)
will be added later.

Test cases:
===========
We are working on kvm selftest for CoVE. We will post them as soon as they are ready.
We haven't started any work on kvm unit-tests as RISC-V doesn't have basic infrastructure
to support that. Once the kvm uni-test infrastructure is in place, we will add
support for CoVE as well. 

Open design questions:
======================

1. The current implementation has two separate configs for guest(CONFIG_RISCV_COVE_GUEST)
and the host (RISCV_COVE_HOST). The default defconfig will enable both so that
same unified image works as both host & guest. Most likely distro prefer this way
to minimize the maintenance burden but some may want a minimal CoVE guest image
that has only hardened drivers. In addition to that, Android runs a microdroid instance
in the confidential guests. A separate config will help in those case. Please let us
know if there is any concern with two configs. 

2. Lazy gstage page allocation vs upfront allocation with page pool.
Currently, all gstage mappings happen at runtime during the fault. This is expensive
as we need to convert that page to confidential memory as well. A page pool framework
may be a better choice which can hold all the confidential pages which can be
pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?

3. In order to allow both confidential VM and non-confidential VM, the series
uses regular branching instead of static branches for CoVE VM specific cases through
out KVM. That may cause a few more branch penalties while running regular VMs. 
The alternate option is to use function pointers for any function that needs to
take a different path. As per my understanding, that would be worse than branches.  

Patch organization:
===================
This series depends on quite a few RISC-V patches that are not upstream yet. 
Here are the dependencies.

1. RISC-V IPI improvement series
2. RISC-V AIA support series.
3. RISC-V NACL support series

In this series, PATCH [0-5] are generic improvement and cleanup patches which
can be merged independently.

PATCH [6-26, 34-37] adds host side for CoVE.
PATCH [27-33] adds the interrupt related changes.
PATCH [34-49] Adds the guest side changes for CoVE.

The TSM project is written in rust and can be found here:
https://github.com/rivosinc/salus

Running the stack
====================

To run/test the stack, you would need the following components :

1) Qemu
2) Common Host & Guest Kernel
3) kvmtool
4) Host RootFS with KVMTOOL and Guest Kernel
5) Salus

The detailed steps are available at[6]

The Linux kernel patches are also available at [7] and the kvmtool patches
are available at [8].

TODOs
=======
As this is a very early work, the todo list is quite long :).
Here are some of them (not in any specific order)

1. Support fd based private memory interface proposed in
   https://lkml.org/lkml/2022/1/18/395
2. Align with updated guest runtime device filtering approach.
3. IOMMU integration
4. Dedicated device assignment via TDSIP & SPDM[4]
5. Support huge pages
6. Page pool allocator to avoid convert/reclaim at every fault
7. Other VMM support (qemu-kvm, crossvm)
8. Complete the PoC for the deployment model 1 where host runs in HS mode
9. Attestation integration
10. Harden the interrupt allowed list
11. kvm self-tests support for CoVE
11. kvm unit-tests support for CoVE
12. Guest hardening
13. Port pKVM on RISC-V using CoVE
14. Any other ?

Links
============
[0] CoVE architecture Specification.
    https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/riscv-aptee-spec.pdf
[1] https://lists.riscv.org/g/sig-hypervisors/message/260 
[2] https://github.com/riscv/riscv-aia/releases/download/1.0-RC2/riscv-interrupts-1.0-RC2.pdf 
[3] https://github.com/rivosinc/linux/tree/cove_integration_device_filtering1
[4] https://github.com/intel/tdx/commits/guest-filter-upstream 
[5] https://lists.riscv.org/g/tech-ap-tee/message/83
[6] https://github.com/rivosinc/cove/wiki/CoVE-KVM-RISCV64-on-QEMU
[7] https://github.com/rivosinc/linux/commits/cove-integration
[8] https://github.com/rivosinc/kvmtool/tree/cove-integration-03072023

Atish Patra (33):
RISC-V: KVM: Improve KVM error reporting to the user space
RISC-V: KVM: Invoke aia_update with preempt disabled/irq enabled
RISC-V: KVM: Add a helper function to get pgd size
RISC-V: Add COVH SBI extensions definitions
RISC-V: KVM: Implement COVH SBI extension
RISC-V: KVM: Add a barebone CoVE implementation
RISC-V: KVM: Add UABI to support static memory region attestation
RISC-V: KVM: Add CoVE related nacl helpers
RISC-V: KVM: Implement static memory region measurement
RISC-V: KVM: Use the new VM IOCTL for measuring pages
RISC-V: KVM: Exit to the user space for trap redirection
RISC-V: KVM: Return early for gstage modifications
RISC-V: KVM: Skip dirty logging updates for TVM
RISC-V: KVM: Add a helper function to trigger fence ops
RISC-V: KVM: Skip most VCPU requests for TVMs
RISC-V : KVM: Skip vmid/hgatp management for TVMs
RISC-V: KVM: Skip TLB management for TVMs
RISC-V: KVM: Register memory regions as confidential for TVMs
RISC-V: KVM: Add gstage mapping for TVMs
RISC-V: KVM: Handle SBI call forward from the TSM
RISC-V: KVM: Implement vcpu load/put functions for CoVE guests
RISC-V: KVM: Wireup TVM world switch
RISC-V: KVM: Skip HVIP update for TVMs
RISC-V: KVM: Implement COVI SBI extension
RISC-V: KVM: Add interrupt management functions for TVM
RISC-V: KVM: Skip AIA CSR updates for TVMs
RISC-V: KVM: Perform limited operations in hardware enable/disable
RISC-V: KVM: Indicate no support user space emulated IRQCHIP
RISC-V: KVM: Add AIA support for TVMs
RISC-V: KVM: Hookup TVM VCPU init/destroy
RISC-V: KVM: Initialize CoVE
RISC-V: KVM: Add TVM init/destroy calls
drivers/hvc: sbi: Disable HVC console for TVMs

Rajnesh Kanwal (15):
mm/vmalloc: Introduce arch hooks to notify ioremap/unmap changes
RISC-V: KVM: Update timer functionality for TVMs.
RISC-V: Add COVI extension definitions
RISC-V: KVM: Read/write gprs from/to shmem in case of TVM VCPU.
RISC-V: Add COVG SBI extension definitions
RISC-V: Add CoVE guest config and helper functions
RISC-V: Implement COVG SBI extension
RISC-V: COVE: Add COVH invalidate, validate, promote, demote and
remove APIs.
RISC-V: KVM: Add host side support to handle COVG SBI calls.
RISC-V: Allow host to inject any ext interrupt id to a CoVE guest.
RISC-V: Add base memory encryption functions.
RISC-V: Add cc_platform_has() for RISC-V for CoVE
RISC-V: ioremap: Implement for arch specific ioremap hooks
riscv/virtio: Have CoVE guests enforce restricted virtio memory
access.
RISC-V: Add shared bounce buffer to support DBCN for CoVE Guest.

arch/riscv/Kbuild                       |    2 +
arch/riscv/Kconfig                      |   27 +
arch/riscv/cove/Makefile                |    2 +
arch/riscv/cove/core.c                  |   40 +
arch/riscv/cove/cove_guest_sbi.c        |  109 +++
arch/riscv/include/asm/cove.h           |   27 +
arch/riscv/include/asm/covg_sbi.h       |   38 +
arch/riscv/include/asm/csr.h            |    2 +
arch/riscv/include/asm/kvm_cove.h       |  206 +++++
arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
arch/riscv/include/asm/kvm_host.h       |   10 +-
arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
arch/riscv/include/asm/mem_encrypt.h    |   26 +
arch/riscv/include/asm/sbi.h            |  107 +++
arch/riscv/include/uapi/asm/kvm.h       |   17 +
arch/riscv/kernel/irq.c                 |   12 +
arch/riscv/kernel/setup.c               |    2 +
arch/riscv/kvm/Makefile                 |    1 +
arch/riscv/kvm/aia.c                    |  101 ++-
arch/riscv/kvm/aia_device.c             |   41 +-
arch/riscv/kvm/aia_imsic.c              |  127 ++-
arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
arch/riscv/kvm/main.c                   |   30 +-
arch/riscv/kvm/mmu.c                    |   45 +-
arch/riscv/kvm/tlb.c                    |   11 +-
arch/riscv/kvm/vcpu.c                   |   69 +-
arch/riscv/kvm/vcpu_exit.c              |   34 +-
arch/riscv/kvm/vcpu_insn.c              |  115 ++-
arch/riscv/kvm/vcpu_sbi.c               |   16 +
arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
arch/riscv/kvm/vcpu_timer.c             |   26 +-
arch/riscv/kvm/vm.c                     |   34 +-
arch/riscv/kvm/vmid.c                   |   17 +-
arch/riscv/mm/Makefile                  |    3 +
arch/riscv/mm/init.c                    |   17 +-
arch/riscv/mm/ioremap.c                 |   45 +
arch/riscv/mm/mem_encrypt.c             |   61 ++
drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
include/uapi/linux/kvm.h                |    8 +
mm/vmalloc.c                            |   16 +
42 files changed, 3222 insertions(+), 109 deletions(-)
create mode 100644 arch/riscv/cove/Makefile
create mode 100644 arch/riscv/cove/core.c
create mode 100644 arch/riscv/cove/cove_guest_sbi.c
create mode 100644 arch/riscv/include/asm/cove.h
create mode 100644 arch/riscv/include/asm/covg_sbi.h
create mode 100644 arch/riscv/include/asm/kvm_cove.h
create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
create mode 100644 arch/riscv/include/asm/mem_encrypt.h
create mode 100644 arch/riscv/kvm/cove.c
create mode 100644 arch/riscv/kvm/cove_sbi.c
create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
create mode 100644 arch/riscv/mm/ioremap.c
create mode 100644 arch/riscv/mm/mem_encrypt.c

--
2.25.1
  

Comments

Atish Patra April 19, 2023, 10:58 p.m. UTC | #1
On Thu, Apr 20, 2023 at 3:47 AM Atish Patra <atishp@rivosinc.com> wrote:
>
> This patch series adds the RISC-V Confidential VM Extension (CoVE) support to
> Linux kernel. The RISC-V CoVE specification introduces non-ISA, SBI APIs. These
> APIs enable a confidential environment in which a guest VM's data can be isolated
> from the host while the host retains control of guest VM management and platform
> resources(memory, CPU, I/O).
>
> This is a very early WIP work. We want to share this with the community to get any
> feedback on overall architecture and direction. Any other feedback is welcome too.
>
> The detailed CoVE architecture document can be found here [0]. It used to be
> called AP-TEE and renamed to CoVE recently to avoid overloading term of TEE in
> general. The specification is in the draft stages and is subjected to change based
> on the feedback from the community.
>
> The CoVE specification introduces 3 new SBI extensions.
> COVH - CoVE Host side interface
> COVG - CoVE Guest side interface
> COVI - CoVE Secure Interrupt management extension
>
> Some key acronyms introduced:
>
> TSM - TEE Security Manager
> TVM - TEE VM (aka Confidential VM)
>
> CoVE Architecture:
> ====================
> The CoVE APIs are designed to be implementation and architecture agnostic,
> allowing for different deployment models while retaining common host and guest
> kernel code. Two examples are shown in Figure 1 and Figure 2.
> As shown in both figures, the architecture introduces a new software component
> called the "TEE Security Manager" (TSM) that runs in HS mode. The TSM has minimal
> hw attested footprint on TCB as it is a passive component that doesn't support
> scheduling or timer interrupts. Both example deployment models provide memory
> isolation between the host and the TEE VM (TVM).
>
>
>         Non secure world       |         Secure world         |
>                                |                              |
>         Non                    |                              |
>     Virtualized |  Virtualized |   Virtualized  Virtualized   |
>         Env     |      Env     |       Env          Env       |
>    +----------+ | +----------+ |  +----------+ +----------+   |  --------------
>    |          | | |          | |  |          | |          |   |
>    | Host Apps| | |   Apps   | |  |   Apps   | |   Apps   |   |        VU-Mode
>    |  (VMM)   | | |          | |  |          | |          |   |
>    +----------+ | +----------+ |  +----------+ +----------+   |  --------------
>         |       | +----------+ |  +----------+ +----------+   |
>         |       | |          | |  |          | |          |   |
>         |       | |          | |  |    TVM   | |    TVM   |   |
>         |       | |   Guest  | |  |   Guest  | |   Guest  |   |       VS-Mode
>      Syscalls   | +----------+ |  +----------+ +----------+   |
>         |              |       |        |                     |
>         |             SBI      |   SBI(COVG + COVI)           |
>         |              |       |        |                     |
>   +--------------------------+ |  +---------------------------+  --------------
>   |     Host (Linux)         | |  |       TSM (Salus)         |
>   +--------------------------+ |  +---------------------------+
>              |                 |            |                       HS-Mode
>      SBI (COVH + COVI)         |     SBI (COVH + COVI)
>              |                 |            |
>   +-----------------------------------------------------------+  --------------
>   |                    Firmware(OpenSBI) + TSM Driver         |        M-Mode
>   +-----------------------------------------------------------+  --------------
>  +-----------------------------------------------------------------------------
>   |                    Hardware (RISC-V CPU + RoT + IOMMU)
>   +----------------------------------------------------------------------------
>                 Figure 1: Host in HS model
>
>
> The deployment model shown in Figure 1 runs the host in HS mode where it is peer
> to the TSM which also runs in HS mode. It requires another component known as TSM
> Driver running in higher privilege mode than host/TSM. It is responsible for switching
> the context between the host and the TSM. TSM driver also manages the platform
> specific hardware solution via confidential domain bit as described in the specification[0]
> to provide the required memory isolation.
>
>
>              Non secure world  |         Secure world
>                                |
>          Virtualized Env       |   Virtualized   Virtualized  |
>                                       Env           Env       |
>    +-------------------------+ |  +----------+  +----------+  |    ------------
>    |          | | |          | |  |          |  |          |  |
>    | Host Apps| | |   Apps   | |  |   Apps   |  |   Apps   |  |        VU-Mode
>    +----------+ | +----------+ |  +----------+  +----------+  |    ------------
>         |                      |        |             |       |
>     Syscalls             SBI   |        |             |       |
>         |                      |        |             |       |
>   +--------------------------+ |  +-----------+ +-----------+ |
>   |     Host (Linux)         | |  |  TVM Guest| |  TVM Guest| |       VS-Mode
>   +--------------------------+ |  +-----------+ +-----------+ |
>              |                 |        |             |       |
>      SBI (COVH + COVI)         |       SBI           SBI      |
>              |                 |   (COVG + COVI) (COVG + COVI)|
>              |                 |        |             |       |
>   +-----------------------------------------------------------+    --------------
>   |                    TSM(Salus)                             |        HS-Mode
>   +-----------------------------------------------------------+    --------------
>                               |
>                              SBI
>                               |
>   +---------------------------------------------------------+    --------------
>   |                    Firmware(OpenSBI)                  |        M-Mode
>   +---------------------------------------------------------+    --------------
>  +-----------------------------------------------------------------------------
>   |                    Hardware (RISC-V CPU + RoT + IOMMU)
>   +----------------------------------------------------------------------------
>                         Figure 2: Host in VS model
>
>
> The deployment model shown in Figure 2 simplifies the context switch and memory isolation
> by running the host in VS mode as a guest of TSM. Thus, the memory isolation is
> achieved by gstage mapping by the TSM. We don't need any additional hardware confidential
> domain bit to provide memory isolation. The downside of this model the host has to run the
> non-confidential VMs in nested environment which may have lower performance (yet to be measured).
> The current implementation Salus(TSM) doesn't support full nested virtualization yet.
>
> The platform must have a RoT to provide attestation in either model.
> This patch series implements the APIs defined by CoVE. The current TSM implementation
> allows the host to run TVMs as shown in figure 2. We are working on deployment
> model 1 in parallel. We do not expect any significant changes in either host/guest side
> ABI due to that.
>
> Shared memory between the host & TSM:
> =====================================
> To accelerate the H-mode CSR/GPR access, CoVE also reuses the Nested Acceleration (NACL)
> SBI extension[1]. NACL defines a per physical cpu shared memory area that is allocated
> at the boot. It allows the host running in VS mode to access H-mode CSR/GPR easily
> without trapping into the TSM. The CoVE specification clearly defines the exact
> state of the shared memory with r/w permissions at every call.
>
> Secure Interrupt management:
> ===========================
> The CoVE specification relies on the MSI based interrupt scheme defined in Advanced Interrupt
> Architecture specification[2]. The COVI SBI extension adds functions to bind
> a guest interrupt file to a TVMs. After that, only TCB components (TSM, TVM, TSM driver)
> can modify that. The host can inject an interrupt via TSM only.
> The TVMs are also in complete control of which interrupts it can receive. By default,
> all interrupts are denied. In this proof-of-concept implementation, all the interrupts
> are allowed by the guest at boot time to keep it simple.
>
> Device I/O:
> ===========
> In order to support paravirt I/O devices, SWIOTLB bounce buffer must be used by the
> guest. As the host can not access confidential memory, this buffer memory
> must be shared with the host via share/unshare functions defined in COVG SBI extension.
> RISC-V implementation achieves this generalizing mem_encrypt_init() similar to TDX/SEV/CCA.
> That's why, the CoVE Guest is only allowed to use virtio devices with VIRTIO_F_ACCESS_PLATFORM
> and VIRTIO_F_VERSION_1 as they force virtio drivers to use the DMA API.
>
> MMIO emulation:
> ======================
> TVM can register regions of address space as MMIO regions to be emulated by
> the host. TSM provides explicit SBI functions i.e. SBI_EXT_COVG_[ADD/REMOVE]_MMIO_REGION
> to request/remove MMIO regions. Any reads or writes to those MMIO region after
> SBI_EXT_COVG_ADD_MMIO_REGION call are forwarded to the host for emulation.
>
> This series allows any ioremapped memory to be emulated as MMIO region with
> above APIs via arch hookups inspired from pKVM work. We are aware that this model
> doesn't address all the threat vectors. We have also implemented the device
> filtering/authorization approach adopted by TDX[4]. However, those patches are not
> part of this series as the base TDX patches are still under active development.
> RISC-V CoVE will also adapt the revamped device filtering work once it is accepted
> by the Linux community in the future.
>
> The direct assignment of devices are a work in progress and will be added in the future[4].
>
> VMM support:
> ============
> This series is only tested with kvmtool support. Other VMM support (qemu-kvm, crossvm/rust-vmm)
> will be added later.
>
> Test cases:
> ===========
> We are working on kvm selftest for CoVE. We will post them as soon as they are ready.
> We haven't started any work on kvm unit-tests as RISC-V doesn't have basic infrastructure
> to support that. Once the kvm uni-test infrastructure is in place, we will add
> support for CoVE as well.
>
> Open design questions:
> ======================
>
> 1. The current implementation has two separate configs for guest(CONFIG_RISCV_COVE_GUEST)
> and the host (RISCV_COVE_HOST). The default defconfig will enable both so that
> same unified image works as both host & guest. Most likely distro prefer this way
> to minimize the maintenance burden but some may want a minimal CoVE guest image
> that has only hardened drivers. In addition to that, Android runs a microdroid instance
> in the confidential guests. A separate config will help in those case. Please let us
> know if there is any concern with two configs.
>
> 2. Lazy gstage page allocation vs upfront allocation with page pool.
> Currently, all gstage mappings happen at runtime during the fault. This is expensive
> as we need to convert that page to confidential memory as well. A page pool framework
> may be a better choice which can hold all the confidential pages which can be
> pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?
>
> 3. In order to allow both confidential VM and non-confidential VM, the series
> uses regular branching instead of static branches for CoVE VM specific cases through
> out KVM. That may cause a few more branch penalties while running regular VMs.
> The alternate option is to use function pointers for any function that needs to
> take a different path. As per my understanding, that would be worse than branches.
>
> Patch organization:
> ===================
> This series depends on quite a few RISC-V patches that are not upstream yet.
> Here are the dependencies.
>
> 1. RISC-V IPI improvement series
> 2. RISC-V AIA support series.
> 3. RISC-V NACL support series
>
> In this series, PATCH [0-5] are generic improvement and cleanup patches which
> can be merged independently.
>
> PATCH [6-26, 34-37] adds host side for CoVE.
> PATCH [27-33] adds the interrupt related changes.
> PATCH [34-49] Adds the guest side changes for CoVE.
>
> The TSM project is written in rust and can be found here:
> https://github.com/rivosinc/salus
>
> Running the stack
> ====================
>
> To run/test the stack, you would need the following components :
>
> 1) Qemu
> 2) Common Host & Guest Kernel
> 3) kvmtool
> 4) Host RootFS with KVMTOOL and Guest Kernel
> 5) Salus
>
> The detailed steps are available at[6]
>
> The Linux kernel patches are also available at [7] and the kvmtool patches
> are available at [8].
>
> TODOs
> =======
> As this is a very early work, the todo list is quite long :).
> Here are some of them (not in any specific order)
>
> 1. Support fd based private memory interface proposed in
>    https://lkml.org/lkml/2022/1/18/395
> 2. Align with updated guest runtime device filtering approach.
> 3. IOMMU integration
> 4. Dedicated device assignment via TDSIP & SPDM[4]
> 5. Support huge pages
> 6. Page pool allocator to avoid convert/reclaim at every fault
> 7. Other VMM support (qemu-kvm, crossvm)
> 8. Complete the PoC for the deployment model 1 where host runs in HS mode
> 9. Attestation integration
> 10. Harden the interrupt allowed list
> 11. kvm self-tests support for CoVE
> 11. kvm unit-tests support for CoVE
> 12. Guest hardening
> 13. Port pKVM on RISC-V using CoVE
> 14. Any other ?
>
> Links
> ============
> [0] CoVE architecture Specification.
>     https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/riscv-aptee-spec.pdf

I just noticed that this link is broken due to a recent PR merge. Here
is the updated link
https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/riscv-cove.pdf

Sorry for the noise.

> [1] https://lists.riscv.org/g/sig-hypervisors/message/260
> [2] https://github.com/riscv/riscv-aia/releases/download/1.0-RC2/riscv-interrupts-1.0-RC2.pdf
> [3] https://github.com/rivosinc/linux/tree/cove_integration_device_filtering1
> [4] https://github.com/intel/tdx/commits/guest-filter-upstream
> [5] https://lists.riscv.org/g/tech-ap-tee/message/83
> [6] https://github.com/rivosinc/cove/wiki/CoVE-KVM-RISCV64-on-QEMU
> [7] https://github.com/rivosinc/linux/commits/cove-integration
> [8] https://github.com/rivosinc/kvmtool/tree/cove-integration-03072023
>
> Atish Patra (33):
> RISC-V: KVM: Improve KVM error reporting to the user space
> RISC-V: KVM: Invoke aia_update with preempt disabled/irq enabled
> RISC-V: KVM: Add a helper function to get pgd size
> RISC-V: Add COVH SBI extensions definitions
> RISC-V: KVM: Implement COVH SBI extension
> RISC-V: KVM: Add a barebone CoVE implementation
> RISC-V: KVM: Add UABI to support static memory region attestation
> RISC-V: KVM: Add CoVE related nacl helpers
> RISC-V: KVM: Implement static memory region measurement
> RISC-V: KVM: Use the new VM IOCTL for measuring pages
> RISC-V: KVM: Exit to the user space for trap redirection
> RISC-V: KVM: Return early for gstage modifications
> RISC-V: KVM: Skip dirty logging updates for TVM
> RISC-V: KVM: Add a helper function to trigger fence ops
> RISC-V: KVM: Skip most VCPU requests for TVMs
> RISC-V : KVM: Skip vmid/hgatp management for TVMs
> RISC-V: KVM: Skip TLB management for TVMs
> RISC-V: KVM: Register memory regions as confidential for TVMs
> RISC-V: KVM: Add gstage mapping for TVMs
> RISC-V: KVM: Handle SBI call forward from the TSM
> RISC-V: KVM: Implement vcpu load/put functions for CoVE guests
> RISC-V: KVM: Wireup TVM world switch
> RISC-V: KVM: Skip HVIP update for TVMs
> RISC-V: KVM: Implement COVI SBI extension
> RISC-V: KVM: Add interrupt management functions for TVM
> RISC-V: KVM: Skip AIA CSR updates for TVMs
> RISC-V: KVM: Perform limited operations in hardware enable/disable
> RISC-V: KVM: Indicate no support user space emulated IRQCHIP
> RISC-V: KVM: Add AIA support for TVMs
> RISC-V: KVM: Hookup TVM VCPU init/destroy
> RISC-V: KVM: Initialize CoVE
> RISC-V: KVM: Add TVM init/destroy calls
> drivers/hvc: sbi: Disable HVC console for TVMs
>
> Rajnesh Kanwal (15):
> mm/vmalloc: Introduce arch hooks to notify ioremap/unmap changes
> RISC-V: KVM: Update timer functionality for TVMs.
> RISC-V: Add COVI extension definitions
> RISC-V: KVM: Read/write gprs from/to shmem in case of TVM VCPU.
> RISC-V: Add COVG SBI extension definitions
> RISC-V: Add CoVE guest config and helper functions
> RISC-V: Implement COVG SBI extension
> RISC-V: COVE: Add COVH invalidate, validate, promote, demote and
> remove APIs.
> RISC-V: KVM: Add host side support to handle COVG SBI calls.
> RISC-V: Allow host to inject any ext interrupt id to a CoVE guest.
> RISC-V: Add base memory encryption functions.
> RISC-V: Add cc_platform_has() for RISC-V for CoVE
> RISC-V: ioremap: Implement for arch specific ioremap hooks
> riscv/virtio: Have CoVE guests enforce restricted virtio memory
> access.
> RISC-V: Add shared bounce buffer to support DBCN for CoVE Guest.
>
> arch/riscv/Kbuild                       |    2 +
> arch/riscv/Kconfig                      |   27 +
> arch/riscv/cove/Makefile                |    2 +
> arch/riscv/cove/core.c                  |   40 +
> arch/riscv/cove/cove_guest_sbi.c        |  109 +++
> arch/riscv/include/asm/cove.h           |   27 +
> arch/riscv/include/asm/covg_sbi.h       |   38 +
> arch/riscv/include/asm/csr.h            |    2 +
> arch/riscv/include/asm/kvm_cove.h       |  206 +++++
> arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
> arch/riscv/include/asm/kvm_host.h       |   10 +-
> arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
> arch/riscv/include/asm/mem_encrypt.h    |   26 +
> arch/riscv/include/asm/sbi.h            |  107 +++
> arch/riscv/include/uapi/asm/kvm.h       |   17 +
> arch/riscv/kernel/irq.c                 |   12 +
> arch/riscv/kernel/setup.c               |    2 +
> arch/riscv/kvm/Makefile                 |    1 +
> arch/riscv/kvm/aia.c                    |  101 ++-
> arch/riscv/kvm/aia_device.c             |   41 +-
> arch/riscv/kvm/aia_imsic.c              |  127 ++-
> arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
> arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
> arch/riscv/kvm/main.c                   |   30 +-
> arch/riscv/kvm/mmu.c                    |   45 +-
> arch/riscv/kvm/tlb.c                    |   11 +-
> arch/riscv/kvm/vcpu.c                   |   69 +-
> arch/riscv/kvm/vcpu_exit.c              |   34 +-
> arch/riscv/kvm/vcpu_insn.c              |  115 ++-
> arch/riscv/kvm/vcpu_sbi.c               |   16 +
> arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
> arch/riscv/kvm/vcpu_timer.c             |   26 +-
> arch/riscv/kvm/vm.c                     |   34 +-
> arch/riscv/kvm/vmid.c                   |   17 +-
> arch/riscv/mm/Makefile                  |    3 +
> arch/riscv/mm/init.c                    |   17 +-
> arch/riscv/mm/ioremap.c                 |   45 +
> arch/riscv/mm/mem_encrypt.c             |   61 ++
> drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
> drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
> include/uapi/linux/kvm.h                |    8 +
> mm/vmalloc.c                            |   16 +
> 42 files changed, 3222 insertions(+), 109 deletions(-)
> create mode 100644 arch/riscv/cove/Makefile
> create mode 100644 arch/riscv/cove/core.c
> create mode 100644 arch/riscv/cove/cove_guest_sbi.c
> create mode 100644 arch/riscv/include/asm/cove.h
> create mode 100644 arch/riscv/include/asm/covg_sbi.h
> create mode 100644 arch/riscv/include/asm/kvm_cove.h
> create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
> create mode 100644 arch/riscv/include/asm/mem_encrypt.h
> create mode 100644 arch/riscv/kvm/cove.c
> create mode 100644 arch/riscv/kvm/cove_sbi.c
> create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
> create mode 100644 arch/riscv/mm/ioremap.c
> create mode 100644 arch/riscv/mm/mem_encrypt.c
>
> --
> 2.25.1
>
  
Sean Christopherson April 20, 2023, 4:30 p.m. UTC | #2
On Wed, Apr 19, 2023, Atish Patra wrote:
> 2. Lazy gstage page allocation vs upfront allocation with page pool.
> Currently, all gstage mappings happen at runtime during the fault. This is expensive
> as we need to convert that page to confidential memory as well. A page pool framework
> may be a better choice which can hold all the confidential pages which can be
> pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?

I'm sorry, what?  Do y'all really not pay any attention to what is happening
outside of the RISC-V world?

We, where "we" is KVM x86 and ARM, with folks contributing from 5+ companines,
have been working on this problem for going on three *years*.  And that's just
from the first public posting[1], there have been discussions about how to approach
this for even longer.  There have been multiple related presentations at KVM Forum,
something like 4 or 5 just at KVM Forum 2022 alone.

Patch 1 says "This patch is based on pkvm patches", so clearly you are at least
aware that there is other work going on in this space.

At a very quick glance, this series is suffers from all of the same flaws that SNP,
TDX, and pKVM have encountered.  E.g. assuming guest memory is backed by struct page
memory, relying on pinning to solve all problems (hint, it doesn't), and so on and
so forth.

And to make things worse, this series is riddled with bugs.  E.g. patch 19 alone
manages to squeeze in multiple fatal bugs in five new lines of code: deadlock due
to not releasing mmap_lock on failure, failure to correcty handle MOVE, failure to
handle DELETE at all, failure to honor (or reject) READONLY, and probably several
others.

diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 4b0f09e..63889d9 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -499,6 +499,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,

        mmap_read_lock(current->mm);

+       if (is_cove_vm(kvm)) {
+               ret = kvm_riscv_cove_vm_add_memreg(kvm, base_gpa, size);
+               if (ret)
+                       return ret;
+       }
        /*
         * A memory region could potentially cover multiple VMAs, and
         * any holes between them, so iterate over all of them to find

I get that this is an RFC, but for a series of this size, operating in an area that
is under heavy development by multiple other architectures, to have a diffstat that
shows _zero_ changes to common KVM is simply unacceptable.

Please, go look at restrictedmem[2] and work on building CoVE support on top of
that.  If the current proposal doesn't fit CoVE's needs, then we need to know _before_
all of that code gets merged.

[1] https://lore.kernel.org/linux-mm/20200522125214.31348-1-kirill.shutemov@linux.intel.com
[2] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com

> arch/riscv/Kbuild                       |    2 +
> arch/riscv/Kconfig                      |   27 +
> arch/riscv/cove/Makefile                |    2 +
> arch/riscv/cove/core.c                  |   40 +
> arch/riscv/cove/cove_guest_sbi.c        |  109 +++
> arch/riscv/include/asm/cove.h           |   27 +
> arch/riscv/include/asm/covg_sbi.h       |   38 +
> arch/riscv/include/asm/csr.h            |    2 +
> arch/riscv/include/asm/kvm_cove.h       |  206 +++++
> arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
> arch/riscv/include/asm/kvm_host.h       |   10 +-
> arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
> arch/riscv/include/asm/mem_encrypt.h    |   26 +
> arch/riscv/include/asm/sbi.h            |  107 +++
> arch/riscv/include/uapi/asm/kvm.h       |   17 +
> arch/riscv/kernel/irq.c                 |   12 +
> arch/riscv/kernel/setup.c               |    2 +
> arch/riscv/kvm/Makefile                 |    1 +
> arch/riscv/kvm/aia.c                    |  101 ++-
> arch/riscv/kvm/aia_device.c             |   41 +-
> arch/riscv/kvm/aia_imsic.c              |  127 ++-
> arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
> arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
> arch/riscv/kvm/main.c                   |   30 +-
> arch/riscv/kvm/mmu.c                    |   45 +-
> arch/riscv/kvm/tlb.c                    |   11 +-
> arch/riscv/kvm/vcpu.c                   |   69 +-
> arch/riscv/kvm/vcpu_exit.c              |   34 +-
> arch/riscv/kvm/vcpu_insn.c              |  115 ++-
> arch/riscv/kvm/vcpu_sbi.c               |   16 +
> arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
> arch/riscv/kvm/vcpu_timer.c             |   26 +-
> arch/riscv/kvm/vm.c                     |   34 +-
> arch/riscv/kvm/vmid.c                   |   17 +-
> arch/riscv/mm/Makefile                  |    3 +
> arch/riscv/mm/init.c                    |   17 +-
> arch/riscv/mm/ioremap.c                 |   45 +
> arch/riscv/mm/mem_encrypt.c             |   61 ++
> drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
> drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
> include/uapi/linux/kvm.h                |    8 +
> mm/vmalloc.c                            |   16 +
> 42 files changed, 3222 insertions(+), 109 deletions(-)
> create mode 100644 arch/riscv/cove/Makefile
> create mode 100644 arch/riscv/cove/core.c
> create mode 100644 arch/riscv/cove/cove_guest_sbi.c
> create mode 100644 arch/riscv/include/asm/cove.h
> create mode 100644 arch/riscv/include/asm/covg_sbi.h
> create mode 100644 arch/riscv/include/asm/kvm_cove.h
> create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
> create mode 100644 arch/riscv/include/asm/mem_encrypt.h
> create mode 100644 arch/riscv/kvm/cove.c
> create mode 100644 arch/riscv/kvm/cove_sbi.c
> create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
> create mode 100644 arch/riscv/mm/ioremap.c
> create mode 100644 arch/riscv/mm/mem_encrypt.c
> 
> --
> 2.25.1
>
  
Atish Patra April 20, 2023, 7:13 p.m. UTC | #3
On Thu, Apr 20, 2023 at 10:00 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 19, 2023, Atish Patra wrote:
> > 2. Lazy gstage page allocation vs upfront allocation with page pool.
> > Currently, all gstage mappings happen at runtime during the fault. This is expensive
> > as we need to convert that page to confidential memory as well. A page pool framework
> > may be a better choice which can hold all the confidential pages which can be
> > pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?
>
> I'm sorry, what?  Do y'all really not pay any attention to what is happening
> outside of the RISC-V world?
>
> We, where "we" is KVM x86 and ARM, with folks contributing from 5+ companines,
> have been working on this problem for going on three *years*.  And that's just
> from the first public posting[1], there have been discussions about how to approach
> this for even longer.  There have been multiple related presentations at KVM Forum,
> something like 4 or 5 just at KVM Forum 2022 alone.
>

Yes. We are following the restrictedmem effort and was reviewing the
v10 this week.
I did mention about that in the 1st item in the TODO list. We are
planning to use the restrictedmen
feature once it is closer to upstream (which seems to be the case
looking at v10).
Another reason is that this initial series is based on kvmtool only.
We are working on qemu-kvm
right now but have some RISC-V specific dependencies(interrupt
controller stuff) which are not there yet.
As the restrictedmem patches are already available in qemu-kvm too,
our plan was to support CoVE
in qemu-kvm first and work on restrictedmem after that.

This item was just based on this RFC implementation which uses a lazy
gstage page allocation.
The idea was to check if there is any interest at all in this
approach. I should have mentioned about
restrictedmem plan in this section as well. Sorry for the confusion.

Thanks for your suggestion. It seems we should just directly move to
restrictedmem asap.

> Patch 1 says "This patch is based on pkvm patches", so clearly you are at least
> aware that there is other work going on in this space.
>

Yes. We have been following pkvm, tdx & CCA patches. The MMIO section
has more details
on TDX/pkvm related aspects.

> At a very quick glance, this series is suffers from all of the same flaws that SNP,
> TDX, and pKVM have encountered.  E.g. assuming guest memory is backed by struct page
> memory, relying on pinning to solve all problems (hint, it doesn't), and so on and
> so forth.
>
> And to make things worse, this series is riddled with bugs.  E.g. patch 19 alone
> manages to squeeze in multiple fatal bugs in five new lines of code: deadlock due
> to not releasing mmap_lock on failure, failure to correcty handle MOVE, failure to

That's an oversight. Apologies for that. Thanks for pointing it out.

> handle DELETE at all, failure to honor (or reject) READONLY, and probably several
> others.
>
It should be rejected for READONLY as our APIs don't have any
permission flags yet.
I think we should add that to enable CoVE APIs to support as well ?

Same goes for DELETE ops as we don't have an API to delete any
confidential memory region
yet. I was not very sure about the use case for MOVE though (migration
possibly ?)

kvm_riscv_cove_vm_add_memreg should have been invoked only for CREATE
& reject others for now.
I will revise the patch accordingly and leave a TODO comment for the
future about API updates.

> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 4b0f09e..63889d9 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -499,6 +499,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>
>         mmap_read_lock(current->mm);
>
> +       if (is_cove_vm(kvm)) {
> +               ret = kvm_riscv_cove_vm_add_memreg(kvm, base_gpa, size);
> +               if (ret)
> +                       return ret;
> +       }
>         /*
>          * A memory region could potentially cover multiple VMAs, and
>          * any holes between them, so iterate over all of them to find
>
> I get that this is an RFC, but for a series of this size, operating in an area that
> is under heavy development by multiple other architectures, to have a diffstat that
> shows _zero_ changes to common KVM is simply unacceptable.
>

Thanks for the valuable feedback. This is pretty much pre-RFC as the
spec is very much
in draft state. We want to share with the larger linux community to
gather feedback sooner
than later so that we can incorporate that feedback into the spec if any.

> Please, go look at restrictedmem[2] and work on building CoVE support on top of
> that.  If the current proposal doesn't fit CoVE's needs, then we need to know _before_
> all of that code gets merged.
>

Absolutely. That has always been the plan.

> [1] https://lore.kernel.org/linux-mm/20200522125214.31348-1-kirill.shutemov@linux.intel.com
> [2] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com
>
> > arch/riscv/Kbuild                       |    2 +
> > arch/riscv/Kconfig                      |   27 +
> > arch/riscv/cove/Makefile                |    2 +
> > arch/riscv/cove/core.c                  |   40 +
> > arch/riscv/cove/cove_guest_sbi.c        |  109 +++
> > arch/riscv/include/asm/cove.h           |   27 +
> > arch/riscv/include/asm/covg_sbi.h       |   38 +
> > arch/riscv/include/asm/csr.h            |    2 +
> > arch/riscv/include/asm/kvm_cove.h       |  206 +++++
> > arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
> > arch/riscv/include/asm/kvm_host.h       |   10 +-
> > arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
> > arch/riscv/include/asm/mem_encrypt.h    |   26 +
> > arch/riscv/include/asm/sbi.h            |  107 +++
> > arch/riscv/include/uapi/asm/kvm.h       |   17 +
> > arch/riscv/kernel/irq.c                 |   12 +
> > arch/riscv/kernel/setup.c               |    2 +
> > arch/riscv/kvm/Makefile                 |    1 +
> > arch/riscv/kvm/aia.c                    |  101 ++-
> > arch/riscv/kvm/aia_device.c             |   41 +-
> > arch/riscv/kvm/aia_imsic.c              |  127 ++-
> > arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
> > arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
> > arch/riscv/kvm/main.c                   |   30 +-
> > arch/riscv/kvm/mmu.c                    |   45 +-
> > arch/riscv/kvm/tlb.c                    |   11 +-
> > arch/riscv/kvm/vcpu.c                   |   69 +-
> > arch/riscv/kvm/vcpu_exit.c              |   34 +-
> > arch/riscv/kvm/vcpu_insn.c              |  115 ++-
> > arch/riscv/kvm/vcpu_sbi.c               |   16 +
> > arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
> > arch/riscv/kvm/vcpu_timer.c             |   26 +-
> > arch/riscv/kvm/vm.c                     |   34 +-
> > arch/riscv/kvm/vmid.c                   |   17 +-
> > arch/riscv/mm/Makefile                  |    3 +
> > arch/riscv/mm/init.c                    |   17 +-
> > arch/riscv/mm/ioremap.c                 |   45 +
> > arch/riscv/mm/mem_encrypt.c             |   61 ++
> > drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
> > drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
> > include/uapi/linux/kvm.h                |    8 +
> > mm/vmalloc.c                            |   16 +
> > 42 files changed, 3222 insertions(+), 109 deletions(-)
> > create mode 100644 arch/riscv/cove/Makefile
> > create mode 100644 arch/riscv/cove/core.c
> > create mode 100644 arch/riscv/cove/cove_guest_sbi.c
> > create mode 100644 arch/riscv/include/asm/cove.h
> > create mode 100644 arch/riscv/include/asm/covg_sbi.h
> > create mode 100644 arch/riscv/include/asm/kvm_cove.h
> > create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
> > create mode 100644 arch/riscv/include/asm/mem_encrypt.h
> > create mode 100644 arch/riscv/kvm/cove.c
> > create mode 100644 arch/riscv/kvm/cove_sbi.c
> > create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
> > create mode 100644 arch/riscv/mm/ioremap.c
> > create mode 100644 arch/riscv/mm/mem_encrypt.c
> >
> > --
> > 2.25.1
> >
  
Sean Christopherson April 20, 2023, 8:21 p.m. UTC | #4
On Fri, Apr 21, 2023, Atish Kumar Patra wrote:
> On Thu, Apr 20, 2023 at 10:00 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Apr 19, 2023, Atish Patra wrote:
> > > 2. Lazy gstage page allocation vs upfront allocation with page pool.
> > > Currently, all gstage mappings happen at runtime during the fault. This is expensive
> > > as we need to convert that page to confidential memory as well. A page pool framework
> > > may be a better choice which can hold all the confidential pages which can be
> > > pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?
> >
> > I'm sorry, what?  Do y'all really not pay any attention to what is happening
> > outside of the RISC-V world?
> >
> > We, where "we" is KVM x86 and ARM, with folks contributing from 5+ companines,
> > have been working on this problem for going on three *years*.  And that's just
> > from the first public posting[1], there have been discussions about how to approach
> > this for even longer.  There have been multiple related presentations at KVM Forum,
> > something like 4 or 5 just at KVM Forum 2022 alone.
> >
> 
> I did mention about that in the 1st item in the TODO list.

My apologies, I completely missed the todo list.

> Thanks for your suggestion. It seems we should just directly move to
> restrictedmem asap.

Yes please, for the sake of everyone involved.  It will likely save you from
running into the same pitfalls that x86 and ARM already encountered, and the more
eyeballs and use cases on whatever restrictemem ends up being called, the better.

Thanks!
  
Michael Roth April 21, 2023, 3:35 p.m. UTC | #5
On Thu, Apr 20, 2023 at 09:30:29AM -0700, Sean Christopherson wrote:
> Please, go look at restrictedmem[2] and work on building CoVE support on top of
> that.  If the current proposal doesn't fit CoVE's needs, then we need to know _before_
> all of that code gets merged.

I agree it's preferable to know beforehand to avoid potential
maintainability quagmires bringing additional architectures onboard, and
that it probably makes sense here to get that early input. But as a
general statement, it's not necessarily a *requirement*.

I worry that if we commit to such a policy that by the time restrictedmem
gets close to merge, yet another architecture/use-case will come along that
delays things further for architectures that already have hardware in the
field.

Not saying that's the case here, but just in general I think it's worth
keeping the option open on iterating on a partial solution vs. trying to
address everything on the first shot, depending on how the timing works
out.

Thanks,

Mike

> 
> [1] https://lore.kernel.org/linux-mm/20200522125214.31348-1-kirill.shutemov@linux.intel.com
> [2] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com
> 
> > arch/riscv/Kbuild                       |    2 +
> > arch/riscv/Kconfig                      |   27 +
> > arch/riscv/cove/Makefile                |    2 +
> > arch/riscv/cove/core.c                  |   40 +
> > arch/riscv/cove/cove_guest_sbi.c        |  109 +++
> > arch/riscv/include/asm/cove.h           |   27 +
> > arch/riscv/include/asm/covg_sbi.h       |   38 +
> > arch/riscv/include/asm/csr.h            |    2 +
> > arch/riscv/include/asm/kvm_cove.h       |  206 +++++
> > arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
> > arch/riscv/include/asm/kvm_host.h       |   10 +-
> > arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
> > arch/riscv/include/asm/mem_encrypt.h    |   26 +
> > arch/riscv/include/asm/sbi.h            |  107 +++
> > arch/riscv/include/uapi/asm/kvm.h       |   17 +
> > arch/riscv/kernel/irq.c                 |   12 +
> > arch/riscv/kernel/setup.c               |    2 +
> > arch/riscv/kvm/Makefile                 |    1 +
> > arch/riscv/kvm/aia.c                    |  101 ++-
> > arch/riscv/kvm/aia_device.c             |   41 +-
> > arch/riscv/kvm/aia_imsic.c              |  127 ++-
> > arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
> > arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
> > arch/riscv/kvm/main.c                   |   30 +-
> > arch/riscv/kvm/mmu.c                    |   45 +-
> > arch/riscv/kvm/tlb.c                    |   11 +-
> > arch/riscv/kvm/vcpu.c                   |   69 +-
> > arch/riscv/kvm/vcpu_exit.c              |   34 +-
> > arch/riscv/kvm/vcpu_insn.c              |  115 ++-
> > arch/riscv/kvm/vcpu_sbi.c               |   16 +
> > arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
> > arch/riscv/kvm/vcpu_timer.c             |   26 +-
> > arch/riscv/kvm/vm.c                     |   34 +-
> > arch/riscv/kvm/vmid.c                   |   17 +-
> > arch/riscv/mm/Makefile                  |    3 +
> > arch/riscv/mm/init.c                    |   17 +-
> > arch/riscv/mm/ioremap.c                 |   45 +
> > arch/riscv/mm/mem_encrypt.c             |   61 ++
> > drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
> > drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
> > include/uapi/linux/kvm.h                |    8 +
> > mm/vmalloc.c                            |   16 +
> > 42 files changed, 3222 insertions(+), 109 deletions(-)
> > create mode 100644 arch/riscv/cove/Makefile
> > create mode 100644 arch/riscv/cove/core.c
> > create mode 100644 arch/riscv/cove/cove_guest_sbi.c
> > create mode 100644 arch/riscv/include/asm/cove.h
> > create mode 100644 arch/riscv/include/asm/covg_sbi.h
> > create mode 100644 arch/riscv/include/asm/kvm_cove.h
> > create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
> > create mode 100644 arch/riscv/include/asm/mem_encrypt.h
> > create mode 100644 arch/riscv/kvm/cove.c
> > create mode 100644 arch/riscv/kvm/cove_sbi.c
> > create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
> > create mode 100644 arch/riscv/mm/ioremap.c
> > create mode 100644 arch/riscv/mm/mem_encrypt.c
> > 
> > --
> > 2.25.1
> > 
>
  
Christophe de Dinechin April 24, 2023, 12:23 p.m. UTC | #6
On 2023-04-19 at 15:16 -07, Atish Patra <atishp@rivosinc.com> wrote...
> This patch series adds the RISC-V Confidential VM Extension (CoVE) support to
> Linux kernel. The RISC-V CoVE specification introduces non-ISA, SBI APIs. These
> APIs enable a confidential environment in which a guest VM's data can be isolated
> from the host while the host retains control of guest VM management and platform
> resources(memory, CPU, I/O).
>
> This is a very early WIP work. We want to share this with the community to get any
> feedback on overall architecture and direction. Any other feedback is welcome too.
>
> The detailed CoVE architecture document can be found here [0]. It used to be
> called AP-TEE and renamed to CoVE recently to avoid overloading term of TEE in
> general. The specification is in the draft stages and is subjected to change based
> on the feedback from the community.
>
> The CoVE specification introduces 3 new SBI extensions.
> COVH - CoVE Host side interface
> COVG - CoVE Guest side interface
> COVI - CoVE Secure Interrupt management extension
>
> Some key acronyms introduced:
>
> TSM - TEE Security Manager
> TVM - TEE VM (aka Confidential VM)
>
> CoVE Architecture:
> ====================
> The CoVE APIs are designed to be implementation and architecture agnostic,
> allowing for different deployment models while retaining common host and guest
> kernel code. Two examples are shown in Figure 1 and Figure 2.
> As shown in both figures, the architecture introduces a new software component
> called the "TEE Security Manager" (TSM) that runs in HS mode. The TSM has minimal
> hw attested footprint on TCB as it is a passive component that doesn't support
> scheduling or timer interrupts. Both example deployment models provide memory
> isolation between the host and the TEE VM (TVM).
>
>
> 	Non secure world       |         Secure world         |
>                                |                              |
>         Non                    |                              |
>     Virtualized |  Virtualized |   Virtualized  Virtualized   |
>         Env     |      Env     |       Env          Env       |
>    +----------+ | +----------+ |  +----------+ +----------+   |  --------------
>    |          | | |          | |  |          | |          |   |
>    | Host Apps| | |   Apps   | |  |   Apps   | |   Apps   |   |        VU-Mode
>    |  (VMM)   | | |          | |  |          | |          |   |
>    +----------+ | +----------+ |  +----------+ +----------+   |  --------------
>         |       | +----------+ |  +----------+ +----------+   |
>         |       | |          | |  |          | |          |   |
>         |       | |          | |  |    TVM   | |    TVM   |   |
>         |       | |   Guest  | |  |   Guest  | |   Guest  |   |       VS-Mode
>      Syscalls   | +----------+ |  +----------+ +----------+   |
>         |              |       |        |                     |
>         |             SBI      |   SBI(COVG + COVI)           |
>         |              |       |        |                     |
>   +--------------------------+ |  +---------------------------+  --------------
>   |     Host (Linux)         | |  |       TSM (Salus)         |
>   +--------------------------+ |  +---------------------------+
>              |                 |            |                       HS-Mode
>      SBI (COVH + COVI)         |     SBI (COVH + COVI)
>              |                 |            |
>   +-----------------------------------------------------------+  --------------
>   |                    Firmware(OpenSBI) + TSM Driver         |        M-Mode
>   +-----------------------------------------------------------+  --------------
>  +-----------------------------------------------------------------------------
>   |                    Hardware (RISC-V CPU + RoT + IOMMU)
>   +----------------------------------------------------------------------------
>  		Figure 1: Host in HS model
>
>
> The deployment model shown in Figure 1 runs the host in HS mode where it is peer
> to the TSM which also runs in HS mode. It requires another component known as TSM
> Driver running in higher privilege mode than host/TSM. It is responsible for switching
> the context between the host and the TSM. TSM driver also manages the platform
> specific hardware solution via confidential domain bit as described in the specification[0]
> to provide the required memory isolation.
>
>
> 	     Non secure world  |         Secure world
>                                |
>          Virtualized Env       |   Virtualized   Virtualized  |
>              		              Env           Env       |
>    +-------------------------+ |  +----------+  +----------+  |    ------------
>    |          | | |          | |  |          |  |          |  |
>    | Host Apps| | |   Apps   | |  |   Apps   |  |   Apps   |  |        VU-Mode
>    +----------+ | +----------+ |  +----------+  +----------+  |    ------------
>         |                      |        |             |       |
>     Syscalls             SBI   |      	|             |       |
>         |                      |        |             |       |
>   +--------------------------+ |  +-----------+ +-----------+ |
>   |     Host (Linux)         | |  |  TVM Guest| |  TVM Guest| |       VS-Mode
>   +--------------------------+ |  +-----------+ +-----------+ |
>              |                 |        |             |       |
>      SBI (COVH + COVI)         |       SBI           SBI      |
>              |                 |   (COVG + COVI) (COVG + COVI)|
> 	     |                 |        |             |       |
>   +-----------------------------------------------------------+    --------------
>   |                    TSM(Salus)	                      |        HS-Mode
>   +-----------------------------------------------------------+    --------------
>  			      |
>   			     SBI
> 			      |
>   +---------------------------------------------------------+    --------------
>   |                    Firmware(OpenSBI)                  |        M-Mode
>   +---------------------------------------------------------+    --------------
>  +-----------------------------------------------------------------------------
>   |                    Hardware (RISC-V CPU + RoT + IOMMU)
>   +----------------------------------------------------------------------------
>  			Figure 2: Host in VS model
>
>
> The deployment model shown in Figure 2 simplifies the context switch and memory isolation
> by running the host in VS mode as a guest of TSM. Thus, the memory isolation is
> achieved by gstage mapping by the TSM. We don't need any additional hardware confidential
> domain bit to provide memory isolation. The downside of this model the host has to run the
> non-confidential VMs in nested environment which may have lower performance (yet to be measured).
> The current implementation Salus(TSM) doesn't support full nested virtualization yet.
>
> The platform must have a RoT to provide attestation in either model.
> This patch series implements the APIs defined by CoVE. The current TSM implementation
> allows the host to run TVMs as shown in figure 2. We are working on deployment
> model 1 in parallel. We do not expect any significant changes in either host/guest side
> ABI due to that.
>
> Shared memory between the host & TSM:
> =====================================
> To accelerate the H-mode CSR/GPR access, CoVE also reuses the Nested Acceleration (NACL)
> SBI extension[1]. NACL defines a per physical cpu shared memory area that is allocated
> at the boot. It allows the host running in VS mode to access H-mode CSR/GPR easily
> without trapping into the TSM. The CoVE specification clearly defines the exact
> state of the shared memory with r/w permissions at every call.
>
> Secure Interrupt management:
> ===========================
> The CoVE specification relies on the MSI based interrupt scheme defined in Advanced Interrupt
> Architecture specification[2]. The COVI SBI extension adds functions to bind
> a guest interrupt file to a TVMs. After that, only TCB components (TSM, TVM, TSM driver)
> can modify that. The host can inject an interrupt via TSM only.
> The TVMs are also in complete control of which interrupts it can receive. By default,
> all interrupts are denied. In this proof-of-concept implementation, all the interrupts
> are allowed by the guest at boot time to keep it simple.
>
> Device I/O:
> ===========
> In order to support paravirt I/O devices, SWIOTLB bounce buffer must be used by the
> guest. As the host can not access confidential memory, this buffer memory
> must be shared with the host via share/unshare functions defined in COVG SBI extension.
> RISC-V implementation achieves this generalizing mem_encrypt_init() similar to TDX/SEV/CCA.
> That's why, the CoVE Guest is only allowed to use virtio devices with VIRTIO_F_ACCESS_PLATFORM
> and VIRTIO_F_VERSION_1 as they force virtio drivers to use the DMA API.
>
> MMIO emulation:
> ======================
> TVM can register regions of address space as MMIO regions to be emulated by
> the host. TSM provides explicit SBI functions i.e. SBI_EXT_COVG_[ADD/REMOVE]_MMIO_REGION
> to request/remove MMIO regions. Any reads or writes to those MMIO region after
> SBI_EXT_COVG_ADD_MMIO_REGION call are forwarded to the host for emulation.
>
> This series allows any ioremapped memory to be emulated as MMIO region with
> above APIs via arch hookups inspired from pKVM work. We are aware that this model
> doesn't address all the threat vectors. We have also implemented the device
> filtering/authorization approach adopted by TDX[4]. However, those patches are not
> part of this series as the base TDX patches are still under active development.
> RISC-V CoVE will also adapt the revamped device filtering work once it is accepted
> by the Linux community in the future.
>
> The direct assignment of devices are a work in progress and will be added in the future[4].
>
> VMM support:
> ============
> This series is only tested with kvmtool support. Other VMM support (qemu-kvm, crossvm/rust-vmm)
> will be added later.
>
> Test cases:
> ===========
> We are working on kvm selftest for CoVE. We will post them as soon as they are ready.
> We haven't started any work on kvm unit-tests as RISC-V doesn't have basic infrastructure
> to support that. Once the kvm uni-test infrastructure is in place, we will add
> support for CoVE as well.
>
> Open design questions:
> ======================
>
> 1. The current implementation has two separate configs for guest(CONFIG_RISCV_COVE_GUEST)
> and the host (RISCV_COVE_HOST). The default defconfig will enable both so that
> same unified image works as both host & guest. Most likely distro prefer this way
> to minimize the maintenance burden but some may want a minimal CoVE guest image
> that has only hardened drivers. In addition to that, Android runs a microdroid instance
> in the confidential guests. A separate config will help in those case. Please let us
> know if there is any concern with two configs.
>
> 2. Lazy gstage page allocation vs upfront allocation with page pool.
> Currently, all gstage mappings happen at runtime during the fault. This is expensive
> as we need to convert that page to confidential memory as well. A page pool framework
> may be a better choice which can hold all the confidential pages which can be
> pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ?
>
> 3. In order to allow both confidential VM and non-confidential VM, the series
> uses regular branching instead of static branches for CoVE VM specific cases through
> out KVM. That may cause a few more branch penalties while running regular VMs.
> The alternate option is to use function pointers for any function that needs to
> take a different path. As per my understanding, that would be worse than branches.
>
> Patch organization:
> ===================
> This series depends on quite a few RISC-V patches that are not upstream yet.
> Here are the dependencies.
>
> 1. RISC-V IPI improvement series
> 2. RISC-V AIA support series.
> 3. RISC-V NACL support series
>
> In this series, PATCH [0-5] are generic improvement and cleanup patches which
> can be merged independently.
>
> PATCH [6-26, 34-37] adds host side for CoVE.
> PATCH [27-33] adds the interrupt related changes.
> PATCH [34-49] Adds the guest side changes for CoVE.
>
> The TSM project is written in rust and can be found here:
> https://github.com/rivosinc/salus
>
> Running the stack
> ====================
>
> To run/test the stack, you would need the following components :
>
> 1) Qemu
> 2) Common Host & Guest Kernel
> 3) kvmtool
> 4) Host RootFS with KVMTOOL and Guest Kernel
> 5) Salus
>
> The detailed steps are available at[6]
>
> The Linux kernel patches are also available at [7] and the kvmtool patches
> are available at [8].
>
> TODOs
> =======
> As this is a very early work, the todo list is quite long :).
> Here are some of them (not in any specific order)
>
> 1. Support fd based private memory interface proposed in
>    https://lkml.org/lkml/2022/1/18/395
> 2. Align with updated guest runtime device filtering approach.
> 3. IOMMU integration
> 4. Dedicated device assignment via TDSIP & SPDM[4]
> 5. Support huge pages
> 6. Page pool allocator to avoid convert/reclaim at every fault
> 7. Other VMM support (qemu-kvm, crossvm)
> 8. Complete the PoC for the deployment model 1 where host runs in HS mode
> 9. Attestation integration
> 10. Harden the interrupt allowed list
> 11. kvm self-tests support for CoVE
> 11. kvm unit-tests support for CoVE
> 12. Guest hardening
> 13. Port pKVM on RISC-V using CoVE
> 14. Any other ?
>
> Links
> ============
> [0] CoVE architecture Specification.
>     https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/riscv-aptee-spec.pdf

URL does not work for me.

I found this:
https://github.com/riscv-non-isa/riscv-ap-tee/blob/main/specification/riscv-cove.pdf

> [1] https://lists.riscv.org/g/sig-hypervisors/message/260
> [2] https://github.com/riscv/riscv-aia/releases/download/1.0-RC2/riscv-interrupts-1.0-RC2.pdf
> [3] https://github.com/rivosinc/linux/tree/cove_integration_device_filtering1
> [4] https://github.com/intel/tdx/commits/guest-filter-upstream
> [5] https://lists.riscv.org/g/tech-ap-tee/message/83
> [6] https://github.com/rivosinc/cove/wiki/CoVE-KVM-RISCV64-on-QEMU
> [7] https://github.com/rivosinc/linux/commits/cove-integration
> [8] https://github.com/rivosinc/kvmtool/tree/cove-integration-03072023
>
> Atish Patra (33):
> RISC-V: KVM: Improve KVM error reporting to the user space
> RISC-V: KVM: Invoke aia_update with preempt disabled/irq enabled
> RISC-V: KVM: Add a helper function to get pgd size
> RISC-V: Add COVH SBI extensions definitions
> RISC-V: KVM: Implement COVH SBI extension
> RISC-V: KVM: Add a barebone CoVE implementation
> RISC-V: KVM: Add UABI to support static memory region attestation
> RISC-V: KVM: Add CoVE related nacl helpers
> RISC-V: KVM: Implement static memory region measurement
> RISC-V: KVM: Use the new VM IOCTL for measuring pages
> RISC-V: KVM: Exit to the user space for trap redirection
> RISC-V: KVM: Return early for gstage modifications
> RISC-V: KVM: Skip dirty logging updates for TVM
> RISC-V: KVM: Add a helper function to trigger fence ops
> RISC-V: KVM: Skip most VCPU requests for TVMs
> RISC-V : KVM: Skip vmid/hgatp management for TVMs
> RISC-V: KVM: Skip TLB management for TVMs
> RISC-V: KVM: Register memory regions as confidential for TVMs
> RISC-V: KVM: Add gstage mapping for TVMs
> RISC-V: KVM: Handle SBI call forward from the TSM
> RISC-V: KVM: Implement vcpu load/put functions for CoVE guests
> RISC-V: KVM: Wireup TVM world switch
> RISC-V: KVM: Skip HVIP update for TVMs
> RISC-V: KVM: Implement COVI SBI extension
> RISC-V: KVM: Add interrupt management functions for TVM
> RISC-V: KVM: Skip AIA CSR updates for TVMs
> RISC-V: KVM: Perform limited operations in hardware enable/disable
> RISC-V: KVM: Indicate no support user space emulated IRQCHIP
> RISC-V: KVM: Add AIA support for TVMs
> RISC-V: KVM: Hookup TVM VCPU init/destroy
> RISC-V: KVM: Initialize CoVE
> RISC-V: KVM: Add TVM init/destroy calls
> drivers/hvc: sbi: Disable HVC console for TVMs
>
> Rajnesh Kanwal (15):
> mm/vmalloc: Introduce arch hooks to notify ioremap/unmap changes
> RISC-V: KVM: Update timer functionality for TVMs.
> RISC-V: Add COVI extension definitions
> RISC-V: KVM: Read/write gprs from/to shmem in case of TVM VCPU.
> RISC-V: Add COVG SBI extension definitions
> RISC-V: Add CoVE guest config and helper functions
> RISC-V: Implement COVG SBI extension
> RISC-V: COVE: Add COVH invalidate, validate, promote, demote and
> remove APIs.
> RISC-V: KVM: Add host side support to handle COVG SBI calls.
> RISC-V: Allow host to inject any ext interrupt id to a CoVE guest.
> RISC-V: Add base memory encryption functions.
> RISC-V: Add cc_platform_has() for RISC-V for CoVE
> RISC-V: ioremap: Implement for arch specific ioremap hooks
> riscv/virtio: Have CoVE guests enforce restricted virtio memory
> access.
> RISC-V: Add shared bounce buffer to support DBCN for CoVE Guest.
>
> arch/riscv/Kbuild                       |    2 +
> arch/riscv/Kconfig                      |   27 +
> arch/riscv/cove/Makefile                |    2 +
> arch/riscv/cove/core.c                  |   40 +
> arch/riscv/cove/cove_guest_sbi.c        |  109 +++
> arch/riscv/include/asm/cove.h           |   27 +
> arch/riscv/include/asm/covg_sbi.h       |   38 +
> arch/riscv/include/asm/csr.h            |    2 +
> arch/riscv/include/asm/kvm_cove.h       |  206 +++++
> arch/riscv/include/asm/kvm_cove_sbi.h   |  101 +++
> arch/riscv/include/asm/kvm_host.h       |   10 +-
> arch/riscv/include/asm/kvm_vcpu_sbi.h   |    3 +
> arch/riscv/include/asm/mem_encrypt.h    |   26 +
> arch/riscv/include/asm/sbi.h            |  107 +++
> arch/riscv/include/uapi/asm/kvm.h       |   17 +
> arch/riscv/kernel/irq.c                 |   12 +
> arch/riscv/kernel/setup.c               |    2 +
> arch/riscv/kvm/Makefile                 |    1 +
> arch/riscv/kvm/aia.c                    |  101 ++-
> arch/riscv/kvm/aia_device.c             |   41 +-
> arch/riscv/kvm/aia_imsic.c              |  127 ++-
> arch/riscv/kvm/cove.c                   | 1005 +++++++++++++++++++++++
> arch/riscv/kvm/cove_sbi.c               |  490 +++++++++++
> arch/riscv/kvm/main.c                   |   30 +-
> arch/riscv/kvm/mmu.c                    |   45 +-
> arch/riscv/kvm/tlb.c                    |   11 +-
> arch/riscv/kvm/vcpu.c                   |   69 +-
> arch/riscv/kvm/vcpu_exit.c              |   34 +-
> arch/riscv/kvm/vcpu_insn.c              |  115 ++-
> arch/riscv/kvm/vcpu_sbi.c               |   16 +
> arch/riscv/kvm/vcpu_sbi_covg.c          |  232 ++++++
> arch/riscv/kvm/vcpu_timer.c             |   26 +-
> arch/riscv/kvm/vm.c                     |   34 +-
> arch/riscv/kvm/vmid.c                   |   17 +-
> arch/riscv/mm/Makefile                  |    3 +
> arch/riscv/mm/init.c                    |   17 +-
> arch/riscv/mm/ioremap.c                 |   45 +
> arch/riscv/mm/mem_encrypt.c             |   61 ++
> drivers/tty/hvc/hvc_riscv_sbi.c         |    5 +
> drivers/tty/serial/earlycon-riscv-sbi.c |   51 +-
> include/uapi/linux/kvm.h                |    8 +
> mm/vmalloc.c                            |   16 +
> 42 files changed, 3222 insertions(+), 109 deletions(-)
> create mode 100644 arch/riscv/cove/Makefile
> create mode 100644 arch/riscv/cove/core.c
> create mode 100644 arch/riscv/cove/cove_guest_sbi.c
> create mode 100644 arch/riscv/include/asm/cove.h
> create mode 100644 arch/riscv/include/asm/covg_sbi.h
> create mode 100644 arch/riscv/include/asm/kvm_cove.h
> create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h
> create mode 100644 arch/riscv/include/asm/mem_encrypt.h
> create mode 100644 arch/riscv/kvm/cove.c
> create mode 100644 arch/riscv/kvm/cove_sbi.c
> create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c
> create mode 100644 arch/riscv/mm/ioremap.c
> create mode 100644 arch/riscv/mm/mem_encrypt.c


--
Cheers,
Christophe de Dinechin (https://c3d.github.io)
Freedom Covenant (https://github.com/c3d/freedom-covenant)
Theory of Incomplete Measurements (https://c3d.github.io/TIM)