[drm-misc-next,v4,0/8,RFC] DRM GPUVA Manager GPU-VM features

Message ID 20230920144343.64830-1-dakr@redhat.com
Headers
Series DRM GPUVA Manager GPU-VM features |

Message

Danilo Krummrich Sept. 20, 2023, 2:42 p.m. UTC
  So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch series
aims at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
   this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
   shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
   GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
   of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

The implementation introduces struct drm_gpuvm_bo, which serves as abstraction
combining a struct drm_gpuvm and struct drm_gem_object, similar to what
amdgpu does with struct amdgpu_bo_vm. While this adds a bit of complexity it
improves the efficiency of tracking external and evicted GEM objects.

This patch series also renames struct drm_gpuva_manager to struct drm_gpuvm
including corresponding functions. This way the GPUVA manager's structures align
better with the documentation of VM_BIND [1] and VM_BIND locking [2]. It also
provides a better foundation for the naming of data structures and functions
introduced for implementing the features of this patch series.

This patch series is also available at [3].

[1] Documentation/gpu/drm-vm-bind-async.rst
[2] Documentation/gpu/drm-vm-bind-locking.rst
[3] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/gpuvm-next

Changes in V2:
==============
  - rename 'drm_gpuva_manager' -> 'drm_gpuvm' which generally leads to more
    consistent naming
  - properly separate commits (introduce common dma-resv, drm_gpuvm_bo
    abstraction, etc.)
  - remove maple tree for tracking external objects, use a list drm_gpuvm_bos
    per drm_gpuvm instead
  - rework dma-resv locking helpers (Thomas)
  - add a locking helper for a given range of the VA space (Christian)
  - make the GPUVA manager buildable as module, rather than drm_exec
    builtin (Christian)

Changes in V3:
==============
  - rename missing function and files (Boris)
  - warn if vm_obj->obj != obj in drm_gpuva_link() (Boris)
  - don't expose drm_gpuvm_bo_destroy() (Boris)
  - unlink VM_BO from GEM in drm_gpuvm_bo_destroy() rather than
    drm_gpuva_unlink() and link within drm_gpuvm_bo_obtain() to keep
    drm_gpuvm_bo instances unique
  - add internal locking to external and evicted object lists to support drivers
    updating the VA space from within the fence signalling critical path (Boris)
  - unlink external objects and evicted objects from the GPUVM's list in
    drm_gpuvm_bo_destroy()
  - add more documentation and fix some kernel doc issues

Changes in V4:
==============
  - add a drm_gpuvm_resv() helper (Boris)
  - add a drm_gpuvm::<list_name>::local_list field (Boris)
  - remove drm_gpuvm_bo_get_unless_zero() helper (Boris)
  - fix missing NULL assignment in get_next_vm_bo_from_list() (Boris)
  - keep a drm_gem_object reference on potential vm_bo destroy (alternatively we
    could free the vm_bo and drop the vm_bo's drm_gem_object reference through
    async work)
  - introduce DRM_GPUVM_RESV_PROTECTED flag to indicate external locking through
    the corresponding dma-resv locks to optimize for drivers already holding
    them when needed; add the corresponding lock_assert_held() calls (Thomas)
  - make drm_gpuvm_bo_evict() per vm_bo and add a drm_gpuvm_bo_gem_evict()
    helper (Thomas)
  - pass a drm_gpuvm_bo in drm_gpuvm_ops::vm_bo_validate() (Thomas)
  - documentation fixes

Danilo Krummrich (8):
  drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
  drm/gpuvm: allow building as module
  drm/nouveau: uvmm: rename 'umgr' to 'base'
  drm/gpuvm: add common dma-resv per struct drm_gpuvm
  drm/gpuvm: add an abstraction for a VM / BO combination
  drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
  drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  drm/nouveau: GPUVM dma-resv/extobj handling, GEM validation

 drivers/gpu/drm/Kconfig                   |    7 +
 drivers/gpu/drm/Makefile                  |    2 +-
 drivers/gpu/drm/drm_debugfs.c             |   16 +-
 drivers/gpu/drm/drm_gpuva_mgr.c           | 1725 --------------
 drivers/gpu/drm/drm_gpuvm.c               | 2600 +++++++++++++++++++++
 drivers/gpu/drm/nouveau/Kconfig           |    1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c      |    4 +-
 drivers/gpu/drm/nouveau/nouveau_debugfs.c |    2 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c    |   52 +-
 drivers/gpu/drm/nouveau/nouveau_exec.h    |    4 -
 drivers/gpu/drm/nouveau/nouveau_gem.c     |    5 +-
 drivers/gpu/drm/nouveau/nouveau_sched.h   |    4 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c    |  207 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.h    |    8 +-
 include/drm/drm_debugfs.h                 |    6 +-
 include/drm/drm_gem.h                     |   32 +-
 include/drm/drm_gpuva_mgr.h               |  706 ------
 include/drm/drm_gpuvm.h                   | 1142 +++++++++
 18 files changed, 3934 insertions(+), 2589 deletions(-)
 delete mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 drivers/gpu/drm/drm_gpuvm.c
 delete mode 100644 include/drm/drm_gpuva_mgr.h
 create mode 100644 include/drm/drm_gpuvm.h


base-commit: 1c7a387ffef894b1ab3942f0482dac7a6e0a909c
  

Comments

Boris Brezillon Sept. 28, 2023, 12:09 p.m. UTC | #1
On Wed, 20 Sep 2023 16:42:33 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> allocations and mappings, generically connect GPU VA mappings to their
> backing buffers and perform more complex mapping operations on the GPU VA
> space.
> 
> However, there are more design patterns commonly used by drivers, which
> can potentially be generalized in order to make the DRM GPUVA manager
> represent a basic GPU-VM implementation. In this context, this patch series
> aims at generalizing the following elements.
> 
> 1) Provide a common dma-resv for GEM objects not being used outside of
>    this GPU-VM.
> 
> 2) Provide tracking of external GEM objects (GEM objects which are
>    shared with other GPU-VMs).
> 
> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>    GPU-VM contains mappings of.
> 
> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
>    of, such that validation of evicted GEM objects is accelerated.
> 
> 5) Provide some convinience functions for common patterns.
> 
> The implementation introduces struct drm_gpuvm_bo, which serves as abstraction
> combining a struct drm_gpuvm and struct drm_gem_object, similar to what
> amdgpu does with struct amdgpu_bo_vm. While this adds a bit of complexity it
> improves the efficiency of tracking external and evicted GEM objects.
> 
> This patch series also renames struct drm_gpuva_manager to struct drm_gpuvm
> including corresponding functions. This way the GPUVA manager's structures align
> better with the documentation of VM_BIND [1] and VM_BIND locking [2]. It also
> provides a better foundation for the naming of data structures and functions
> introduced for implementing the features of this patch series.
> 
> This patch series is also available at [3].
> 
> [1] Documentation/gpu/drm-vm-bind-async.rst
> [2] Documentation/gpu/drm-vm-bind-locking.rst
> [3] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/gpuvm-next
> 
> Changes in V2:
> ==============
>   - rename 'drm_gpuva_manager' -> 'drm_gpuvm' which generally leads to more
>     consistent naming
>   - properly separate commits (introduce common dma-resv, drm_gpuvm_bo
>     abstraction, etc.)
>   - remove maple tree for tracking external objects, use a list drm_gpuvm_bos
>     per drm_gpuvm instead
>   - rework dma-resv locking helpers (Thomas)
>   - add a locking helper for a given range of the VA space (Christian)
>   - make the GPUVA manager buildable as module, rather than drm_exec
>     builtin (Christian)
> 
> Changes in V3:
> ==============
>   - rename missing function and files (Boris)
>   - warn if vm_obj->obj != obj in drm_gpuva_link() (Boris)
>   - don't expose drm_gpuvm_bo_destroy() (Boris)
>   - unlink VM_BO from GEM in drm_gpuvm_bo_destroy() rather than
>     drm_gpuva_unlink() and link within drm_gpuvm_bo_obtain() to keep
>     drm_gpuvm_bo instances unique
>   - add internal locking to external and evicted object lists to support drivers
>     updating the VA space from within the fence signalling critical path (Boris)
>   - unlink external objects and evicted objects from the GPUVM's list in
>     drm_gpuvm_bo_destroy()
>   - add more documentation and fix some kernel doc issues
> 
> Changes in V4:
> ==============
>   - add a drm_gpuvm_resv() helper (Boris)
>   - add a drm_gpuvm::<list_name>::local_list field (Boris)
>   - remove drm_gpuvm_bo_get_unless_zero() helper (Boris)
>   - fix missing NULL assignment in get_next_vm_bo_from_list() (Boris)
>   - keep a drm_gem_object reference on potential vm_bo destroy (alternatively we
>     could free the vm_bo and drop the vm_bo's drm_gem_object reference through
>     async work)
>   - introduce DRM_GPUVM_RESV_PROTECTED flag to indicate external locking through
>     the corresponding dma-resv locks to optimize for drivers already holding
>     them when needed; add the corresponding lock_assert_held() calls (Thomas)
>   - make drm_gpuvm_bo_evict() per vm_bo and add a drm_gpuvm_bo_gem_evict()
>     helper (Thomas)
>   - pass a drm_gpuvm_bo in drm_gpuvm_ops::vm_bo_validate() (Thomas)
>   - documentation fixes
> 
> Danilo Krummrich (8):
>   drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
>   drm/gpuvm: allow building as module
>   drm/nouveau: uvmm: rename 'umgr' to 'base'
>   drm/gpuvm: add common dma-resv per struct drm_gpuvm
>   drm/gpuvm: add an abstraction for a VM / BO combination
>   drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
>   drm/gpuvm: generalize dma_resv/extobj handling and GEM validation

Tested-by: Boris Brezillon <boris.brezillon@collabora.com>

>   drm/nouveau: GPUVM dma-resv/extobj handling, GEM validation
> 
>  drivers/gpu/drm/Kconfig                   |    7 +
>  drivers/gpu/drm/Makefile                  |    2 +-
>  drivers/gpu/drm/drm_debugfs.c             |   16 +-
>  drivers/gpu/drm/drm_gpuva_mgr.c           | 1725 --------------
>  drivers/gpu/drm/drm_gpuvm.c               | 2600 +++++++++++++++++++++
>  drivers/gpu/drm/nouveau/Kconfig           |    1 +
>  drivers/gpu/drm/nouveau/nouveau_bo.c      |    4 +-
>  drivers/gpu/drm/nouveau/nouveau_debugfs.c |    2 +-
>  drivers/gpu/drm/nouveau/nouveau_exec.c    |   52 +-
>  drivers/gpu/drm/nouveau/nouveau_exec.h    |    4 -
>  drivers/gpu/drm/nouveau/nouveau_gem.c     |    5 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.h   |    4 +-
>  drivers/gpu/drm/nouveau/nouveau_uvmm.c    |  207 +-
>  drivers/gpu/drm/nouveau/nouveau_uvmm.h    |    8 +-
>  include/drm/drm_debugfs.h                 |    6 +-
>  include/drm/drm_gem.h                     |   32 +-
>  include/drm/drm_gpuva_mgr.h               |  706 ------
>  include/drm/drm_gpuvm.h                   | 1142 +++++++++
>  18 files changed, 3934 insertions(+), 2589 deletions(-)
>  delete mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>  create mode 100644 drivers/gpu/drm/drm_gpuvm.c
>  delete mode 100644 include/drm/drm_gpuva_mgr.h
>  create mode 100644 include/drm/drm_gpuvm.h
> 
> 
> base-commit: 1c7a387ffef894b1ab3942f0482dac7a6e0a909c