[v4,00/10] arm64: ptdump: View the second stage page-tables

Message ID 20231218135859.2513568-2-sebastianene@google.com
Headers
Series arm64: ptdump: View the second stage page-tables |

Message

Sebastian Ene Dec. 18, 2023, 1:58 p.m. UTC
  Hi,

This can be used as a debugging tool for dumping the second stage
page-tables.

When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump registers 
'/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
upon guest creation. This allows userspace tools (eg. cat) to dump the
stage-2 pagetables by reading the registered file.

Reading the debugfs file shows stage-2 memory ranges in following format:
<IPA range> <size> <descriptor type> <access permissions> <mem_attributes>

Under pKVM configuration(kvm-arm.mode=protected) ptdump registers an entry
for the host stage-2 pagetables in the following path:
/sys/debug/kvm/host_stage2_page_tables/

The tool interprets the pKVM ownership annotation stored in the invalid
entries and dumps to the console the ownership information. To be able
to access the host stage-2 page-tables from the kernel, a new hypervisor
call was introduced which allows us to snapshot the page-tables in a host
provided buffer. The hypervisor call is hidden behind CONFIG_NVHE_EL2_DEBUG
as this should be used under debugging environment.

Link to the third version:
https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@google.com/

Link to the second version:
https://lore.kernel.org/all/20231019144032.2943044-1-sebastianene@google.com/#r

Link to the first version:
https://lore.kernel.org/all/20230927112517.2631674-1-sebastianene@google.com/

Changelog:
  v3 -> current_version:
  * refactorization: moved all the **KVM** specific components under
    kvm/ as suggested by Oliver. Introduced a new file
    arm64/kvm/ptdump.c which handled the second stage translation.
    re-used only the display portion from mm/ptdump.c

  * pagetable snapshot creation now uses memory donated from the host.
    The memory is no longer shared with the host as this can pose a security
    risk if the host has access to manipulate the pagetable copy while
    the hypervisor iterates it.

  * fixed a memory leak: while memory was used from the memcache for
    building the snapshot pagetable, it was no longer giving back the
    pages to the host for freeing. A separate array was introduced to
    keep track of the pages allocated from the memcache.

  v2 -> v3:
  * register the stage-2 debugfs entry for the host under
    /sys/debug/kvm/host_stage2_page_tables and in
    /sys/debug/kvm/<guest_id>/stage2_page_tables for guests.

  * don't use a static array for parsing the attributes description,
    generate it dynamically based on the number of pagetable levels

  * remove the lock that was guarding the seq_file private inode data,
    and keep the data private to the open file session.

  * minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
    CONFIG_PTDUMP_STAGE2_DEBUGFS

  v1 -> v2:
  * use the stage-2 pagetable walker for dumping descriptors instead of
    the one provided by ptdump.

  * support for guests pagetables dumping under VHE/nVHE non-protected

Thanks,

Sebastian Ene (10):
  KVM: arm64: Add snapshot interface for the host stage-2 pagetable
  KVM: arm64: Add ptdump registration with debugfs for the stage-2
    pagetables
  KVM: arm64: Invoke the snapshot interface for the host stage-2
    pagetable
  arm64: ptdump: Expose the attribute parsing functionality
  arm64: ptdump: Use the mask from the state structure
  KVM: arm64: Move pagetable definitions to common header
  KVM: arm64: Walk the pagetable snapshot and parse the ptdump
    descriptors
  arm64: ptdump: Interpret memory attributes based on the runtime config
  arm64: ptdump: Interpret pKVM ownership annotations
  arm64: ptdump: Add guest stage-2 pagetables dumping

 arch/arm64/include/asm/kvm_asm.h              |   1 +
 arch/arm64/include/asm/kvm_pgtable.h          | 111 +++++
 arch/arm64/include/asm/ptdump.h               |  49 +-
 arch/arm64/kvm/Kconfig                        |  13 +
 arch/arm64/kvm/Makefile                       |   1 +
 arch/arm64/kvm/arm.c                          |   2 +
 arch/arm64/kvm/debug.c                        |   6 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  27 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  12 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 186 ++++++++
 arch/arm64/kvm/hyp/pgtable.c                  |  85 ++--
 arch/arm64/kvm/kvm_ptdump.h                   |  21 +
 arch/arm64/kvm/ptdump.c                       | 436 ++++++++++++++++++
 arch/arm64/mm/ptdump.c                        |  55 +--
 14 files changed, 897 insertions(+), 108 deletions(-)
 create mode 100644 arch/arm64/kvm/kvm_ptdump.h
 create mode 100644 arch/arm64/kvm/ptdump.c
  

Comments

Oliver Upton Dec. 21, 2023, 6:36 p.m. UTC | #1
Hi Sebastian,

I'm going to try and review the series a bit more after the holidays,
but in general this is unecessarily complex. Furthermore, building out
the feature for pKVM *first* rather than 'normal' VMs makes it all very
challenging to follow.

I would strongly prefer that this series be split in two, adding support
for 'normal' VMs with a follow-up series for getting the pKVM plumbing
in place for host stage-2.
  
Sebastian Ene Dec. 21, 2023, 6:41 p.m. UTC | #2
On Thu, Dec 21, 2023 at 06:36:28PM +0000, Oliver Upton wrote:

Hello Oliver,

> Hi Sebastian,
> 
> I'm going to try and review the series a bit more after the holidays,
> but in general this is unecessarily complex. Furthermore, building out
> the feature for pKVM *first* rather than 'normal' VMs makes it all very
> challenging to follow.
> 

Thanks for having a look. This sounds good, I can split the series into
two afer the holidays and push a new version because I am really not
happy with my current state of the patches (especially because I
discovered some issues after I subimtted on the list).


> I would strongly prefer that this series be split in two, adding support
> for 'normal' VMs with a follow-up series for getting the pKVM plumbing
> in place for host stage-2.
> 
> -- 
> Thanks,
> Oliver

Thank you,
Seb