[RFC,0/3] Export APICv-related state via binary stats interface

Message ID 20240215160136.1256084-1-alejandro.j.jimenez@oracle.com
Headers
Series Export APICv-related state via binary stats interface |

Message

Alejandro Jimenez Feb. 15, 2024, 4:01 p.m. UTC
  The goal of this RFC is to agree on a mechanism for querying the state (and
related stats) of APICv/AVIC. I clearly have an AVIC bias when approaching this
topic since that is the side that I have mostly looked at, and has the greater
number of possible inhibits, but I believe the argument applies for both
vendor's technologies.

Currently, a user or monitoring app trying to determine if APICv is actually
being used needs implementation-specific knowlegde in order to look for specific
types of #VMEXIT (i.e. AVIC_INCOMPLETE_IPI/AVIC_NOACCEL), checking GALog events
by watching /proc/interrupts for AMD-Vi*-GA, etc. There are existing tracepoints
(e.g. kvm_apicv_accept_irq, kvm_avic_ga_log) that make this task easier, but
tracefs is not viable in some scenarios. Adding kvm debugfs entries has similar
downsides. Suravee has previously proposed a new IOCTL interface[0] to expose
this information, but there has not been any development in that direction.
Sean has mentioned a preference for using BPF to extract info from the current
tracepoints, which would require reworking existing structs to access some
desired data, but as far as I know there isn't any work done on that approach
yet.

Recently Joao mentioned another alternative: the binary stats framework that is
already supported by kernel[1] and QEMU[2]. This RFC has minimal code changes to
expose the relevant info based on the existing data types the framework already
supports. If there is consensus on using this approach, I can expand the fd
stats subsystem to include other data types (e.g. a bitmap type for exposing the
inhibit reasons), as well as adding documentation on KVM explaining which stats
are relevant for APICv and how to query them.

A basic example of retrieving the stats via qmp-shell, showing both a VM and
per-vCPU case:

# /usr/local/bin/qmp-shell --pretty ./qmp-sock

(QEMU) query-stats target=vm providers=[{'provider':'kvm','names':['apicv_inhibited']}]
{
    "return": [
        {
            "provider": "kvm",
            "stats": [
                {
                    "name": "apicv_inhibited",
                    "value": false
                }
            ]
        }
    ]
}

(QEMU) query-stats target=vcpu vcpus=['/machine/unattached/device[0]'] providers=[{'provider':'kvm','names':['apicv_accept_irq','ga_log_event']}]
{
    "return": [
        {
            "provider": "kvm",
            "qom-path": "/machine/unattached/device[0]",
            "stats": [
                {
                    "name": "ga_log_event",
                    "value": 98
                },
                {
                    "name": "apicv_accept_irq",
                    "value": 166920
                }
            ]
        }
    ]
}

If other alternatives are preferred, please let's use this thread to discuss and
I can take a shot at implementing the desired solution.

Regards,
Alejandro

[0] https://lore.kernel.org/qemu-devel/7e0d22fa-b9b0-ad1a-3a37-a450ec5d73e8@amd.com/
[1] https://lore.kernel.org/all/20210618222709.1858088-1-jingzhangos@google.com/
[2] https://lore.kernel.org/qemu-devel/20220530150714.756954-1-pbonzini@redhat.com/

Alejandro Jimenez (3):
  x86: KVM: stats: Add a stat to report status of APICv inhibition
  x86: KVM: stats: Add stat counter for IRQs injected via APICv
  x86: KVM: stats: Add a stat counter for GALog events

 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/svm/avic.c         |  4 +++-
 arch/x86/kvm/svm/svm.c          |  3 +++
 arch/x86/kvm/vmx/vmx.c          |  2 ++
 arch/x86/kvm/x86.c              | 12 +++++++++++-
 5 files changed, 22 insertions(+), 2 deletions(-)


base-commit: 7455665a3521aa7b56245c0a2810f748adc5fdd4