[RFC,v5,0/6] KVM: x86: add per-vCPU exits disable capability

Message ID	20230113220114.2437-1-kechenl@nvidia.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C From: Kechen Lu <kechenl@nvidia.com> To: <kvm@vger.kernel.org>, <seanjc@google.com>, <pbonzini@redhat.com> CC: <chao.gao@intel.com>, <shaoqin.huang@intel.com>, <vkuznets@redhat.com>, <kechenl@nvidia.com>, <linux-kernel@vger.kernel.org> Subject: [RFC PATCH v5 0/6] KVM: x86: add per-vCPU exits disable capability Date: Fri, 13 Jan 2023 22:01:08 +0000 Message-ID: <20230113220114.2437-1-kechenl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	KVM: x86: add per-vCPU exits disable capability \| [RFC,v5,0/6] KVM: x86: add per-vCPU exits disable capability [RFC,v5,1/6] KVM: x86: only allow exits disable before vCPUs created [RFC,v5,2/6] KVM: x86: Move *_in_guest power management flags to vCPU scope [RFC,v5,3/6] KVM: x86: Reject disabling of MWAIT interception when not allowed [RFC,v5,4/6] KVM: x86: Let userspace re-enable previously disabled exits [RFC,v5,5/6] KVM: x86: add vCPU scoped toggling for disabled exits [RFC,v5,6/6] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS

Message ID

20230113220114.2437-1-kechenl@nvidia.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates
 216.228.117.161 as permitted sender) receiver=protection.outlook.com;
 client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C
From: Kechen Lu <kechenl@nvidia.com>
To: <kvm@vger.kernel.org>, <seanjc@google.com>, <pbonzini@redhat.com>
CC: <chao.gao@intel.com>, <shaoqin.huang@intel.com>,
        <vkuznets@redhat.com>, <kechenl@nvidia.com>,
        <linux-kernel@vger.kernel.org>
Subject: [RFC PATCH v5 0/6] KVM: x86: add per-vCPU exits disable capability
Date: Fri, 13 Jan 2023 22:01:08 +0000
Message-ID: <20230113220114.2437-1-kechenl@nvidia.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jan 2023 22:01:42.7696
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 93c589de-666d-4f88-2e6f-08daf5b1c169
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com]
X-MS-Exchange-CrossTenant-AuthSource: 
 DM6NAM11FT087.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB5642
X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,
        DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO,
        RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE
        autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
X-GMAIL-THRID: =?utf-8?q?1754946688113644986?=
X-GMAIL-MSGID: =?utf-8?q?1754946688113644986?=

Series

KVM: x86: add per-vCPU exits disable capability |

Message

Kechen Lu Jan. 13, 2023, 10:01 p.m. UTC

  Summary
===========
Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
cap for disabling exits to enable finer-grained VM exits disabling
on per vCPU scales instead of whole guest. This patch series enabled
the vCPU-scoped exits control and toggling.

Motivation
============
In use cases like Windows guest running heavy CPU-bound
workloads, disabling HLT VM-exits could mitigate host sched ctx switch
overhead. Simply HLT disabling on all vCPUs could bring
performance benefits, but if no pCPUs reserved for host threads, could
happened to the forced preemption as host does not know the time to do
the schedule for other host threads want to run. With this patch, we
could only disable part of vCPUs HLT exits for one guest, this still
keeps performance benefits, and also shows resiliency to host stressing
workload running at the same time.

Performance and Testing
=========================
In the host stressing workload experiment with Windows guest heavy
CPU-bound workloads, it shows good resiliency and having the ~3%
performance improvement. E.g. Passmark running in a Windows guest
with this patch disabling HLT exits on only half of vCPUs still
showing 2.4% higher main score v/s baseline.

Tested everything on AMD machines.

v4->v5 :
- Drop the usage of KVM request, keep the VM-scoped exits disable
  as the existing design, and only allow per-vCPU settings to
  override the per-VM settings (Sean Christopherson)
- Refactor the disable exits selftest without introducing any
  new prerequisite patch, tests per-vCPU exits disable and overrides,
  and per-VM exits disable

v3->v4 (Chao Gao) :
- Use kvm vCPU request KVM_REQ_DISABLE_EXIT to perform the arch
  VMCS updating (patch 5)
- Fix selftests redundant arguments (patch 7)
- Merge overlapped fix bits from patch 4 to patch 3

v2->v3 (Sean Christopherson) :
- Reject KVM_CAP_X86_DISABLE_EXITS if userspace disable MWAIT exits
  when MWAIT is not allowed in guest (patch 3)
- Make userspace able to re-enable previously disabled exits (patch 4)
- Add mwait/pause/cstate exits flag toggling instead of only hlt
  exits (patch 5)
- Add selftests for KVM_CAP_X86_DISABLE_EXITS (patch 7)

v1->v2 (Sean Christopherson) :
- Add explicit restriction for VM-scoped exits disabling to be called
  before vCPUs creation (patch 1)
- Use vCPU ioctl instead of 64bit vCPU bitmask (patch 5), and make exits
  disable flags check purely for vCPU instead of VM (patch 2)

Best Regards,
Kechen

Kechen Lu (3):
  KVM: x86: Move *_in_guest power management flags to vCPU scope
  KVM: x86: add vCPU scoped toggling for disabled exits
  KVM: selftests: Add tests for VM and vCPU cap
    KVM_CAP_X86_DISABLE_EXITS

Sean Christopherson (3):
  KVM: x86: only allow exits disable before vCPUs created
  KVM: x86: Reject disabling of MWAIT interception when not allowed
  KVM: x86: Let userspace re-enable previously disabled exits

 Documentation/virt/kvm/api.rst                |   8 +-
 arch/x86/include/asm/kvm-x86-ops.h            |   1 +
 arch/x86/include/asm/kvm_host.h               |   7 +
 arch/x86/kvm/cpuid.c                          |   4 +-
 arch/x86/kvm/lapic.c                          |   7 +-
 arch/x86/kvm/svm/nested.c                     |   4 +-
 arch/x86/kvm/svm/svm.c                        |  42 +-
 arch/x86/kvm/vmx/vmx.c                        |  53 +-
 arch/x86/kvm/x86.c                            |  69 ++-
 arch/x86/kvm/x86.h                            |  16 +-
 include/uapi/linux/kvm.h                      |   4 +-
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/disable_exits_test.c | 457 ++++++++++++++++++
 13 files changed, 626 insertions(+), 47 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c

Comments

Zhi Wang Jan. 18, 2023, 8:30 a.m. UTC | #1

On Fri, 13 Jan 2023 22:01:08 +0000
Kechen Lu <kechenl@nvidia.com> wrote:

Hi:

checkpatch.pl throws a lot of warning and errors when I was trying
this series. Can you fix them?

total: 470 errors, 22 warnings, 464 lines checked

> Summary
> ===========
> Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
> cap for disabling exits to enable finer-grained VM exits disabling
> on per vCPU scales instead of whole guest. This patch series enabled
> the vCPU-scoped exits control and toggling.
> 
> Motivation
> ============
> In use cases like Windows guest running heavy CPU-bound
> workloads, disabling HLT VM-exits could mitigate host sched ctx switch
> overhead. Simply HLT disabling on all vCPUs could bring
> performance benefits, but if no pCPUs reserved for host threads, could
> happened to the forced preemption as host does not know the time to do
> the schedule for other host threads want to run. With this patch, we
> could only disable part of vCPUs HLT exits for one guest, this still
> keeps performance benefits, and also shows resiliency to host stressing
> workload running at the same time.
> 
> Performance and Testing
> =========================
> In the host stressing workload experiment with Windows guest heavy
> CPU-bound workloads, it shows good resiliency and having the ~3%
> performance improvement. E.g. Passmark running in a Windows guest
> with this patch disabling HLT exits on only half of vCPUs still
> showing 2.4% higher main score v/s baseline.
> 
> Tested everything on AMD machines.
> 
> v4->v5 :
> - Drop the usage of KVM request, keep the VM-scoped exits disable
>   as the existing design, and only allow per-vCPU settings to
>   override the per-VM settings (Sean Christopherson)
> - Refactor the disable exits selftest without introducing any
>   new prerequisite patch, tests per-vCPU exits disable and overrides,
>   and per-VM exits disable
> 
> v3->v4 (Chao Gao) :
> - Use kvm vCPU request KVM_REQ_DISABLE_EXIT to perform the arch
>   VMCS updating (patch 5)
> - Fix selftests redundant arguments (patch 7)
> - Merge overlapped fix bits from patch 4 to patch 3
> 
> v2->v3 (Sean Christopherson) :
> - Reject KVM_CAP_X86_DISABLE_EXITS if userspace disable MWAIT exits
>   when MWAIT is not allowed in guest (patch 3)
> - Make userspace able to re-enable previously disabled exits (patch 4)
> - Add mwait/pause/cstate exits flag toggling instead of only hlt
>   exits (patch 5)
> - Add selftests for KVM_CAP_X86_DISABLE_EXITS (patch 7)
> 
> v1->v2 (Sean Christopherson) :
> - Add explicit restriction for VM-scoped exits disabling to be called
>   before vCPUs creation (patch 1)
> - Use vCPU ioctl instead of 64bit vCPU bitmask (patch 5), and make exits
>   disable flags check purely for vCPU instead of VM (patch 2)
> 
> Best Regards,
> Kechen
> 
> Kechen Lu (3):
>   KVM: x86: Move *_in_guest power management flags to vCPU scope
>   KVM: x86: add vCPU scoped toggling for disabled exits
>   KVM: selftests: Add tests for VM and vCPU cap
>     KVM_CAP_X86_DISABLE_EXITS
> 
> Sean Christopherson (3):
>   KVM: x86: only allow exits disable before vCPUs created
>   KVM: x86: Reject disabling of MWAIT interception when not allowed
>   KVM: x86: Let userspace re-enable previously disabled exits
> 
>  Documentation/virt/kvm/api.rst                |   8 +-
>  arch/x86/include/asm/kvm-x86-ops.h            |   1 +
>  arch/x86/include/asm/kvm_host.h               |   7 +
>  arch/x86/kvm/cpuid.c                          |   4 +-
>  arch/x86/kvm/lapic.c                          |   7 +-
>  arch/x86/kvm/svm/nested.c                     |   4 +-
>  arch/x86/kvm/svm/svm.c                        |  42 +-
>  arch/x86/kvm/vmx/vmx.c                        |  53 +-
>  arch/x86/kvm/x86.c                            |  69 ++-
>  arch/x86/kvm/x86.h                            |  16 +-
>  include/uapi/linux/kvm.h                      |   4 +-
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/x86_64/disable_exits_test.c | 457 ++++++++++++++++++
>  13 files changed, 626 insertions(+), 47 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c
>

Zhi Wang Jan. 18, 2023, 1:32 p.m. UTC | #2

On Wed, 18 Jan 2023 10:30:03 +0200
Zhi Wang <zhi.wang.linux@gmail.com> wrote:

Hi:

No sure why the test never finishes on my testing machine. Will take a look
later today.

My CPU: 
model name      : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz

branch kvm.git/master top commit:
310bc39546a435c83cc27a0eba878afac0d74714

-----

[inno@inno-lk-server x86_64]$ time ./disable_exits_test
VM-scoped tests start
Halter vCPU thread started
vCPU thread running vCPU 0
Halter vCPU thread reported its first HLT executed after 1 seconds.
vCPU thread running vCPU 1
Halter vCPU had 10 halt exits
Guest records 10 HLTs executed, waked 9 times
Halter vCPU thread started
vCPU thread running vCPU 0



^C

real    19m0.923s
user    37m56.512s
sys     0m1.086s

-----

> On Fri, 13 Jan 2023 22:01:08 +0000
> Kechen Lu <kechenl@nvidia.com> wrote:
> 
> Hi:
> 
> checkpatch.pl throws a lot of warning and errors when I was trying
> this series. Can you fix them?
> 
> total: 470 errors, 22 warnings, 464 lines checked
> 
> > Summary
> > ===========
> > Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
> > cap for disabling exits to enable finer-grained VM exits disabling
> > on per vCPU scales instead of whole guest. This patch series enabled
> > the vCPU-scoped exits control and toggling.
> > 
> > Motivation
> > ============
> > In use cases like Windows guest running heavy CPU-bound
> > workloads, disabling HLT VM-exits could mitigate host sched ctx switch
> > overhead. Simply HLT disabling on all vCPUs could bring
> > performance benefits, but if no pCPUs reserved for host threads, could
> > happened to the forced preemption as host does not know the time to do
> > the schedule for other host threads want to run. With this patch, we
> > could only disable part of vCPUs HLT exits for one guest, this still
> > keeps performance benefits, and also shows resiliency to host stressing
> > workload running at the same time.
> > 
> > Performance and Testing
> > =========================
> > In the host stressing workload experiment with Windows guest heavy
> > CPU-bound workloads, it shows good resiliency and having the ~3%
> > performance improvement. E.g. Passmark running in a Windows guest
> > with this patch disabling HLT exits on only half of vCPUs still
> > showing 2.4% higher main score v/s baseline.
> > 
> > Tested everything on AMD machines.
> > 
> > v4->v5 :
> > - Drop the usage of KVM request, keep the VM-scoped exits disable
> >   as the existing design, and only allow per-vCPU settings to
> >   override the per-VM settings (Sean Christopherson)
> > - Refactor the disable exits selftest without introducing any
> >   new prerequisite patch, tests per-vCPU exits disable and overrides,
> >   and per-VM exits disable
> > 
> > v3->v4 (Chao Gao) :
> > - Use kvm vCPU request KVM_REQ_DISABLE_EXIT to perform the arch
> >   VMCS updating (patch 5)
> > - Fix selftests redundant arguments (patch 7)
> > - Merge overlapped fix bits from patch 4 to patch 3
> > 
> > v2->v3 (Sean Christopherson) :
> > - Reject KVM_CAP_X86_DISABLE_EXITS if userspace disable MWAIT exits
> >   when MWAIT is not allowed in guest (patch 3)
> > - Make userspace able to re-enable previously disabled exits (patch 4)
> > - Add mwait/pause/cstate exits flag toggling instead of only hlt
> >   exits (patch 5)
> > - Add selftests for KVM_CAP_X86_DISABLE_EXITS (patch 7)
> > 
> > v1->v2 (Sean Christopherson) :
> > - Add explicit restriction for VM-scoped exits disabling to be called
> >   before vCPUs creation (patch 1)
> > - Use vCPU ioctl instead of 64bit vCPU bitmask (patch 5), and make exits
> >   disable flags check purely for vCPU instead of VM (patch 2)
> > 
> > Best Regards,
> > Kechen
> > 
> > Kechen Lu (3):
> >   KVM: x86: Move *_in_guest power management flags to vCPU scope
> >   KVM: x86: add vCPU scoped toggling for disabled exits
> >   KVM: selftests: Add tests for VM and vCPU cap
> >     KVM_CAP_X86_DISABLE_EXITS
> > 
> > Sean Christopherson (3):
> >   KVM: x86: only allow exits disable before vCPUs created
> >   KVM: x86: Reject disabling of MWAIT interception when not allowed
> >   KVM: x86: Let userspace re-enable previously disabled exits
> > 
> >  Documentation/virt/kvm/api.rst                |   8 +-
> >  arch/x86/include/asm/kvm-x86-ops.h            |   1 +
> >  arch/x86/include/asm/kvm_host.h               |   7 +
> >  arch/x86/kvm/cpuid.c                          |   4 +-
> >  arch/x86/kvm/lapic.c                          |   7 +-
> >  arch/x86/kvm/svm/nested.c                     |   4 +-
> >  arch/x86/kvm/svm/svm.c                        |  42 +-
> >  arch/x86/kvm/vmx/vmx.c                        |  53 +-
> >  arch/x86/kvm/x86.c                            |  69 ++-
> >  arch/x86/kvm/x86.h                            |  16 +-
> >  include/uapi/linux/kvm.h                      |   4 +-
> >  tools/testing/selftests/kvm/Makefile          |   1 +
> >  .../selftests/kvm/x86_64/disable_exits_test.c | 457 ++++++++++++++++++
> >  13 files changed, 626 insertions(+), 47 deletions(-)
> >  create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> > 
>