[v3,00/11] KVM: x86/mmu: refine memtype related mmu zap

Message ID 20230616023101.7019-1-yan.y.zhao@intel.com
Headers
Series KVM: x86/mmu: refine memtype related mmu zap |

Message

Yan Zhao June 16, 2023, 2:31 a.m. UTC
  This series refines mmu zap caused by EPT memory type update when guest
MTRRs are honored.

The first 5 patches revolve around utilizing helper functions to check if
KVM TDP honors guest MTRRs, so that TDP zap and page fault max_level
reduction are only targeted to TDPs that honor guest MTRRs.

-The 5th patch will trigger zapping of TDP leaf entries if non-coherent
 DMA devices count goes from 0 to 1 or from 1 to 0.

The last 6 patches are fixes and optimizations for mmu zaps happen when
guest MTRRs are honored. Those mmu zaps are usually triggered from all
vCPUs in bursts on all GFN ranges, intending to remove stale memtypes of
TDP entries.

- The 6th patch places TDP zap to when CR0.CD toggles and when guest MTRRs
  update under CR0.CD=0.

- The 7th-8th patches refine KVM_X86_QUIRK_CD_NW_CLEARED by removing the
  IPAT bit in EPT memtype when CR0.CD=1 and guest MTRRs are honored.

- The 9th-11th patches are optimizations of the mmu zap when guest MTRRs
  are honored by serializing vCPUs' gfn zap requests and calculating of
  precise fine-grained ranges to zap.
  They are put in mtrr.c because the optimizations are related to when
  guest MTRRs are honored and because it requires to read guest MTRRs
  for fine-grained ranges.
  Calls to kvm_unmap_gfn_range() are not included into the optimization,
  because they are not triggered from all vCPUs in bursts and not all of
  them are blockable. They usually happen at memslot removal and thus do
  not affect the mmu zaps when guest MTRRs are honored. Also, current
  performance data shows that there's no observable performance difference
  to mmu zaps by turning on/off auto numa balancing triggered
  kvm_unmap_gfn_range().

A reference performance data for last 6 patches as below:

Base: base code before patch 6
C6-8: includes base code + patches 6 + 7 + 8
      patch 6: move TDP zaps from guest MTRRs update to CR0.CD toggling
      patch 7: drop IPAT in memtype when CD=1 for
               KVM_X86_QUIRK_CD_NW_CLEARED
      patch 8: move vmx code to get EPT memtype when CR0.CD=1 to x86 common
               code
C9:   includes C6-8 + patch 9
      patch 9: serialize vCPUs to zap gfn when guest MTRRs are honored
C10:  includes C9 + patch 10
      patch 10: fine-grained gfn zap when guest MTRRs are honored
C11:  includes C10 + patch 11
      patch 11: split a single gfn zap range when guest MTRRs are honored

vCPUs cnt: 8,  guest memory: 16G
Physical CPU frequency: 3100 MHz

     |              OVMF            |             Seabios          |
     | EPT zap cycles | EPT zap cnt | EPT zap cycles | EPT zap cnt |
Base |    3444.97M    |      84     |      61.29M    |      50     |
C6-8 |    4343.68M    |      74     |     503.04M    |      42     |*     
 C9  |     261.45M    |      74     |     106.64M    |      42     |     
 C10 |     157.42M    |      74     |      71.04M    |      42     |     
 C11 |      33.95M    |      74     |      24.04M    |      42     |     

* With C8, EPT zap cnt are reduced because there are some MTRR updates
  under CR0.CD=1.
  EPT zap cycles increases a bit (especially true in case of Seabios)
  because concurrency is more intense when CR0.CD toggles than when
  guest MTRRs update.
  (patch 7/8 are neglectable in performance)

Changelog:
v2 --> v3:
1. Updated patch 1 about definition of honor guest MTRRs helper. (Sean)
2. Added patch 2 to use honor guest MTRRs helper in kvm_tdp_page_fault().
   (Sean)
3. Remove unnecessary calculation of MTRR ranges.
   (Chao Gao, Kai Huang, Sean)
4. Updated patches 3-5 to use the helper. (Chao Gao, Kai Huang, Sean)
5. Added patches 6,7 to reposition TDP zap and drop IPAT bit. (Sean)
6. Added patch 8 to prepare for patch 10's memtype calculation when
   CR0.CD=1.
7. Added patches 9-11 to speed up MTRR update /CD0 toggle when guest
   MTRRs are honored. (Sean)
8. Dropped per-VM based MTRRs in v2 (Sean)

v1 --> v2:
1. Added a helper to skip non EPT case in patch 1
2. Added patch 2 to skip mmu zap when guest CR0_CD changes if EPT is not
   enabled. (Chao Gao)
3. Added patch 3 to skip mmu zap when guest MTRR changes if EPT is not
   enabled.
4. Do not mention TDX in patch 4 as the code is not merged yet (Chao Gao)
5. Added patches 5-6 to reduce EPT zap during guest bootup.

v2:
https://lore.kernel.org/all/20230509134825.1523-1-yan.y.zhao@intel.com/

v1:
https://lore.kernel.org/all/20230508034700.7686-1-yan.y.zhao@intel.com/

Yan Zhao (11):
  KVM: x86/mmu: helpers to return if KVM honors guest MTRRs
  KVM: x86/mmu: Use KVM honors guest MTRRs helper in
    kvm_tdp_page_fault()
  KVM: x86/mmu: Use KVM honors guest MTRRs helper when CR0.CD toggles
  KVM: x86/mmu: Use KVM honors guest MTRRs helper when update mtrr
  KVM: x86/mmu: zap KVM TDP when noncoherent DMA assignment starts/stops
  KVM: x86/mmu: move TDP zaps from guest MTRRs update to CR0.CD toggling
  KVM: VMX: drop IPAT in memtype when CD=1 for
    KVM_X86_QUIRK_CD_NW_CLEARED
  KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common
    code
  KVM: x86/mmu: serialize vCPUs to zap gfn when guest MTRRs are honored
  KVM: x86/mmu: fine-grained gfn zap when guest MTRRs are honored
  KVM: x86/mmu: split a single gfn zap range when guest MTRRs are
    honored

 arch/x86/include/asm/kvm_host.h |   4 +
 arch/x86/kvm/mmu.h              |   7 +
 arch/x86/kvm/mmu/mmu.c          |  18 +-
 arch/x86/kvm/mtrr.c             | 286 +++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.c          |  11 +-
 arch/x86/kvm/x86.c              |  25 ++-
 arch/x86/kvm/x86.h              |   2 +
 7 files changed, 333 insertions(+), 20 deletions(-)


base-commit: 24ff4c08e5bbdd7399d45f940f10fed030dfadda
  

Comments

Sean Christopherson June 28, 2023, 11:02 p.m. UTC | #1
On Fri, Jun 16, 2023, Yan Zhao wrote:
> This series refines mmu zap caused by EPT memory type update when guest
> MTRRs are honored.

...

> Yan Zhao (11):
>   KVM: x86/mmu: helpers to return if KVM honors guest MTRRs
>   KVM: x86/mmu: Use KVM honors guest MTRRs helper in
>     kvm_tdp_page_fault()
>   KVM: x86/mmu: Use KVM honors guest MTRRs helper when CR0.CD toggles
>   KVM: x86/mmu: Use KVM honors guest MTRRs helper when update mtrr
>   KVM: x86/mmu: zap KVM TDP when noncoherent DMA assignment starts/stops
>   KVM: x86/mmu: move TDP zaps from guest MTRRs update to CR0.CD toggling
>   KVM: VMX: drop IPAT in memtype when CD=1 for
>     KVM_X86_QUIRK_CD_NW_CLEARED
>   KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common
>     code
>   KVM: x86/mmu: serialize vCPUs to zap gfn when guest MTRRs are honored
>   KVM: x86/mmu: fine-grained gfn zap when guest MTRRs are honored
>   KVM: x86/mmu: split a single gfn zap range when guest MTRRs are
>     honored

I got through the easy patches, I'll circle back for the last few patches in a
few weeks (probably 3+ weeks at this point).
  
Yan Zhao July 14, 2023, 7:11 a.m. UTC | #2
On Wed, Jun 28, 2023 at 04:02:05PM -0700, Sean Christopherson wrote:
> On Fri, Jun 16, 2023, Yan Zhao wrote:
> > This series refines mmu zap caused by EPT memory type update when guest
> > MTRRs are honored.
> 
> ...
> 
> > Yan Zhao (11):
> >   KVM: x86/mmu: helpers to return if KVM honors guest MTRRs
> >   KVM: x86/mmu: Use KVM honors guest MTRRs helper in
> >     kvm_tdp_page_fault()
> >   KVM: x86/mmu: Use KVM honors guest MTRRs helper when CR0.CD toggles
> >   KVM: x86/mmu: Use KVM honors guest MTRRs helper when update mtrr
> >   KVM: x86/mmu: zap KVM TDP when noncoherent DMA assignment starts/stops
> >   KVM: x86/mmu: move TDP zaps from guest MTRRs update to CR0.CD toggling
> >   KVM: VMX: drop IPAT in memtype when CD=1 for
> >     KVM_X86_QUIRK_CD_NW_CLEARED
> >   KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common
> >     code
> >   KVM: x86/mmu: serialize vCPUs to zap gfn when guest MTRRs are honored
> >   KVM: x86/mmu: fine-grained gfn zap when guest MTRRs are honored
> >   KVM: x86/mmu: split a single gfn zap range when guest MTRRs are
> >     honored
> 
> I got through the easy patches, I'll circle back for the last few patches in a
> few weeks (probably 3+ weeks at this point).
Thanks for this heads-up.
I addressed almost all the comments for v3 currently, except about
where to get memtype for CR0.CD=1, and feel free to decline my new
proposal in v4 as explained in another mail :)
v4 is available here
https://lore.kernel.org/all/20230714064656.20147-1-yan.y.zhao@intel.com/
Please review the new version directly.

Thanks!