[RFC,00/14] KVM: x86/MMU: Formalize the Shadow MMU

Message ID	20221221222418.3307832-1-bgardon@google.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Date: Wed, 21 Dec 2022 22:24:04 +0000 Mime-Version: 1.0 Message-ID: <20221221222418.3307832-1-bgardon@google.com> Subject: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU From: Ben Gardon <bgardon@google.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>, Sean Christopherson <seanjc@google.com>, David Matlack <dmatlack@google.com>, Vipin Sharma <vipinsh@google.com>, Nagareddy Reddy <nspreddy@google.com>, Ben Gardon <bgardon@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	KVM: x86/MMU: Formalize the Shadow MMU \| [RFC,00/14] KVM: x86/MMU: Formalize the Shadow MMU [RFC,01/14] KVM: x86/MMU: Add shadow_mmu.(c\|h) [RFC,02/14] KVM: x86/MMU: Expose functions for the Shadow MMU [RFC,03/14] KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c [RFC,04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h [RFC,05/14] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c [RFC,06/14] KVM: x86/MMU: Clean up Shadow MMU exports [RFC,07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU [RFC,08/14] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions [RFC,09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault [RFC,10/14] KVM: x86/MMU: Fix naming on prepare / commit zap page functions [RFC,11/14] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c [RFC,12/14] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c [RFC,13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c [RFC,14/14] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h

Message ID

20221221222418.3307832-1-bgardon@google.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Date: Wed, 21 Dec 2022 22:24:04 +0000
Mime-Version: 1.0
Message-ID: <20221221222418.3307832-1-bgardon@google.com>
Subject: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU
From: Ben Gardon <bgardon@google.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        David Matlack <dmatlack@google.com>,
        Vipin Sharma <vipinsh@google.com>,
        Nagareddy Reddy <nspreddy@google.com>,
        Ben Gardon <bgardon@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

KVM: x86/MMU: Formalize the Shadow MMU |

Message

Ben Gardon Dec. 21, 2022, 10:24 p.m. UTC

  This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
implemented in separate files, with a defined interface to common code.

When the TDP (Two Dimensional Paging) MMU was added to x86 KVM, it came in
a separate file with a (reasonably) clear interface. This lead to many points
in the KVM MMU like this:

if (tdp_mmu_on())
	kvm_tdp_mmu_do_stuff()

if (memslots_have_rmaps())
	/* Do whatever was being done before */

The implementations of various functions which preceded the TDP MMU have
remained scattered around mmu.c with no clear identity or interface. Over the
last couple years, the KVM x86 community has settled on calling the KVM MMU
implementation which preceded the TDP MMU the "Shadow MMU", as it grew
from shadow paging, which supported virtualization on hardware pre-TDP.

Splitting it out into separate files will give a clear interface and make it
easier to distinguish common x86 MMU code from the code specific to the two
implementations.

This series is almost all pure refactors, with just one functional
change to clean up unnecessary work in the page fault handler, revealed
by the refactors.

Patches 1 and 2 prepare for the refactor by adding files and exporting
functions.
(I'm not familiar with the current rules about copyright notices and
authorship credit, so I didn't put anything at the top of shadow_mmu.c,
but since it's going to be filled with code from mmu.c please let me
know if that should change.)

Patch 3 is the big move, transferring 3.5K lines from mmu.c to
shadow_mmu.c
(It may be best if whoever ends up preparing the pull request with
this patch just dumps my version and re-does the move so that no code is
lost.)

Patches 4-6 move the includes for paging_tmpl.h to shadow_mmu.c

Patch 9 is the only functional change, removing an unnecessary operation
from the TDP MMU PF path.

The remaining patches clean up the interface between the Shadow MMU and
common MMU code.

Patch 3 is an enormous change, and doing it all at once in a single
commit all but guarantees merge conflicts and makes it hard to review. I
don't have a good answer to this problem as there's no easy way to move
3.5K lines between files. I tried moving the code bit-by-bit but the
intermediate steps added complexity and ultimately the 50+ patches it
created didn't seem any easier to review.
Doing the big move all at once at least makes it easier to get past when
doing Git archeology, and doing it at the beggining of the series allows the
rest of the commits to still show up in Git blame.

I've tested this series on an Intel Skylake host with kvm-unit-tests and
selftests.

This series builds on 9352e7470a1b4edd2fa9d235420ecc7bc3971bdc. Sean
Christopherson suggested I send out another version right before the
beginning of a merge window so that we can merge this onto the queue early
and have time to pile everything else on top of it. I'll do that once we've
flushed out reviews and feedback.

Please consider this a replacement for the previous series of rmap
refactors I sent out. This replaces all that code movement but does not
preclude any of the great pte_list refactor / rename ideas which were
discussed there.

Thanks everyone for your feedback and happy holidays.

Ben Gardon (14):
  KVM: x86/MMU: Add shadow_mmu.(c|h)
  KVM: x86/MMU: Expose functions for the Shadow MMU
  KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c
  KVM: x86/MMU: Expose functions for paging_tmpl.h
  KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
  KVM: x86/MMU: Clean up Shadow MMU exports
  KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
  KVM: x86/MMU: Clean up naming of exported Shadow MMU functions
  KVM: x86/MMU: Only make pages available on Shadow MMU fault
  KVM: x86/MMU: Fix naming on prepare / commit zap page functions
  KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c
  KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
  KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
  KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in
    shadow_mmu.h

 arch/x86/kvm/Makefile           |    2 +-
 arch/x86/kvm/debugfs.c          |    1 +
 arch/x86/kvm/mmu/mmu.c          | 4671 ++++---------------------------
 arch/x86/kvm/mmu/mmu_internal.h |   44 +-
 arch/x86/kvm/mmu/paging_tmpl.h  |   13 +-
 arch/x86/kvm/mmu/shadow_mmu.c   | 3538 +++++++++++++++++++++++
 arch/x86/kvm/mmu/shadow_mmu.h   |  114 +
 7 files changed, 4296 insertions(+), 4087 deletions(-)
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.c
 create mode 100644 arch/x86/kvm/mmu/shadow_mmu.h

Comments

David Matlack Jan. 6, 2023, 7:18 p.m. UTC | #1

On Wed, Dec 21, 2022 at 10:24:04PM +0000, Ben Gardon wrote:
> This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
> implemented in separate files, with a defined interface to common code.

Overall I really like the end result.

While looking through I found a few more bits of code that should
probably be moved into shadow_mmu.c:

 - kvm_mmu_zap_all(): Move the shadow MMU zapping to shadow_mmu.c (the
   active_mmu_pages loop + commit_zap_page).

 - need_topup(), need_topup_split_caches_or_resched()
   topup_split_caches() should be static functions in shadow_mmu.c.

 - Split out kvm_mmu_init/uninit_vm() functions for the shadow MMU.
   Notably, the split caches, active_mmu_pages, zapped_obsolete_pages,
   and other Shadow MMU-specific stuff can go in shadow_mmu.c.

 - The Shadow MMU parts of walk_shadow_page_lockless_begin/end() should
   go in shadow_mmu.c. e.g. kvm_shadow_mmu_walk_lockless_begin/end().

> Patch 3 is an enormous change, and doing it all at once in a single
> commit all but guarantees merge conflicts and makes it hard to review. I
> don't have a good answer to this problem as there's no easy way to move
> 3.5K lines between files. I tried moving the code bit-by-bit but the
> intermediate steps added complexity and ultimately the 50+ patches it
> created didn't seem any easier to review.
> Doing the big move all at once at least makes it easier to get past when
> doing Git archeology, and doing it at the beggining of the series allows the
> rest of the commits to still show up in Git blame.

An alternative would be to rename mmu.c to shadow_mmu.c first and then
move code in the opposite direction. That would preserve the git-blame
history for shadow_mmu.c. But by the end of the series mmu.c and
shadow_mmu.c are both ~3K LOC, so I don't think doing this is really any
better. Either way, you have to move ~3K LOC.

Ben Gardon Jan. 9, 2023, 6:43 p.m. UTC | #2

On Fri, Jan 6, 2023 at 11:18 AM David Matlack <dmatlack@google.com> wrote:
>
> On Wed, Dec 21, 2022 at 10:24:04PM +0000, Ben Gardon wrote:
> > This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
> > implemented in separate files, with a defined interface to common code.
>
> Overall I really like the end result.
>
> While looking through I found a few more bits of code that should
> probably be moved into shadow_mmu.c:
>
>  - kvm_mmu_zap_all(): Move the shadow MMU zapping to shadow_mmu.c (the
>    active_mmu_pages loop + commit_zap_page).
>
>  - need_topup(), need_topup_split_caches_or_resched()
>    topup_split_caches() should be static functions in shadow_mmu.c.
>
>  - Split out kvm_mmu_init/uninit_vm() functions for the shadow MMU.
>    Notably, the split caches, active_mmu_pages, zapped_obsolete_pages,
>    and other Shadow MMU-specific stuff can go in shadow_mmu.c.
>
>  - The Shadow MMU parts of walk_shadow_page_lockless_begin/end() should
>    go in shadow_mmu.c. e.g. kvm_shadow_mmu_walk_lockless_begin/end().

Awesome, thank you for pointing these out. I'll work them into a V1.

>
> > Patch 3 is an enormous change, and doing it all at once in a single
> > commit all but guarantees merge conflicts and makes it hard to review. I
> > don't have a good answer to this problem as there's no easy way to move
> > 3.5K lines between files. I tried moving the code bit-by-bit but the
> > intermediate steps added complexity and ultimately the 50+ patches it
> > created didn't seem any easier to review.
> > Doing the big move all at once at least makes it easier to get past when
> > doing Git archeology, and doing it at the beggining of the series allows the
> > rest of the commits to still show up in Git blame.
>
> An alternative would be to rename mmu.c to shadow_mmu.c first and then
> move code in the opposite direction. That would preserve the git-blame
> history for shadow_mmu.c. But by the end of the series mmu.c and
> shadow_mmu.c are both ~3K LOC, so I don't think doing this is really any
> better. Either way, you have to move ~3K LOC.

I tried implementing this refactor both ways and ultimately found this
way to be a lot cleaner. Preserving the git blame for the Shadow MMU
code would be nice, since IMO it's the more complex code, but it got
complicated quickly. The in-between stages of moving around function
definitions to header files, and detangling code to move it back to
mmu.c, was a nightmare. It's relatively easy to move the leaf
functions in the call-tree, but I found moving the upper level
functions was difficult to do bit-by-bit.
If anyone wants to try implementing this commit in a more elegant way,
I'm happy to rebase the rest of the series on  top of it.
As you said, either way we gotta move 3K lines of code.

Sean Christopherson Feb. 1, 2023, 8:02 p.m. UTC | #3

On Wed, Dec 21, 2022, Ben Gardon wrote:
> This series builds on 9352e7470a1b4edd2fa9d235420ecc7bc3971bdc.

Before you send the next version, can you tweak your workflow to generate the
base commit via `git format-patch --base`?  That makes it much easier for humans
and scripts to find the base commit, and saves you from having to remember to
manually specify the base.  Because of the code movement, applying this series
without the precise base is an exercise in frustration.

E.g. my workflow does

	git format-patch --base=HEAD~$nr <more crud>

where $nr is the number of patches to generate.  There's also an "auto" option,
but IIRC that only works if you have the upstream pointing at the base, e.g. it
falls apart if upstream points at your own remote "backup" repo.

Ben Gardon Feb. 1, 2023, 8:45 p.m. UTC | #4

On Wed, Feb 1, 2023 at 12:02 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Dec 21, 2022, Ben Gardon wrote:
> > This series builds on 9352e7470a1b4edd2fa9d235420ecc7bc3971bdc.
>
> Before you send the next version, can you tweak your workflow to generate the
> base commit via `git format-patch --base`?  That makes it much easier for humans
> and scripts to find the base commit, and saves you from having to remember to
> manually specify the base.  Because of the code movement, applying this series
> without the precise base is an exercise in frustration.
>
> E.g. my workflow does
>
>         git format-patch --base=HEAD~$nr <more crud>
>
> where $nr is the number of patches to generate.  There's also an "auto" option,
> but IIRC that only works if you have the upstream pointing at the base, e.g. it
> falls apart if upstream points at your own remote "backup" repo.

Sure thing, thanks for the tip!