[v1,00/18] KVM selftests code consolidation and cleanup

Message ID 20221024113445.1022147-1-wei.w.wang@intel.com
Headers
Series KVM selftests code consolidation and cleanup |

Message

Wang, Wei W Oct. 24, 2022, 11:34 a.m. UTC
  This patch series intends to improve kvm selftests with better code
consolidation using the helper functions to perform vcpu and thread
related operations.

In general, several aspects are improved:
1) The current users allocate an array of vcpu pointers to the vcpus that
   are added to a vm, and an array of vcpu threads. This isn't necessary
   as kvm_vm already maintains a list of added vcpus. This series changes
   the list of vcpus in the kvm_vm struct to a vcpu array for users to
   work with and removes each user's own allocation of such vcpu arrays.
   Aslo add the vcpu thread to the kvm_vcpu struct, so that users don't
   need to explicitly allocate a thread array to manage vcpu threads on
   their own.
2) Change the users to use the helper functions provided by this series
   with the following enhancements:
   - Many users working with pthread_create/join forgot to check if
     error on returning. The helper functions have handled thoses inside,
     so users don't need to handle them by themselves;
   - The vcpu threads created via the helper functions are named in
     "vcpu-##id" format. Naming the threads facilitates debugging,
     performance tuning, runtime pining etc;
   - helper functions named with "vm_vcpu_threads_" iterates over all the
     vcpus that have been added to the vm. Users don't need a explicit
     loop to go through the added cpus by themselves.
3) kvm_vcpu is used as the interface parameter to the vcpu thread's
   start routine, and the user specific data is made to be the private
   data in kvm_vcpu. This can simplify the user specific data structures,
   as kvm_vcpu has already included the required info for the thread, for
   example, in patch 13, the cpu_idx field from "struct vcpu_thread"
   is a duplicate of vcpu->id.

The changes have been tested on an SPR server. Patch 15 and 16 haven't
been tested due to the lack of an ARM environment currently.

Wei Wang (18):
  KVM: selftests/kvm_util: use array of pointers to maintain vcpus in
    kvm_vm
  KVM: selftests/kvm_util: use vm->vcpus[] when create vm with vcpus
  KVM: selftests/kvm_util: helper functions for vcpus and threads
  KVM: selftests/kvm_page_table_test: vcpu related code consolidation
  KVM: selftests/hardware_disable_test: code consolidation and cleanup
  KVM: selftests/dirty_log_test: vcpu related code consolidation
  KVM: selftests/max_guest_memory_test: vcpu related code consolidation
  KVM: selftests/set_memory_region_test: vcpu related code consolidation
  KVM: selftests/steal_time: vcpu related code consolidation and cleanup
  KVM: selftests/tsc_scaling_sync: vcpu related code consolidation
  KVM: selftest/xapic_ipi_test: vcpu related code consolidation
  KVM: selftests/rseq_test: name the migration thread and some cleanup
  KVM: selftests/perf_test_util: vcpu related code consolidation
  KVM: selftest/memslot_perf_test: vcpu related code consolidation
  KVM: selftests/vgic_init: vcpu related code consolidation
  KVM: selftest/arch_timer: vcpu related code consolidation
  KVM: selftests: remove the *vcpu[] input from __vm_create_with_vcpus
  KVM: selftests/kvm_create_max_vcpus: check KVM_MAX_VCPUS

 .../selftests/kvm/aarch64/arch_timer.c        |  42 ++--
 .../testing/selftests/kvm/aarch64/vgic_init.c |  35 ++-
 .../selftests/kvm/access_tracking_perf_test.c |  18 +-
 .../selftests/kvm/demand_paging_test.c        |   9 +-
 .../selftests/kvm/dirty_log_perf_test.c       |  11 +-
 tools/testing/selftests/kvm/dirty_log_test.c  |  16 +-
 .../selftests/kvm/hardware_disable_test.c     |  56 ++---
 .../testing/selftests/kvm/include/kvm_util.h  |  24 ++
 .../selftests/kvm/include/kvm_util_base.h     |  12 +-
 .../selftests/kvm/include/perf_test_util.h    |   9 +-
 .../selftests/kvm/kvm_create_max_vcpus.c      |   7 +
 .../selftests/kvm/kvm_page_table_test.c       |  16 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    | 217 +++++++++++++++---
 .../selftests/kvm/lib/perf_test_util.c        |  68 ++----
 .../selftests/kvm/lib/x86_64/perf_test_util.c |  11 +-
 tools/testing/selftests/kvm/lib/x86_64/vmx.c  |   2 +-
 .../selftests/kvm/max_guest_memory_test.c     |  53 ++---
 .../kvm/memslot_modification_stress_test.c    |   9 +-
 .../testing/selftests/kvm/memslot_perf_test.c | 137 +++++------
 tools/testing/selftests/kvm/rseq_test.c       |  10 +-
 .../selftests/kvm/set_memory_region_test.c    |  16 +-
 tools/testing/selftests/kvm/steal_time.c      |  15 +-
 .../selftests/kvm/x86_64/tsc_scaling_sync.c   |  25 +-
 .../selftests/kvm/x86_64/xapic_ipi_test.c     |  54 ++---
 24 files changed, 476 insertions(+), 396 deletions(-)
  

Comments

David Matlack Oct. 26, 2022, 9:22 p.m. UTC | #1
On Mon, Oct 24, 2022 at 07:34:27PM +0800, Wei Wang wrote:
> This patch series intends to improve kvm selftests with better code
> consolidation using the helper functions to perform vcpu and thread
> related operations.
> 
> In general, several aspects are improved:
> 1) The current users allocate an array of vcpu pointers to the vcpus that
>    are added to a vm, and an array of vcpu threads. This isn't necessary
>    as kvm_vm already maintains a list of added vcpus. This series changes
>    the list of vcpus in the kvm_vm struct to a vcpu array for users to
>    work with and removes each user's own allocation of such vcpu arrays.
>    Aslo add the vcpu thread to the kvm_vcpu struct, so that users don't
>    need to explicitly allocate a thread array to manage vcpu threads on
>    their own.
> 2) Change the users to use the helper functions provided by this series
>    with the following enhancements:
>    - Many users working with pthread_create/join forgot to check if
>      error on returning. The helper functions have handled thoses inside,
>      so users don't need to handle them by themselves;
>    - The vcpu threads created via the helper functions are named in
>      "vcpu-##id" format. Naming the threads facilitates debugging,
>      performance tuning, runtime pining etc;
>    - helper functions named with "vm_vcpu_threads_" iterates over all the
>      vcpus that have been added to the vm. Users don't need a explicit
>      loop to go through the added cpus by themselves.
> 3) kvm_vcpu is used as the interface parameter to the vcpu thread's
>    start routine, and the user specific data is made to be the private
>    data in kvm_vcpu. This can simplify the user specific data structures,
>    as kvm_vcpu has already included the required info for the thread, for
>    example, in patch 13, the cpu_idx field from "struct vcpu_thread"
>    is a duplicate of vcpu->id.

I haven't dug too much into the actual code yet, but I have some high
level feedback based on a quick look through the series:

 - Use the format "KVM: selftests: <Decsription>" for the shortlog.

 - Make the shortlog more specific. "vcpu related code consolidation" is
   vague.

 - Do not introduce bugs and then fix them in subsequent commits.  This
   breaks bisection. For example, kvm_page_table_test is broken at "KVM:
   selftests/kvm_util: use vm->vcpus[] when create vm with vcpus" and
   then fixed by "KVM: selftests/kvm_page_table_test: vcpu related code
   consolidation".

 - Try to limit each patch to one logical change. This is somewhat more
   art than science, but the basic idea is to avoid changing too much at
   once so that the code is easier to review and bisect. For example,
   "KVM: selftests/perf_test_util: vcpu related code consolidation" has
   a list of 6 different changes being made in the commit description.
   This is a sure sign this commit should be broken up. The same applies
   to many of the other patches. This will also make it easier to come
   up with more specific shortlogs.

> 
> The changes have been tested on an SPR server. Patch 15 and 16 haven't
> been tested due to the lack of an ARM environment currently.
> 
> Wei Wang (18):
>   KVM: selftests/kvm_util: use array of pointers to maintain vcpus in
>     kvm_vm
>   KVM: selftests/kvm_util: use vm->vcpus[] when create vm with vcpus
>   KVM: selftests/kvm_util: helper functions for vcpus and threads
>   KVM: selftests/kvm_page_table_test: vcpu related code consolidation
>   KVM: selftests/hardware_disable_test: code consolidation and cleanup
>   KVM: selftests/dirty_log_test: vcpu related code consolidation
>   KVM: selftests/max_guest_memory_test: vcpu related code consolidation
>   KVM: selftests/set_memory_region_test: vcpu related code consolidation
>   KVM: selftests/steal_time: vcpu related code consolidation and cleanup
>   KVM: selftests/tsc_scaling_sync: vcpu related code consolidation
>   KVM: selftest/xapic_ipi_test: vcpu related code consolidation
>   KVM: selftests/rseq_test: name the migration thread and some cleanup
>   KVM: selftests/perf_test_util: vcpu related code consolidation
>   KVM: selftest/memslot_perf_test: vcpu related code consolidation
>   KVM: selftests/vgic_init: vcpu related code consolidation
>   KVM: selftest/arch_timer: vcpu related code consolidation
>   KVM: selftests: remove the *vcpu[] input from __vm_create_with_vcpus
>   KVM: selftests/kvm_create_max_vcpus: check KVM_MAX_VCPUS
> 
>  .../selftests/kvm/aarch64/arch_timer.c        |  42 ++--
>  .../testing/selftests/kvm/aarch64/vgic_init.c |  35 ++-
>  .../selftests/kvm/access_tracking_perf_test.c |  18 +-
>  .../selftests/kvm/demand_paging_test.c        |   9 +-
>  .../selftests/kvm/dirty_log_perf_test.c       |  11 +-
>  tools/testing/selftests/kvm/dirty_log_test.c  |  16 +-
>  .../selftests/kvm/hardware_disable_test.c     |  56 ++---
>  .../testing/selftests/kvm/include/kvm_util.h  |  24 ++
>  .../selftests/kvm/include/kvm_util_base.h     |  12 +-
>  .../selftests/kvm/include/perf_test_util.h    |   9 +-
>  .../selftests/kvm/kvm_create_max_vcpus.c      |   7 +
>  .../selftests/kvm/kvm_page_table_test.c       |  16 +-
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 217 +++++++++++++++---
>  .../selftests/kvm/lib/perf_test_util.c        |  68 ++----
>  .../selftests/kvm/lib/x86_64/perf_test_util.c |  11 +-
>  tools/testing/selftests/kvm/lib/x86_64/vmx.c  |   2 +-
>  .../selftests/kvm/max_guest_memory_test.c     |  53 ++---
>  .../kvm/memslot_modification_stress_test.c    |   9 +-
>  .../testing/selftests/kvm/memslot_perf_test.c | 137 +++++------
>  tools/testing/selftests/kvm/rseq_test.c       |  10 +-
>  .../selftests/kvm/set_memory_region_test.c    |  16 +-
>  tools/testing/selftests/kvm/steal_time.c      |  15 +-
>  .../selftests/kvm/x86_64/tsc_scaling_sync.c   |  25 +-
>  .../selftests/kvm/x86_64/xapic_ipi_test.c     |  54 ++---
>  24 files changed, 476 insertions(+), 396 deletions(-)
> 
> -- 
> 2.27.0
>
  
Wang, Wei W Oct. 27, 2022, 12:18 p.m. UTC | #2
On Thursday, October 27, 2022 5:23 AM, David Matlack:
> On Mon, Oct 24, 2022 at 07:34:27PM +0800, Wei Wang wrote:
> > This patch series intends to improve kvm selftests with better code
> > consolidation using the helper functions to perform vcpu and thread
> > related operations.
> >
> > In general, several aspects are improved:
> > 1) The current users allocate an array of vcpu pointers to the vcpus that
> >    are added to a vm, and an array of vcpu threads. This isn't necessary
> >    as kvm_vm already maintains a list of added vcpus. This series changes
> >    the list of vcpus in the kvm_vm struct to a vcpu array for users to
> >    work with and removes each user's own allocation of such vcpu arrays.
> >    Aslo add the vcpu thread to the kvm_vcpu struct, so that users don't
> >    need to explicitly allocate a thread array to manage vcpu threads on
> >    their own.
> > 2) Change the users to use the helper functions provided by this series
> >    with the following enhancements:
> >    - Many users working with pthread_create/join forgot to check if
> >      error on returning. The helper functions have handled thoses inside,
> >      so users don't need to handle them by themselves;
> >    - The vcpu threads created via the helper functions are named in
> >      "vcpu-##id" format. Naming the threads facilitates debugging,
> >      performance tuning, runtime pining etc;
> >    - helper functions named with "vm_vcpu_threads_" iterates over all the
> >      vcpus that have been added to the vm. Users don't need a explicit
> >      loop to go through the added cpus by themselves.
> > 3) kvm_vcpu is used as the interface parameter to the vcpu thread's
> >    start routine, and the user specific data is made to be the private
> >    data in kvm_vcpu. This can simplify the user specific data structures,
> >    as kvm_vcpu has already included the required info for the thread, for
> >    example, in patch 13, the cpu_idx field from "struct vcpu_thread"
> >    is a duplicate of vcpu->id.
> 
> I haven't dug too much into the actual code yet, but I have some high level
> feedback based on a quick look through the series:
> 
>  - Use the format "KVM: selftests: <Decsription>" for the shortlog.

I know it's not common to see so far, but curious is this the required format?
I didn't find where it's documented. If it's indeed a requirement, probably we
also need to enhance checkpatch.pl to detect this.

If it's not required, I think it is more obvious to have /sub_field in the title,
e.g. selftests/hardware_disable_test, to outline which specific part of
selftests the patch is changing. (the selftests are growing larger with many
usages independent of each other).

> 
>  - Make the shortlog more specific. "vcpu related code consolidation" is
>    vague.
> 
>  - Do not introduce bugs and then fix them in subsequent commits.  This
>    breaks bisection. For example, kvm_page_table_test is broken at "KVM:
>    selftests/kvm_util: use vm->vcpus[] when create vm with vcpus" and
>    then fixed by "KVM: selftests/kvm_page_table_test: vcpu related code
>    consolidation".
> 
>  - Try to limit each patch to one logical change. This is somewhat more
>    art than science, but the basic idea is to avoid changing too much at
>    once so that the code is easier to review and bisect. For example,
>    "KVM: selftests/perf_test_util: vcpu related code consolidation" has
>    a list of 6 different changes being made in the commit description.
>    This is a sure sign this commit should be broken up. The same applies
>    to many of the other patches. This will also make it easier to come
>    up with more specific shortlogs.

OK, will re-organize the patches.
  
Sean Christopherson Oct. 27, 2022, 3:44 p.m. UTC | #3
On Thu, Oct 27, 2022, Wang, Wei W wrote:
> On Thursday, October 27, 2022 5:23 AM, David Matlack:
> > I haven't dug too much into the actual code yet, but I have some high level
> > feedback based on a quick look through the series:
> > 
> >  - Use the format "KVM: selftests: <Decsription>" for the shortlog.
> 
> I know it's not common to see so far, but curious is this the required format?

It's definitely the preferred format.

> I didn't find where it's documented.

Heh, for all shortlog scopes, the "documentation" is `git log --pretty=oneline` :-)

> If it's indeed a requirement, probably we also need to enhance checkpatch.pl
> to detect this.

I like the idea in theory, but that'd be a daunting task to set up, and quite the
maintenance nightmare.  There are probably thousands of file => scope mappings
throughout the kernel, with any number of exceptions and arbitrary rules.

> If it's not required, I think it is more obvious to have /sub_field in the title,
> e.g. selftests/hardware_disable_test, to outline which specific part of
> selftests the patch is changing. (the selftests are growing larger with many
> usages independent of each other).

I agree that "KVM: selftests:" is a rather large umbrella, but it's not obvious
that "KVM: selfetest/<test>" is unequivocally better, e.g. if someone is making a
change to x86_64/vmx_exception_with_invalid_guest_state.c, the scope will be

  KVM: selftests/x86_64/vmx_exception_with_invalid_guest_state:

or 

  KVM: selftests/vmx_exception_with_invalid_guest_state:

which doesn't leave a whole lot of room for an actual shortlog.

When reviewing selftests patches, I do find myself pausing sometimes to see exactly
what file/test is being modified, but in almost all cases a quick glance at the
diffstat provides the answer.  And on the flip side, having all selftests patches
exactly match "KVM: selftests:" makes it super easy to identify selftest changes,
i.e. it's mostly a wash overall in terms of efficiency (for me at least).
  
David Matlack Oct. 27, 2022, 4:24 p.m. UTC | #4
On Thu, Oct 27, 2022 at 8:44 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Oct 27, 2022, Wang, Wei W wrote:
> > On Thursday, October 27, 2022 5:23 AM, David Matlack:
> > > I haven't dug too much into the actual code yet, but I have some high level
> > > feedback based on a quick look through the series:
> > >
> > >  - Use the format "KVM: selftests: <Decsription>" for the shortlog.
> >
> > I know it's not common to see so far, but curious is this the required format?
>
> It's definitely the preferred format.
>
> > I didn't find where it's documented.
>
> Heh, for all shortlog scopes, the "documentation" is `git log --pretty=oneline` :-)
>
> > If it's indeed a requirement, probably we also need to enhance checkpatch.pl
> > to detect this.
>
> I like the idea in theory, but that'd be a daunting task to set up, and quite the
> maintenance nightmare.  There are probably thousands of file => scope mappings
> throughout the kernel, with any number of exceptions and arbitrary rules.

I was thinking about proposing this in checkpatch.pl, or in some
KVM-specific check script. It seems like the following rule: If a
commit only modifies files in tools/testing/selftests/kvm/*, then
requires the shortlog match the regex "KVM: selftests: .*". That would
handle the vast majority of cases without affecting other subsystems.

Sean are you more concerned that if we start validating shortlogs in
checkpatch.pl then eventually it will get too out of hand? (i.e. not
so concerned with this specific case, but the general problem?)

Either way, we should at least document the preferred KVM shortlog
styles in Documentation/virtual/kvm/.
  
Sean Christopherson Oct. 27, 2022, 6:27 p.m. UTC | #5
On Thu, Oct 27, 2022, David Matlack wrote:
> On Thu, Oct 27, 2022 at 8:44 AM Sean Christopherson <seanjc@google.com> wrote:
> > I like the idea in theory, but that'd be a daunting task to set up, and quite the
> > maintenance nightmare.  There are probably thousands of file => scope mappings
> > throughout the kernel, with any number of exceptions and arbitrary rules.
> 
> I was thinking about proposing this in checkpatch.pl, or in some
> KVM-specific check script. It seems like the following rule: If a
> commit only modifies files in tools/testing/selftests/kvm/*, then
> requires the shortlog match the regex "KVM: selftests: .*". That would
> handle the vast majority of cases without affecting other subsystems.
> 
> Sean are you more concerned that if we start validating shortlogs in
> checkpatch.pl then eventually it will get too out of hand? (i.e. not
> so concerned with this specific case, but the general problem?)

Ya, the general problem.  Hardcoding anything KVM specific in checkpatch.pl isn't
going to fly.  The checkpatch maintainers most definitely don't want to take on
the burden of maintaining subsystem rules.  Letting one subsystem add custom rules
effectively opens the flood gates to all subsystems adding custom rules.  And from
a KVM perspective, I don't want to have to get an Acked-by from a checkpatch
maintiainer just to tweak a KVM rule.

The only somewhat feasible approach I can think of would be to provide a generic
"language" for shortlog scope rules, and have checkpatch look for a well-known
file in relevant directories, e.g. add arch/x86/kvm/SCOPES or whatever.  But even
that is a non-trivial problem to solve, as it means coming up with a "language"
that has a reasonable chance of working for many subsystems without generating too
many false positives.

It's definitely doable, and likely not actually a maintenance nightmare (I wrote
that thinking of modifying a common rules file).  But it's still fairly daunting
as getting buy-in on something that affects the kernel at large tends to be easier
said then done.  Then again, I'm probably being pessimistic due to my sub-par
regex+scripting skills :-)
  
Andrew Jones Oct. 28, 2022, 12:41 p.m. UTC | #6
On Thu, Oct 27, 2022 at 06:27:39PM +0000, Sean Christopherson wrote:
> On Thu, Oct 27, 2022, David Matlack wrote:
> > On Thu, Oct 27, 2022 at 8:44 AM Sean Christopherson <seanjc@google.com> wrote:
> > > I like the idea in theory, but that'd be a daunting task to set up, and quite the
> > > maintenance nightmare.  There are probably thousands of file => scope mappings
> > > throughout the kernel, with any number of exceptions and arbitrary rules.
> > 
> > I was thinking about proposing this in checkpatch.pl, or in some
> > KVM-specific check script. It seems like the following rule: If a
> > commit only modifies files in tools/testing/selftests/kvm/*, then
> > requires the shortlog match the regex "KVM: selftests: .*". That would
> > handle the vast majority of cases without affecting other subsystems.
> > 
> > Sean are you more concerned that if we start validating shortlogs in
> > checkpatch.pl then eventually it will get too out of hand? (i.e. not
> > so concerned with this specific case, but the general problem?)
> 
> Ya, the general problem.  Hardcoding anything KVM specific in checkpatch.pl isn't
> going to fly.  The checkpatch maintainers most definitely don't want to take on
> the burden of maintaining subsystem rules.  Letting one subsystem add custom rules
> effectively opens the flood gates to all subsystems adding custom rules.  And from
> a KVM perspective, I don't want to have to get an Acked-by from a checkpatch
> maintiainer just to tweak a KVM rule.
> 
> The only somewhat feasible approach I can think of would be to provide a generic
> "language" for shortlog scope rules, and have checkpatch look for a well-known
> file in relevant directories, e.g. add arch/x86/kvm/SCOPES or whatever.  But even
> that is a non-trivial problem to solve, as it means coming up with a "language"
> that has a reasonable chance of working for many subsystems without generating too
> many false positives.
> 
> It's definitely doable, and likely not actually a maintenance nightmare (I wrote
> that thinking of modifying a common rules file).  But it's still fairly daunting
> as getting buy-in on something that affects the kernel at large tends to be easier
> said then done.  Then again, I'm probably being pessimistic due to my sub-par
> regex+scripting skills :-)

How about adding support for checkpatch extension plugins? If we could add
a plugin script, e.g. tools/testing/selftests/kvm/.checkpatch, and modify
checkpatch to run .checkpatch scripts in the patched files' directories
(and recursively in the parent directories) when found, then we'd get
custom checkpatch behaviors. The scripts wouldn't even have to be written
in Perl (but I say that a bit sadly, because I like Perl).

Thanks,
drew
  
Sean Christopherson Oct. 28, 2022, 3:49 p.m. UTC | #7
On Fri, Oct 28, 2022, Andrew Jones wrote:
> On Thu, Oct 27, 2022 at 06:27:39PM +0000, Sean Christopherson wrote:
> > On Thu, Oct 27, 2022, David Matlack wrote:
> > > On Thu, Oct 27, 2022 at 8:44 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > I like the idea in theory, but that'd be a daunting task to set up, and quite the
> > > > maintenance nightmare.  There are probably thousands of file => scope mappings
> > > > throughout the kernel, with any number of exceptions and arbitrary rules.
> > > 
> > > I was thinking about proposing this in checkpatch.pl, or in some
> > > KVM-specific check script. It seems like the following rule: If a
> > > commit only modifies files in tools/testing/selftests/kvm/*, then
> > > requires the shortlog match the regex "KVM: selftests: .*". That would
> > > handle the vast majority of cases without affecting other subsystems.
> > > 
> > > Sean are you more concerned that if we start validating shortlogs in
> > > checkpatch.pl then eventually it will get too out of hand? (i.e. not
> > > so concerned with this specific case, but the general problem?)
> > 
> > Ya, the general problem.  Hardcoding anything KVM specific in checkpatch.pl isn't
> > going to fly.  The checkpatch maintainers most definitely don't want to take on
> > the burden of maintaining subsystem rules.  Letting one subsystem add custom rules
> > effectively opens the flood gates to all subsystems adding custom rules.  And from
> > a KVM perspective, I don't want to have to get an Acked-by from a checkpatch
> > maintiainer just to tweak a KVM rule.
> > 
> > The only somewhat feasible approach I can think of would be to provide a generic
> > "language" for shortlog scope rules, and have checkpatch look for a well-known
> > file in relevant directories, e.g. add arch/x86/kvm/SCOPES or whatever.  But even
> > that is a non-trivial problem to solve, as it means coming up with a "language"
> > that has a reasonable chance of working for many subsystems without generating too
> > many false positives.
> > 
> > It's definitely doable, and likely not actually a maintenance nightmare (I wrote
> > that thinking of modifying a common rules file).  But it's still fairly daunting
> > as getting buy-in on something that affects the kernel at large tends to be easier
> > said then done.  Then again, I'm probably being pessimistic due to my sub-par
> > regex+scripting skills :-)
> 
> How about adding support for checkpatch extension plugins? If we could add
> a plugin script, e.g. tools/testing/selftests/kvm/.checkpatch, and modify
> checkpatch to run .checkpatch scripts in the patched files' directories
> (and recursively in the parent directories) when found, then we'd get
> custom checkpatch behaviors. The scripts wouldn't even have to be written
> in Perl (but I say that a bit sadly, because I like Perl).

That will work for simple cases, but patches that touch files in multiple directories
will be messy.  E.g. a patch that touches virt/kvm/ and arch/x86/kvm/ will have
two separate custom rules enforcing two different scopes.

Recursively executing plugins will also be problematic, e.g. except for KVM, arch/x86/
is maintained by the tip-tree folks, and the tip-tree is quite opinionated on all
sorts of things, whereas KVM tends to be a bit more relaxed.

Enforcing scope through plugins would also lead to some amount of duplicate code
throught subsystems.

Anyways, if someone wants to pursue this, these ideas and the "requirement" should
be run by the checkpatch maintainers.  They have far more experience and authority
in this area, and I suspect we aren't the first people to want checkpatch to get
involved in enforcing shortlog scope.
  
David Matlack Nov. 7, 2022, 6:11 p.m. UTC | #8
On Fri, Oct 28, 2022 at 8:49 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Oct 28, 2022, Andrew Jones wrote:
> > On Thu, Oct 27, 2022 at 06:27:39PM +0000, Sean Christopherson wrote:
> > > On Thu, Oct 27, 2022, David Matlack wrote:
> > > > On Thu, Oct 27, 2022 at 8:44 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > I like the idea in theory, but that'd be a daunting task to set up, and quite the
> > > > > maintenance nightmare.  There are probably thousands of file => scope mappings
> > > > > throughout the kernel, with any number of exceptions and arbitrary rules.
> > > >
> > > > I was thinking about proposing this in checkpatch.pl, or in some
> > > > KVM-specific check script. It seems like the following rule: If a
> > > > commit only modifies files in tools/testing/selftests/kvm/*, then
> > > > requires the shortlog match the regex "KVM: selftests: .*". That would
> > > > handle the vast majority of cases without affecting other subsystems.
> > > >
> > > > Sean are you more concerned that if we start validating shortlogs in
> > > > checkpatch.pl then eventually it will get too out of hand? (i.e. not
> > > > so concerned with this specific case, but the general problem?)
> > >
> > > Ya, the general problem.  Hardcoding anything KVM specific in checkpatch.pl isn't
> > > going to fly.  The checkpatch maintainers most definitely don't want to take on
> > > the burden of maintaining subsystem rules.  Letting one subsystem add custom rules
> > > effectively opens the flood gates to all subsystems adding custom rules.  And from
> > > a KVM perspective, I don't want to have to get an Acked-by from a checkpatch
> > > maintiainer just to tweak a KVM rule.
> > >
> > > The only somewhat feasible approach I can think of would be to provide a generic
> > > "language" for shortlog scope rules, and have checkpatch look for a well-known
> > > file in relevant directories, e.g. add arch/x86/kvm/SCOPES or whatever.  But even
> > > that is a non-trivial problem to solve, as it means coming up with a "language"
> > > that has a reasonable chance of working for many subsystems without generating too
> > > many false positives.
> > >
> > > It's definitely doable, and likely not actually a maintenance nightmare (I wrote
> > > that thinking of modifying a common rules file).  But it's still fairly daunting
> > > as getting buy-in on something that affects the kernel at large tends to be easier
> > > said then done.  Then again, I'm probably being pessimistic due to my sub-par
> > > regex+scripting skills :-)
> >
> > How about adding support for checkpatch extension plugins? If we could add
> > a plugin script, e.g. tools/testing/selftests/kvm/.checkpatch, and modify
> > checkpatch to run .checkpatch scripts in the patched files' directories
> > (and recursively in the parent directories) when found, then we'd get
> > custom checkpatch behaviors. The scripts wouldn't even have to be written
> > in Perl (but I say that a bit sadly, because I like Perl).
>
> That will work for simple cases, but patches that touch files in multiple directories
> will be messy.  E.g. a patch that touches virt/kvm/ and arch/x86/kvm/ will have
> two separate custom rules enforcing two different scopes.
>
> Recursively executing plugins will also be problematic, e.g. except for KVM, arch/x86/
> is maintained by the tip-tree folks, and the tip-tree is quite opinionated on all
> sorts of things, whereas KVM tends to be a bit more relaxed.
>
> Enforcing scope through plugins would also lead to some amount of duplicate code
> throught subsystems.
>
> Anyways, if someone wants to pursue this, these ideas and the "requirement" should
> be run by the checkpatch maintainers.  They have far more experience and authority
> in this area, and I suspect we aren't the first people to want checkpatch to get
> involved in enforcing shortlog scope.

Documenting would at least be an improvement over what we have today
since it would eliminate the need to re-explain the preferred rules
every time. We can just point to the documentation when reviewing
patches.

`git log --pretty=oneline` is not a great way to document shortlog
scopes because it does not explain the rules (e.g. when to use "KVM:
x86: " vs "KVM: x86/mmu: "), does not explain why things the way they
are, and is inconsistent since we don't always catch every patch that
goes by with a non-preferred shortlog scope.
  
Sean Christopherson Nov. 7, 2022, 6:19 p.m. UTC | #9
On Mon, Nov 07, 2022, David Matlack wrote:
> On Fri, Oct 28, 2022 at 8:49 AM Sean Christopherson <seanjc@google.com> wrote:
> > Anyways, if someone wants to pursue this, these ideas and the "requirement" should
> > be run by the checkpatch maintainers.  They have far more experience and authority
> > in this area, and I suspect we aren't the first people to want checkpatch to get
> > involved in enforcing shortlog scope.
> 
> Documenting would at least be an improvement over what we have today
> since it would eliminate the need to re-explain the preferred rules
> every time. We can just point to the documentation when reviewing
> patches.

Agreed.  And there are many other things I want to formalize for KVM x86, e.g.
testing expectations, health requirements for the various branches, what each
branch is used for etc...

If you want to send a patch for the shortlogs thing, maybe create

  Documentation/process/maintainer-kvm-x86.rst

and link it into Documentation/process/maintainer-handbooks.rst?

> `git log --pretty=oneline` is not a great way to document shortlog
> scopes because it does not explain the rules (e.g. when to use "KVM:
> x86: " vs "KVM: x86/mmu: "), does not explain why things the way they
> are, and is inconsistent since we don't always catch every patch that
> goes by with a non-preferred shortlog scope.
  
David Matlack Nov. 9, 2022, 7:05 p.m. UTC | #10
On Mon, Nov 7, 2022 at 10:19 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Nov 07, 2022, David Matlack wrote:
> > On Fri, Oct 28, 2022 at 8:49 AM Sean Christopherson <seanjc@google.com> wrote:
> > > Anyways, if someone wants to pursue this, these ideas and the "requirement" should
> > > be run by the checkpatch maintainers.  They have far more experience and authority
> > > in this area, and I suspect we aren't the first people to want checkpatch to get
> > > involved in enforcing shortlog scope.
> >
> > Documenting would at least be an improvement over what we have today
> > since it would eliminate the need to re-explain the preferred rules
> > every time. We can just point to the documentation when reviewing
> > patches.
>
> Agreed.  And there are many other things I want to formalize for KVM x86, e.g.
> testing expectations, health requirements for the various branches, what each
> branch is used for etc...
>
> If you want to send a patch for the shortlogs thing, maybe create
>
>   Documentation/process/maintainer-kvm-x86.rst
>
> and link it into Documentation/process/maintainer-handbooks.rst?

Can do. I'll try to take a look later this week or next week.