[00/12] Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()

Message ID 20230531130833.635651916@infradead.org
Headers
Series Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() |

Message

Peter Zijlstra May 31, 2023, 1:08 p.m. UTC
  Hi!

After much breaking of things, find here the improved version.

Since v3:

 - unbreak everything that does *NOT* have cmpxchg128()

   Notably this_cpu_cmpxchg_double() is used unconditionally by SLUB
   which means that this_cpu_try_cmpxchg128() needs to be unconditionally
   available on all 64bit architectures.

 - fixed up x86/x86_64 cmpxchg{8,16}b emulation for this_cpu_cmpxchg{64,128}()

 - introduce {raw,this}_cpu_try_cmpxchg*()

 - add fallback for !__SIZEOF_INT128__ 64bit architectures

   Sadly there are supported 64bit architecture/compiler combinations that do
   not have __SIZEOF_INT128__, specifically it was found that HPPA64 only added
   this with GCC-11.

   this is yuck, and ideally we'd simply raise compiler requirements, but this
   'works'.

My plan is to re-add this to tip/locking/core and thus -next later this week.

Also available at:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/core

---
 Documentation/core-api/this_cpu_ops.rst     |   2 -
 arch/arm64/include/asm/atomic_ll_sc.h       |  56 +++---
 arch/arm64/include/asm/atomic_lse.h         |  39 ++---
 arch/arm64/include/asm/cmpxchg.h            |  48 ++----
 arch/arm64/include/asm/percpu.h             |  30 ++--
 arch/s390/include/asm/cmpxchg.h             |  32 +---
 arch/s390/include/asm/cpu_mf.h              |   2 +-
 arch/s390/include/asm/percpu.h              |  34 ++--
 arch/s390/kernel/perf_cpum_sf.c             |  16 +-
 arch/x86/include/asm/cmpxchg.h              |  25 ---
 arch/x86/include/asm/cmpxchg_32.h           |   2 +-
 arch/x86/include/asm/cmpxchg_64.h           |  63 ++++++-
 arch/x86/include/asm/percpu.h               | 102 ++++++-----
 arch/x86/lib/Makefile                       |   3 +-
 arch/x86/lib/cmpxchg16b_emu.S               |  43 +++--
 arch/x86/lib/cmpxchg8b_emu.S                |  67 ++++++--
 drivers/iommu/amd/amd_iommu_types.h         |   9 +-
 drivers/iommu/amd/iommu.c                   |  10 +-
 drivers/iommu/intel/irq_remapping.c         |   8 +-
 include/asm-generic/percpu.h                | 257 ++++++++++++++++++++++------
 include/crypto/b128ops.h                    |  14 +-
 include/linux/atomic/atomic-arch-fallback.h |  95 +++++++++-
 include/linux/atomic/atomic-instrumented.h  |  93 ++++++++--
 include/linux/dmar.h                        | 125 +++++++-------
 include/linux/percpu-defs.h                 |  45 ++---
 include/linux/slub_def.h                    |  12 +-
 include/linux/types.h                       |  12 ++
 include/uapi/linux/types.h                  |   4 +
 lib/crypto/curve25519-hacl64.c              |   2 -
 lib/crypto/poly1305-donna64.c               |   2 -
 mm/slab.h                                   |  53 +++++-
 mm/slub.c                                   | 139 +++++++++------
 scripts/atomic/gen-atomic-fallback.sh       |   4 +-
 scripts/atomic/gen-atomic-instrumented.sh   |  19 +-
 34 files changed, 952 insertions(+), 515 deletions(-)
  

Comments

Mark Rutland May 31, 2023, 2:47 p.m. UTC | #1
On Wed, May 31, 2023 at 03:08:33PM +0200, Peter Zijlstra wrote:
> Hi!
> 
> After much breaking of things, find here the improved version.
> 
> Since v3:
> 
>  - unbreak everything that does *NOT* have cmpxchg128()
> 
>    Notably this_cpu_cmpxchg_double() is used unconditionally by SLUB
>    which means that this_cpu_try_cmpxchg128() needs to be unconditionally
>    available on all 64bit architectures.
> 
>  - fixed up x86/x86_64 cmpxchg{8,16}b emulation for this_cpu_cmpxchg{64,128}()
> 
>  - introduce {raw,this}_cpu_try_cmpxchg*()
> 
>  - add fallback for !__SIZEOF_INT128__ 64bit architectures
> 
>    Sadly there are supported 64bit architecture/compiler combinations that do
>    not have __SIZEOF_INT128__, specifically it was found that HPPA64 only added
>    this with GCC-11.
> 
>    this is yuck, and ideally we'd simply raise compiler requirements, but this
>    'works'.

The patches look good to me, and I used my local cross-build script to build
test this with the kernel.org GCC 10.3.0 cross toolchain for all of the
following arch/triplet/config combinations:

  alpha           alpha-linux             defconfig
  arc             arc-linux               defconfig
  arm             arm-linux-gnueabi       multi_v4t_defconfig
  arm             arm-linux-gnueabi       multi_v5_defconfig
  arm             arm-linux-gnueabi       multi_v7_defconfig
  arm             arm-linux-gnueabi       omap1_defconfig
  arm64           aarch64-linux           defconfig
  csky            csky-linux              defconfig
  i386            i386-linux              defconfig
  ia64            ia64-linux              defconfig
  m68k            m68k-linux              defconfig
  microblaze      microblaze-linux        defconfig
  mips            mips-linux              32r1_defconfig
  mips            mips-linux              32r2_defconfig
  mips            mips-linux              32r6_defconfig
  mips            mips64-linux            64r1_defconfig
  mips            mips64-linux            64r2_defconfig
  mips            mips64-linux            64r6_defconfig
  nios2           nios2-linux             defconfig
  openrisc        or1k-linux              defconfig
  parisc          hppa-linux              generic-32bit_defconfig
  parisc          hppa64-linux            generic-64bit_defconfig
  powerpc         powerpc-linux           ppc40x_defconfig
  powerpc         powerpc64-linux         ppc64_defconfig
  powerpc         powerpc64-linux         ppc64e_defconfig
  riscv           riscv32-linux           rv32_defconfig
  riscv           riscv64-linux           defconfig
  s390            s390-linux              defconfig
  sh              sh4-linux               defconfig
  sparc           sparc-linux             sparc32_defconfig
  sparc           sparc64-linux           sparc64_defconfig
  x86_64          x86_64-linux            defconfig
  xtensa          xtensa-linux            defconfig

... and everything seemed happy.

I've also boot-tested arm64 defconfig.

So FWIW, for the series:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

> My plan is to re-add this to tip/locking/core and thus -next later this week.

I'll need to rebase my kerneldoc series atop this, so getting this into a
stable branch soon would be great!

Thanks,
Mark.

> 
> Also available at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/core
> 
> ---
>  Documentation/core-api/this_cpu_ops.rst     |   2 -
>  arch/arm64/include/asm/atomic_ll_sc.h       |  56 +++---
>  arch/arm64/include/asm/atomic_lse.h         |  39 ++---
>  arch/arm64/include/asm/cmpxchg.h            |  48 ++----
>  arch/arm64/include/asm/percpu.h             |  30 ++--
>  arch/s390/include/asm/cmpxchg.h             |  32 +---
>  arch/s390/include/asm/cpu_mf.h              |   2 +-
>  arch/s390/include/asm/percpu.h              |  34 ++--
>  arch/s390/kernel/perf_cpum_sf.c             |  16 +-
>  arch/x86/include/asm/cmpxchg.h              |  25 ---
>  arch/x86/include/asm/cmpxchg_32.h           |   2 +-
>  arch/x86/include/asm/cmpxchg_64.h           |  63 ++++++-
>  arch/x86/include/asm/percpu.h               | 102 ++++++-----
>  arch/x86/lib/Makefile                       |   3 +-
>  arch/x86/lib/cmpxchg16b_emu.S               |  43 +++--
>  arch/x86/lib/cmpxchg8b_emu.S                |  67 ++++++--
>  drivers/iommu/amd/amd_iommu_types.h         |   9 +-
>  drivers/iommu/amd/iommu.c                   |  10 +-
>  drivers/iommu/intel/irq_remapping.c         |   8 +-
>  include/asm-generic/percpu.h                | 257 ++++++++++++++++++++++------
>  include/crypto/b128ops.h                    |  14 +-
>  include/linux/atomic/atomic-arch-fallback.h |  95 +++++++++-
>  include/linux/atomic/atomic-instrumented.h  |  93 ++++++++--
>  include/linux/dmar.h                        | 125 +++++++-------
>  include/linux/percpu-defs.h                 |  45 ++---
>  include/linux/slub_def.h                    |  12 +-
>  include/linux/types.h                       |  12 ++
>  include/uapi/linux/types.h                  |   4 +
>  lib/crypto/curve25519-hacl64.c              |   2 -
>  lib/crypto/poly1305-donna64.c               |   2 -
>  mm/slab.h                                   |  53 +++++-
>  mm/slub.c                                   | 139 +++++++++------
>  scripts/atomic/gen-atomic-fallback.sh       |   4 +-
>  scripts/atomic/gen-atomic-instrumented.sh   |  19 +-
>  34 files changed, 952 insertions(+), 515 deletions(-)
>