[v9,0/4] Implement MTE tag compression for swapped pages

Message ID 20231113105234.32058-1-glider@google.com
Headers
Series Implement MTE tag compression for swapped pages |

Message

Alexander Potapenko Nov. 13, 2023, 10:52 a.m. UTC
  Currently, when MTE pages are swapped out, the tags are kept in the
memory, occupying PAGE_SIZE/32 bytes per page. This is especially
problematic for devices that use zram-backed in-memory swap, because
tags stored uncompressed in the heap effectively reduce the available
amount of swap memory.

The RLE-based algorithm suggested by Evgenii Stepanov and implemented in
this patch series is able to efficiently compress fixed-size tag buffers,
resulting in practical compression ratio of 2x. In many cases it is
possible to store the compressed data in 63-bit Xarray values, resulting
in no extra memory allocations.

This patch series depends on "lib/bitmap: add bitmap_{read,write}()"
(https://lore.kernel.org/linux-arm-kernel/20231030153210.139512-1-glider@google.com/T/)
that is mailed separately.

v9:
 - split off the stats collection code into a separate patch in the
   series (as suggested by Yury Norov)

v8:
 - split off the bitmap_read()/bitmap_write() series
 - simplified the compression logic (only compress data if it fits into
   a pointer)

v7:
 - fixed comments by Yury Norov, Andy Shevchenko, Rasmus Villemoes
 - added perf tests for bitmap_read()/bitmap_write()
 - more efficient bitmap_write() implementation (meant to be sent in v5)

v6:
 - fixed comments by Yury Norov
 - fixed handling of sizes divisible by MTE_GRANULES_PER_PAGE / 2
   (caught while testing on a real device)

v5:
 - fixed comments by Andy Shevchenko, Catalin Marinas, and Yury Norov
 - added support for 16K- and 64K pages
 - more efficient bitmap_write() implementation

v4:
 - fixed a bunch of comments by Andy Shevchenko and Yury Norov
 - added Documentation/arch/arm64/mte-tag-compression.rst

v3:
 - as suggested by Andy Shevchenko, use
   bitmap_get_value()/bitmap_set_value() written by Syed Nayyar Waris
 - switched to unsigned long to reduce typecasts
 - simplified the compression code

v2:
 - as suggested by Yuri Norov, replace the poorly implemented struct
   bitq with <linux/bitmap.h>


Alexander Potapenko (4):
  arm64: mte: implement CONFIG_ARM64_MTE_COMP
  arm64: mte: add a test for MTE tags compression
  arm64: mte: add compression support to mteswap.c
  arm64: mte: implement CONFIG_ARM64_MTE_SWAP_STATS

 Documentation/arch/arm64/index.rst            |   1 +
 .../arch/arm64/mte-tag-compression.rst        | 166 ++++++++
 arch/arm64/Kconfig                            |  37 ++
 arch/arm64/include/asm/mtecomp.h              |  39 ++
 arch/arm64/mm/Makefile                        |   2 +
 arch/arm64/mm/mtecomp.c                       | 257 +++++++++++++
 arch/arm64/mm/mtecomp.h                       |  12 +
 arch/arm64/mm/mteswap.c                       | 110 +++++-
 arch/arm64/mm/test_mtecomp.c                  | 364 ++++++++++++++++++
 9 files changed, 985 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/arch/arm64/mte-tag-compression.rst
 create mode 100644 arch/arm64/include/asm/mtecomp.h
 create mode 100644 arch/arm64/mm/mtecomp.c
 create mode 100644 arch/arm64/mm/mtecomp.h
 create mode 100644 arch/arm64/mm/test_mtecomp.c
  

Comments

Will Deacon Dec. 13, 2023, 12:31 p.m. UTC | #1
On Mon, Nov 13, 2023 at 11:52:29AM +0100, Alexander Potapenko wrote:
> Currently, when MTE pages are swapped out, the tags are kept in the
> memory, occupying PAGE_SIZE/32 bytes per page. This is especially
> problematic for devices that use zram-backed in-memory swap, because
> tags stored uncompressed in the heap effectively reduce the available
> amount of swap memory.
> 
> The RLE-based algorithm suggested by Evgenii Stepanov and implemented in
> this patch series is able to efficiently compress fixed-size tag buffers,
> resulting in practical compression ratio of 2x. In many cases it is
> possible to store the compressed data in 63-bit Xarray values, resulting
> in no extra memory allocations.
> 
> This patch series depends on "lib/bitmap: add bitmap_{read,write}()"
> (https://lore.kernel.org/linux-arm-kernel/20231030153210.139512-1-glider@google.com/T/)
> that is mailed separately.

That's a shame, because it means I can't apply the series as-is:


arch/arm64/mm/mtecomp.c: In function ‘mte_bitmap_write’:
arch/arm64/mm/mtecomp.c:105:2: error: implicit declaration of function ‘bitmap_write’; did you mean ‘bitmap_free’? [-Werror=implicit-function-declaration]
  105 |  bitmap_write(bitmap, value, *pos, bits);
      |  ^~~~~~~~~~~~
      |  bitmap_free
arch/arm64/mm/mtecomp.c: In function ‘mte_bitmap_read’:
arch/arm64/mm/mtecomp.c:198:9: error: implicit declaration of function ‘bitmap_read’; did you mean ‘bitmap_remap’? [-Werror=implicit-function-declaration]
  198 |  return bitmap_read(bitmap, start, bits);
      |         ^~~~~~~~~~~
      |         bitmap_remap
cc1: some warnings being treated as errors
make[5]: *** [scripts/Makefile.build:243: arch/arm64/mm/mtecomp.o] Error 1


Do you really have such a hard dependency on those new bitmap ops?

Will
  
Alexander Potapenko Dec. 13, 2023, 2:01 p.m. UTC | #2
On Wed, Dec 13, 2023 at 1:31 PM Will Deacon <will@kernel.org> wrote:
>
> On Mon, Nov 13, 2023 at 11:52:29AM +0100, Alexander Potapenko wrote:
> > Currently, when MTE pages are swapped out, the tags are kept in the
> > memory, occupying PAGE_SIZE/32 bytes per page. This is especially
> > problematic for devices that use zram-backed in-memory swap, because
> > tags stored uncompressed in the heap effectively reduce the available
> > amount of swap memory.
> >
> > The RLE-based algorithm suggested by Evgenii Stepanov and implemented in
> > this patch series is able to efficiently compress fixed-size tag buffers,
> > resulting in practical compression ratio of 2x. In many cases it is
> > possible to store the compressed data in 63-bit Xarray values, resulting
> > in no extra memory allocations.
> >
> > This patch series depends on "lib/bitmap: add bitmap_{read,write}()"
> > (https://lore.kernel.org/linux-arm-kernel/20231030153210.139512-1-glider@google.com/T/)
> > that is mailed separately.
>
> That's a shame, because it means I can't apply the series as-is:
>

Uh-oh, sorry about that. There was another series depending on the
bitmap_read/bitmap_write API, and I thought mailing those patches
separately would speed things up.
But in fact Yury requested them to have at least one user, so now that
the MTE series is also acked I'd better bring everything back
together.
Let me send out another version.

>
> arch/arm64/mm/mtecomp.c: In function ‘mte_bitmap_write’:
> arch/arm64/mm/mtecomp.c:105:2: error: implicit declaration of function ‘bitmap_write’; did you mean ‘bitmap_free’? [-Werror=implicit-function-declaration]
>   105 |  bitmap_write(bitmap, value, *pos, bits);
>       |  ^~~~~~~~~~~~
>       |  bitmap_free
> arch/arm64/mm/mtecomp.c: In function ‘mte_bitmap_read’:
> arch/arm64/mm/mtecomp.c:198:9: error: implicit declaration of function ‘bitmap_read’; did you mean ‘bitmap_remap’? [-Werror=implicit-function-declaration]
>   198 |  return bitmap_read(bitmap, start, bits);
>       |         ^~~~~~~~~~~
>       |         bitmap_remap
> cc1: some warnings being treated as errors
> make[5]: *** [scripts/Makefile.build:243: arch/arm64/mm/mtecomp.o] Error 1
>
>
> Do you really have such a hard dependency on those new bitmap ops?

Having them in bitmap.h seems natural - this way they can be reused by
other people.