[Resend,v1,1/5] linux/bitqueue.h: add the bit queue implementation

Message ID 20230711144233.3129207-2-glider@google.com
State New
Headers
Series Implement MTE tag compression for swapped pages |

Commit Message

Alexander Potapenko July 11, 2023, 2:42 p.m. UTC
  struct bitq represents a bit queue with external storage.

Its purpose is to easily pack sub-byte values, which can be used, for
example, to implement RLE algorithms.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
 include/linux/bitqueue.h | 144 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 144 insertions(+)
 create mode 100644 include/linux/bitqueue.h
  

Comments

Yury Norov July 11, 2023, 7:20 p.m. UTC | #1
+ Andy and Rasmus

On Tue, Jul 11, 2023 at 04:42:29PM +0200, Alexander Potapenko wrote:
> struct bitq represents a bit queue with external storage.
> 
> Its purpose is to easily pack sub-byte values, which can be used, for
> example, to implement RLE algorithms.

Whatever it is, it's not a queue. The queue implies O(1) for insertion
and deletion, but your 'dequeue' is clearly an O(n) procedure.

I'm not sure if I completely understand the purpose of the series, but
from this description:
        enqueueing/dequeueing of sub-byte values

I think, the simplest solution would be a circular queue (ringbuffer)
based on bitmaps:

Init an empty 10-bit bitmap:

Head = 0
v
..... .....
 ^
 Tail = 1

Enqueue 6-bit value 0b101010 at the end:

Head = 0
v
10101 0....
      ^
      Tail = 1

Dequeue 3 bits from the beginning:

   Head = 0
   v
...01 0....
      ^
      Tail = 1

And so on...

See some other comments inline.

Thanks,
Yury

> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
>  include/linux/bitqueue.h | 144 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 144 insertions(+)
>  create mode 100644 include/linux/bitqueue.h
> 
> diff --git a/include/linux/bitqueue.h b/include/linux/bitqueue.h
> new file mode 100644
> index 0000000000000..c4393f703c697
> --- /dev/null
> +++ b/include/linux/bitqueue.h
> @@ -0,0 +1,144 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * A simple bit queue which supports enqueueing/dequeueing of sub-byte values.
> + *
> + * This can be used to pack complex bitfields into byte arrays.
> + */
> +#ifndef _LINUX_BITQUEUE_H
> +#define _LINUX_BITQUEUE_H
> +
> +#include <linux/string.h>
> +#include <linux/types.h>
> +
> +/**
> + * struct bitq - represents a bit queue with external storage.
> + * @data:	data buffer used by the queue.
> + * @size:	size of @data in bytes.
> + * @bit_pos:	current bit position.
> + */
> +struct bitq {
> +	u8 *data;
> +	int size, bit_pos;
> +};
> +
> +/**
> + * bitq_init - initialize an empty bit queue.
> + * @q:		struct bitq to be initialized.
> + * @data:	external data buffer to use.
> + * @size:	capacity in bytes.
> + *
> + * Return:	0 in the case of success, -1 if either of the pointers is NULL.

ENIVAL?

> + */
> +static inline int bitq_init(struct bitq *q, u8 *data, int size)
> +{
> +	if (!q || !data)
> +		return -1;

This is a useless check. Erroneous code may (and often does) pass a
broken pointer other than NULL.

And to make it worse, this error handling (instead of run-time
segfault which can be easily caught at debugging) may make users think
that passing NULL is the right thing. Check bit/bitmap and other kernel
functions: they don't check against NULL as a general rule, except for
well-justified cases, like 'free(NULL)'.


> +	q->data = data;
> +	q->size = size;
> +	memset(data, 0, size);

Useless memset?

> +	q->bit_pos = 0;
> +	return 0;
> +}
> +
> +/**
> + * bitq_init_full - make a bit queue from an initialized byte array.
> + * @q:		struct bitq to be initialized.
> + * @data:	external data buffer to use.
> + * @size:	capacity in bytes.
> + *
> + * Return:	0 in the case of success, -1 if either of the pointers is NULL.
> + */
> +static inline int bitq_init_full(struct bitq *q, u8 *data, int size)
> +{
> +	if (!q || !data)
> +		return -1;
> +	q->data = data;
> +	q->size = size;
> +	q->bit_pos = q->size * 8;
> +	return 0;
> +}

This all should not reside in a header.

> +
> +/**
> + * bitq_enqueue - push up to 8 bits to the end of the queue.
> + * @q:		struct bitq.
> + * @value:	byte containing the value to be pushed.
> + * @bits:	number of bits (1 to 8) to push.
> + *
> + * Return:	number of bits pushed, or -1 in the case of an error.
> + */
> +static inline int bitq_enqueue(struct bitq *q, u8 value, int bits)
> +{
> +	int byte_pos, left_in_byte, max_pos;
> +	u8 hi, lo;
> +
> +	if (!q || (bits < 1) || (bits > 8))
> +		return -1;

Pushing 0 elements in queue is usually not an error. Implementations
usually return and do nothing. From the malloc() man page:

        If size is 0, then malloc() returns a unique pointer value that
        can later be successfully passed to free().

> +	max_pos = q->size * 8;
> +	if ((max_pos - q->bit_pos) < bits)
> +		return -1;

ENOMEM? Or probably better to resize the queue.

> +
> +	left_in_byte = 8 - (q->bit_pos % 8);
> +	byte_pos = q->bit_pos / 8;
> +	/* Clamp @value. */
> +	value %= (1 << bits);
> +	if (left_in_byte >= bits) {
> +		/* @value fits into the current byte. */
> +		value <<= (left_in_byte - bits);
> +		q->data[byte_pos] |= value;
> +	} else {
> +		/*
> +		 * @value needs to be split between the current and the
> +		 * following bytes.
> +		 */
> +		hi = value >> (bits - left_in_byte);
> +		q->data[byte_pos] |= hi;
> +		byte_pos++;
> +		lo = value << (8 - (bits - left_in_byte));
> +		q->data[byte_pos] |= lo;
> +	}

This piece should be a bitmap_append() function, like:
bitmap_append(addr, 3, 2, 0b11) would append 0b11 to the bitmap at
offset 3. We already have bitmap_{set,get}_value8, so I suggest
to extend the interface for unaligned offsets and lengths up to
BITS_PER_LONG.

> +	q->bit_pos += bits;
> +	return bits;
> +}
> +
> +/**
> + * bitq_dequeue - pop up to 8 bits from the beginning of the queue.
> + * @q:		struct bitq.
> + * @value:	u8* to store the popped value (can be NULL).
> + * @bits:	number of bits (1 to 8) to pop.
> + *
> + * Return:	number of bits popped, or -1 in the case of an error.
> + */
> +
> +#include <linux/printk.h>
> +static inline int bitq_dequeue(struct bitq *q, u8 *value, int bits)
> +{
> +	int rem_bits = 8 - bits, i;
> +	u8 output;
> +
> +	/* Invalid arguments. */
> +	if (!q || (bits < 1) || (bits > 8))
> +		return -1;
> +	/* Not enough space to insert @bits. */
> +	if (q->bit_pos < bits)
> +		return -1;
> +	/* Take the first @bits bits from the first byte. */
> +	output = q->data[0];
> +	output >>= rem_bits;
> +	if (value)
> +		*value = output;
> +
> +	/*
> +	 * Shift every byte in the queue to the left by @bits, carrying over to
> +	 * the previous byte.
> +	 */
> +	for (i = 0; i < q->size - 1; i++) {
> +		q->data[i] = (q->data[i] << bits) |
> +			     (q->data[i + 1] >> rem_bits);
> +	}

As I already mentioned, this is O(N), which is wrong for queues. Add a
pointer to the head in the bitq structure to avoid shifting every
byte.

BTW, we've got bitmap_shift_{left,right} for this.

> +	q->data[q->size - 1] <<= bits;
> +	q->bit_pos -= bits;
> +	return bits;
> +}
> +
> +#endif // _LINUX_BITQUEUE_H
> -- 
> 2.41.0.255.g8b1d071c50-goog
  
Alexander Potapenko July 12, 2023, 10:24 a.m. UTC | #2
On Tue, Jul 11, 2023 at 9:20 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> + Andy and Rasmus
>
> On Tue, Jul 11, 2023 at 04:42:29PM +0200, Alexander Potapenko wrote:
> > struct bitq represents a bit queue with external storage.
> >
> > Its purpose is to easily pack sub-byte values, which can be used, for
> > example, to implement RLE algorithms.
>
> Whatever it is, it's not a queue. The queue implies O(1) for insertion
> and deletion, but your 'dequeue' is clearly an O(n) procedure.

Thanks for spotting this!
I have indeed done a poor job implementing the dequeue method.

> I'm not sure if I completely understand the purpose of the series,

To implement tag compression, we need to serialize/deserialize "bit
fields" looking e.g. like this:

    int largest_idx : 6;
    unsigned char tags[N] : 4*N;
    unsigned char sizes[N-1] : 7*(N-1)

to/from a byte array. This actually needs to be done only once, and
enqueue()/dequeue() operations do not interleave, so there is no need
for an actual queue.

I'll try to come up with something simple - maybe reimplement it as a
ring buffer, or even skip the "ring" part, because it is not needed
for my purpose.
(The struct may end up being less generic - in that case I'll move it
from include/linux arch/arm64/mm/)


> but
> from this description:
>         enqueueing/dequeueing of sub-byte values
>
> I think, the simplest solution would be a circular queue (ringbuffer)
> based on bitmaps:


> > +/**
> > + * bitq_init - initialize an empty bit queue.
> > + * @q:               struct bitq to be initialized.
> > + * @data:    external data buffer to use.
> > + * @size:    capacity in bytes.
> > + *
> > + * Return:   0 in the case of success, -1 if either of the pointers is NULL.
>
> ENIVAL?

Ack, better use the common error values.

>
> > + */
> > +static inline int bitq_init(struct bitq *q, u8 *data, int size)
> > +{
> > +     if (!q || !data)
> > +             return -1;
>
> This is a useless check. Erroneous code may (and often does) pass a
> broken pointer other than NULL.

I am actually a fan of defensive programming, but it's a good point
that it does not defend against non-NULL pointers, and NULL is anyway
an unexpected input value.

>
> > +     q->data = data;
> > +     q->size = size;
> > +     memset(data, 0, size);
>
> Useless memset?

An overly cautious one, that lets us fetch values from partially
initialized bytes. This code will be removed anyway.

> > +static inline int bitq_init_full(struct bitq *q, u8 *data, int size)
> > +{
> > +     if (!q || !data)
> > +             return -1;
> > +     q->data = data;
> > +     q->size = size;
> > +     q->bit_pos = q->size * 8;
> > +     return 0;
> > +}
>
> This all should not reside in a header.

There's a handful of examples in include/linux where meaningful code
is written in the headers, but I agree that in this particular case it
is probably not justified by performance reasons.

> > +     if (!q || (bits < 1) || (bits > 8))
> > +             return -1;
>
> Pushing 0 elements in queue is usually not an error. Implementations
> usually return and do nothing. From the malloc() man page:

Agreed.

>         If size is 0, then malloc() returns a unique pointer value that
>         can later be successfully passed to free().
>
> > +     max_pos = q->size * 8;
> > +     if ((max_pos - q->bit_pos) < bits)
> > +             return -1;
>
> ENOMEM? Or probably better to resize the queue.

This "queue" relies on the external storage that may not be easily
resizeable (e.g. when we are using a local u64 as a storage).
ENOMEM sounds better (should we stick to this interface).


> > +             /*
> > +              * @value needs to be split between the current and the
> > +              * following bytes.
> > +              */
> > +             hi = value >> (bits - left_in_byte);
> > +             q->data[byte_pos] |= hi;
> > +             byte_pos++;
> > +             lo = value << (8 - (bits - left_in_byte));
> > +             q->data[byte_pos] |= lo;
> > +     }
>
> This piece should be a bitmap_append() function, like:
> bitmap_append(addr, 3, 2, 0b11) would append 0b11 to the bitmap at
> offset 3. We already have bitmap_{set,get}_value8, so I suggest
> to extend the interface for unaligned offsets and lengths up to
> BITS_PER_LONG.

Interesting. Yeah, this could be part of bitmap.h instead.


> > +     /*
> > +      * Shift every byte in the queue to the left by @bits, carrying over to
> > +      * the previous byte.
> > +      */
> > +     for (i = 0; i < q->size - 1; i++) {
> > +             q->data[i] = (q->data[i] << bits) |
> > +                          (q->data[i + 1] >> rem_bits);
> > +     }
>
> As I already mentioned, this is O(N), which is wrong for queues. Add a
> pointer to the head in the bitq structure to avoid shifting every
> byte.
>
> BTW, we've got bitmap_shift_{left,right} for this.

Ack.
  

Patch

diff --git a/include/linux/bitqueue.h b/include/linux/bitqueue.h
new file mode 100644
index 0000000000000..c4393f703c697
--- /dev/null
+++ b/include/linux/bitqueue.h
@@ -0,0 +1,144 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * A simple bit queue which supports enqueueing/dequeueing of sub-byte values.
+ *
+ * This can be used to pack complex bitfields into byte arrays.
+ */
+#ifndef _LINUX_BITQUEUE_H
+#define _LINUX_BITQUEUE_H
+
+#include <linux/string.h>
+#include <linux/types.h>
+
+/**
+ * struct bitq - represents a bit queue with external storage.
+ * @data:	data buffer used by the queue.
+ * @size:	size of @data in bytes.
+ * @bit_pos:	current bit position.
+ */
+struct bitq {
+	u8 *data;
+	int size, bit_pos;
+};
+
+/**
+ * bitq_init - initialize an empty bit queue.
+ * @q:		struct bitq to be initialized.
+ * @data:	external data buffer to use.
+ * @size:	capacity in bytes.
+ *
+ * Return:	0 in the case of success, -1 if either of the pointers is NULL.
+ */
+static inline int bitq_init(struct bitq *q, u8 *data, int size)
+{
+	if (!q || !data)
+		return -1;
+	q->data = data;
+	q->size = size;
+	memset(data, 0, size);
+	q->bit_pos = 0;
+	return 0;
+}
+
+/**
+ * bitq_init_full - make a bit queue from an initialized byte array.
+ * @q:		struct bitq to be initialized.
+ * @data:	external data buffer to use.
+ * @size:	capacity in bytes.
+ *
+ * Return:	0 in the case of success, -1 if either of the pointers is NULL.
+ */
+static inline int bitq_init_full(struct bitq *q, u8 *data, int size)
+{
+	if (!q || !data)
+		return -1;
+	q->data = data;
+	q->size = size;
+	q->bit_pos = q->size * 8;
+	return 0;
+}
+
+/**
+ * bitq_enqueue - push up to 8 bits to the end of the queue.
+ * @q:		struct bitq.
+ * @value:	byte containing the value to be pushed.
+ * @bits:	number of bits (1 to 8) to push.
+ *
+ * Return:	number of bits pushed, or -1 in the case of an error.
+ */
+static inline int bitq_enqueue(struct bitq *q, u8 value, int bits)
+{
+	int byte_pos, left_in_byte, max_pos;
+	u8 hi, lo;
+
+	if (!q || (bits < 1) || (bits > 8))
+		return -1;
+
+	max_pos = q->size * 8;
+	if ((max_pos - q->bit_pos) < bits)
+		return -1;
+
+	left_in_byte = 8 - (q->bit_pos % 8);
+	byte_pos = q->bit_pos / 8;
+	/* Clamp @value. */
+	value %= (1 << bits);
+	if (left_in_byte >= bits) {
+		/* @value fits into the current byte. */
+		value <<= (left_in_byte - bits);
+		q->data[byte_pos] |= value;
+	} else {
+		/*
+		 * @value needs to be split between the current and the
+		 * following bytes.
+		 */
+		hi = value >> (bits - left_in_byte);
+		q->data[byte_pos] |= hi;
+		byte_pos++;
+		lo = value << (8 - (bits - left_in_byte));
+		q->data[byte_pos] |= lo;
+	}
+	q->bit_pos += bits;
+	return bits;
+}
+
+/**
+ * bitq_dequeue - pop up to 8 bits from the beginning of the queue.
+ * @q:		struct bitq.
+ * @value:	u8* to store the popped value (can be NULL).
+ * @bits:	number of bits (1 to 8) to pop.
+ *
+ * Return:	number of bits popped, or -1 in the case of an error.
+ */
+
+#include <linux/printk.h>
+static inline int bitq_dequeue(struct bitq *q, u8 *value, int bits)
+{
+	int rem_bits = 8 - bits, i;
+	u8 output;
+
+	/* Invalid arguments. */
+	if (!q || (bits < 1) || (bits > 8))
+		return -1;
+	/* Not enough space to insert @bits. */
+	if (q->bit_pos < bits)
+		return -1;
+	/* Take the first @bits bits from the first byte. */
+	output = q->data[0];
+	output >>= rem_bits;
+	if (value)
+		*value = output;
+
+	/*
+	 * Shift every byte in the queue to the left by @bits, carrying over to
+	 * the previous byte.
+	 */
+	for (i = 0; i < q->size - 1; i++) {
+		q->data[i] = (q->data[i] << bits) |
+			     (q->data[i + 1] >> rem_bits);
+	}
+	q->data[q->size - 1] <<= bits;
+	q->bit_pos -= bits;
+	return bits;
+}
+
+#endif // _LINUX_BITQUEUE_H