[1/2] s390/rwonce: add READ_ONCE_ALIGNED_128() macro

Message ID 20230224100237.3247871-2-hca@linux.ibm.com
State New
Headers
Series s390: don't use 128-bit cmpxchg for READ_ONCE() purposes |

Commit Message

Heiko Carstens Feb. 24, 2023, 10:02 a.m. UTC
  Add an s390 specific READ_ONCE_ALIGNED_128() helper, which can be used for
fast block concurrent (atomic) 128-bit accesses.

The used lpq instruction requires 128-bit alignment. This is also the
reason why the compiler doesn't emit this instruction if __READ_ONCE() is
used for 128-bit accesses.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
 arch/s390/include/asm/rwonce.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 arch/s390/include/asm/rwonce.h
  

Comments

Peter Zijlstra Feb. 25, 2023, 4:50 p.m. UTC | #1
On Fri, Feb 24, 2023 at 11:02:36AM +0100, Heiko Carstens wrote:
> Add an s390 specific READ_ONCE_ALIGNED_128() helper, which can be used for
> fast block concurrent (atomic) 128-bit accesses.
> 
> The used lpq instruction requires 128-bit alignment. This is also the
> reason why the compiler doesn't emit this instruction if __READ_ONCE() is
> used for 128-bit accesses.

Does your u128 not have natural alignment? Does it help if you force
align the u128 type?

> 
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
> ---
>  arch/s390/include/asm/rwonce.h | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>  create mode 100644 arch/s390/include/asm/rwonce.h
> 
> diff --git a/arch/s390/include/asm/rwonce.h b/arch/s390/include/asm/rwonce.h
> new file mode 100644
> index 000000000000..91fc24520e82
> --- /dev/null
> +++ b/arch/s390/include/asm/rwonce.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_S390_RWONCE_H
> +#define __ASM_S390_RWONCE_H
> +
> +#include <linux/compiler_types.h>
> +
> +/*
> + * Use READ_ONCE_ALIGNED_128() for 128-bit block concurrent (atomic) read
> + * accesses. Note that x must be 128-bit aligned, otherwise a specification
> + * exception is generated.
> + */
> +#define READ_ONCE_ALIGNED_128(x)			\
> +({							\
> +	union {						\
> +		typeof(x) __x;				\
> +		__uint128_t val;			\
> +	} __u;						\
> +							\
> +	BUILD_BUG_ON(sizeof(x) != 16);			\
> +	asm volatile(					\
> +		"	lpq	%[val],%[_x]\n"		\
> +		: [val] "=d" (__u.val)			\
> +		: [_x] "QS" (x)				\
> +		: "memory");				\
> +	__u.__x;					\
> +})
> +
> +#include <asm-generic/rwonce.h>
> +
> +#endif	/* __ASM_S390_RWONCE_H */
> -- 
> 2.37.2
>
  
Heiko Carstens Feb. 26, 2023, 8:56 p.m. UTC | #2
On Sat, Feb 25, 2023 at 05:50:58PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 24, 2023 at 11:02:36AM +0100, Heiko Carstens wrote:
> > Add an s390 specific READ_ONCE_ALIGNED_128() helper, which can be used for
> > fast block concurrent (atomic) 128-bit accesses.
> > 
> > The used lpq instruction requires 128-bit alignment. This is also the
> > reason why the compiler doesn't emit this instruction if __READ_ONCE() is
> > used for 128-bit accesses.
> 
> Does your u128 not have natural alignment? Does it help if you force
> align the u128 type?

s390 seems to be the only architecture which has a 64 bit alignment for
__uint128_t. But making it explicitly naturally aligned doesn't help.
I guess that's because the lpq instruction requires an even-odd register
pair where it reads to, while the now used lmg instruction can use any
register pair; but lmg doesn't come with atomic semantics.
  
Peter Zijlstra Feb. 27, 2023, 11:51 a.m. UTC | #3
On Sun, Feb 26, 2023 at 09:56:44PM +0100, Heiko Carstens wrote:
> On Sat, Feb 25, 2023 at 05:50:58PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 24, 2023 at 11:02:36AM +0100, Heiko Carstens wrote:
> > > Add an s390 specific READ_ONCE_ALIGNED_128() helper, which can be used for
> > > fast block concurrent (atomic) 128-bit accesses.
> > > 
> > > The used lpq instruction requires 128-bit alignment. This is also the
> > > reason why the compiler doesn't emit this instruction if __READ_ONCE() is
> > > used for 128-bit accesses.
> > 
> > Does your u128 not have natural alignment? Does it help if you force
> > align the u128 type?
> 
> s390 seems to be the only architecture which has a 64 bit alignment for
> __uint128_t. But making it explicitly naturally aligned doesn't help.
> I guess that's because the lpq instruction requires an even-odd register
> pair where it reads to, while the now used lmg instruction can use any
> register pair; but lmg doesn't come with atomic semantics.

One thing you could do it talk with your compiler folks to allow using
lpq for volatile loads. That won't help you now and you'll have to do
these patches, but it makes sense to change the toolchains to me.
  

Patch

diff --git a/arch/s390/include/asm/rwonce.h b/arch/s390/include/asm/rwonce.h
new file mode 100644
index 000000000000..91fc24520e82
--- /dev/null
+++ b/arch/s390/include/asm/rwonce.h
@@ -0,0 +1,31 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_S390_RWONCE_H
+#define __ASM_S390_RWONCE_H
+
+#include <linux/compiler_types.h>
+
+/*
+ * Use READ_ONCE_ALIGNED_128() for 128-bit block concurrent (atomic) read
+ * accesses. Note that x must be 128-bit aligned, otherwise a specification
+ * exception is generated.
+ */
+#define READ_ONCE_ALIGNED_128(x)			\
+({							\
+	union {						\
+		typeof(x) __x;				\
+		__uint128_t val;			\
+	} __u;						\
+							\
+	BUILD_BUG_ON(sizeof(x) != 16);			\
+	asm volatile(					\
+		"	lpq	%[val],%[_x]\n"		\
+		: [val] "=d" (__u.val)			\
+		: [_x] "QS" (x)				\
+		: "memory");				\
+	__u.__x;					\
+})
+
+#include <asm-generic/rwonce.h>
+
+#endif	/* __ASM_S390_RWONCE_H */