[xstormy16] Add support for byte and word swapping instructions.
Checks
Commit Message
This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
words) instructions. The most obvious application of these to implement
the __builtin_bswap16 and __builtin_bswap32 intrinsics.
Currently, __builtin_bswap16 is implemented as:
foo: mov r7,r2
shl r7,#8
shr r2,#8
or r2,r7
ret
but with this patch becomes:
foo: swpb r2
ret
Likewise, __builtin_bswap32 now becomes:
foo: swpb r2 | swpb r3 | swpw r2,r3
ret
Finally, the swpw instruction on its own can be used to exchange
two word mode registers without a temporary, so a new pattern and
peephole2 have been added to catch this. As described in the
PR rtl-optimization/106518, register allocation can (in theory)
be more efficient on targets that provide a swap/exchange instruction.
The slightly unusual swap<mode> naming matches that used in i386.md.
This patch has been tested by building a cross-compiler to xstormy16-elf
from x86_64-pc-linux-gnu, and confirming the new test cases pass.
Ok for mainline?
2024-04-25 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/stormy16/stormy16.md (bswaphi2): New define_insn.
(bswapsi2): New define_insn.
(swaphi): New define_insn to exchange two registers (swpw).
(define_peephole2): Recognize exchange of registers as swaphi.
gcc/testsuite/ChangeLog
* gcc.target/xstormy16/bswap16.c: New test case.
* gcc.target/xstormy16/bswap32.c: Likewise.
* gcc.target/xstormy16/swpb.c: Likewise.
* gcc.target/xstormy16/swpw-1.c: Likewise.
* gcc.target/xstormy16/swpw-2.c: Likewise.
Thanks in advance,
Roger
--
Comments
On 4/25/23 14:20, Roger Sayle wrote:
>
> This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
> words) instructions. The most obvious application of these to implement
> the __builtin_bswap16 and __builtin_bswap32 intrinsics.
>
> Currently, __builtin_bswap16 is implemented as:
> foo: mov r7,r2
> shl r7,#8
> shr r2,#8
> or r2,r7
> ret
>
> but with this patch becomes:
> foo: swpb r2
> ret
>
> Likewise, __builtin_bswap32 now becomes:
> foo: swpb r2 | swpb r3 | swpw r2,r3
> ret
>
> Finally, the swpw instruction on its own can be used to exchange
> two word mode registers without a temporary, so a new pattern and
> peephole2 have been added to catch this. As described in the
> PR rtl-optimization/106518, register allocation can (in theory)
> be more efficient on targets that provide a swap/exchange instruction.
> The slightly unusual swap<mode> naming matches that used in i386.md.
>
> This patch has been tested by building a cross-compiler to xstormy16-elf
> from x86_64-pc-linux-gnu, and confirming the new test cases pass.
> Ok for mainline?
>
>
> 2024-04-25 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> * config/stormy16/stormy16.md (bswaphi2): New define_insn.
> (bswapsi2): New define_insn.
> (swaphi): New define_insn to exchange two registers (swpw).
> (define_peephole2): Recognize exchange of registers as swaphi.
>
> gcc/testsuite/ChangeLog
> * gcc.target/xstormy16/bswap16.c: New test case.
> * gcc.target/xstormy16/bswap32.c: Likewise.
> * gcc.target/xstormy16/swpb.c: Likewise.
> * gcc.target/xstormy16/swpw-1.c: Likewise.
> * gcc.target/xstormy16/swpw-2.c: Likewise.
OK. And like prior patches, if it causes any problems in wider testing,
we'll know ~24hrs after the bits go in.
jeff
@@ -1265,3 +1265,39 @@
"bp %1,#7,%l0"
[(set_attr "length" "4")
(set_attr "psw_operand" "nop")])
+
+(define_insn "bswaphi2"
+ [(set (match_operand:HI 0 "register_operand" "=r")
+ (bswap:HI (match_operand:HI 1 "register_operand" "0")))]
+ ""
+ "swpb %0")
+
+(define_insn "bswapsi2"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (bswap:SI (match_operand:SI 1 "register_operand" "0")))]
+ ""
+ "swpb %0 | swpb %h0 | swpw %0,%h0"
+ [(set_attr "length" "6")])
+
+(define_insn "swaphi"
+ [(set (match_operand:HI 0 "register_operand" "+r")
+ (match_operand:HI 1 "register_operand" "+r"))
+ (set (match_dup 1)
+ (match_dup 0))]
+ ""
+ "swpw %0,%1")
+
+(define_peephole2
+ [(set (match_operand:HI 0 "register_operand")
+ (match_operand:HI 1 "register_operand"))
+ (set (match_dup 1)
+ (match_operand:HI 2 "register_operand"))
+ (set (match_dup 2)
+ (match_dup 0))]
+ "REGNO (operands[0]) != REGNO (operands[1])
+ && REGNO (operands[0]) != REGNO (operands[2])
+ && REGNO (operands[1]) != REGNO (operands[2])
+ && peep2_reg_dead_p (3, operands[0])"
+ [(parallel [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 1) (match_dup 2))])])
+
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned short foo(unsigned short x)
+{
+ return __builtin_bswap16 (x);
+}
+
+/* { dg-final { scan-assembler "swpb r2" } } */
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned long foo(unsigned long x)
+{
+ return __builtin_bswap32 (x);
+}
+
+/* { dg-final { scan-assembler "swpb" } } */
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned short foo(unsigned short x)
+{
+ return (x>>8) | (x<<8);
+}
+
+/* { dg-final { scan-assembler "swpb r2" } } */
new file mode 100644
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void ext(int x, int y);
+
+void foo(int x, int y) { ext(y,x); }
+
+/* { dg-final { scan-assembler "swpw r3,r2" } } */
new file mode 100644
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void ext(int x, int y);
+
+void foo(int x, int y)
+{
+ int t1 = x ^ y;
+ int t2 = t1 ^ x;
+ int t3 = t1 ^ y;
+ ext(t2,t3);
+}
+
+/* { dg-final { scan-assembler "swpw r3,r2" } } */