[v1] LoongArch: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO
Checks
Commit Message
The LoongArch has defined ctz and clz on the backend, but if we want GCC
do CTZ transformation optimization in forwprop2 pass, GCC need to know
the value of c[lt]z at zero, which may be beneficial for some test cases
(like spec2017 deepsjeng_r).
After implementing the macro, we test dynamic instruction count on
deepsjeng_r:
- before 1688423249186
- after 1660311215745 (1.66% reduction)
---
gcc/config/loongarch/loongarch.h | 5 +++++
gcc/testsuite/gcc.dg/pr90838.c | 5 +++++
2 files changed, 10 insertions(+)
Comments
On Thu, 2023-11-16 at 20:30 +0800, Li Wei wrote:
> The LoongArch has defined ctz and clz on the backend, but if we want GCC
> do CTZ transformation optimization in forwprop2 pass, GCC need to know
> the value of c[lt]z at zero, which may be beneficial for some test cases
> (like spec2017 deepsjeng_r).
>
> After implementing the macro, we test dynamic instruction count on
> deepsjeng_r:
> - before 1688423249186
> - after 1660311215745 (1.66% reduction)
LGTM, nice catch!
> ---
> gcc/config/loongarch/loongarch.h | 5 +++++
> gcc/testsuite/gcc.dg/pr90838.c | 5 +++++
> 2 files changed, 10 insertions(+)
>
> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
> index ddac8e98ea9..115222e70fd 100644
> --- a/gcc/config/loongarch/loongarch.h
> +++ b/gcc/config/loongarch/loongarch.h
> @@ -1239,3 +1239,8 @@ struct GTY (()) machine_function
>
> #define TARGET_EXPLICIT_RELOCS \
> (la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS)
> +
> +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> + ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> + ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> diff --git a/gcc/testsuite/gcc.dg/pr90838.c b/gcc/testsuite/gcc.dg/pr90838.c
> index 759059683a9..40aad70499d 100644
> --- a/gcc/testsuite/gcc.dg/pr90838.c
> +++ b/gcc/testsuite/gcc.dg/pr90838.c
> @@ -83,3 +83,8 @@ int ctz4 (unsigned long x)
> /* { dg-final { scan-assembler-times "ctz\t" 3 { target { rv32 } } } } */
> /* { dg-final { scan-assembler-times "andi\t" 1 { target { rv32 } } } } */
> /* { dg-final { scan-assembler-times "mul\t" 1 { target { rv32 } } } } */
> +
> +/* { dg-final { scan-tree-dump-times {= \.CTZ} 4 "forwprop2" { target { loongarch64*-*-* } } } } */
> +/* { dg-final { scan-assembler-times "ctz.d\t" 1 { target { loongarch64*-*-* } } } } */
> +/* { dg-final { scan-assembler-times "ctz.w\t" 3 { target { loongarch64*-*-* } } } } */
> +/* { dg-final { scan-assembler-times "andi\t" 4 { target { loongarch64*-*-* } } } } */
@@ -1239,3 +1239,8 @@ struct GTY (()) machine_function
#define TARGET_EXPLICIT_RELOCS \
(la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS)
+
+#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+ ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
+#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+ ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
@@ -83,3 +83,8 @@ int ctz4 (unsigned long x)
/* { dg-final { scan-assembler-times "ctz\t" 3 { target { rv32 } } } } */
/* { dg-final { scan-assembler-times "andi\t" 1 { target { rv32 } } } } */
/* { dg-final { scan-assembler-times "mul\t" 1 { target { rv32 } } } } */
+
+/* { dg-final { scan-tree-dump-times {= \.CTZ} 4 "forwprop2" { target { loongarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-times "ctz.d\t" 1 { target { loongarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-times "ctz.w\t" 3 { target { loongarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-times "andi\t" 4 { target { loongarch64*-*-* } } } } */