LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

Message ID 20231212095006.12830-1-xujiahao@loongson.cn
State Accepted
Headers
Series LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT. |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Jiahao Xu Dec. 12, 2023, 9:50 a.m. UTC
  Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.

This gives a 1.8% improvement in SPECCPU 2017 fprate on 3A6000.

gcc/ChangeLog:

	* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/short-circuit.c: New test.
  

Comments

Xi Ruoyao Dec. 12, 2023, 10:05 a.m. UTC | #1
On Tue, 2023-12-12 at 17:50 +0800, Jiahao Xu wrote:
> diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> new file mode 100644
> index 00000000000..2cef0193466
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-gimple" } */
> +
> +int 
> +short_circuit (float *a)
> +{
> +  float t1x = a[0];
> +  float t2x = a[1];
> +  float t1y = a[2];
> +  float t2y = a[3];
> +  float t1z = a[4];
> +  float t2z = a[5];
> +
> +  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
> +    return 0;
> +
> +  return 1;
> +}
> +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */

This test already passes without defining LOGICAL_OP_NON_SHORT_CIRCUIT.
Or am I missing something here?
  
Jiahao Xu Dec. 12, 2023, 11:08 a.m. UTC | #2
在 2023/12/12 下午6:05, Xi Ruoyao 写道:
> On Tue, 2023-12-12 at 17:50 +0800, Jiahao Xu wrote:
>> diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
>> new file mode 100644
>> index 00000000000..2cef0193466
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-gimple" } */
>> +
>> +int
>> +short_circuit (float *a)
>> +{
>> +  float t1x = a[0];
>> +  float t2x = a[1];
>> +  float t1y = a[2];
>> +  float t2y = a[3];
>> +  float t1z = a[4];
>> +  float t2z = a[5];
>> +
>> +  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
>> +    return 0;
>> +
>> +  return 1;
>> +}
>> +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */
> This test already passes without defining LOGICAL_OP_NON_SHORT_CIRCUIT.
> Or am I missing something here?
This test also needs to add the compilation option -ffast-math. I missed 
it. Thanks for the reminder.
>
  
Xi Ruoyao Dec. 12, 2023, 11:21 a.m. UTC | #3
On Tue, 2023-12-12 at 19:08 +0800, Jiahao Xu wrote:
> This test also needs to add the compilation option -ffast-math. I missed 
> it. Thanks for the reminder.

In r14-15 we removed LOGICAL_OP_NON_SHORT_CIRCUIT definition because the
default value (1 for all current LoongArch CPUs with branch_cost = 6)
may reduce the number of conditional branch instructions.

I guess here the problem is floating-point compare instruction is much
more costly than other instructions but the fact is not correctly
modeled yet.  Could you try
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640012.html
where I've raised fp_add cost (which is used for estimating floating-
point compare cost) to 5 instructions and see if it solves your problem
without LOGICAL_OP_NON_SHORT_CIRCUIT?
  

Patch

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index f1350b6048f..880c576c35b 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -869,6 +869,7 @@  typedef struct {
    1 is the default; other values are interpreted relative to that.  */
 
 #define BRANCH_COST(speed_p, predictable_p) loongarch_branch_cost
+#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
 
 /* Return the asm template for a conditional branch instruction.
    OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
new file mode 100644
index 00000000000..2cef0193466
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-gimple" } */
+
+int 
+short_circuit (float *a)
+{
+  float t1x = a[0];
+  float t2x = a[1];
+  float t1y = a[2];
+  float t2y = a[3];
+  float t1z = a[4];
+  float t2z = a[5];
+
+  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
+    return 0;
+
+  return 1;
+}
+/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */