[v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

Message ID 20240116023231.53164-1-xujiahao@loongson.cn
State Unresolved
Headers
Series [v3] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jiahao Xu Jan. 16, 2024, 2:32 a.m. UTC
  Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.

SPEC2017 performance evaluation shows 1% performance improvement for fprate
GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
on 3A6000.

gcc/ChangeLog:

	* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/short-circuit.c: New test.
  

Comments

chenglulu Jan. 19, 2024, 8:54 a.m. UTC | #1
Hi, Jiahao:

This patch will introduce redundant FAIL, and the reason needs to be explained.

+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional combines static and invariant" 1
+FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate bb" 2
+FAIL: gcc.dg/tree-ssa/update-threading.c scan-tree-dump-times optimized "Invalid sum" 0

在 2024/1/16 上午10:32, Jiahao Xu 写道:
> Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
> short-circuit operation instead of the non-short-circuit operation.
>
> SPEC2017 performance evaluation shows 1% performance improvement for fprate
> GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
> on 3A6000.
>
> gcc/ChangeLog:
>
> 	* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/loongarch/short-circuit.c: New test.
>
> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
> index 4e6ede926d3..8b453ab3140 100644
> --- a/gcc/config/loongarch/loongarch.h
> +++ b/gcc/config/loongarch/loongarch.h
> @@ -869,6 +869,7 @@ typedef struct {
>      1 is the default; other values are interpreted relative to that.  */
>   
>   #define BRANCH_COST(speed_p, predictable_p) la_branch_cost
> +#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
>   
>   /* Return the asm template for a conditional branch instruction.
>      OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
> diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> new file mode 100644
> index 00000000000..bed585ee172
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
> +
> +int
> +short_circuit (float *a)
> +{
> +  float t1x = a[0];
> +  float t2x = a[1];
> +  float t1y = a[2];
> +  float t2y = a[3];
> +  float t1z = a[4];
> +  float t2z = a[5];
> +
> +  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
> +    return 0;
> +
> +  return 1;
> +}
> +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */
  
Jiahao Xu Jan. 19, 2024, 9:50 a.m. UTC | #2
The test case gcc.dg/tree-ssa/copy-headers-8.c fails for a target where 
LOGICAL_OP_NON_SHORT_CIRCUIT is defined as 0.It is suggested to add 
`--param logical-op-non-short-circuit=1` to the test case to make it a 
target-independent testcase.

see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111248.

在 2024/1/19 下午4:54, chenglulu 写道:
> Hi, Jiahao:
>
> This patch will introduce redundant FAIL, and the reason needs to be explained.
>
> +FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional combines static and invariant" 1
> +FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate bb" 2
> +FAIL: gcc.dg/tree-ssa/update-threading.c scan-tree-dump-times optimized "Invalid sum" 0
> 在 2024/1/16 上午10:32, Jiahao Xu 写道:
>> Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
>> short-circuit operation instead of the non-short-circuit operation.
>>
>> SPEC2017 performance evaluation shows 1% performance improvement for fprate
>> GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
>> on 3A6000.
>>
>> gcc/ChangeLog:
>>
>> 	* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	* gcc.target/loongarch/short-circuit.c: New test.
>>
>> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
>> index 4e6ede926d3..8b453ab3140 100644
>> --- a/gcc/config/loongarch/loongarch.h
>> +++ b/gcc/config/loongarch/loongarch.h
>> @@ -869,6 +869,7 @@ typedef struct {
>>      1 is the default; other values are interpreted relative to that.  */
>>   
>>   #define BRANCH_COST(speed_p, predictable_p) la_branch_cost
>> +#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
>>   
>>   /* Return the asm template for a conditional branch instruction.
>>      OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
>> diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
>> new file mode 100644
>> index 00000000000..bed585ee172
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
>> +
>> +int
>> +short_circuit (float *a)
>> +{
>> +  float t1x = a[0];
>> +  float t2x = a[1];
>> +  float t1y = a[2];
>> +  float t2y = a[3];
>> +  float t1z = a[4];
>> +  float t2z = a[5];
>> +
>> +  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
>> +    return 0;
>> +
>> +  return 1;
>> +}
>> +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */
  
chenglulu Jan. 26, 2024, 8:13 a.m. UTC | #3
Pushed to r14-8446.

在 2024/1/16 上午10:32, Jiahao Xu 写道:
> Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
> short-circuit operation instead of the non-short-circuit operation.
>
> SPEC2017 performance evaluation shows 1% performance improvement for fprate
> GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
> on 3A6000.
>
> gcc/ChangeLog:
>
> 	* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/loongarch/short-circuit.c: New test.
>
> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
> index 4e6ede926d3..8b453ab3140 100644
> --- a/gcc/config/loongarch/loongarch.h
> +++ b/gcc/config/loongarch/loongarch.h
> @@ -869,6 +869,7 @@ typedef struct {
>      1 is the default; other values are interpreted relative to that.  */
>   
>   #define BRANCH_COST(speed_p, predictable_p) la_branch_cost
> +#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
>   
>   /* Return the asm template for a conditional branch instruction.
>      OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
> diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> new file mode 100644
> index 00000000000..bed585ee172
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
> +
> +int
> +short_circuit (float *a)
> +{
> +  float t1x = a[0];
> +  float t2x = a[1];
> +  float t1y = a[2];
> +  float t2y = a[3];
> +  float t1z = a[4];
> +  float t2z = a[5];
> +
> +  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
> +    return 0;
> +
> +  return 1;
> +}
> +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */
  

Patch

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 4e6ede926d3..8b453ab3140 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -869,6 +869,7 @@  typedef struct {
    1 is the default; other values are interpreted relative to that.  */
 
 #define BRANCH_COST(speed_p, predictable_p) la_branch_cost
+#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
 
 /* Return the asm template for a conditional branch instruction.
    OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
new file mode 100644
index 00000000000..bed585ee172
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */
+
+int
+short_circuit (float *a)
+{
+  float t1x = a[0];
+  float t2x = a[1];
+  float t1y = a[2];
+  float t2y = a[3];
+  float t1z = a[4];
+  float t2z = a[5];
+
+  if (t1x > t2y  || t2x < t1y  || t1x > t2z || t2x < t1z || t1y > t2z || t2y < t1z)
+    return 0;
+
+  return 1;
+}
+/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */