[arm] adjust expectations for armv8_2-fp16-move-[12].c

Message ID orfsb4sr3w.fsf@lxoliva.fsfla.org
State Accepted
Headers
Series [arm] adjust expectations for armv8_2-fp16-move-[12].c |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Alexandre Oliva Feb. 17, 2023, 7:12 a.m. UTC
  Commit 3a7ba8fd0cda387809e4902328af2473662b6a4a, a patch for
tree-ssa-sink, enabled the removal of basic blocks in ways that
affected the generated code for both of these tests, deviating from
the expectations of the tests.

The simplest case is that of -2, in which the edge unsplitting ends up
enabling a conditional return rather than a conditional branch to a
set-and-return block.  That looks like an improvement to me, but the
condition in which the branch or the return takes place can be
reasonably reversed (and, with the current code, it is), I've relaxed
the pattern in the test so as to accept reversed and unreversed
conditions applied to return or branch opcodes.

The situation in -1 is a little more elaborate: conditional branches
based on FP compares in test_select_[78] are initially expanded with
CCFPE compare-and-cbranch on G{T,E}, but when ce2 turns those into a
cmove, because now we have a different fallthrough block, the
condition is reversed, and that lands us with a compare-and-cmove
sequence that needs CCFP for UNL{E,T}.  The insn output reverses the
condition and swaps the cmove input operands, so the vcmp and vsel
insns come out the same except for the missing 'e' (for the compare
mode) in vcmp, so, since such reversals could have happened to any of
the tests depending on legitimate basic block layout, I've combined
the vcmp and vcmpe counts.

I see room for improving cmove sequence generation, e.g. trying direct
and reversed conditions and selecting the cheapest one (which would
require CCFP conditions to be modeled as more expensive than CCFPE),
or for some other machine-specific (peephole2?) optimization to turn
CCFP-requiring compare and cmove into CCFPE compare and swapped-inputs
cmove, but I haven't tried that.

Regstrapped on x86_64-linux-gnu.
Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?

for  gcc/testsuite/ChangeLog

	* gcc.target/arm/armv8_2-fp16-move-1.c: Combine vcmp and vcmpe
	expected counts into a single pattern.
	* gcc.target/arm/armv8_2-fp16-move-2.c: Accept conditional
	return and reversed conditions.
---
 gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c |    3 +--
 gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c |    2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)
  

Comments

Alexandre Oliva March 3, 2023, 8:28 a.m. UTC | #1
On Feb 17, 2023, Alexandre Oliva <oliva@adacore.com> wrote:

> 	* gcc.target/arm/armv8_2-fp16-move-1.c: Combine vcmp and vcmpe
> 	expected counts into a single pattern.
> 	* gcc.target/arm/armv8_2-fp16-move-2.c: Accept conditional
> 	return and reversed conditions.

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612190.html
  
Kyrylo Tkachov March 3, 2023, 9:19 a.m. UTC | #2
> -----Original Message-----
> From: Alexandre Oliva <oliva@adacore.com>
> Sent: Friday, February 17, 2023 7:12 AM
> To: gcc-patches@gcc.gnu.org
> Cc: nickc@redhat.com; Richard Earnshaw <Richard.Earnshaw@arm.com>
> Subject: [PATCH] [arm] adjust expectations for armv8_2-fp16-move-[12].c
> 
> 
> Commit 3a7ba8fd0cda387809e4902328af2473662b6a4a, a patch for
> tree-ssa-sink, enabled the removal of basic blocks in ways that
> affected the generated code for both of these tests, deviating from
> the expectations of the tests.
> 
> The simplest case is that of -2, in which the edge unsplitting ends up
> enabling a conditional return rather than a conditional branch to a
> set-and-return block.  That looks like an improvement to me, but the
> condition in which the branch or the return takes place can be
> reasonably reversed (and, with the current code, it is), I've relaxed
> the pattern in the test so as to accept reversed and unreversed
> conditions applied to return or branch opcodes.
> 
> The situation in -1 is a little more elaborate: conditional branches
> based on FP compares in test_select_[78] are initially expanded with
> CCFPE compare-and-cbranch on G{T,E}, but when ce2 turns those into a
> cmove, because now we have a different fallthrough block, the
> condition is reversed, and that lands us with a compare-and-cmove
> sequence that needs CCFP for UNL{E,T}.  The insn output reverses the
> condition and swaps the cmove input operands, so the vcmp and vsel
> insns come out the same except for the missing 'e' (for the compare
> mode) in vcmp, so, since such reversals could have happened to any of
> the tests depending on legitimate basic block layout, I've combined
> the vcmp and vcmpe counts.
> 
> I see room for improving cmove sequence generation, e.g. trying direct
> and reversed conditions and selecting the cheapest one (which would
> require CCFP conditions to be modeled as more expensive than CCFPE),
> or for some other machine-specific (peephole2?) optimization to turn
> CCFP-requiring compare and cmove into CCFPE compare and swapped-
> inputs
> cmove, but I haven't tried that.
> 
> Regstrapped on x86_64-linux-gnu.
> Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?

The changes in the patch are okay for now. We can look at other improvements separately.
Thanks,
Kyrill

> 
> for  gcc/testsuite/ChangeLog
> 
> 	* gcc.target/arm/armv8_2-fp16-move-1.c: Combine vcmp and vcmpe
> 	expected counts into a single pattern.
> 	* gcc.target/arm/armv8_2-fp16-move-2.c: Accept conditional
> 	return and reversed conditions.
> ---
>  gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c |    3 +--
>  gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
> b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
> index 009bb8d1575a4..444c4a3353555 100644
> --- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
> +++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
> @@ -196,5 +196,4 @@ test_compare_5 (__fp16 a, __fp16 b)
>  /* { dg-final { scan-assembler-not {vcmp\.f16} } }  */
>  /* { dg-final { scan-assembler-not {vcmpe\.f16} } }  */
> 
> -/* { dg-final { scan-assembler-times {vcmp\.f32} 4 } }  */
> -/* { dg-final { scan-assembler-times {vcmpe\.f32} 8 } }  */
> +/* { dg-final { scan-assembler-times {vcmpe?\.f32} 12 } }  */
> diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
> b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
> index fcb857f29ff15..dff57ac8147c2 100644
> --- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
> +++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
> @@ -8,4 +8,4 @@ test_select (__fp16 a, __fp16 b, __fp16 c)
>  {
>    return (a < b) ? b : c;
>  }
> -/* { dg-final { scan-assembler "bmi" } } */
> +/* { dg-final { scan-assembler "bx?(mi|pl)" } } */
> 
> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>
  

Patch

diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
index 009bb8d1575a4..444c4a3353555 100644
--- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
@@ -196,5 +196,4 @@  test_compare_5 (__fp16 a, __fp16 b)
 /* { dg-final { scan-assembler-not {vcmp\.f16} } }  */
 /* { dg-final { scan-assembler-not {vcmpe\.f16} } }  */
 
-/* { dg-final { scan-assembler-times {vcmp\.f32} 4 } }  */
-/* { dg-final { scan-assembler-times {vcmpe\.f32} 8 } }  */
+/* { dg-final { scan-assembler-times {vcmpe?\.f32} 12 } }  */
diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
index fcb857f29ff15..dff57ac8147c2 100644
--- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-2.c
@@ -8,4 +8,4 @@  test_select (__fp16 a, __fp16 b, __fp16 c)
 {
   return (a < b) ? b : c;
 }
-/* { dg-final { scan-assembler "bmi" } } */
+/* { dg-final { scan-assembler "bx?(mi|pl)" } } */