[x86] Fix FAIL of gcc.target/i386/pr78794.c on ia32.

Message ID 020901d9a926$cc4b7950$64e26bf0$@nextmovesoftware.com
State Accepted
Headers
Series [x86] Fix FAIL of gcc.target/i386/pr78794.c on ia32. |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Roger Sayle June 27, 2023, 6:40 p.m. UTC
  This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
is caused by minor STV rtx_cost differences with -march=silvermont.
It turns out that generic tuning results in pandn, but the lack of
accurate parameterization for COMPARE in compute_convert_gain combined
with small differences in scalar<->SSE costs on silvermont results in
this DImode chain not being converted.

The solution is to provide more accurate costs/gains for converting
(DImode and SImode) comparisons.

I'd been holding off of doing this as I'd thought it would be possible
to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
win) but I've recently realized that these optimizations (as I've
implemented them) occur in the wrong order (stv2 occurs after
combine), so it isn't easy for STV to convert CCZmode into CCCmode.
Doh!  Perhaps something can be done in peephole2...


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/78794
        * config/i386/i386-features.cc (compute_convert_gain): Provide
        more accurate gains for conversion of scalar comparisons to
        PTEST.


Thanks for your patience.
Roger
--
  

Comments

Uros Bizjak June 27, 2023, 8:02 p.m. UTC | #1
On Tue, Jun 27, 2023 at 8:40 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
> is caused by minor STV rtx_cost differences with -march=silvermont.
> It turns out that generic tuning results in pandn, but the lack of
> accurate parameterization for COMPARE in compute_convert_gain combined
> with small differences in scalar<->SSE costs on silvermont results in
> this DImode chain not being converted.
>
> The solution is to provide more accurate costs/gains for converting
> (DImode and SImode) comparisons.
>
> I'd been holding off of doing this as I'd thought it would be possible
> to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
> win) but I've recently realized that these optimizations (as I've
> implemented them) occur in the wrong order (stv2 occurs after
> combine), so it isn't easy for STV to convert CCZmode into CCCmode.
> Doh!  Perhaps something can be done in peephole2...
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/78794
>         * config/i386/i386-features.cc (compute_convert_gain): Provide
>         more accurate gains for conversion of scalar comparisons to
>         PTEST.

LGTM.

Thanks,
Uros.

>
> Thanks for your patience.
> Roger
> --
>
  

Patch

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 4a3b07a..53bec08 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -631,7 +631,31 @@  general_scalar_chain::compute_convert_gain ()
 	    break;
 
 	  case COMPARE:
-	    /* Assume comparison cost is the same.  */
+	    if (XEXP (src, 1) != const0_rtx)
+	      {
+		/* cmp vs. pxor;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 3);
+	      }
+	    else if (GET_CODE (XEXP (src, 0)) != AND)
+	      {
+		/* test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 2);
+	      }
+	    else if (GET_CODE (XEXP (XEXP (src, 0), 0)) != NOT)
+	      {
+		/* and;test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 2);
+	      }
+	    else if (TARGET_BMI)
+	      {
+		/* andn;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 3);
+	      }
+	    else
+	      {
+		/* not;and;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (3 * m - 3);
+	      }
 	    break;
 
 	  case CONST_INT: