[1/3] PHIOPT: Add diamond bb form to factor_out_conditional_conversion

Message ID 20230507044332.1122612-1-apinski@marvell.com
State Accepted
Headers
Series [1/3] PHIOPT: Add diamond bb form to factor_out_conditional_conversion |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Andrew Pinski May 7, 2023, 4:43 a.m. UTC
  So the function factor_out_conditional_conversion already supports
diamond shaped bb forms, just need to be called for such a thing.

harden-cond-comp.c needed to be changed as we would optimize out the
conversion now and that causes the compare hardening not needing to
split the block which it was testing. So change it such that there
would be no chance of optimization.

Also add two testcases that showed the improvement. PR 103771 is
solved in ifconvert also for the vectorizer but now it is solved
in a general sense.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/49959
	PR tree-optimization/103771

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (pass_phiopt::execute): Support
	Diamond shapped bb form for factor_out_conditional_conversion.

gcc/testsuite/ChangeLog:

	* c-c++-common/torture/harden-cond-comp.c: Change testcase
	slightly to avoid the new phiopt optimization.
	* gcc.dg/tree-ssa/abs-2.c: New test.
	* gcc.dg/tree-ssa/pr103771.c: New test.
---
 .../c-c++-common/torture/harden-cond-comp.c   |  6 +++---
 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c         | 20 +++++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c      | 18 +++++++++++++++++
 gcc/tree-ssa-phiopt.cc                        |  2 +-
 4 files changed, 42 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
  

Comments

Richard Biener May 8, 2023, 6:51 a.m. UTC | #1
On Sun, May 7, 2023 at 7:05 AM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> So the function factor_out_conditional_conversion already supports
> diamond shaped bb forms, just need to be called for such a thing.
>
> harden-cond-comp.c needed to be changed as we would optimize out the
> conversion now and that causes the compare hardening not needing to
> split the block which it was testing. So change it such that there
> would be no chance of optimization.
>
> Also add two testcases that showed the improvement. PR 103771 is
> solved in ifconvert also for the vectorizer but now it is solved
> in a general sense.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

So what does it do for the diamond case?  Factor out _common_ operations
only or also promote conditionally executed operations to unconditionally
executed?  Apart from cost considerations (with the looping patch any number
might end up promoted) there's correctness issues for negation (signed overflow)
and for non-call exceptions there's possible external throws.

If it only factors out common operations the patch is OK (but the testcases
suggest that's not the case).  Otherwise we should add a limit as to how many
ops we hoist (maybe one for the diamond case?).  Thinking over it the same
issue of course exists for the non-diamond case?

Richard.

>         PR tree-optimization/49959
>         PR tree-optimization/103771
>
> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc (pass_phiopt::execute): Support
>         Diamond shapped bb form for factor_out_conditional_conversion.
>
> gcc/testsuite/ChangeLog:
>
>         * c-c++-common/torture/harden-cond-comp.c: Change testcase
>         slightly to avoid the new phiopt optimization.
>         * gcc.dg/tree-ssa/abs-2.c: New test.
>         * gcc.dg/tree-ssa/pr103771.c: New test.
> ---
>  .../c-c++-common/torture/harden-cond-comp.c   |  6 +++---
>  gcc/testsuite/gcc.dg/tree-ssa/abs-2.c         | 20 +++++++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/pr103771.c      | 18 +++++++++++++++++
>  gcc/tree-ssa-phiopt.cc                        |  2 +-
>  4 files changed, 42 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
>
> diff --git a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> index 5aad890a1d3..dcf364ee993 100644
> --- a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> +++ b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> @@ -1,11 +1,11 @@
>  /* { dg-do compile } */
>  /* { dg-options "-fharden-conditional-branches -fharden-compares -fdump-tree-hardcbr -fdump-tree-hardcmp -ffat-lto-objects" } */
>
> -int f(int i, int j) {
> +int f(int i, int j, int k, int l) {
>    if (i == 0)
> -    return j != 0;
> +    return (j != 0) + l;
>    else
> -    return i * j != 0;
> +    return (i * j != 0) * k;
>  }
>
>  /* { dg-final { scan-tree-dump-times "Splitting edge" 2 "hardcbr" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> new file mode 100644
> index 00000000000..328b1802541
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> @@ -0,0 +1,20 @@
> +/* PR tree-optimization/49959 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-phiopt1-details" } */
> +
> +#define ABS(X)    (((X)>0)?(X):-(X))
> +unsigned long
> +test_abs(int *cur)
> +{
> +  unsigned long sad = 0;
> +  if (cur[0] > 0)
> +    sad = cur[0];
> +  else
> +    sad = -cur[0];
> +  return sad;
> +}
> +
> +/* We should figure out that test_abs has an ABS_EXPR in it. */
> +/* { dg-final { scan-tree-dump " = ABS_EXPR" "phiopt1"} } */
> +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from" 1 "phiopt1"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> new file mode 100644
> index 00000000000..97c9db846cb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-phiopt1-details" } */
> +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from COND_EXPR." 1 "phiopt1" } } */
> +
> +typedef unsigned char uint8_t;
> +
> +static uint8_t x264_clip_uint8 (int x)
> +{
> +  return x & (~255) ? (-x) >> 31 : x;
> +}
> +
> +void
> +mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
> +          int i_width,int i_scale)
> +{
> +  for(int x = 0; x < i_width; x++)
> +    dst[x] = x264_clip_uint8 (src[x] * i_scale);
> +}
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index f14b7e8b7e6..41fea78dc8d 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -4087,7 +4087,7 @@ pass_phiopt::execute (function *)
>
>        gphi *newphi;
>        if (single_pred_p (bb1)
> -         && !diamond_p
> +         && EDGE_COUNT (merge->preds) == 2
>           && (newphi = factor_out_conditional_conversion (e1, e2, phi,
>                                                           arg0, arg1,
>                                                           cond_stmt)))
> --
> 2.31.1
>
  
Richard Biener May 8, 2023, 6:52 a.m. UTC | #2
On Mon, May 8, 2023 at 8:51 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Sun, May 7, 2023 at 7:05 AM Andrew Pinski via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > So the function factor_out_conditional_conversion already supports
> > diamond shaped bb forms, just need to be called for such a thing.
> >
> > harden-cond-comp.c needed to be changed as we would optimize out the
> > conversion now and that causes the compare hardening not needing to
> > split the block which it was testing. So change it such that there
> > would be no chance of optimization.
> >
> > Also add two testcases that showed the improvement. PR 103771 is
> > solved in ifconvert also for the vectorizer but now it is solved
> > in a general sense.
> >
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> So what does it do for the diamond case?  Factor out _common_ operations
> only or also promote conditionally executed operations to unconditionally
> executed?  Apart from cost considerations (with the looping patch any number
> might end up promoted) there's correctness issues for negation (signed overflow)
> and for non-call exceptions there's possible external throws.
>
> If it only factors out common operations the patch is OK (but the testcases
> suggest that's not the case).  Otherwise we should add a limit as to how many
> ops we hoist (maybe one for the diamond case?).  Thinking over it the same
> issue of course exists for the non-diamond case?

Oh, and we hoist the stuff without being sure the final PHI will be if-converted
in the end, correct?  Can we somehow improve on that, aka tentatively
convert if-convert the PHI first?

Richard.

> Richard.
>
> >         PR tree-optimization/49959
> >         PR tree-optimization/103771
> >
> > gcc/ChangeLog:
> >
> >         * tree-ssa-phiopt.cc (pass_phiopt::execute): Support
> >         Diamond shapped bb form for factor_out_conditional_conversion.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * c-c++-common/torture/harden-cond-comp.c: Change testcase
> >         slightly to avoid the new phiopt optimization.
> >         * gcc.dg/tree-ssa/abs-2.c: New test.
> >         * gcc.dg/tree-ssa/pr103771.c: New test.
> > ---
> >  .../c-c++-common/torture/harden-cond-comp.c   |  6 +++---
> >  gcc/testsuite/gcc.dg/tree-ssa/abs-2.c         | 20 +++++++++++++++++++
> >  gcc/testsuite/gcc.dg/tree-ssa/pr103771.c      | 18 +++++++++++++++++
> >  gcc/tree-ssa-phiopt.cc                        |  2 +-
> >  4 files changed, 42 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> >
> > diff --git a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > index 5aad890a1d3..dcf364ee993 100644
> > --- a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > +++ b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > @@ -1,11 +1,11 @@
> >  /* { dg-do compile } */
> >  /* { dg-options "-fharden-conditional-branches -fharden-compares -fdump-tree-hardcbr -fdump-tree-hardcmp -ffat-lto-objects" } */
> >
> > -int f(int i, int j) {
> > +int f(int i, int j, int k, int l) {
> >    if (i == 0)
> > -    return j != 0;
> > +    return (j != 0) + l;
> >    else
> > -    return i * j != 0;
> > +    return (i * j != 0) * k;
> >  }
> >
> >  /* { dg-final { scan-tree-dump-times "Splitting edge" 2 "hardcbr" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > new file mode 100644
> > index 00000000000..328b1802541
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > @@ -0,0 +1,20 @@
> > +/* PR tree-optimization/49959 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -fdump-tree-phiopt1-details" } */
> > +
> > +#define ABS(X)    (((X)>0)?(X):-(X))
> > +unsigned long
> > +test_abs(int *cur)
> > +{
> > +  unsigned long sad = 0;
> > +  if (cur[0] > 0)
> > +    sad = cur[0];
> > +  else
> > +    sad = -cur[0];
> > +  return sad;
> > +}
> > +
> > +/* We should figure out that test_abs has an ABS_EXPR in it. */
> > +/* { dg-final { scan-tree-dump " = ABS_EXPR" "phiopt1"} } */
> > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from" 1 "phiopt1"} } */
> > +
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > new file mode 100644
> > index 00000000000..97c9db846cb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > @@ -0,0 +1,18 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -fdump-tree-phiopt1-details" } */
> > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from COND_EXPR." 1 "phiopt1" } } */
> > +
> > +typedef unsigned char uint8_t;
> > +
> > +static uint8_t x264_clip_uint8 (int x)
> > +{
> > +  return x & (~255) ? (-x) >> 31 : x;
> > +}
> > +
> > +void
> > +mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
> > +          int i_width,int i_scale)
> > +{
> > +  for(int x = 0; x < i_width; x++)
> > +    dst[x] = x264_clip_uint8 (src[x] * i_scale);
> > +}
> > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > index f14b7e8b7e6..41fea78dc8d 100644
> > --- a/gcc/tree-ssa-phiopt.cc
> > +++ b/gcc/tree-ssa-phiopt.cc
> > @@ -4087,7 +4087,7 @@ pass_phiopt::execute (function *)
> >
> >        gphi *newphi;
> >        if (single_pred_p (bb1)
> > -         && !diamond_p
> > +         && EDGE_COUNT (merge->preds) == 2
> >           && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> >                                                           arg0, arg1,
> >                                                           cond_stmt)))
> > --
> > 2.31.1
> >
  
Andrew Pinski May 8, 2023, 7:09 a.m. UTC | #3
On Sun, May 7, 2023 at 11:53 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, May 8, 2023 at 8:51 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Sun, May 7, 2023 at 7:05 AM Andrew Pinski via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > So the function factor_out_conditional_conversion already supports
> > > diamond shaped bb forms, just need to be called for such a thing.
> > >
> > > harden-cond-comp.c needed to be changed as we would optimize out the
> > > conversion now and that causes the compare hardening not needing to
> > > split the block which it was testing. So change it such that there
> > > would be no chance of optimization.
> > >
> > > Also add two testcases that showed the improvement. PR 103771 is
> > > solved in ifconvert also for the vectorizer but now it is solved
> > > in a general sense.
> > >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > So what does it do for the diamond case?  Factor out _common_ operations
> > only or also promote conditionally executed operations to unconditionally
> > executed?  Apart from cost considerations (with the looping patch any number
> > might end up promoted) there's correctness issues for negation (signed overflow)
> > and for non-call exceptions there's possible external throws.
> >
> > If it only factors out common operations the patch is OK (but the testcases
> > suggest that's not the case).

Right because the testcases are also testing if there was another
phiopt that happened in the same pass rather than having to wait
again.

>> Otherwise we should add a limit as to how many
> > ops we hoist (maybe one for the diamond case?).  Thinking over it the same
> > issue of course exists for the non-diamond case?
>
> Oh, and we hoist the stuff without being sure the final PHI will be if-converted
> in the end, correct?  Can we somehow improve on that, aka tentatively
> convert if-convert the PHI first?

Yes it hoists the stuff without any checks on if the final PHI will
have happened.
We have the same issue with hoisting adjacent loads.
I will think of a way to handle to maybe handle this better and even
undoing it; I will have to get back to you the details though.
And yes that was already an existing issue with the non-diamond case
of this; there was some extra checks added to prevent moving some
casts from being moved out of the PHI due to extra zero extends with
respect of __builtin_popcount/clz, etc.

Thanks,
Andrew



>
> Richard.
>
> > Richard.
> >
> > >         PR tree-optimization/49959
> > >         PR tree-optimization/103771
> > >
> > > gcc/ChangeLog:
> > >
> > >         * tree-ssa-phiopt.cc (pass_phiopt::execute): Support
> > >         Diamond shapped bb form for factor_out_conditional_conversion.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * c-c++-common/torture/harden-cond-comp.c: Change testcase
> > >         slightly to avoid the new phiopt optimization.
> > >         * gcc.dg/tree-ssa/abs-2.c: New test.
> > >         * gcc.dg/tree-ssa/pr103771.c: New test.
> > > ---
> > >  .../c-c++-common/torture/harden-cond-comp.c   |  6 +++---
> > >  gcc/testsuite/gcc.dg/tree-ssa/abs-2.c         | 20 +++++++++++++++++++
> > >  gcc/testsuite/gcc.dg/tree-ssa/pr103771.c      | 18 +++++++++++++++++
> > >  gcc/tree-ssa-phiopt.cc                        |  2 +-
> > >  4 files changed, 42 insertions(+), 4 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > >
> > > diff --git a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > index 5aad890a1d3..dcf364ee993 100644
> > > --- a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > +++ b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > @@ -1,11 +1,11 @@
> > >  /* { dg-do compile } */
> > >  /* { dg-options "-fharden-conditional-branches -fharden-compares -fdump-tree-hardcbr -fdump-tree-hardcmp -ffat-lto-objects" } */
> > >
> > > -int f(int i, int j) {
> > > +int f(int i, int j, int k, int l) {
> > >    if (i == 0)
> > > -    return j != 0;
> > > +    return (j != 0) + l;
> > >    else
> > > -    return i * j != 0;
> > > +    return (i * j != 0) * k;
> > >  }
> > >
> > >  /* { dg-final { scan-tree-dump-times "Splitting edge" 2 "hardcbr" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > > new file mode 100644
> > > index 00000000000..328b1802541
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > > @@ -0,0 +1,20 @@
> > > +/* PR tree-optimization/49959 */
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -fdump-tree-phiopt1-details" } */
> > > +
> > > +#define ABS(X)    (((X)>0)?(X):-(X))
> > > +unsigned long
> > > +test_abs(int *cur)
> > > +{
> > > +  unsigned long sad = 0;
> > > +  if (cur[0] > 0)
> > > +    sad = cur[0];
> > > +  else
> > > +    sad = -cur[0];
> > > +  return sad;
> > > +}
> > > +
> > > +/* We should figure out that test_abs has an ABS_EXPR in it. */
> > > +/* { dg-final { scan-tree-dump " = ABS_EXPR" "phiopt1"} } */
> > > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from" 1 "phiopt1"} } */
> > > +
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > > new file mode 100644
> > > index 00000000000..97c9db846cb
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > > @@ -0,0 +1,18 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O3 -fdump-tree-phiopt1-details" } */
> > > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from COND_EXPR." 1 "phiopt1" } } */
> > > +
> > > +typedef unsigned char uint8_t;
> > > +
> > > +static uint8_t x264_clip_uint8 (int x)
> > > +{
> > > +  return x & (~255) ? (-x) >> 31 : x;
> > > +}
> > > +
> > > +void
> > > +mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
> > > +          int i_width,int i_scale)
> > > +{
> > > +  for(int x = 0; x < i_width; x++)
> > > +    dst[x] = x264_clip_uint8 (src[x] * i_scale);
> > > +}
> > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > > index f14b7e8b7e6..41fea78dc8d 100644
> > > --- a/gcc/tree-ssa-phiopt.cc
> > > +++ b/gcc/tree-ssa-phiopt.cc
> > > @@ -4087,7 +4087,7 @@ pass_phiopt::execute (function *)
> > >
> > >        gphi *newphi;
> > >        if (single_pred_p (bb1)
> > > -         && !diamond_p
> > > +         && EDGE_COUNT (merge->preds) == 2
> > >           && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > >                                                           arg0, arg1,
> > >                                                           cond_stmt)))
> > > --
> > > 2.31.1
> > >
  
Richard Biener May 8, 2023, 7:28 a.m. UTC | #4
On Mon, May 8, 2023 at 9:09 AM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Sun, May 7, 2023 at 11:53 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, May 8, 2023 at 8:51 AM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Sun, May 7, 2023 at 7:05 AM Andrew Pinski via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > So the function factor_out_conditional_conversion already supports
> > > > diamond shaped bb forms, just need to be called for such a thing.
> > > >
> > > > harden-cond-comp.c needed to be changed as we would optimize out the
> > > > conversion now and that causes the compare hardening not needing to
> > > > split the block which it was testing. So change it such that there
> > > > would be no chance of optimization.
> > > >
> > > > Also add two testcases that showed the improvement. PR 103771 is
> > > > solved in ifconvert also for the vectorizer but now it is solved
> > > > in a general sense.
> > > >
> > > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > > So what does it do for the diamond case?  Factor out _common_ operations
> > > only or also promote conditionally executed operations to unconditionally
> > > executed?  Apart from cost considerations (with the looping patch any number
> > > might end up promoted) there's correctness issues for negation (signed overflow)
> > > and for non-call exceptions there's possible external throws.
> > >
> > > If it only factors out common operations the patch is OK (but the testcases
> > > suggest that's not the case).
>
> Right because the testcases are also testing if there was another
> phiopt that happened in the same pass rather than having to wait
> again.
>
> >> Otherwise we should add a limit as to how many
> > > ops we hoist (maybe one for the diamond case?).  Thinking over it the same
> > > issue of course exists for the non-diamond case?
> >
> > Oh, and we hoist the stuff without being sure the final PHI will be if-converted
> > in the end, correct?  Can we somehow improve on that, aka tentatively
> > convert if-convert the PHI first?
>
> Yes it hoists the stuff without any checks on if the final PHI will
> have happened.
> We have the same issue with hoisting adjacent loads.
> I will think of a way to handle to maybe handle this better and even
> undoing it; I will have to get back to you the details though.
> And yes that was already an existing issue with the non-diamond case
> of this; there was some extra checks added to prevent moving some
> casts from being moved out of the PHI due to extra zero extends with
> respect of __builtin_popcount/clz, etc.

OK.  Would be nice if you can somehow handle this.  Even adding a
static limit to the number of hoisted ops isn't good - multiple phiopt
passes will simply run up to N times such limit :/  Undoing might be
"easiest", best would be separating analysis and transform phase
somehow.

Let's go with the patch as-is in the meantime.

Thanks,
Richard.

> Thanks,
> Andrew
>
>
>
> >
> > Richard.
> >
> > > Richard.
> > >
> > > >         PR tree-optimization/49959
> > > >         PR tree-optimization/103771
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * tree-ssa-phiopt.cc (pass_phiopt::execute): Support
> > > >         Diamond shapped bb form for factor_out_conditional_conversion.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         * c-c++-common/torture/harden-cond-comp.c: Change testcase
> > > >         slightly to avoid the new phiopt optimization.
> > > >         * gcc.dg/tree-ssa/abs-2.c: New test.
> > > >         * gcc.dg/tree-ssa/pr103771.c: New test.
> > > > ---
> > > >  .../c-c++-common/torture/harden-cond-comp.c   |  6 +++---
> > > >  gcc/testsuite/gcc.dg/tree-ssa/abs-2.c         | 20 +++++++++++++++++++
> > > >  gcc/testsuite/gcc.dg/tree-ssa/pr103771.c      | 18 +++++++++++++++++
> > > >  gcc/tree-ssa-phiopt.cc                        |  2 +-
> > > >  4 files changed, 42 insertions(+), 4 deletions(-)
> > > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > > >
> > > > diff --git a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > > index 5aad890a1d3..dcf364ee993 100644
> > > > --- a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > > +++ b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
> > > > @@ -1,11 +1,11 @@
> > > >  /* { dg-do compile } */
> > > >  /* { dg-options "-fharden-conditional-branches -fharden-compares -fdump-tree-hardcbr -fdump-tree-hardcmp -ffat-lto-objects" } */
> > > >
> > > > -int f(int i, int j) {
> > > > +int f(int i, int j, int k, int l) {
> > > >    if (i == 0)
> > > > -    return j != 0;
> > > > +    return (j != 0) + l;
> > > >    else
> > > > -    return i * j != 0;
> > > > +    return (i * j != 0) * k;
> > > >  }
> > > >
> > > >  /* { dg-final { scan-tree-dump-times "Splitting edge" 2 "hardcbr" } } */
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > > > new file mode 100644
> > > > index 00000000000..328b1802541
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
> > > > @@ -0,0 +1,20 @@
> > > > +/* PR tree-optimization/49959 */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O -fdump-tree-phiopt1-details" } */
> > > > +
> > > > +#define ABS(X)    (((X)>0)?(X):-(X))
> > > > +unsigned long
> > > > +test_abs(int *cur)
> > > > +{
> > > > +  unsigned long sad = 0;
> > > > +  if (cur[0] > 0)
> > > > +    sad = cur[0];
> > > > +  else
> > > > +    sad = -cur[0];
> > > > +  return sad;
> > > > +}
> > > > +
> > > > +/* We should figure out that test_abs has an ABS_EXPR in it. */
> > > > +/* { dg-final { scan-tree-dump " = ABS_EXPR" "phiopt1"} } */
> > > > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from" 1 "phiopt1"} } */
> > > > +
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > > > new file mode 100644
> > > > index 00000000000..97c9db846cb
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
> > > > @@ -0,0 +1,18 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O3 -fdump-tree-phiopt1-details" } */
> > > > +/* { dg-final { scan-tree-dump-times "changed to factor conversion out from COND_EXPR." 1 "phiopt1" } } */
> > > > +
> > > > +typedef unsigned char uint8_t;
> > > > +
> > > > +static uint8_t x264_clip_uint8 (int x)
> > > > +{
> > > > +  return x & (~255) ? (-x) >> 31 : x;
> > > > +}
> > > > +
> > > > +void
> > > > +mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
> > > > +          int i_width,int i_scale)
> > > > +{
> > > > +  for(int x = 0; x < i_width; x++)
> > > > +    dst[x] = x264_clip_uint8 (src[x] * i_scale);
> > > > +}
> > > > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> > > > index f14b7e8b7e6..41fea78dc8d 100644
> > > > --- a/gcc/tree-ssa-phiopt.cc
> > > > +++ b/gcc/tree-ssa-phiopt.cc
> > > > @@ -4087,7 +4087,7 @@ pass_phiopt::execute (function *)
> > > >
> > > >        gphi *newphi;
> > > >        if (single_pred_p (bb1)
> > > > -         && !diamond_p
> > > > +         && EDGE_COUNT (merge->preds) == 2
> > > >           && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> > > >                                                           arg0, arg1,
> > > >                                                           cond_stmt)))
> > > > --
> > > > 2.31.1
> > > >
  

Patch

diff --git a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
index 5aad890a1d3..dcf364ee993 100644
--- a/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
+++ b/gcc/testsuite/c-c++-common/torture/harden-cond-comp.c
@@ -1,11 +1,11 @@ 
 /* { dg-do compile } */
 /* { dg-options "-fharden-conditional-branches -fharden-compares -fdump-tree-hardcbr -fdump-tree-hardcmp -ffat-lto-objects" } */
 
-int f(int i, int j) {
+int f(int i, int j, int k, int l) {
   if (i == 0)
-    return j != 0;
+    return (j != 0) + l;
   else
-    return i * j != 0;
+    return (i * j != 0) * k;
 }
 
 /* { dg-final { scan-tree-dump-times "Splitting edge" 2 "hardcbr" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
new file mode 100644
index 00000000000..328b1802541
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-2.c
@@ -0,0 +1,20 @@ 
+/* PR tree-optimization/49959 */
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt1-details" } */
+
+#define ABS(X)    (((X)>0)?(X):-(X))
+unsigned long
+test_abs(int *cur)
+{
+  unsigned long sad = 0;
+  if (cur[0] > 0)
+    sad = cur[0];
+  else
+    sad = -cur[0];
+  return sad;
+}
+
+/* We should figure out that test_abs has an ABS_EXPR in it. */
+/* { dg-final { scan-tree-dump " = ABS_EXPR" "phiopt1"} } */
+/* { dg-final { scan-tree-dump-times "changed to factor conversion out from" 1 "phiopt1"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
new file mode 100644
index 00000000000..97c9db846cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103771.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-phiopt1-details" } */
+/* { dg-final { scan-tree-dump-times "changed to factor conversion out from COND_EXPR." 1 "phiopt1" } } */
+
+typedef unsigned char uint8_t;
+
+static uint8_t x264_clip_uint8 (int x)
+{
+  return x & (~255) ? (-x) >> 31 : x;
+}
+
+void
+mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
+	   int i_width,int i_scale)
+{
+  for(int x = 0; x < i_width; x++)
+    dst[x] = x264_clip_uint8 (src[x] * i_scale);
+}
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index f14b7e8b7e6..41fea78dc8d 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -4087,7 +4087,7 @@  pass_phiopt::execute (function *)
 
       gphi *newphi;
       if (single_pred_p (bb1)
-	  && !diamond_p
+	  && EDGE_COUNT (merge->preds) == 2
 	  && (newphi = factor_out_conditional_conversion (e1, e2, phi,
 							  arg0, arg1,
 							  cond_stmt)))