[v4] A new copy propagation and PHI elimination pass

Message ID ZXnX24ZRkpHr7B-m@localhost.localdomain
State Accepted
Headers
Series [v4] A new copy propagation and PHI elimination pass |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Filip Kastl Dec. 13, 2023, 4:12 p.m. UTC
  > > > Hi,
> > > 
> > > this is a patch that I submitted two months ago as an RFC. I added some polish
> > > since.
> > > 
> > > It is a new lightweight pass that removes redundant PHI functions and as a
> > > bonus does basic copy propagation. With Jan Hubi?ka we measured that it is able
> > > to remove usually more than 5% of all PHI functions when run among early passes
> > > (sometimes even 13% or more). Those are mostly PHI functions that would be
> > > later optimized away but with this pass it is possible to remove them early
> > > enough so that they don't get streamed when runing LTO (and also potentially
> > > inlined at multiple places). It is also able to remove some redundant PHIs
> > > that otherwise would still be present during RTL expansion.
> > > 
> > > Jakub Jel?nek was concerned about debug info coverage so I compiled cc1plus
> > > with and without this patch. These are the sizes of .debug_info and
> > > .debug_loclists
> > > 
> > > .debug_info without patch 181694311
> > > .debug_info    with patch 181692320
> > > +0.0011% change
> > > 
> > > .debug_loclists without patch 47934753
> > > .debug_loclists    with patch 47934966
> > > -0.0004% change
> > > 
> > > I wanted to use dwlocstat to compare debug coverages but didn't manage to get
> > > the program working on my machine sadly. Hope this suffices. Seems to me that
> > > my patch doesn't have a significant impact on debug info.
> > > 
> > > Bootstraped and tested* on x86_64-pc-linux-gnu.
> > > 
> > > * One testcase (pr79691.c) did regress. However that is because the test is
> > > dependent on a certain variable not being copy propagated. I will go into more
> > > detail about this in a reply to this mail.
> > > 
> > > Ok to commit?
> > 
> > This is a second version of the patch.  In this version, I modified the
> > pr79691.c testcase so that it works as intended with other changes from the
> > patch.
> > 
> > The pr79691.c testcase checks that we get constants from snprintf calls and
> > that they simplify into a single constant.  The testcase doesn't account for
> > the fact that this constant may be further copy propagated which is exactly
> > what happens with this patch applied.
> > 
> > Bootstrapped and tested on x86_64-pc-linux-gnu.
> > 
> > Ok to commit?
> 
> This is the third version of the patch. In this version, I adressed most of
> Richards remarks about the second version. Here is a summary of changes I made:
> 
> - Rename the pass from tree-ssa-sccopy.cc to gimple-ssa-sccopy.cc
> - Use simple_dce_from_worklist to remove propagated statements
> - Use existing replace_uses API instead of reinventing it
>   - This allowed me to get rid of some now redundant cleanup code
> - Encapsulate the SCC finding into a class
> - Rework stmt_may_generate_copy to get rid of redundant checks
> - Add check that PHI doesn't contain two non-SSA-name values to
>   stmt_may_generate_copy
> - Regarding alignment and value ranges in stmt_may_generate_copy: For now use
>   the conservative check that Richard suggested
> - Index array of vertices that SCC discovery uses by SSA name version numbers
>   instead of numbering statements myself.
> 
> 
> I didn't make any changes based on these remarks:
> 
> 1 It might be nice to optimize SCCs of size 1 somehow, not sure how
>   many times these appear - possibly prevent them from even entering
>   the SCC discovery?
> 
> It would be nice. But the only way to do this that I see right now is to first
> propagate SCCs of size 1 and then the rest. This would mean adding a new copy
> propagation procedure. It wouldn't be a trivial procedure. Efficiency of the
> pass relies on having SCCs topogically sorted so this procedure would have to
> implement some topological sort algorithm.
> 
> This could be done. It could save allocating some vec<>s (right now, SCCs of
> size 1 are represented by a vec<> with a single element). But is it worth it to
> optimize the pass this way right now? If possible, I'd like to see that the
> pass works and sort out any problems people encounter with it before I start
> optimizing it.
> 
> 2 Instead of collecting all stmts that may generate a copy at the beginning of
>   the pass into a vec<>, let the SCC discovery check that statements may
>   generate a copy on the fly.
> 
> This would be a big change to the pass, it would require a lot of reworking.
> I'm also not sure if this would help reduce the number of allocated vec<>s that
> much because I'll still want to represent SCCs by vec<>s.
> 
> Again - its possible I'll want to rework the pass in this way in the future but
> I'd like to leave it as it is for now.
> 
> 3 Add a comment saying that the pass is doing optimistic copy propagation
> 
> I don't think the pass works in an optimistic way. It doesn't assume that all
> variables are copies of each other at any point. It instead identifies copy
> statements (or PHI SCCs that act as copy statements) and then for each of these
> it propagates: By that I mean if a copy statement says that _3 is a copy of _2,
> then it replaces all uses of _3 by _2.
> 
> But its possible that I still misinterpret what 'optimistic' means. If
> mentioning that it works in an optimistic way would help readability of the
> pass, I'm happy to add that comment.
>
> Richard, is the patch ok as it is now? Or do you think that making changes 1, 2
> or 3 is necessary? Or are there any other issues which should be adressed?
>
> Filip

This is the fourth version of this patch. Changes in this version:
- Disabled the pass for -Og
  - This meant that I could revert my changes to gcc.dg/tree-ssa/pr79691.c
- Fixed formatting of the patch (like braces around a single-statment if body)
- compute_sccs (auto_vec<>...) -> compute_sccs (vec<> ...)
- auto_vec<gimple *> worklist; worklist = ...; ->
  auto_vec<gimple *> worklist = ...;

I've also checked that there are no memory leak issues. Valgrind doesn't detect
any aditional leaks when running with sccopy vs running without sccopy:

[fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2
==27363== Memcheck, a memory error detector
==27363== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==27363== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==27363== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2
==27363== 
==27363== 
==27363== HEAP SUMMARY:
==27363==     in use at exit: 174,234 bytes in 92 blocks
==27363==   total heap usage: 408 allocs, 316 frees, 228,050 bytes allocated
==27363== 
==27363== LEAK SUMMARY:
==27363==    definitely lost: 5,935 bytes in 23 blocks
==27363==    indirectly lost: 82 bytes in 5 blocks
==27363==      possibly lost: 0 bytes in 0 blocks
==27363==    still reachable: 168,217 bytes in 64 blocks
==27363==                       of which reachable via heuristic:
==27363==                         newarray           : 1,544 bytes in 1 blocks
==27363==         suppressed: 0 bytes in 0 blocks
==27363== Rerun with --leak-check=full to see details of leaked memory
==27363== 
==27363== For lists of detected and suppressed errors, rerun with: -s
==27363== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

[fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2
==27372== Memcheck, a memory error detector
==27372== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==27372== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==27372== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2
==27372== 
cc1: note: disable pass tree-sccopy1 for functions in the range of [0, 4294967295]
cc1: note: disable pass tree-sccopy2 for functions in the range of [0, 4294967295]
==27372== 
==27372== HEAP SUMMARY:
==27372==     in use at exit: 174,282 bytes in 92 blocks
==27372==   total heap usage: 408 allocs, 316 frees, 228,674 bytes allocated
==27372== 
==27372== LEAK SUMMARY:
==27372==    definitely lost: 5,935 bytes in 23 blocks
==27372==    indirectly lost: 82 bytes in 5 blocks
==27372==      possibly lost: 0 bytes in 0 blocks
==27372==    still reachable: 168,265 bytes in 64 blocks
==27372==                       of which reachable via heuristic:
==27372==                         newarray           : 1,544 bytes in 1 blocks
==27372==         suppressed: 0 bytes in 0 blocks
==27372== Rerun with --leak-check=full to see details of leaked memory
==27372== 
==27372== For lists of detected and suppressed errors, rerun with: -s
==27372== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Richard, I wrote down the bitmap vs hash_set optimizations you suggested so
that I can work on the pass more later and optimize it.

I didn't yet bootstrap this version. I just regtested it. Will bootstrap and
regtest on the most current commit. Once that is successfully done, is the pass
OK to be pushed to main?

Filip

-- >8 --

This patch adds the strongly-connected copy propagation (SCCOPY) pass.
It is a lightweight GIMPLE copy propagation pass that also removes some
redundant PHI statements. It handles degenerate PHIs, e.g.:

_5 = PHI <_1>;
_6 = PHI <_6, _6, _1, _1>;
_7 = PHI <16, _7>;
// Replaces occurences of _5 and _6 by _1 and _7 by 16

It also handles more complicated situations, e.g.:

_8 = PHI <_9, _10>;
_9 = PHI <_8, _10>;
_10 = PHI <_8, _9, _1>;
// Replaces occurences of _8, _9 and _10 by _1

gcc/ChangeLog:

	* Makefile.in: Added sccopy pass.
	* passes.def: Added sccopy pass before LTO streaming and before
	  RTL expansion.
	* tree-pass.h (make_pass_sccopy): Added sccopy pass.
	* gimple-ssa-sccopy.cc: New file.

gcc/testsuite/ChangeLog:

	* gcc.dg/sccopy-1.c: New test.

Signed-off-by: Filip Kastl <fkastl@suse.cz>
---
 gcc/Makefile.in                 |   1 +
 gcc/gimple-ssa-sccopy.cc        | 680 ++++++++++++++++++++++++++++++++
 gcc/passes.def                  |   2 +
 gcc/testsuite/gcc.dg/sccopy-1.c |  78 ++++
 gcc/tree-pass.h                 |   1 +
 5 files changed, 762 insertions(+)
 create mode 100644 gcc/gimple-ssa-sccopy.cc
 create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c
  

Comments

Richard Biener Dec. 13, 2023, 4:15 p.m. UTC | #1
> Am 13.12.2023 um 17:12 schrieb Filip Kastl <fkastl@suse.cz>:
> 
> 
>> 
>>>> Hi,
>>>> 
>>>> this is a patch that I submitted two months ago as an RFC. I added some polish
>>>> since.
>>>> 
>>>> It is a new lightweight pass that removes redundant PHI functions and as a
>>>> bonus does basic copy propagation. With Jan Hubi?ka we measured that it is able
>>>> to remove usually more than 5% of all PHI functions when run among early passes
>>>> (sometimes even 13% or more). Those are mostly PHI functions that would be
>>>> later optimized away but with this pass it is possible to remove them early
>>>> enough so that they don't get streamed when runing LTO (and also potentially
>>>> inlined at multiple places). It is also able to remove some redundant PHIs
>>>> that otherwise would still be present during RTL expansion.
>>>> 
>>>> Jakub Jel?nek was concerned about debug info coverage so I compiled cc1plus
>>>> with and without this patch. These are the sizes of .debug_info and
>>>> .debug_loclists
>>>> 
>>>> .debug_info without patch 181694311
>>>> .debug_info    with patch 181692320
>>>> +0.0011% change
>>>> 
>>>> .debug_loclists without patch 47934753
>>>> .debug_loclists    with patch 47934966
>>>> -0.0004% change
>>>> 
>>>> I wanted to use dwlocstat to compare debug coverages but didn't manage to get
>>>> the program working on my machine sadly. Hope this suffices. Seems to me that
>>>> my patch doesn't have a significant impact on debug info.
>>>> 
>>>> Bootstraped and tested* on x86_64-pc-linux-gnu.
>>>> 
>>>> * One testcase (pr79691.c) did regress. However that is because the test is
>>>> dependent on a certain variable not being copy propagated. I will go into more
>>>> detail about this in a reply to this mail.
>>>> 
>>>> Ok to commit?
>>> 
>>> This is a second version of the patch.  In this version, I modified the
>>> pr79691.c testcase so that it works as intended with other changes from the
>>> patch.
>>> 
>>> The pr79691.c testcase checks that we get constants from snprintf calls and
>>> that they simplify into a single constant.  The testcase doesn't account for
>>> the fact that this constant may be further copy propagated which is exactly
>>> what happens with this patch applied.
>>> 
>>> Bootstrapped and tested on x86_64-pc-linux-gnu.
>>> 
>>> Ok to commit?
>> 
>> This is the third version of the patch. In this version, I adressed most of
>> Richards remarks about the second version. Here is a summary of changes I made:
>> 
>> - Rename the pass from tree-ssa-sccopy.cc to gimple-ssa-sccopy.cc
>> - Use simple_dce_from_worklist to remove propagated statements
>> - Use existing replace_uses API instead of reinventing it
>>  - This allowed me to get rid of some now redundant cleanup code
>> - Encapsulate the SCC finding into a class
>> - Rework stmt_may_generate_copy to get rid of redundant checks
>> - Add check that PHI doesn't contain two non-SSA-name values to
>>  stmt_may_generate_copy
>> - Regarding alignment and value ranges in stmt_may_generate_copy: For now use
>>  the conservative check that Richard suggested
>> - Index array of vertices that SCC discovery uses by SSA name version numbers
>>  instead of numbering statements myself.
>> 
>> 
>> I didn't make any changes based on these remarks:
>> 
>> 1 It might be nice to optimize SCCs of size 1 somehow, not sure how
>>  many times these appear - possibly prevent them from even entering
>>  the SCC discovery?
>> 
>> It would be nice. But the only way to do this that I see right now is to first
>> propagate SCCs of size 1 and then the rest. This would mean adding a new copy
>> propagation procedure. It wouldn't be a trivial procedure. Efficiency of the
>> pass relies on having SCCs topogically sorted so this procedure would have to
>> implement some topological sort algorithm.
>> 
>> This could be done. It could save allocating some vec<>s (right now, SCCs of
>> size 1 are represented by a vec<> with a single element). But is it worth it to
>> optimize the pass this way right now? If possible, I'd like to see that the
>> pass works and sort out any problems people encounter with it before I start
>> optimizing it.
>> 
>> 2 Instead of collecting all stmts that may generate a copy at the beginning of
>>  the pass into a vec<>, let the SCC discovery check that statements may
>>  generate a copy on the fly.
>> 
>> This would be a big change to the pass, it would require a lot of reworking.
>> I'm also not sure if this would help reduce the number of allocated vec<>s that
>> much because I'll still want to represent SCCs by vec<>s.
>> 
>> Again - its possible I'll want to rework the pass in this way in the future but
>> I'd like to leave it as it is for now.
>> 
>> 3 Add a comment saying that the pass is doing optimistic copy propagation
>> 
>> I don't think the pass works in an optimistic way. It doesn't assume that all
>> variables are copies of each other at any point. It instead identifies copy
>> statements (or PHI SCCs that act as copy statements) and then for each of these
>> it propagates: By that I mean if a copy statement says that _3 is a copy of _2,
>> then it replaces all uses of _3 by _2.
>> 
>> But its possible that I still misinterpret what 'optimistic' means. If
>> mentioning that it works in an optimistic way would help readability of the
>> pass, I'm happy to add that comment.
>> 
>> Richard, is the patch ok as it is now? Or do you think that making changes 1, 2
>> or 3 is necessary? Or are there any other issues which should be adressed?
>> 
>> Filip
> 
> This is the fourth version of this patch. Changes in this version:
> - Disabled the pass for -Og
>  - This meant that I could revert my changes to gcc.dg/tree-ssa/pr79691.c
> - Fixed formatting of the patch (like braces around a single-statment if body)
> - compute_sccs (auto_vec<>...) -> compute_sccs (vec<> ...)
> - auto_vec<gimple *> worklist; worklist = ...; ->
>  auto_vec<gimple *> worklist = ...;
> 
> I've also checked that there are no memory leak issues. Valgrind doesn't detect
> any aditional leaks when running with sccopy vs running without sccopy:
> 
> [fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2
> ==27363== Memcheck, a memory error detector
> ==27363== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==27363== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
> ==27363== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2
> ==27363== 
> ==27363== 
> ==27363== HEAP SUMMARY:
> ==27363==     in use at exit: 174,234 bytes in 92 blocks
> ==27363==   total heap usage: 408 allocs, 316 frees, 228,050 bytes allocated
> ==27363== 
> ==27363== LEAK SUMMARY:
> ==27363==    definitely lost: 5,935 bytes in 23 blocks
> ==27363==    indirectly lost: 82 bytes in 5 blocks
> ==27363==      possibly lost: 0 bytes in 0 blocks
> ==27363==    still reachable: 168,217 bytes in 64 blocks
> ==27363==                       of which reachable via heuristic:
> ==27363==                         newarray           : 1,544 bytes in 1 blocks
> ==27363==         suppressed: 0 bytes in 0 blocks
> ==27363== Rerun with --leak-check=full to see details of leaked memory
> ==27363== 
> ==27363== For lists of detected and suppressed errors, rerun with: -s
> ==27363== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> 
> [fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2
> ==27372== Memcheck, a memory error detector
> ==27372== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==27372== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
> ==27372== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2
> ==27372== 
> cc1: note: disable pass tree-sccopy1 for functions in the range of [0, 4294967295]
> cc1: note: disable pass tree-sccopy2 for functions in the range of [0, 4294967295]
> ==27372== 
> ==27372== HEAP SUMMARY:
> ==27372==     in use at exit: 174,282 bytes in 92 blocks
> ==27372==   total heap usage: 408 allocs, 316 frees, 228,674 bytes allocated
> ==27372== 
> ==27372== LEAK SUMMARY:
> ==27372==    definitely lost: 5,935 bytes in 23 blocks
> ==27372==    indirectly lost: 82 bytes in 5 blocks
> ==27372==      possibly lost: 0 bytes in 0 blocks
> ==27372==    still reachable: 168,265 bytes in 64 blocks
> ==27372==                       of which reachable via heuristic:
> ==27372==                         newarray           : 1,544 bytes in 1 blocks
> ==27372==         suppressed: 0 bytes in 0 blocks
> ==27372== Rerun with --leak-check=full to see details of leaked memory
> ==27372== 
> ==27372== For lists of detected and suppressed errors, rerun with: -s
> ==27372== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> 
> Richard, I wrote down the bitmap vs hash_set optimizations you suggested so
> that I can work on the pass more later and optimize it.
> 
> I didn't yet bootstrap this version. I just regtested it. Will bootstrap and
> regtest on the most current commit. Once that is successfully done, is the pass
> OK to be pushed to main?

Yes.

Thanks,
Richard 


> Filip
> 
> -- >8 --
> 
> This patch adds the strongly-connected copy propagation (SCCOPY) pass.
> It is a lightweight GIMPLE copy propagation pass that also removes some
> redundant PHI statements. It handles degenerate PHIs, e.g.:
> 
> _5 = PHI <_1>;
> _6 = PHI <_6, _6, _1, _1>;
> _7 = PHI <16, _7>;
> // Replaces occurences of _5 and _6 by _1 and _7 by 16
> 
> It also handles more complicated situations, e.g.:
> 
> _8 = PHI <_9, _10>;
> _9 = PHI <_8, _10>;
> _10 = PHI <_8, _9, _1>;
> // Replaces occurences of _8, _9 and _10 by _1
> 
> gcc/ChangeLog:
> 
>    * Makefile.in: Added sccopy pass.
>    * passes.def: Added sccopy pass before LTO streaming and before
>      RTL expansion.
>    * tree-pass.h (make_pass_sccopy): Added sccopy pass.
>    * gimple-ssa-sccopy.cc: New file.
> 
> gcc/testsuite/ChangeLog:
> 
>    * gcc.dg/sccopy-1.c: New test.
> 
> Signed-off-by: Filip Kastl <fkastl@suse.cz>
> ---
> gcc/Makefile.in                 |   1 +
> gcc/gimple-ssa-sccopy.cc        | 680 ++++++++++++++++++++++++++++++++
> gcc/passes.def                  |   2 +
> gcc/testsuite/gcc.dg/sccopy-1.c |  78 ++++
> gcc/tree-pass.h                 |   1 +
> 5 files changed, 762 insertions(+)
> create mode 100644 gcc/gimple-ssa-sccopy.cc
> create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 753f2f36618..2e6f2fef1ba 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1497,6 +1497,7 @@ OBJS = \
>    gimple-ssa-backprop.o \
>    gimple-ssa-isolate-paths.o \
>    gimple-ssa-nonnull-compare.o \
> +    gimple-ssa-sccopy.o \
>    gimple-ssa-split-paths.o \
>    gimple-ssa-store-merging.o \
>    gimple-ssa-strength-reduction.o \
> diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc
> new file mode 100644
> index 00000000000..ac5ec32eb32
> --- /dev/null
> +++ b/gcc/gimple-ssa-sccopy.cc
> @@ -0,0 +1,680 @@
> +/* Strongly-connected copy propagation pass for the GNU compiler.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   Contributed by Filip Kastl <fkastl@suse.cz>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "tree.h"
> +#include "gimple.h"
> +#include "tree-pass.h"
> +#include "ssa.h"
> +#include "gimple-iterator.h"
> +#include "vec.h"
> +#include "hash-set.h"
> +#include <algorithm>
> +#include "ssa-iterators.h"
> +#include "gimple-fold.h"
> +#include "gimplify.h"
> +#include "tree-cfg.h"
> +#include "tree-eh.h"
> +#include "builtins.h"
> +#include "tree-ssa-dce.h"
> +#include "fold-const.h"
> +
> +/* Strongly connected copy propagation pass.
> +
> +   This is a lightweight copy propagation pass that is also able to eliminate
> +   redundant PHI statements.  The pass considers the following types of copy
> +   statements:
> +
> +   1 An assignment statement with a single argument.
> +
> +   _3 = _2;
> +   _4 = 5;
> +
> +   2 A degenerate PHI statement.  A degenerate PHI is a PHI that only refers to
> +     itself or one other value.
> +
> +   _5 = PHI <_1>;
> +   _6 = PHI <_6, _6, _1, _1>;
> +   _7 = PHI <16, _7>;
> +
> +   3 A set of PHI statements that only refer to each other or to one other
> +     value.
> +
> +   _8 = PHI <_9, _10>;
> +   _9 = PHI <_8, _10>;
> +   _10 = PHI <_8, _9, _1>;
> +
> +   All of these statements produce copies and can be eliminated from the
> +   program.  For a copy statement we identify the value it creates a copy of
> +   and replace references to the statement with the value -- we propagate the
> +   copy.
> +
> +   _3 = _2; // Replace all occurences of _3 by _2
> +
> +   _8 = PHI <_9, _10>;
> +   _9 = PHI <_8, _10>;
> +   _10 = PHI <_8, _9, _1>; // Replace all occurences of _8, _9 and _10 by _1
> +
> +   To find all three types of copy statements we use an algorithm based on
> +   strongly-connected components (SCCs) in dataflow graph.  The algorithm was
> +   introduced in an article from 2013[1]. We describe the algorithm bellow.
> +
> +   To identify SCCs we implement the Robert Tarjan's SCC algorithm.  For the
> +   SCC computation we wrap potential copy statements in the 'vertex' struct.
> +   To each of these statements we also assign a vertex number ('vxnum'). Since
> +   the main algorithm has to be able to compute SCCs of subgraphs of the whole
> +   dataflow graph we use GIMPLE stmt flags to prevent Tarjan's algorithm from
> +   leaving the subgraph.
> +
> +   References:
> +
> +     [1] Simple and Efficient Construction of Static Single Assignmemnt Form,
> +     Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
> +     Section 3.2.  */
> +
> +/* Bitmap tracking statements which were propagated to be removed at the end of
> +   the pass.  */
> +
> +static bitmap dead_stmts;
> +
> +/* State of vertex during SCC discovery.
> +
> +   unvisited  Vertex hasn't yet been popped from worklist.
> +   vopen      DFS has visited vertex for the first time.  Vertex has been put
> +          on Tarjan stack.
> +   closed     DFS has backtracked through vertex.  At this point, vertex
> +          doesn't have any unvisited neighbors.
> +   in_scc     Vertex has been popped from Tarjan stack.  */
> +
> +enum vstate
> +{
> +  unvisited,
> +  vopen,
> +  closed,
> +  in_scc
> +};
> +
> +/* Information about a vertex.  Used by SCC discovery.  */
> +
> +struct vertex
> +{
> +  bool active; /* scc_discovery::compute_sccs () only considers a subgraph of
> +          the whole dataflow graph.  It uses this flag so that it knows
> +          which vertices are part of this subgraph.  */
> +  vstate state;
> +  unsigned index;
> +  unsigned lowlink;
> +};
> +
> +/* SCC discovery.
> +
> +   Used to find SCCs in a dataflow graph.  Implements Tarjan's SCC
> +   algorithm.  */
> +
> +class scc_discovery
> +{
> +public:
> +  scc_discovery ();
> +  ~scc_discovery ();
> +  auto_vec<vec<gimple *>> compute_sccs (vec<gimple *> &stmts);
> +
> +private:
> +  unsigned curr_generation = 0;
> +  vertex* vertices; /* Indexed by SSA_NAME_VERSION.  */
> +  auto_vec<unsigned> worklist; /* DFS stack.  */
> +  auto_vec<unsigned> stack; /* Tarjan stack.  */
> +
> +  void visit_neighbor (tree neigh_tree, unsigned parent_vxnum);
> +};
> +
> +scc_discovery::scc_discovery ()
> +{
> +  /* Create vertex struct for each SSA name.  */
> +  vertices = XNEWVEC (struct vertex, num_ssa_names);
> +  unsigned i = 0;
> +  for (i = 0; i < num_ssa_names; i++)
> +    vertices[i].active = false;
> +}
> +
> +scc_discovery::~scc_discovery ()
> +{
> +  XDELETEVEC (vertices);
> +}
> +
> +/* Part of 'scc_discovery::compute_sccs ()'.  */
> +
> +void
> +scc_discovery::visit_neighbor (tree neigh_tree, unsigned parent_version)
> +{
> +  if (TREE_CODE (neigh_tree) != SSA_NAME)
> +    return; /* Skip any neighbor that isn't an SSA name.  */
> +  unsigned neigh_version = SSA_NAME_VERSION (neigh_tree);
> +
> +  /* Skip neighbors outside the subgraph that Tarjan currently works
> +     with.  */
> +  if (!vertices[neigh_version].active)
> +    return;
> +
> +  vstate neigh_state = vertices[neigh_version].state;
> +  vstate parent_state = vertices[parent_version].state;
> +  if (parent_state == vopen) /* We're currently opening parent.  */
> +    {
> +      /* Put unvisited neighbors on worklist.  Update lowlink of parent
> +     vertex according to indices of neighbors present on stack.  */
> +      switch (neigh_state)
> +    {
> +    case unvisited:
> +      worklist.safe_push (neigh_version);
> +      break;
> +    case vopen:
> +    case closed:
> +      vertices[parent_version].lowlink
> +        = std::min (vertices[parent_version].lowlink,
> +            vertices[neigh_version].index);
> +      break;
> +    case in_scc:
> +      /* Ignore these edges.  */
> +      break;
> +    }
> +    }
> +  else if (parent_state == closed) /* We're currently closing parent.  */
> +    {
> +      /* Update lowlink of parent vertex according to lowlinks of
> +     children of parent (in terms of DFS tree).  */
> +      if (neigh_state == closed)
> +    {
> +      vertices[parent_version].lowlink
> +        = std::min (vertices[parent_version].lowlink,
> +            vertices[neigh_version].lowlink);
> +    }
> +    }
> +}
> +
> +/* Compute SCCs in dataflow graph on given statements 'stmts'.  Ignore
> +   statements outside 'stmts'.  Return the SCCs in a reverse topological
> +   order.
> +
> +   stmt_may_generate_copy () must be true for all statements from 'stmts'!  */
> +
> +auto_vec<vec<gimple *>>
> +scc_discovery::compute_sccs (vec<gimple *> &stmts)
> +{
> +  auto_vec<vec<gimple *>> sccs;
> +
> +  for (gimple *stmt : stmts)
> +    {
> +      unsigned i;
> +      switch (gimple_code (stmt))
> +    {
> +      case GIMPLE_ASSIGN:
> +        i = SSA_NAME_VERSION (gimple_assign_lhs (stmt));
> +        break;
> +      case GIMPLE_PHI:
> +        i = SSA_NAME_VERSION (gimple_phi_result (stmt));
> +        break;
> +      default:
> +        gcc_unreachable ();
> +    }
> +
> +      vertices[i].index = 0;
> +      vertices[i].lowlink = 0;
> +      vertices[i].state = unvisited;
> +      vertices[i].active = true; /* Mark the subgraph we'll be working on so
> +                    that we don't leave it.  */
> +
> +      worklist.safe_push (i);
> +    }
> +
> +  /* Worklist loop.  */
> +  unsigned curr_index = 0;
> +  while (!worklist.is_empty ())
> +    {
> +      unsigned i = worklist.pop ();
> +      gimple *stmt = SSA_NAME_DEF_STMT (ssa_name (i));
> +      vstate state = vertices[i].state;
> +
> +      if (state == unvisited)
> +    {
> +      vertices[i].state = vopen;
> +
> +      /* Assign index to this vertex.  */
> +      vertices[i].index = curr_index;
> +      vertices[i].lowlink = curr_index;
> +      curr_index++;
> +
> +      /* Put vertex on stack and also on worklist to be closed later.  */
> +      stack.safe_push (i);
> +      worklist.safe_push (i);
> +    }
> +      else if (state == vopen)
> +    vertices[i].state = closed;
> +
> +      /* Visit neighbors of this vertex.  */
> +      tree op;
> +      gphi *phi;
> +      switch (gimple_code (stmt))
> +    {
> +      case GIMPLE_PHI:
> +        phi = as_a <gphi *> (stmt);
> +        unsigned j;
> +        for (j = 0; j < gimple_phi_num_args (phi); j++)
> +          {
> +        op = gimple_phi_arg_def (phi, j);
> +        visit_neighbor (op, i);
> +          }
> +        break;
> +      case GIMPLE_ASSIGN:
> +        op = gimple_assign_rhs1 (stmt);
> +        visit_neighbor (op, i);
> +        break;
> +      default:
> +        gcc_unreachable ();
> +    }
> +
> +      /* If we've just closed a root vertex of an scc, pop scc from stack.  */
> +      if (state == vopen && vertices[i].lowlink == vertices[i].index)
> +    {
> +      vec<gimple *> scc = vNULL;
> +
> +      unsigned j;
> +      do
> +        {
> +          j = stack.pop ();
> +          scc.safe_push (SSA_NAME_DEF_STMT (ssa_name (j)));
> +          vertices[j].state = in_scc;
> +        }
> +      while (j != i);
> +
> +      sccs.safe_push (scc);
> +    }
> +    }
> +
> +  if (!stack.is_empty ())
> +    gcc_unreachable ();
> +
> +  /* Clear 'active' flags.  */
> +  for (gimple *stmt : stmts)
> +    {
> +      unsigned i;
> +      switch (gimple_code (stmt))
> +    {
> +      case GIMPLE_ASSIGN:
> +        i = SSA_NAME_VERSION (gimple_assign_lhs (stmt));
> +        break;
> +      case GIMPLE_PHI:
> +        i = SSA_NAME_VERSION (gimple_phi_result (stmt));
> +        break;
> +      default:
> +        gcc_unreachable ();
> +    }
> +
> +      vertices[i].active = false;
> +    }
> +
> +  return sccs;
> +}
> +
> +/* Could this statement potentially be a copy statement?
> +
> +   This pass only considers statements for which this function returns 'true'.
> +   Those are basically PHI functions and assignment statements similar to
> +
> +   _2 = _1;
> +   or
> +   _2 = 5;  */
> +
> +static bool
> +stmt_may_generate_copy (gimple *stmt)
> +{
> +  /* A PHI may generate a copy.  */
> +  if (gimple_code (stmt) == GIMPLE_PHI)
> +    {
> +      gphi *phi = as_a <gphi *> (stmt);
> +
> +      /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs.  */
> +      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
> +    return false;
> +
> +      unsigned i;
> +      for (i = 0; i < gimple_phi_num_args (phi); i++)
> +    {
> +      tree op = gimple_phi_arg_def (phi, i);
> +      if (TREE_CODE (op) == SSA_NAME
> +          && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> +        return false;
> +    }
> +
> +      /* If PHI has more than one unique non-SSA arguments, it won't generate a
> +     copy.  */
> +      tree const_op = NULL_TREE;
> +      for (i = 0; i < gimple_phi_num_args (phi); i++)
> +    {
> +      tree op = gimple_phi_arg_def (phi, i);
> +      if (TREE_CODE (op) != SSA_NAME)
> +        {
> +          if (const_op && !operand_equal_p (op, const_op))
> +        return false;
> +          const_op = op;
> +        }
> +    }
> +
> +      return true;
> +    }
> +
> +  /* Or a statement of type _2 = _1; OR _2 = 5; may generate a copy.  */
> +
> +  if (!gimple_assign_single_p (stmt))
> +    return false;
> +
> +  tree lhs = gimple_assign_lhs (stmt);
> +  tree rhs = gimple_assign_rhs1 (stmt);
> +
> +  if (TREE_CODE (lhs) != SSA_NAME)
> +    return false;
> +
> +  /* lhs shouldn't flow through any abnormal edges.  */
> +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> +    return false;
> +
> +  if (is_gimple_min_invariant (rhs))
> +    return true;  /* A statement of type _2 = 5;.  */
> +
> +  if (TREE_CODE (rhs) != SSA_NAME)
> +    return false;
> +
> +  /* rhs shouldn't flow through any abnormal edges.  */
> +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs))
> +    return false;
> +
> +  /* It is possible that lhs has more alignment or value range information.  By
> +     propagating we would lose this information.  So in the case that alignment
> +     or value range information differs, we are conservative and do not
> +     propagate.
> +
> +     FIXME: Propagate alignment and value range info the same way copy-prop
> +     does.  */
> +  if (POINTER_TYPE_P (TREE_TYPE (lhs))
> +      && POINTER_TYPE_P (TREE_TYPE (rhs))
> +      && SSA_NAME_PTR_INFO (lhs) != SSA_NAME_PTR_INFO (rhs))
> +    return false;
> +  if (!POINTER_TYPE_P (TREE_TYPE (lhs))
> +      && !POINTER_TYPE_P (TREE_TYPE (rhs))
> +      && SSA_NAME_RANGE_INFO (lhs) != SSA_NAME_RANGE_INFO (rhs))
> +    return false;
> +
> +  return true;  /* A statement of type _2 = _1;.  */
> +}
> +
> +/* Return all statements in cfun that could generate copies.  All statements
> +   for which stmt_may_generate_copy returns 'true'.  */
> +
> +static auto_vec<gimple *>
> +get_all_stmt_may_generate_copy (void)
> +{
> +  auto_vec<gimple *> result;
> +
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    {
> +      gimple_stmt_iterator gsi;
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +    {
> +      gimple *s = gsi_stmt (gsi);
> +      if (stmt_may_generate_copy (s))
> +        result.safe_push (s);
> +    }
> +
> +      gphi_iterator pi;
> +      for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi))
> +    {
> +      gimple *s = pi.phi ();
> +      if (stmt_may_generate_copy (s))
> +        result.safe_push (s);
> +    }
> +    }
> +
> +  return result;
> +}
> +
> +/* For each statement from given SCC, replace its usages by value
> +   VAL.  */
> +
> +static void
> +replace_scc_by_value (vec<gimple *> scc, tree val)
> +{
> +  for (gimple *stmt : scc)
> +    {
> +      tree name = gimple_get_lhs (stmt);
> +      replace_uses_by (name, val);
> +      bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name));
> +    }
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ());
> +}
> +
> +/* Part of 'sccopy_propagate ()'.  */
> +
> +static void
> +sccopy_visit_op (tree op, hash_set<tree> &outer_ops,
> +         hash_set<gimple *> &scc_set, bool &is_inner,
> +         tree &last_outer_op)
> +{
> +  bool op_in_scc = false;
> +
> +  if (TREE_CODE (op) == SSA_NAME)
> +    {
> +      gimple *op_stmt = SSA_NAME_DEF_STMT (op);
> +      if (scc_set.contains (op_stmt))
> +    op_in_scc = true;
> +    }
> +
> +  if (!op_in_scc)
> +    {
> +      outer_ops.add (op);
> +      last_outer_op = op;
> +      is_inner = false;
> +    }
> +}
> +
> +/* Main function of this pass.  Find and propagate all three types of copy
> +   statements (see pass description above).
> +
> +   This is an implementation of an algorithm from the paper Simple and
> +   Efficient Construction of Static Single Assignmemnt Form[1].  It is based
> +   on strongly-connected components (SCCs) in dataflow graph.  The original
> +   algorithm only considers PHI statements.  We extend it to also consider
> +   assignment statements of type _2 = _1;.
> +
> +   The algorithm is based on this definition of a set of redundant PHIs[1]:
> +
> +     A non-empty set P of PHI functions is redundant iff the PHI functions just
> +     reference each other or one other value
> +
> +   It uses this lemma[1]:
> +
> +     Let P be a redundant set of PHI functions.  Then there is a
> +     strongly-connected component S subset of P that is also redundant.
> +
> +   The algorithm works in this way:
> +
> +     1 Find SCCs
> +     2 For each SCC S in topological order:
> +     3   Construct set 'inner' of statements that only have other statements
> +     from S on their right hand side
> +     4   Construct set 'outer' of values that originate outside S and appear on
> +     right hand side of some statement from S
> +     5   If |outer| = 1, outer only contains a value v.  Statements in S only
> +     refer to each other or to v -- they are redundant.  Propagate v.
> +     Else, recurse on statements in inner.
> +
> +   The implementation is non-recursive.
> +
> +   References:
> +
> +     [1] Simple and Efficient Construction of Static Single Assignmemnt Form,
> +     Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
> +     Section 3.2.  */
> +
> +static void
> +sccopy_propagate ()
> +{
> +  auto_vec<gimple *> useful_stmts = get_all_stmt_may_generate_copy ();
> +  scc_discovery discovery;
> +
> +  auto_vec<vec<gimple *>> worklist = discovery.compute_sccs (useful_stmts);
> +
> +  while (!worklist.is_empty ())
> +    {
> +      vec<gimple *> scc = worklist.pop ();
> +
> +      auto_vec<gimple *> inner;
> +      hash_set<tree> outer_ops;
> +      tree last_outer_op = NULL_TREE;
> +
> +      /* Prepare hash set of PHIs in scc to query later.  */
> +      hash_set<gimple *> scc_set;
> +      for (gimple *stmt : scc)
> +    scc_set.add (stmt);
> +
> +      for (gimple *stmt : scc)
> +    {
> +      bool is_inner = true;
> +
> +      gphi *phi;
> +      tree op;
> +
> +      switch (gimple_code (stmt))
> +        {
> +          case GIMPLE_PHI:
> +        phi = as_a <gphi *> (stmt);
> +        unsigned j;
> +        for (j = 0; j < gimple_phi_num_args (phi); j++)
> +          {
> +            op = gimple_phi_arg_def (phi, j);
> +            sccopy_visit_op (op, outer_ops, scc_set, is_inner,
> +                   last_outer_op);
> +          }
> +        break;
> +          case GIMPLE_ASSIGN:
> +        op = gimple_assign_rhs1 (stmt);
> +        sccopy_visit_op (op, outer_ops, scc_set, is_inner,
> +                   last_outer_op);
> +        break;
> +          default:
> +        gcc_unreachable ();
> +        }
> +
> +      if (is_inner)
> +        inner.safe_push (stmt);
> +    }
> +
> +      if (outer_ops.elements () == 1)
> +    {
> +      /* The only operand in outer_ops.  */
> +      tree outer_op = last_outer_op;
> +      replace_scc_by_value (scc, outer_op);
> +    }
> +      else if (outer_ops.elements () > 1)
> +    {
> +      /* Add inner sccs to worklist.  */
> +      auto_vec<vec<gimple *>> inner_sccs
> +        = discovery.compute_sccs (inner);
> +      for (vec<gimple *> inner_scc : inner_sccs)
> +        worklist.safe_push (inner_scc);
> +    }
> +      else
> +    gcc_unreachable ();
> +
> +      scc.release ();
> +    }
> +}
> +
> +/* Called when pass execution starts.  */
> +
> +static void
> +init_sccopy (void)
> +{
> +  /* For propagated statements.  */
> +  dead_stmts = BITMAP_ALLOC (NULL);
> +}
> +
> +/* Called before pass execution ends.  */
> +
> +static void
> +finalize_sccopy (void)
> +{
> +  /* Remove all propagated statements.  */
> +  simple_dce_from_worklist (dead_stmts);
> +  BITMAP_FREE (dead_stmts);
> +
> +  /* Propagating a constant may create dead eh edges.  */
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, cfun)
> +    gimple_purge_dead_eh_edges (bb);
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_sccopy =
> +{
> +  GIMPLE_PASS, /* type */
> +  "sccopy", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_NONE, /* tv_id */
> +  ( PROP_cfg | PROP_ssa ), /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
> +};
> +
> +class pass_sccopy : public gimple_opt_pass
> +{
> +public:
> +  pass_sccopy (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_sccopy, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *) { return true; }
> +  virtual unsigned int execute (function *);
> +  opt_pass * clone () final override { return new pass_sccopy (m_ctxt); }
> +}; // class pass_sccopy
> +
> +unsigned
> +pass_sccopy::execute (function *)
> +{
> +  init_sccopy ();
> +  sccopy_propagate ();
> +  finalize_sccopy ();
> +  return 0;
> +}
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_sccopy (gcc::context *ctxt)
> +{
> +  return new pass_sccopy (ctxt);
> +}
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1e1950bdb39..048ad3ee7a5 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
>      NEXT_PASS (pass_if_to_switch);
>      NEXT_PASS (pass_convert_switch);
>      NEXT_PASS (pass_cleanup_eh);
> +      NEXT_PASS (pass_sccopy);
>      NEXT_PASS (pass_profile);
>      NEXT_PASS (pass_local_pure_const);
>      NEXT_PASS (pass_modref);
> @@ -368,6 +369,7 @@ along with GCC; see the file COPYING3.  If not see
>     However, this also causes us to misdiagnose cases that should be
>     real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
>       NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
> +      NEXT_PASS (pass_sccopy);
>       NEXT_PASS (pass_tail_calls);
>       /* Split critical edges before late uninit warning to reduce the
>          number of false positives from it.  */
> diff --git a/gcc/testsuite/gcc.dg/sccopy-1.c b/gcc/testsuite/gcc.dg/sccopy-1.c
> new file mode 100644
> index 00000000000..1e61a6b320e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/sccopy-1.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fgimple -fdump-tree-sccopy -O2" } */
> +/* { dg-final { scan-tree-dump "Replacing SCC of size 2" "sccopy1" } } */
> +
> +int __GIMPLE (ssa, startwith ("sccopy"))
> +main ()
> +{
> +  int a;
> +  int y;
> +  int x;
> +  int _1;
> +  int _2;
> +  int _13;
> +
> +  __BB(2):
> +  if (x_7(D) == 5)
> +    goto __BB3;
> +  else
> +    goto __BB4;
> +
> +  __BB(3):
> +  a_10 = x_7(D);
> +  goto __BB5;
> +
> +  __BB(4):
> +  a_9 = y_8(D);
> +  goto __BB5;
> +
> +  __BB(5):
> +  a_3 = __PHI (__BB3: a_10, __BB4: a_9);
> +  if (x_7(D) == y_8(D))
> +    goto __BB6;
> +  else
> +    goto __BB11;
> +
> +  __BB(6):
> +  a_11 = a_3 + 1;
> +  goto __BB7;
> +
> +  __BB(7):
> +  a_4 = __PHI (__BB6: a_11, __BB11: a_6);
> +label1:
> +  if (x_7(D) != y_8(D))
> +    goto __BB8;
> +  else
> +    goto __BB10;
> +
> +  __BB(8):
> +  goto __BB9;
> +
> +  __BB(9):
> +  a_12 = __PHI (__BB8: a_4, __BB10: a_5);
> +  goto __BB10;
> +
> +  __BB(10,loop_header(1)):
> +  a_5 = __PHI (__BB7: a_4, __BB9: a_12);
> +label2:
> +  _1 = y_8(D) * 2;
> +  if (x_7(D) == _1)
> +    goto __BB9;
> +  else
> +    goto __BB11;
> +
> +  __BB(11):
> +  a_6 = __PHI (__BB5: a_3, __BB10: a_5);
> +  _2 = x_7(D) * 3;
> +  if (y_8(D) == _2)
> +    goto __BB7;
> +  else
> +    goto __BB12;
> +
> +  __BB(12):
> +  _13 = 0;
> +  return _13;
> +
> +}
> +
> +
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 09e6ada5b2f..94a48461590 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -399,6 +399,7 @@ extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_sccopy (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt);
> extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);
> --
> 2.43.0
>
  
Filip Kastl Dec. 14, 2023, 10:26 a.m. UTC | #2
Successfully bootstrapped and regtested on x86_64-linux. Will push to master.

Filip
  

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 753f2f36618..2e6f2fef1ba 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1497,6 +1497,7 @@  OBJS = \
 	gimple-ssa-backprop.o \
 	gimple-ssa-isolate-paths.o \
 	gimple-ssa-nonnull-compare.o \
+	gimple-ssa-sccopy.o \
 	gimple-ssa-split-paths.o \
 	gimple-ssa-store-merging.o \
 	gimple-ssa-strength-reduction.o \
diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc
new file mode 100644
index 00000000000..ac5ec32eb32
--- /dev/null
+++ b/gcc/gimple-ssa-sccopy.cc
@@ -0,0 +1,680 @@ 
+/* Strongly-connected copy propagation pass for the GNU compiler.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   Contributed by Filip Kastl <fkastl@suse.cz>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "gimple-iterator.h"
+#include "vec.h"
+#include "hash-set.h"
+#include <algorithm>
+#include "ssa-iterators.h"
+#include "gimple-fold.h"
+#include "gimplify.h"
+#include "tree-cfg.h"
+#include "tree-eh.h"
+#include "builtins.h"
+#include "tree-ssa-dce.h"
+#include "fold-const.h"
+
+/* Strongly connected copy propagation pass.
+
+   This is a lightweight copy propagation pass that is also able to eliminate
+   redundant PHI statements.  The pass considers the following types of copy
+   statements:
+
+   1 An assignment statement with a single argument.
+
+   _3 = _2;
+   _4 = 5;
+
+   2 A degenerate PHI statement.  A degenerate PHI is a PHI that only refers to
+     itself or one other value.
+
+   _5 = PHI <_1>;
+   _6 = PHI <_6, _6, _1, _1>;
+   _7 = PHI <16, _7>;
+
+   3 A set of PHI statements that only refer to each other or to one other
+     value.
+
+   _8 = PHI <_9, _10>;
+   _9 = PHI <_8, _10>;
+   _10 = PHI <_8, _9, _1>;
+
+   All of these statements produce copies and can be eliminated from the
+   program.  For a copy statement we identify the value it creates a copy of
+   and replace references to the statement with the value -- we propagate the
+   copy.
+
+   _3 = _2; // Replace all occurences of _3 by _2
+
+   _8 = PHI <_9, _10>;
+   _9 = PHI <_8, _10>;
+   _10 = PHI <_8, _9, _1>; // Replace all occurences of _8, _9 and _10 by _1
+
+   To find all three types of copy statements we use an algorithm based on
+   strongly-connected components (SCCs) in dataflow graph.  The algorithm was
+   introduced in an article from 2013[1]. We describe the algorithm bellow.
+
+   To identify SCCs we implement the Robert Tarjan's SCC algorithm.  For the
+   SCC computation we wrap potential copy statements in the 'vertex' struct.
+   To each of these statements we also assign a vertex number ('vxnum'). Since
+   the main algorithm has to be able to compute SCCs of subgraphs of the whole
+   dataflow graph we use GIMPLE stmt flags to prevent Tarjan's algorithm from
+   leaving the subgraph.
+
+   References:
+
+     [1] Simple and Efficient Construction of Static Single Assignmemnt Form,
+     Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
+     Section 3.2.  */
+
+/* Bitmap tracking statements which were propagated to be removed at the end of
+   the pass.  */
+
+static bitmap dead_stmts;
+
+/* State of vertex during SCC discovery.
+
+   unvisited  Vertex hasn't yet been popped from worklist.
+   vopen      DFS has visited vertex for the first time.  Vertex has been put
+	      on Tarjan stack.
+   closed     DFS has backtracked through vertex.  At this point, vertex
+	      doesn't have any unvisited neighbors.
+   in_scc     Vertex has been popped from Tarjan stack.  */
+
+enum vstate
+{
+  unvisited,
+  vopen,
+  closed,
+  in_scc
+};
+
+/* Information about a vertex.  Used by SCC discovery.  */
+
+struct vertex
+{
+  bool active; /* scc_discovery::compute_sccs () only considers a subgraph of
+		  the whole dataflow graph.  It uses this flag so that it knows
+		  which vertices are part of this subgraph.  */
+  vstate state;
+  unsigned index;
+  unsigned lowlink;
+};
+
+/* SCC discovery.
+
+   Used to find SCCs in a dataflow graph.  Implements Tarjan's SCC
+   algorithm.  */
+
+class scc_discovery
+{
+public:
+  scc_discovery ();
+  ~scc_discovery ();
+  auto_vec<vec<gimple *>> compute_sccs (vec<gimple *> &stmts);
+
+private:
+  unsigned curr_generation = 0;
+  vertex* vertices; /* Indexed by SSA_NAME_VERSION.  */
+  auto_vec<unsigned> worklist; /* DFS stack.  */
+  auto_vec<unsigned> stack; /* Tarjan stack.  */
+
+  void visit_neighbor (tree neigh_tree, unsigned parent_vxnum);
+};
+
+scc_discovery::scc_discovery ()
+{
+  /* Create vertex struct for each SSA name.  */
+  vertices = XNEWVEC (struct vertex, num_ssa_names);
+  unsigned i = 0;
+  for (i = 0; i < num_ssa_names; i++)
+    vertices[i].active = false;
+}
+
+scc_discovery::~scc_discovery ()
+{
+  XDELETEVEC (vertices);
+}
+
+/* Part of 'scc_discovery::compute_sccs ()'.  */
+
+void
+scc_discovery::visit_neighbor (tree neigh_tree, unsigned parent_version)
+{
+  if (TREE_CODE (neigh_tree) != SSA_NAME)
+    return; /* Skip any neighbor that isn't an SSA name.  */
+  unsigned neigh_version = SSA_NAME_VERSION (neigh_tree);
+
+  /* Skip neighbors outside the subgraph that Tarjan currently works
+     with.  */
+  if (!vertices[neigh_version].active)
+    return;
+
+  vstate neigh_state = vertices[neigh_version].state;
+  vstate parent_state = vertices[parent_version].state;
+  if (parent_state == vopen) /* We're currently opening parent.  */
+    {
+      /* Put unvisited neighbors on worklist.  Update lowlink of parent
+	 vertex according to indices of neighbors present on stack.  */
+      switch (neigh_state)
+	{
+	case unvisited:
+	  worklist.safe_push (neigh_version);
+	  break;
+	case vopen:
+	case closed:
+	  vertices[parent_version].lowlink
+	    = std::min (vertices[parent_version].lowlink,
+			vertices[neigh_version].index);
+	  break;
+	case in_scc:
+	  /* Ignore these edges.  */
+	  break;
+	}
+    }
+  else if (parent_state == closed) /* We're currently closing parent.  */
+    {
+      /* Update lowlink of parent vertex according to lowlinks of
+	 children of parent (in terms of DFS tree).  */
+      if (neigh_state == closed)
+	{
+	  vertices[parent_version].lowlink
+	    = std::min (vertices[parent_version].lowlink,
+			vertices[neigh_version].lowlink);
+	}
+    }
+}
+
+/* Compute SCCs in dataflow graph on given statements 'stmts'.  Ignore
+   statements outside 'stmts'.  Return the SCCs in a reverse topological
+   order.
+
+   stmt_may_generate_copy () must be true for all statements from 'stmts'!  */
+
+auto_vec<vec<gimple *>>
+scc_discovery::compute_sccs (vec<gimple *> &stmts)
+{
+  auto_vec<vec<gimple *>> sccs;
+
+  for (gimple *stmt : stmts)
+    {
+      unsigned i;
+      switch (gimple_code (stmt))
+	{
+	  case GIMPLE_ASSIGN:
+	    i = SSA_NAME_VERSION (gimple_assign_lhs (stmt));
+	    break;
+	  case GIMPLE_PHI:
+	    i = SSA_NAME_VERSION (gimple_phi_result (stmt));
+	    break;
+	  default:
+	    gcc_unreachable ();
+	}
+
+      vertices[i].index = 0;
+      vertices[i].lowlink = 0;
+      vertices[i].state = unvisited;
+      vertices[i].active = true; /* Mark the subgraph we'll be working on so
+				    that we don't leave it.  */
+
+      worklist.safe_push (i);
+    }
+
+  /* Worklist loop.  */
+  unsigned curr_index = 0;
+  while (!worklist.is_empty ())
+    {
+      unsigned i = worklist.pop ();
+      gimple *stmt = SSA_NAME_DEF_STMT (ssa_name (i));
+      vstate state = vertices[i].state;
+
+      if (state == unvisited)
+	{
+	  vertices[i].state = vopen;
+
+	  /* Assign index to this vertex.  */
+	  vertices[i].index = curr_index;
+	  vertices[i].lowlink = curr_index;
+	  curr_index++;
+
+	  /* Put vertex on stack and also on worklist to be closed later.  */
+	  stack.safe_push (i);
+	  worklist.safe_push (i);
+	}
+      else if (state == vopen)
+	vertices[i].state = closed;
+
+      /* Visit neighbors of this vertex.  */
+      tree op;
+      gphi *phi;
+      switch (gimple_code (stmt))
+	{
+	  case GIMPLE_PHI:
+	    phi = as_a <gphi *> (stmt);
+	    unsigned j;
+	    for (j = 0; j < gimple_phi_num_args (phi); j++)
+	      {
+		op = gimple_phi_arg_def (phi, j);
+		visit_neighbor (op, i);
+	      }
+	    break;
+	  case GIMPLE_ASSIGN:
+	    op = gimple_assign_rhs1 (stmt);
+	    visit_neighbor (op, i);
+	    break;
+	  default:
+	    gcc_unreachable ();
+	}
+
+      /* If we've just closed a root vertex of an scc, pop scc from stack.  */
+      if (state == vopen && vertices[i].lowlink == vertices[i].index)
+	{
+	  vec<gimple *> scc = vNULL;
+
+	  unsigned j;
+	  do
+	    {
+	      j = stack.pop ();
+	      scc.safe_push (SSA_NAME_DEF_STMT (ssa_name (j)));
+	      vertices[j].state = in_scc;
+	    }
+	  while (j != i);
+
+	  sccs.safe_push (scc);
+	}
+    }
+
+  if (!stack.is_empty ())
+    gcc_unreachable ();
+
+  /* Clear 'active' flags.  */
+  for (gimple *stmt : stmts)
+    {
+      unsigned i;
+      switch (gimple_code (stmt))
+	{
+	  case GIMPLE_ASSIGN:
+	    i = SSA_NAME_VERSION (gimple_assign_lhs (stmt));
+	    break;
+	  case GIMPLE_PHI:
+	    i = SSA_NAME_VERSION (gimple_phi_result (stmt));
+	    break;
+	  default:
+	    gcc_unreachable ();
+	}
+
+      vertices[i].active = false;
+    }
+
+  return sccs;
+}
+
+/* Could this statement potentially be a copy statement?
+
+   This pass only considers statements for which this function returns 'true'.
+   Those are basically PHI functions and assignment statements similar to
+
+   _2 = _1;
+   or
+   _2 = 5;  */
+
+static bool
+stmt_may_generate_copy (gimple *stmt)
+{
+  /* A PHI may generate a copy.  */
+  if (gimple_code (stmt) == GIMPLE_PHI)
+    {
+      gphi *phi = as_a <gphi *> (stmt);
+
+      /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs.  */
+      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
+	return false;
+
+      unsigned i;
+      for (i = 0; i < gimple_phi_num_args (phi); i++)
+	{
+	  tree op = gimple_phi_arg_def (phi, i);
+	  if (TREE_CODE (op) == SSA_NAME
+	      && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
+	    return false;
+	}
+
+      /* If PHI has more than one unique non-SSA arguments, it won't generate a
+	 copy.  */
+      tree const_op = NULL_TREE;
+      for (i = 0; i < gimple_phi_num_args (phi); i++)
+	{
+	  tree op = gimple_phi_arg_def (phi, i);
+	  if (TREE_CODE (op) != SSA_NAME)
+	    {
+	      if (const_op && !operand_equal_p (op, const_op))
+		return false;
+	      const_op = op;
+	    }
+	}
+
+      return true;
+    }
+
+  /* Or a statement of type _2 = _1; OR _2 = 5; may generate a copy.  */
+
+  if (!gimple_assign_single_p (stmt))
+    return false;
+
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME)
+    return false;
+
+  /* lhs shouldn't flow through any abnormal edges.  */
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
+    return false;
+
+  if (is_gimple_min_invariant (rhs))
+    return true;  /* A statement of type _2 = 5;.  */
+
+  if (TREE_CODE (rhs) != SSA_NAME)
+    return false;
+
+  /* rhs shouldn't flow through any abnormal edges.  */
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs))
+    return false;
+
+  /* It is possible that lhs has more alignment or value range information.  By
+     propagating we would lose this information.  So in the case that alignment
+     or value range information differs, we are conservative and do not
+     propagate.
+
+     FIXME: Propagate alignment and value range info the same way copy-prop
+     does.  */
+  if (POINTER_TYPE_P (TREE_TYPE (lhs))
+      && POINTER_TYPE_P (TREE_TYPE (rhs))
+      && SSA_NAME_PTR_INFO (lhs) != SSA_NAME_PTR_INFO (rhs))
+    return false;
+  if (!POINTER_TYPE_P (TREE_TYPE (lhs))
+      && !POINTER_TYPE_P (TREE_TYPE (rhs))
+      && SSA_NAME_RANGE_INFO (lhs) != SSA_NAME_RANGE_INFO (rhs))
+    return false;
+
+  return true;  /* A statement of type _2 = _1;.  */
+}
+
+/* Return all statements in cfun that could generate copies.  All statements
+   for which stmt_may_generate_copy returns 'true'.  */
+
+static auto_vec<gimple *>
+get_all_stmt_may_generate_copy (void)
+{
+  auto_vec<gimple *> result;
+
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      gimple_stmt_iterator gsi;
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *s = gsi_stmt (gsi);
+	  if (stmt_may_generate_copy (s))
+	    result.safe_push (s);
+	}
+
+      gphi_iterator pi;
+      for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi))
+	{
+	  gimple *s = pi.phi ();
+	  if (stmt_may_generate_copy (s))
+	    result.safe_push (s);
+	}
+    }
+
+  return result;
+}
+
+/* For each statement from given SCC, replace its usages by value
+   VAL.  */
+
+static void
+replace_scc_by_value (vec<gimple *> scc, tree val)
+{
+  for (gimple *stmt : scc)
+    {
+      tree name = gimple_get_lhs (stmt);
+      replace_uses_by (name, val);
+      bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name));
+    }
+
+  if (dump_file)
+    fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ());
+}
+
+/* Part of 'sccopy_propagate ()'.  */
+
+static void
+sccopy_visit_op (tree op, hash_set<tree> &outer_ops,
+		 hash_set<gimple *> &scc_set, bool &is_inner,
+		 tree &last_outer_op)
+{
+  bool op_in_scc = false;
+
+  if (TREE_CODE (op) == SSA_NAME)
+    {
+      gimple *op_stmt = SSA_NAME_DEF_STMT (op);
+      if (scc_set.contains (op_stmt))
+	op_in_scc = true;
+    }
+
+  if (!op_in_scc)
+    {
+      outer_ops.add (op);
+      last_outer_op = op;
+      is_inner = false;
+    }
+}
+
+/* Main function of this pass.  Find and propagate all three types of copy
+   statements (see pass description above).
+
+   This is an implementation of an algorithm from the paper Simple and
+   Efficient Construction of Static Single Assignmemnt Form[1].  It is based
+   on strongly-connected components (SCCs) in dataflow graph.  The original
+   algorithm only considers PHI statements.  We extend it to also consider
+   assignment statements of type _2 = _1;.
+
+   The algorithm is based on this definition of a set of redundant PHIs[1]:
+
+     A non-empty set P of PHI functions is redundant iff the PHI functions just
+     reference each other or one other value
+
+   It uses this lemma[1]:
+
+     Let P be a redundant set of PHI functions.  Then there is a
+     strongly-connected component S subset of P that is also redundant.
+
+   The algorithm works in this way:
+
+     1 Find SCCs
+     2 For each SCC S in topological order:
+     3   Construct set 'inner' of statements that only have other statements
+	 from S on their right hand side
+     4   Construct set 'outer' of values that originate outside S and appear on
+	 right hand side of some statement from S
+     5   If |outer| = 1, outer only contains a value v.  Statements in S only
+	 refer to each other or to v -- they are redundant.  Propagate v.
+	 Else, recurse on statements in inner.
+
+   The implementation is non-recursive.
+
+   References:
+
+     [1] Simple and Efficient Construction of Static Single Assignmemnt Form,
+     Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
+     Section 3.2.  */
+
+static void
+sccopy_propagate ()
+{
+  auto_vec<gimple *> useful_stmts = get_all_stmt_may_generate_copy ();
+  scc_discovery discovery;
+
+  auto_vec<vec<gimple *>> worklist = discovery.compute_sccs (useful_stmts);
+
+  while (!worklist.is_empty ())
+    {
+      vec<gimple *> scc = worklist.pop ();
+
+      auto_vec<gimple *> inner;
+      hash_set<tree> outer_ops;
+      tree last_outer_op = NULL_TREE;
+
+      /* Prepare hash set of PHIs in scc to query later.  */
+      hash_set<gimple *> scc_set;
+      for (gimple *stmt : scc)
+	scc_set.add (stmt);
+
+      for (gimple *stmt : scc)
+	{
+	  bool is_inner = true;
+
+	  gphi *phi;
+	  tree op;
+
+	  switch (gimple_code (stmt))
+	    {
+	      case GIMPLE_PHI:
+		phi = as_a <gphi *> (stmt);
+		unsigned j;
+		for (j = 0; j < gimple_phi_num_args (phi); j++)
+		  {
+		    op = gimple_phi_arg_def (phi, j);
+		    sccopy_visit_op (op, outer_ops, scc_set, is_inner,
+				   last_outer_op);
+		  }
+		break;
+	      case GIMPLE_ASSIGN:
+		op = gimple_assign_rhs1 (stmt);
+		sccopy_visit_op (op, outer_ops, scc_set, is_inner,
+			       last_outer_op);
+		break;
+	      default:
+		gcc_unreachable ();
+	    }
+
+	  if (is_inner)
+	    inner.safe_push (stmt);
+	}
+
+      if (outer_ops.elements () == 1)
+	{
+	  /* The only operand in outer_ops.  */
+	  tree outer_op = last_outer_op;
+	  replace_scc_by_value (scc, outer_op);
+	}
+      else if (outer_ops.elements () > 1)
+	{
+	  /* Add inner sccs to worklist.  */
+	  auto_vec<vec<gimple *>> inner_sccs
+	    = discovery.compute_sccs (inner);
+	  for (vec<gimple *> inner_scc : inner_sccs)
+	    worklist.safe_push (inner_scc);
+	}
+      else
+	gcc_unreachable ();
+
+      scc.release ();
+    }
+}
+
+/* Called when pass execution starts.  */
+
+static void
+init_sccopy (void)
+{
+  /* For propagated statements.  */
+  dead_stmts = BITMAP_ALLOC (NULL);
+}
+
+/* Called before pass execution ends.  */
+
+static void
+finalize_sccopy (void)
+{
+  /* Remove all propagated statements.  */
+  simple_dce_from_worklist (dead_stmts);
+  BITMAP_FREE (dead_stmts);
+
+  /* Propagating a constant may create dead eh edges.  */
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, cfun)
+    gimple_purge_dead_eh_edges (bb);
+}
+
+namespace {
+
+const pass_data pass_data_sccopy =
+{
+  GIMPLE_PASS, /* type */
+  "sccopy", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
+};
+
+class pass_sccopy : public gimple_opt_pass
+{
+public:
+  pass_sccopy (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_sccopy, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+  opt_pass * clone () final override { return new pass_sccopy (m_ctxt); }
+}; // class pass_sccopy
+
+unsigned
+pass_sccopy::execute (function *)
+{
+  init_sccopy ();
+  sccopy_propagate ();
+  finalize_sccopy ();
+  return 0;
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_sccopy (gcc::context *ctxt)
+{
+  return new pass_sccopy (ctxt);
+}
diff --git a/gcc/passes.def b/gcc/passes.def
index 1e1950bdb39..048ad3ee7a5 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -100,6 +100,7 @@  along with GCC; see the file COPYING3.  If not see
 	  NEXT_PASS (pass_if_to_switch);
 	  NEXT_PASS (pass_convert_switch);
 	  NEXT_PASS (pass_cleanup_eh);
+	  NEXT_PASS (pass_sccopy);
 	  NEXT_PASS (pass_profile);
 	  NEXT_PASS (pass_local_pure_const);
 	  NEXT_PASS (pass_modref);
@@ -368,6 +369,7 @@  along with GCC; see the file COPYING3.  If not see
 	 However, this also causes us to misdiagnose cases that should be
 	 real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
       NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
+      NEXT_PASS (pass_sccopy);
       NEXT_PASS (pass_tail_calls);
       /* Split critical edges before late uninit warning to reduce the
          number of false positives from it.  */
diff --git a/gcc/testsuite/gcc.dg/sccopy-1.c b/gcc/testsuite/gcc.dg/sccopy-1.c
new file mode 100644
index 00000000000..1e61a6b320e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/sccopy-1.c
@@ -0,0 +1,78 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fgimple -fdump-tree-sccopy -O2" } */
+/* { dg-final { scan-tree-dump "Replacing SCC of size 2" "sccopy1" } } */
+
+int __GIMPLE (ssa, startwith ("sccopy"))
+main ()
+{
+  int a;
+  int y;
+  int x;
+  int _1;
+  int _2;
+  int _13;
+
+  __BB(2):
+  if (x_7(D) == 5)
+    goto __BB3;
+  else
+    goto __BB4;
+
+  __BB(3):
+  a_10 = x_7(D);
+  goto __BB5;
+
+  __BB(4):
+  a_9 = y_8(D);
+  goto __BB5;
+
+  __BB(5):
+  a_3 = __PHI (__BB3: a_10, __BB4: a_9);
+  if (x_7(D) == y_8(D))
+    goto __BB6;
+  else
+    goto __BB11;
+
+  __BB(6):
+  a_11 = a_3 + 1;
+  goto __BB7;
+
+  __BB(7):
+  a_4 = __PHI (__BB6: a_11, __BB11: a_6);
+label1:
+  if (x_7(D) != y_8(D))
+    goto __BB8;
+  else
+    goto __BB10;
+
+  __BB(8):
+  goto __BB9;
+
+  __BB(9):
+  a_12 = __PHI (__BB8: a_4, __BB10: a_5);
+  goto __BB10;
+
+  __BB(10,loop_header(1)):
+  a_5 = __PHI (__BB7: a_4, __BB9: a_12);
+label2:
+  _1 = y_8(D) * 2;
+  if (x_7(D) == _1)
+    goto __BB9;
+  else
+    goto __BB11;
+
+  __BB(11):
+  a_6 = __PHI (__BB5: a_3, __BB10: a_5);
+  _2 = x_7(D) * 3;
+  if (y_8(D) == _2)
+    goto __BB7;
+  else
+    goto __BB12;
+
+  __BB(12):
+  _13 = 0;
+  return _13;
+
+}
+
+
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 09e6ada5b2f..94a48461590 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -399,6 +399,7 @@  extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_sccopy (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);